Observation
CPU architecture consists of two components:
-
Instruction Set Architecture (ISA) - The logical model for the CPU. What operations are available to you? What storage is available to you? The data types provided to you?
A brief aside. An ISA defines the vocabulary of the CPU. Consider add. The overflow and underflow behavior of storage; the size of data types such as integers and doubles.
But you might ask isn’t this also seen in the language specification? Remember that the ISA declares what a CPU can do. The language specification is what the language is; enforced by a compliant compiler. They are not related.
-
Microarchitecture (uarch) - The physical design of a chip built to process the ISA. The details such as the number of cores, the cache sizes, and the pipeline length
Complications
From an application point of view, we care about this abstraction a layer underneath because it affects what packages and libraries are available to us. To that end, consider the task of multi-platform builds. I have an application that I want to be available on Windows, Linux, and MacOS. I can not ignore the different underlying architectures. How do you engage in multi-platform building?
In the Wild
Cpython
Look at the Cpython repo on github:
strategy:
fail-fast: false
matrix:
target:
- i686-pc-windows-msvc/msvc
- x86_64-pc-windows-msvc/msvc
- aarch64-pc-windows-msvc/msvc
- x86_64-apple-darwin/clang
- aarch64-apple-darwin/clang
- x86_64-unknown-linux-gnu/gcc
- aarch64-unknown-linux-gnu/gcc
debug:
- true
- false
llvm:
- 21
include:
- target: i686-pc-windows-msvc/msvc
architecture: Win32
runner: windows-2022
- target: x86_64-pc-windows-msvc/msvc
architecture: x64
runner: windows-2022
- target: aarch64-pc-windows-msvc/msvc
architecture: ARM64
runner: windows-11-arm
- target: x86_64-apple-darwin/clang
architecture: x86_64
runner: macos-15-intel
- target: aarch64-apple-darwin/clang
architecture: aarch64
runner: macos-14
- target: x86_64-unknown-linux-gnu/gcc
architecture: x86_64
runner: ubuntu-24.04
- target: aarch64-unknown-linux-gnu/gcc
architecture: aarch64
runner: ubuntu-24.04-arm
They use github runner and allocate a custom architecture on a per runner basis. Since the repo is written in both C and python where the critical tasks are in C for performance and the higher level layers are in python for ease of abstraction/maintenance. We can see in their setup to target multiple platforms they will provision different VMs with different architectures to build against.
Terraform
Now consider Terraform’s github.
strategy:
matrix:
include:
- {
goos: "freebsd",
goarch: "386",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "freebsd",
goarch: "amd64",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "freebsd",
goarch: "arm",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "linux",
goarch: "386",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "linux",
goarch: "amd64",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "linux",
goarch: "arm",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "linux",
goarch: "arm64",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "openbsd",
goarch: "386",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "openbsd",
goarch: "amd64",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "solaris",
goarch: "amd64",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "windows",
goarch: "386",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "windows",
goarch: "amd64",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "windows",
goarch: "arm64",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "darwin",
goarch: "amd64",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
- {
goos: "darwin",
goarch: "arm64",
runson: "ubuntu-latest",
cgo-enabled: "0",
}
fail-fast: false
In their setup, they do not need to run separate VMs; everything runs on linux runners and Go handles the cross compilations.
Zig
Zig is a tool (Hermetic toolchain) for cross compiling. It works by basically aggregating all of the arch specific standard libraries (stdlib for aarch64, stdlib for x86_64, etc.). Itself is just a mapping of standard libraries headers and their corresponding. When you invoke it, it will build from scratch from the standard library layer upwards.
Compare this to the alternative, you would need to install
- the specific version of GCC/Clang to the desired arch
- the target headers for desired arch
- the target libraries for desired arch
Furthermore, you will encounter overwrites where you prior had the x86_64 installation but then the foreign architecture packages will overwrite the prior ones.
Final Notes
Go as a core feature supports cross compilation and does it by maintaining its own compiler backend, namely its own IR and the mapping from said IR to every arch’s instruction set. Other projects that build off LLVM require clang and LLVM to cross compile.