September 10, 2017

My last post explained that invoking swift/utils/build-script compiles the C++ source code in the apple/swift project and produces a swift compiler executable.

swift/utils/build-script is a Python script that invokes another script, swift/utils/build-script-impl. That, in turn, invokes cmake, in order to configure and then compile the source code in apple/swift.

I'll refer to this process as the Swift build system, and I'll explain it in detail below.

swift/utils/build-script invokes swift/utils/build-script-impl, which invokes cmake, which configures and builds apple/swift.

There are many benefits to having a deep understanding of the build system:

When making changes to the apple/swift project, I typically edit source code, then compile that code and run its tests. By understanding the various ways to invoke the build system, I'm able to compile and test the subset of the project that's related to my changes. It takes several minutes to run the entire Swift test suite, but only a fraction of a second to run just the tests I care about.
The build system has a great deal of room for improvement, so I found it to be a very easy place to start contributing to Swift. At the point that I received commit access to apple/swift, over half of my pull requests had been related to the build system.
Improvements to the build system are impactful: a small change could result in faster builds of the Swift compiler, which in turn would speed up the work of the core team, other contributors, and CI. This recent change to the LLVM build system is a good example: it reduced the number of actions performed by the build by several thousand actions, just by modifying seven lines of build system code.

I'll begin with the build system's core: CMake.

Using CMake: An example

The cmake executable reads script files that describe how a software project builds. For example, here's how I'd use it to build a simple C++ program, composed of a single source file, /tmp/src/hello.cpp:

/tmp/class/hello.cpp

#include <iostream>

int main() {
  std::cout << "Hello!" << std::endl;
  return 0;
}

I can describe how to build this program using the CMake language, in a file named src/CMakeLists.txt:

/tmp/src/CMakeLists.txt

# CMake version 3.2 or greater must
# be used to configure this project.
cmake_minimum_required(VERSION 3.2)

# The name of the project is 'Hello'.
project(Hello)

# When built, create an executable
# named 'hello', by compiling the
# source file 'hello.cpp'.
add_executable(hello
               hello.cpp)

Running cmake on the command line generates build files that are used to actually build the project:

# Read the CMakeLists.txt in the '/tmp/src' directory,
# and generate build files at '/tmp/build'.
cmake -H/tmp/src -B/tmp/build

I can then compile the hello executable by executing the build files in /tmp/build. By default, CMake generates a Makefile, which I can execute like so:

make -C /tmp/build

Running the above command compiles and links the executable /tmp/build/hello. Running that program outputs the expected text, "Hello!".

It's important to note that CMake itself doesn't compile and link the hello executable. CMake generates build files, and those build files compile and link hello. I can instruct CMake to generate different kinds of build files, including an Xcode project:

cmake -H/tmp/src -B/tmp/build-with-xcode -G Xcode

The above command generates /tmp/build-with-xcode/Hello.xcodeproj. I can open Hello.xcodeproj in Xcode and build hello by clicking the "Run" button:

I can also build hello via the command line, by invoking xcodebuild:

xcodebuild -project /tmp/build-with-xcode/Hello.xcodeproj

Notice how, when I didn't specify a build file generator using the cmake -G argument, cmake generated a Makefile, that I could build by invoking make. The Xcode project, on the other hand, was built by invoking xcodebuild. Remembering which invocation to use is tiresome, so cmake provides a convenient way to build using whichever files were generated:

# Build using the Makefile in the
# /tmp/build directory.
cmake --build /tmp/build

# Build using the .xcodeproj in the
# /tmp/build-with-xcode directory.
cmake --build /tmp/build-with-xcode

CMake has many more features than just the three lines from my example above. I'll cover additional concepts below, as we encounter them in apple/swift's CMake code. To learn about CMake itself, try LLVM's CMake primer, or the latest official CMake documentation.

Why does apple/swift use CMake?

If you're an iOS developer, you've likely used Xcode project files to specify how your iOS application is built. So why couldn't the apple/swift project just use Xcode, too? The simple hello example above demonstrates several reasons why:

Xcode is a macOS application; it can't be used to build apple/swift on Linux machines. CMake, on the other hand, can generate one set of build files on Linux, or an Xcode project on macOS, or even a Visual Studio "solution" file on Windows. CMake's versatility allows it to generate the best build files for any platform.
Xcode project files are massive XML files that contain a large number of automatically generated ID strings. If you've worked on a small team of even two or three iOS developers, you know how hard it is to resolve multiple sets of changes to such files. Now just imagine resolving those conflicts across the 500+ contributors to apple/swift! Because CMake uses a plain text scripting language, it's easy to read and modify.
Some of CMake's generators build projects faster than Xcode can. For example, try reconfiguring the hello example above to use cmake -G Ninja. This generates Ninja build files, which are incredibly fast.

Building apple/swift with CMake

The apple/swift project is essentially the same as my "Hello" project above. Yes, it is a larger project with more CMake code, but it can be configured and built in the same way "Hello" can. Doing so is a useful exercise in understanding the apple/swift build infrastructure.

First, I'll clone the source code from the apple/swift project, as well as the three projects it depends upon:

apple/swift-cmark: A Markdown parsing library, used by apple/swift when parsing Markdown documentation and comment blocks. This is Apple's fork of commonmark/cmark.
apple/swift-llvm: apple/swift uses LLVM in a variety of ways. The most prominent is as a "backend": a library that generates the 0's and 1's that are capable of running on the target machine, whether that be an iOS armv7 device, a macOS x86_64 device, or even Android armv7. But apple/swift also uses LLVM utilities such as lit and FileCheck, two tools used to run the apple/swift test suite.
apple/swift-clang: apple/swift makes use of Clang, a C and C++ compiler, as part of its C and Objective-C interoperability.

I'll clone these projects into a directory named ~/local/Source/apple/standalone:

git clone https://github.com/apple/swift-cmark.git \
    ~/local/Source/apple/standalone/swift-cmark

git clone https://github.com/apple/swift-llvm.git \
    ~/local/Source/apple/standalone/swift-llvm

# A bug in Swift's CMake requires this repository
# to be named 'clang', not 'swift-clang'.
# https://bugs.swift.org/browse/SR-5778
git clone https://github.com/apple/swift-clang.git \
    ~/local/Source/apple/standalone/clang

git clone https://github.com/apple/swift.git \
    ~/local/Source/apple/standalone/swift

Before I can configure and build apple/swift, I need to build its dependencies. First is apple/swift-cmark. I'll generate Ninja build files because they're faster:

# First configure.
cmake \
    -H~/local/Source/apple/standalone/swift-cmark \
    -B~/local/Source/apple/standalone/swift-cmark-build \
    -G Ninja

# Then build.
cmake --build \
    ~/local/Source/apple/standalone/swift-cmark-build

Next up is apple/swift-llvm and apple/swift-clang. LLVM"s CMake defines cache entry settings that allow users to build both LLVM and Clang at once. As an example of a cache entry setting, consider the following, defined in swift-llvm/CMakeLists.txt:

swift-llvm/CMakeLists.txt

124  set(LLVM_ENABLE_PROJECTS "" CACHE STRING
125          "Semicolon-separated list of projects to build (${LLVM_ALL_PROJECTS}), or \"all\".")

Users are able to set this value by invoking cmake -DLLVM_ENABLE_PROJECTS="foo" when configuring apple/swift-llvm. The empty string "" is the default value used when the user does not specify -DLLVM_ENABLE_PROJECTS= on the command line.

A full explanation of LLVM and Clang's CMake build system is outside of the scope of this article – perhaps I'll write about it more next year – but the following invocations configure and build apple/swift-llvm and apple/swift-clang:

# First configure.
cmake \
  -H~/local/Source/apple/standalone/swift-llvm \
  -B~/local/Source/apple/standalone/swift-llvm-build \
  -G Ninja \
  -DLLVM_ENABLE_PROJECTS=clang

# Then build.
cmake --build \
    ~/local/Source/apple/standalone/swift-llvm-build

# Finally, create a symlink from the build
# directory to the C++ headers included with Xcode.
# If you're following along, make sure the
# path to your Xcode beta is the same as below.
ln -s \
    /Applications/Xcode-beta.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++ \
    ~/local/Source/apple/standalone/swift-llvm-build/include

Now, I can finally configure and build apple/swift itself. Like LLVM, apple/swift CMake defines many settings; for example, CMAKE_BUILD_TYPE is used to determine whether the swift executable is built with debugging symbols or not. I'll go into more of these options below, but for now the following invocations configure and build apple/swift:

# First configure.
cmake \
  -H~/local/Source/apple/standalone/swift \
  -B~/local/Source/apple/standalone/swift-build \
  -G Ninja \
  -DCMAKE_BUILD_TYPE="Debug" \
  -DSWIFT_PATH_TO_CMARK_SOURCE=$HOME/local/Source/apple/standalone/swift-cmark \
  -DSWIFT_PATH_TO_CMARK_BUILD=$HOME/local/Source/apple/standalone/swift-cmark-build \
  -DSWIFT_PATH_TO_LLVM_SOURCE=$HOME/local/Source/apple/standalone/swift-llvm \
  -DSWIFT_PATH_TO_LLVM_BUILD=$HOME/local/Source/apple/standalone/swift-llvm-build \
  -DSWIFT_PATH_TO_CLANG_SOURCE=$HOME/local/Source/apple/standalone/clang \
  -DSWIFT_PATH_TO_CLANG_BUILD=$HOME/local/Source/apple/standalone/swift-llvm-build

# Then build.
cmake --build \
    ~/local/Source/apple/standalone/swift-build

Once the build finishes, the built swift compiler executable is located at ~/local/Source/apple/standalone/swift-build/bin/swift. It works exactly as well as the executable I built using swift/utils/build-script in my last post.

What do `swift/utils/build-script` and `swift/utils/build-script-impl` do?

The apple/swift utils/build-script is a Python script that performs the exact same actions I did above:

Use CMake to configure and build apple/swift-cmark.
Use CMake to configure and build apple/swift-llvm and apple/swift-clang, as well as perform post-build configuration like symlinking C++ headers.
Use CMake to configure and build apple/swift.

swift/utils/build-script accepts command-line arguments to specify how the project is built. The logic in the Python script translates these arguments into arguments to swift/utils/build-script-impl, a shellscript that in turn invokes cmake. For example, when a user invokes:

~/local/Source/apple/swift/utils/build-script --release

This in turn calls swift/utils/build-script-impl:

~/local/Source/apple/swift/utils/build-script-impl \
    --cmark-build-type=Release \
    --llvm-build-type=Release \
    --swift-build-type=Release

When invoked with these arguments, swift/utils/build-script-impl invokes cmake -DCMAKE_BUILD_TYPE=Release to configure and build apple/swift-cmark, apple/swift-llvm (and thus simultaneously apple/swift-clang), and apple/swift.

The shellscript swift/utils/build-script-impl is an unnecessary complication. Ideally, the Python script swift/utils/build-script would invoke cmake directly. A Swift bug report exists to get rid of swift/utils/build-script-impl, but that's easier said than done: the script contains over 3,000 lines of code.

swift/utils/build-script places the build products for each project in a separate directory. The full path of the directory is based upon the options used; for the invocation swift/utils/build-script --release --debug-swift, the products are placed at:

~/local/Source/apple/build/Ninja-ReleaseAssert+swift-DebugAssert/
    cmark-macosx-x86_64/ # swift-cmark build products.
    llvm-macosx-x86_64/  # swift-llvm and swift-clang build products.
    swift-macosx-x86_64/ # swift build products.

It's clear that, compared to invoking cmake multiple times to configure and build three separate projects, one invocation of swift/utils/build-script is much simpler for new apple/swift contributors:

Users don't need to know how to use cmake on the command line.
Users don't need to know the particular CMake settings for each project.

In addition, some projects related to apple/swift, such as apple/swift-corelibs-foundation, do not include CMake files that describe how to build the project. swift/utils/build-script and swift/utils/build-script-impl take care of building these as well.

However, the simplicity comes at a cost:

In order to add a new build setting, contributors to apple/swift need to add the setting to three places: the CMake, the shellscript swift/utils/build-script-impl, and the Python script swift/utils/build-script. Even if swift/utils/build-script-impl were finally deleted one day, contributors would still need to modify both CMake and Python in order to add a new setting.
In cases where the documentation from swift/utils/build-script --help is not completely clear, users need to trace how an option is translated between Python, shellscript, and CMake. In addition, some options are only accessible via CMake; for example, an option to build static libraries for sourcekitd, SOURCEKITD_BUILD_STATIC_INPROC, is only available in CMake.
swift/utils/build-script performs three configure and build actions each time it is invoked, regardless of whether the user has modified any files in apple/swift-cmark, apple/swift-llvm, or apple/swift-clang. Although most generators are smart enough to do nothing if no files have changed, the extra configuration steps makes swift/utils/build-script slower than invoking cmake --build directly.

This is a frequent stumbling block for new contributors to apple/swift.

Many experienced contributors recommend newcomers invoke ninja -C ~/local/Source/apple/build/swift-macosx-x86_64 directly in order to perform faster incremental builds of apple/swift. But a new contributor who doesn't know what swift/utils/build-script does will have a lot of questions:

"What is ninja?"

"What is the swift-macosx-x86_64 directory?"

"I used swift/utils/build-script --xcode in order to generate an Xcode project; attempting to invoke ninja results in an error. Why?"

Contributors familiar with the interaction between swift/utils/build-script and CMake know that swift/utils/build-script --xcode eventually calls cmake -G Xcode when configuring the apple/swift project, and so it stands to reason that invoking ninja on the configured project would not work. A slight improvement would be to recommend newcomers use cmake --build instead.

One last trick: "In-tree" builds of apple/swift

Configuring and building apple/swift using direct invocations of cmake gave me a new appreciation of the work done by swift/utils/build-script and swift/utils/build-script-impl. But it also made me notice that configuring and building Clang did not require a cmake invocation of its own.

The recommended way to build the Clang project is slightly different than the way I built apple/swift with CMake. Whereas building apple/swift required me to build LLVM first, Clang is built as part of the LLVM project. LLVM's CMake automatically detects when Clang is present at llvm/tools/clang. If it is, then LLVM's CMake includes Clang in the build.

It turns out that apple/swift can be built in this way as well:

# Clone apple/swift-llvm.
git clone https://github.com/apple/swift-llvm.git \
    ~/local/Source/apple/intree/swift-llvm

# Clone cmark, clang, and swift
# into swift-llvm/tools.
git clone https://github.com/apple/swift-cmark.git \
    ~/local/Source/apple/intree/swift-llvm/tools/cmark
git clone https://github.com/apple/swift-clang.git \
    ~/local/Source/apple/intree/swift-llvm/tools/clang
git clone https://github.com/apple/swift.git \
    ~/local/Source/apple/intree/swift-llvm/tools/swift

Configuring and building apple/swift "in-tree" requires fewer options to be specified on the command-line, because the paths to the apple/swift-llvm, apple/swift-cmark, and apple/swift-clang source and build directories are inferred by Swift's CMake code:

# First configure.
cmake \
    -H~/local/Source/apple/intree/swift-llvm \
    -B~/local/Source/apple/intree/build \
    -G Ninja

# Then build: first Clang, and then Swift.
cmake --build \
    ~/local/Source/apple/intree/build \
    -- clang swift

Compared to the "standalone" build I first tried, in-tree builds of apple/swift don't require me to remember to recompile apple/swift-llvm. If I make any changes to the apple/swift-llvm source code, they will be recompiled the next time I invoke cmake --build.

However, in-tree builds are not the official documented way to build apple/swift; if I encounter an error related to the build system, it's on me to file a bug and maybe even fix it myself.

For newcomers to apple/swift, I would recommend either using swift/utils/build-script, or the standalone setup.

Summary

apple/swift's build system has three main components: the CMake that describes how the project is built, and the build-script and build-script-impl that are responsible for invoking cmake.
Because CMake generates build files, but does not build the project itself, it's possible to build with your tool of choice: ninja, Xcode, or something else.
The LLVM family of projects frequently references two kinds of builds: "in-tree" builds, in which projects reside within the llvm/tools directory, and "standalone" builds, which have the projects reside elsewhere. "In-tree" builds of llvm/tools/swift are more uncommon; the recommended way to build is "standalone". Builds that use swift/utils/build-script are standalone.

Next Sunday morning, I'll post a "deep dive" into apple/swift's CMake code: how it uses a "recursive make" pattern, examples of how I read the code to see what it does, and some summaries of how things like the compiler itself, the standard library, and the test suite are built.

The Swift Compiler's Build System