swift-llvm and swift-clang

When Apple released the apple/swift repository on GitHub, they also released apple/swift-llvm and apple/swift-clang. Why?

LLVM

Apple's Swift fork of LLVM has only three sets of minor changes:

  1. Convenience wrappers for LLVM bitcode. These are used by Swift's fork of Clang, as part of the implementation of APINotes (more on those below).
  2. A specialization of the LLVM abstract data type llvm::DenseMapInfo. This allows std::tuple<T...> to be used as a key in an llvm::DenseMap. Some helpers to hash tuple values have also been added. Again, this is used by Swift's fork of Clang in its implementation of APINotes, to store tuples of information about Objective-C properties.
  3. Slight adjustments to ensure a mangled name is used as a "linkage name" for debug info. (I know very little about debug info, but I read that certain tools don't work if the "linkage name" isn't correct.)

All of these changes seem like they could be merged back into LLVM trunk. However, two of the three changes are related to APINotes and, as discussed below, it's not clear those will be pushed upstream.

Clang

Apple's Swift fork of Clang includes a much broader set of changes than the Swift fork of LLVM. Doug Gregor outlines these changes in an email to Clang's mailing list. Here's what I found:

1. APINotes

The Swift compiler's ClangImporter library loads Clang modules and transforms their interfaces into Swift. However, a "one size fits all" approach to importing C and Objective-C has some drawbacks:

  1. Not all C and Objective-C interfaces map elegantly into Swift. The Swift maintainers believe the initializer IndexPath(row:section:) is more aesthetically pleasing than the class method NSIndexPath.indexPathForRow(section:). The API notes system allows them to transform the class method.
  2. Some interfaces are almost perfect, but ideally would be annotated with attributes that help them map to Swift – adding noreturn to the C exit function, for example. Of course, Apple ships these headers as part of their developer SDK, so they could theoretically modify them directly. However, the Swift team probably doesn't have free rein to change whichever headers they please.

APINotes could be merged back into Clang trunk, but some objections have been raised: the APINotes mechanism don't support transformations for C++ interfaces, for example. Perhaps they could be merged if C++ support was added.

I guess the question is: how much merge pain does APINotes generate for the Swift team?

Although the bulk of the APINotes implementation lives in an isolated part of the Clang codebase – include/clang/APINotes, lib/APINotes, and so on – option flags for APINotes need to be threaded through the Clang driver, and calls to clang::Sema::ProcessAPINotes are peppered throughout various parts of Apple's fork. Merge conflicts must arise every once in a while, although whether it's often enough for the Swift team to allocate someone to work on C++ support in APINotes… well, that's anyone's guess.

2. Additional attributes

Swift's Clang fork adds two kinds of Clang attributes:

  1. Attributes that are specific to Swift, such as swift_error, swift_name, swift_private, and swift_newtype. These help bridge C and Objective-C into Swift. For example, a C typedef attributed with swift_newtype(struct) or swift_newtype(enum) is imported into Swift as a struct or enum bearing the typedef's name.
  2. Attributes that aren't specific to Swift, such as noescape and objc_subclassing_restricted. These help bridge C and Objective-C to Swift, but are also useful in a 100% Objective-C codebase as well. objc_subclassing_restricted has already been merged back into Clang trunk, and hopefully noescape will be soon to follow.

3. Module shadowing…?

My first encounter with variable "shadowing" was from Python:

def foo():   # Defines function 'foo'
    return "foo"

foo = "bar"  # Function definition shadowed by variable
print(foo)   # => "bar"

In this case, the second definition of foo shadows the first.

Swift's Clang fork adds the ability to shadow Clang modules. Unlike Python definition shadowing, the first definition of a Clang module shadows the second.

In cases where implicit module imports end up importing two modules with the same name, this change allows you to specify -fmodule-map-file, in order to choose which specific copy of a module is imported.

So, why is this part of Swift's fork of Clang? I… actually have no idea. If you know, hit me up.

Further reading

If you're interested in the challenges involved in "living downstream" from a large project like LLVM, watch this session from last year's LLVM Developers' Meeting. You can see some ideas from that talk in action on Apple's LLVM and Clang forks. For example, an Automerger bot propogates changes from upstream LLVM into a staging branch, called upstream-with-swift.