Interview: Chris Lattner

Chris Lattner will give a talk about "LLVM and Clang" at FOSDEM 2011.

Because of his employer's policy, Chris couldn't be interviewed, but he agreed to share some information with us that we have posted below.

Information about the speaker

Chris is a software developer with a diverse range of interests and experiences, particularly in the area of compiler tool chains, operating systems, and graphics. This is Chris' first appearance at FOSDEM, but he has enjoyed the work that has come out of the conference in the past, and looks forward to attending this year.

Chris started graduate studies at the University of Illinois in 2000 to do research work on compilers. At Illinois, he started the LLVM project, which was open sourced and had its first release in 2003. Chris graduated with a Ph.D. in 2005 and began work at Apple, Inc., where today he manages Apple's compiler team, and continues to work on LLVM.

Chris' work experience, before starting the LLVM project, involved working both independently and at a number of technology companies. In that work he has written boot loaders, OS kernels, graphics drivers, and 3D engines. He has experience with a number of large code bases, both proprietary and open source, including the Linux kernel, GCC, GDB, and the Java Hotspot JVM.

Overview of the talk

The goal of the talk is to raise awareness of the LLVM Project and the capabilities it provides. Of course another high-priority goal is to get a lot of people with new ideas together, discussing LLVM and meeting each other.

The talk will describe the greater LLVM Project and what it is all about. LLVM is a widely misunderstood project: while many people see LLVM simply as an effort to replace GCC, LLVM provides much broader capabilities than GCC in a far more reusable way.

It will give an overview of how the LLVM Project is constructed, describing how LLVM is actually an umbrella project, and lay out many of the components within LLVM.org. For instance, the project encompasses a compiler optimizer and code generator (the "LLVM backend"), there are specific language front ends such as the Clang frontend for C languages (C/C++/Objective-C) or VMKit for Java frontends. There are even non-compiler projects included in LLVM, such as the LLDB debugger. The talk will explain how these projects relate, the common focus, and common technologies.

It will explain the comparison between LLVM and GCC. For instance, the modular, umbrella-like nature of the LLVM Project is more accurately compared not just to GCC, but the complete GNU toolset, including GCC, GDB, binutils, libstdc++, and Kaffe VM projects.

A key goal for LLVM is building technology that is modular and reusable, in contrast to traditional compiler tools that are monolithic applications. The talk will describe how LLVM's design meets this goal. The talk will also mention several examples of LLVM usage, due in large part to the modular, library-based architecture. Some of these examples can be found at the project's website.

The C/C++/Objective-C compiler, Clang, will be covered in this talk. While the LLVM architecture is fascinating to some people, many developers are interested in the LLVM project because, quite simply, they want a better C compiler. The Clang compiler is ready today, is a drop in replacement for GCC, is generally faster than GCC, and provides reports

Some background on LLVM

Killer features

LLVM is a pretty big project, and it has more than one "Killer Feature", it just depends on the audience. A few important audiences are compiler users (current users of GCC or Microsoft C++), language implementors, high-performance graphics pipeline engineers, and tools implementers.

Compiler users love LLVM because the Clang-subproject (a C/C++/Objective-C front end) does a great job compiling code very quickly, and provides really useful warning and error messages (look at the project's website for an example). A developer spends a lot more time adding features and fixing bugs, rather than compiling and trying to decipher cryptic compiler messages.

Language implementors really like the library / API approach to building LLVM. They can easily use the LLVM optimizer and code generators to implement their languages, rather than building all that stuff from scratch. Some good examples are the GHC Haskell compiler, the MacRuby project, and there are Python compilers built on LLVM too.

High-performance use cases like that LLVM offers JIT (Just In Time) code generation capabilities, currently used to enable things like the OpenGL pipeline on Mac OS X, and the rendering pipeline used by some major special effects studios.

Many of the OpenCL implementors, using the GPU for high performance computing, have focused most of their energy around using LLVM and Clang, too.

The Clang and LLVM libraries make it easy to integrate core compiler and parsing features directly into an IDE, such as the upcoming Xcode 4. Fully integrated into the Xcode 4 editor, the compiler can report mistakes as you type, rather than waiting for a separate "Build" step. In fact, the code analysis is good enough for the IDE to fix it for you in many cases. This type of integrated compiler experience wasn't possible with GCC, and is why the modular design of the LLVM Project is so important.

The community

The LLVM open source project is pretty big, but no number of contributors is truly accurate. According to Ohloh.net in December, there have been 158 different contributors to the LLVM optimizer and backend. This doesn't count people who don't have commit access (and thus have their patches committed by other people), so it under-represents casual contributors.

Apple is one of a number of big companies that employs contributors to LLVM. It is important to understand, though, that no single company has a significant percentage of the overall contributors. In fact, the size of the LLVM community is an order of magnitude bigger than any single contributing company. Companies such as Apple provide focused leadership, but the organization as a whole is friendly, and notable for great cooperation across participating companies, individuals, and academic institutions.

A move forward

Clang/LLVM can now successfully build millions of lines of C, C++ and Objective-C code including notable things like the Linux kernel, Mozilla, Qt, Boost, and Clang itself. Each of these uses a huge number of GCC-specific extensions to the C and C++ languages.

These days, the most common Clang vs GCC compatibility issues are due to two things:

GCC defaults to gnu89 mode and clang defaults to gnu99 mode for C code, and C99 has subtly different 'inline' semantics. While we could easily default to gnu89 mode, C99 is over 10 years old now, so we feel that it is time for the world to move forward.
GCC's C++ frontend has a number of standards conformance bugs, particularly in the area of template two-phase name lookup. We feel that it is better for Clang to produce good error messages showing people how to fix their code than it is to emulate these bugs.

More details on common compatibility issues can be found on a webpage.

The future

Today we consider LLVM as having already completed many of the toughest parts. We have a code optimizer that outperforms some of the best compilers in the world. We can parse C and Objective-C code with incredible accuracy and speed, and work on the C++ parser is focusing on the upcoming C++'0x standard. We've even implemented a static analyzer. These are tough, foundational pieces.

Lately the LLVM projects are reaching a point where things are all falling into place. It's great to see a lot more adoption, for instance the FreeBSD community is actively working on switching their system compiler to Clang/LLVM.

Like any large, popular, and growing project, it seems the more we do, the longer the TODO list becomes. There is a virtually unbounded number of ways that we can improve LLVM and we are always looking for new contributors. One specific thing that LLVM optimizer lacks is an auto-vectorizer, which is important for certain numeric codes. It would also be great to improve the Clang static analyzer to find new classes of bugs. Of course the list can go on and on, adding more/better platforms support, and to other languages.

The LLDB debugger is a relatively new project for LLVM.org, we're looking for people to help port it to new platforms and expand its capabilities. LLDB may seem like a different type of project for LLVM, but it fits quite nicely. Modularity, performance, ease of embedding - areas where LLVM has focused and excelled - all have huge benefits applied to a debugger as well, and are goals for LLDB.

One of the most exciting things about LLVM is that people are realizing that there is less and less reason to "roll your own" special purpose compiler to solve short-term problems. Compiler-driven technology can advance at a more rapid pace, focusing on innovation rather than the plumbing of getting the project up and running.

This interview is licensed under a Creative Commons Attribution 2.0 Belgium License.

Speakers

fosdem.org

User login