June 5, 2014

Using LLVM Passes with Clang

LLVM is a useful system for toying with compiler extensions: its ISA is clean and well-documented, its structure is modular and easy to extend, and its user community is large and active. Thus, when I began writing instrumentation in LLVM, I was a little surprised that there wasn't a straightforward path for integrating custom passes into a build flow. Nearly all of the documentation suggests using opt to load and run a custom pass. That leads to a flow that looks something like:
clang -O3 -emit-llvm -o source.ll source.c
opt -S -load custom_pass.so -custompass -o source_opt.ll source.ll
llvm-link -o out.ll source_opt.ll ...
llc -filetype=obj -o out.o out.ll
gcc out.o
It's not that complicated, but let's say that you're trying to instrument an existing project that's been built with autotools. Do you rewrite the `.in` files? Do you try to hack together a wrapper script that pretends like it's gcc but actually does all of the stuff above? What I'd really like to do is just put my custom pass into clang and use it instead:
clang -O3 -"magic" source.c
Well, as it so happens, LLVM has magic built-in, but it's not where you might think. There's no option to tell clang to load and run a custom pass in a certain phase. There is, however, an option to tell clang to load a custom extension, and there's a mechanism to allow that extension to register itself in a particular phase. This is just as good, so long as we're okay with the specific insertion points that LLVM provides (there are about half a dozen).

So what's the magic?
clang -Xclang -load -Xclang your_custom_pass.so ...
And a couple of lines inserted at the end of your pass code. I've written some demo code to show how it works, and it's available on github:

https://github.com/rdadolf/clangtool

Enjoy!