USDT Providers Redux
In this post I’m going to review DTrace USDT providers and show a complete working example that I hope will be a useful reference for people interested in building providers for their own applications.
First, the prerequisites:
- DTrace is the comprehensive dynamic tracing framework available on Illumos-based, BSD, and MacOS systems. If you’ve never used DTrace, check out dtrace.org; this post assumes you’ve at least played around with it.
- USDT (Userland Statically Defined Tracing) is the mechanism by which application developers embed DTrace probes directly into an application. This allows users to trace semantically meaningful operations like “request-start”, rather than having to know which function implements the operation. More importantly, since USDT probes are part of the source code, scripts that use them continue working even as the underlying software evolves and the implementing functions are renamed and deleted.1
Back in 2009 I posted about how we put together an Apache USDT provider for the Sun Storage appliances. Although I wasn’t able to include any of the actual source, this post became the de-facto documentation on implementing a USDT provider with translated arguments. Two years later, as part of our work on Cloud Analytics we’ve written (from scratch) a new Apache provider called mod_usdt, and this time the complete source is available on github.
I hope this provider can be a useful reference for others building USDT providers. There are plenty of other providers out there already, but many of them don’t use translated arguments (see below), and most of them are (naturally) embedded inside a complex application and build system, so it’s hard to pick out what’s actually relevant. Since this module is built separately from Apache, nearly 100% of the code in mod_usdt relates directly to the provider.
The main files are these:
- src/httpd.d: defines the structure translators. Delivered into /usr/lib/dtrace and loaded by dtrace(1M) at run-time.
- src/httpd_provider.d: defines the provider probes and arguments. Linked into the module binary at link-time.
- src/httpd_provider_impl.h: defines the structure passed from Apache to DTrace at run-time.
- src/usdt.c: implements an Apache module using hooks at the beginning and end of each request to fire the corresponding DTrace probes
There are also some example DTrace scripts in the repo for tracing request latency; see the README for details. Between these various pieces, you have everything you need to define and use the provider.
At this point, if you want to understand the pieces of the implementation, your best bet is to go read all the code in mod_usdt. It’s not actually that much code, and it’s well documented. The rest of this post describes how the surrounding mechanism works.
The provider implementation uses macros in the source to define probe points:
This example (not actually from mod_usdt) includes two probe arguments, which can be accessed from D scripts to filter or aggregate on this information (the HTTP method and URI in this example). Passing simple arguments works well for simple probes, but starts getting unwieldy as you add a bunch more arguments. So DTrace also allows you to define C-style structs that are passed as arguments to the probes. To do this, you must define translators that take the actual arguments from the application and populate a struct for use in the probe. (Don’t worry if you didn’t catch all that. This will become clearer below.)
Building the shared object
We start with the provider definition file, httpd_provider.d, which defines probes like this:
probe request__start(dthttpd_t *p) : (conninfo_t *p, httpd_rqinfo_t *p);
This defines a probe called “request-start”. The application will pass a pointer to dthttpd_t, and the actual D probes will get a conninfo_t and a httpd_rqinfo_t.
When you run “make”, the first step is to take this provider definition generate a header file that defines the macros that the application uses to fire the probes:
dtrace -xnolibs -h -o build/httpd_provider.h -s src/httpd_provider.d
The header file has definitions like this:
#define HTTPD_REQUEST_START(arg0) \ __dtrace_httpd___request__start(arg0)
Conceptually, the application uses this macro to fire the probe. (As we’ll see, there’s some magic involved.) When you compile the file:
gcc -Wall -Werror -fPIC -Ibuild -I/opt/local/include/apr-1 -I/opt/local/include -I/opt/local/include/db4 -D_REENTRANT -I/usr/include -DLDAP_DEPRECATED -I/opt/local/include/httpd -c -o build/usdt.o src/usdt.c
The resulting object file indeed includes a call to the special __dtrace function, but there’s another important step next:
dtrace -xnolibs -G -o build/httpd_provider.o -s src/httpd_provider.d build/usdt.o
The “dtrace -G” pass iterates each of the objects (just usdt.o in this case), identifies these function calls, and for each one records which probe is being fired and the location in the program text2. It then replaces the function calls with “nop” instructions, which are ignored by the CPU. A new object file (httpd_provider.o) is generated that includes the probe and location information in a special SUNW_dof section. This object also contains an “_init” function – we’ll get to this shortly. To wrap up the build, we link these objects into the final shared library:
gcc -shared -fPIC -o build/mod_usdt.so build/usdt.o build/httpd_provider.o
Loading the module
When the library is actually loaded, the _init function transmits the information in the SUNW_dof section down to the DTrace kernel module. See the source for details. With this, DTrace knows which probes are available for this process and their locations in memory.
Enable the probes
When a user runs dtrace(1M) to instrument a USDT probe, DTrace replaces the “nop” instructions at all of that probe’s call sites with “int 3” (0xcc) instructions, which will cause a fast trap into the kernel. The DTrace kernel module can tell (from the location of the trapping instruction) that this corresponds to an enabled probe, so it fires the probe and returns back to userland.
When the probes are disabled again (i.e. when you CTRL-C the dtrace(1M) process), the “int 3” instructions are changed back to nops.
The upshot of all this is that when DTrace is not enabled, the application runs essentially the same as it did before USDT support was added at all. There’s no overhead to just having DTrace support.
The files in /usr/lib/dtrace define structures and translators that take arguments passed into probes from the application and convert them to semantically meaningful structures. For details, see src/httpd.d in mod_usdt.
As described above, you can get away without translated arguments by passing primitives (ints and pointers, with which you can access strings too) directly to the probes. That way, you don’t have to deliver files into /usr/lib/dtrace.
And a few more things
Believe it or not, all of the above is a gross simplification of what actually goes on. To see all the details of implementing a provider, you’ll have to read all of mod_usdt. For a taste of how complex the mechanism for firing userland probes really is, check out Bryan’s post on what happens when magic collides.
Also, I explained above that there’s no overhead for disabled probes. That’s almost true, as long as there’s no code required to set up the arguments for the DTrace probe. But often times, you do want to include more complex arguments. To minimize the overhead of setting them up when DTrace is not enabled, you can use ISENABLED macros to tell at run-time whether the probe is enabled. mod_usdt uses these to avoid setting up the structure if the probes are not enabled.
USDT is extremely powerful for tracing semantically meaningful events in userland applications. While implementing a provider doesn’t require a lot of work per se, the steps required are not very obvious. I hope that mod_usdt will serve as a useful reference for those looking to add USDT to their own application, besides being useful in its own right for those using Apache.
This contrasts with the pid provider, which allows users to trace almost any instruction, function entry, and function exit in an application. This flexibility makes the pid provider extremely valuable for ad-hoc investigation, but the resulting dependence on application implementation details makes it unsuitable for stable tools. ↩︎
The “dtrace -G” pass emits relocations, since the final locations of these instructions won’t be known until the run-time linker runs. ↩︎