on
Anatomy of a DTrace USDT provider
Note: the information in this post has been updated with a complete source example. This post remains for historical reference.
I’ve previously mentioned that the 7000 series HTTP/WebDAV Analytics feature relies on USDT, the mechanism by which developers can define application-specific DTrace probes to provide stable points of observation for debugging or analysis. Many projects already use USDT, including the Firefox Javascript engine, mysql, python, perl, and ruby. But writing a USDT provider is (necessarily) somewhat complicated, and documentation is sparse. While there are some USDT examples on the web, most do not make use of newer USDT features (for example, they don’t use translated arguments). Moreover, there are some rough edges around the USDT infrastructure that can make the initial development somewhat frustrating. In this entry I’ll explain the structure of a USDT provider with translated arguments in hopes that it’s helpful to those starting out on such a project. I’ll use the HTTP provider as an example since I know that one best. I also refer to the source of the iscsi provider since the complete source is freely available (as part of iscsitgtd in ON).
Overview
A USDT provider with translated arguments is made up of the following pieces:
- a provider definition file (e.g.,
http_provider.d
), from which an application header and object files will be generated - a provider header file (e.g.,
http_provider.h
), generated from the provider definition file withdtrace -h
and included by the application, defines macros used by the application to fire probes - a provider object file (e.g.,
http_provider.o
), generated from the provider definition and application object files withdtrace -G
- a provider application header file (e.g.,
http_provider_impl.h
), which defines the C structures passed into probes by the application - native (C) code that invokes the probe
- a provider support file (e.g.,
http.d
), delivered into /usr/lib/dtrace, which defines the D structures and translators used in probes at runtime
Putting these together:
- The build process takes the provider definition and generates the provider header and provider object files.
- The application includes the provider header file as well as the application header file and uses DTrace macros and C structures to fire probes.
- The compiled application is linked with the generated provider object file that encodes the probes.
- DTrace consumers (e.g., dtrace(1M)) read in the provider support file and instrument any processes containing the specified probes.
It’s okay if you didn’t follow all that. Let’s examine these pieces in more detail.
1. Provider definition (and generated files)
The provider definition describes the set of probes made available by that provider. For each probe, the definition describes the arguments passed to DTrace by the the probe implementation (within the application) as well as the arguments passed by DTrace to probe consumers (D scripts).
For example, here’s the heart of the http
provider definition, http.d
:
provider http {
probe request__start(httpproto_t *p) :
(conninfo_t *p, http_reqinfo_t *p);
probe request__done(httpproto_t *p) :
(conninfo_t *p, http_reqinfo_t *p);
/* ... */
};
The http
provider defines two probes: request-start
and request-done
. (DTrace converts double-underscores to hyphens.) Each of these consumes an httpproto_t
from the application (in this case, an Apache module) and provides a conninfo_t
and a http_reqinfo_t
to DTrace scripts using these probes. Don’t worry about the details of these structures just yet.
From the provider definition we build the related http_provider.h
and http_provider.o
files. The header file is generated by dtrace -h
and contains macros used by the application to fire probes. Here’s a piece of the generated file (edited to show only the x86 version for clarity):
#define HTTP_REQUEST_DONE(arg0) \
__dtrace_http___request__done(arg0)
#define HTTP_REQUEST_DONE_ENABLED() \
__dtraceenabled_http___request__done()
So for each probe, we have a macro of the form PROVIDER_PROBENAME(PROBEARGS)
that the application uses to fire a probe with the specified arguments. Note that the argument to HTTP_REQUEST_DONE
should be a httpproto_t *
, since that’s what the request-done
probe consumes from the application.
The provider object file generated by dtrace -G
is also necessary to make all this work, but the mechanics are well beyond the scope of this entry.
2. Application components
The application header file http_provider_impl.h
defines the httpproto_t
structure for the application, which passes a pointer to this object into the probe macro to fire the probe. Here’s an example:
typedef struct {
const char *http_laddr; /* local IP address (as string) */
const char *http_uri; /* URI of requested */
const char *http_useragent; /* user's browser (User-agent header) */
uint64_t http_byteswritten; /* bytes RECEIVED from client */
/* ... */
} httpproto_t;
The application uses the macros from http_provider.h
and the structure defined in http_provider_impl.h
to fire a DTrace probe. We also use the is-enabled macros to avoid constructing the arguments when they’re not needed. For example:
static void
mod_dtrace_postrequest(request_rec *rr)
{
httpproto_t hh;
/* ... */
if (!HTTP_REQUEST_DONE_ENABLED())
return;
/* fill in hh object based on request rr ... */
HTTP_REQUEST_DONE(&hh);
}
3. DTrace consumer components
What we haven’t specified yet is exactly what defines a conninfo_t
or http_reqinfo_t
or how to translate an httpproto_t
object into a conninfo_t
or http_reqinfo_t
. These structures and translators must be defined when a consuming D script is compiled (i.e., when a user runs dtrace(1M) and wants to use our probe). These definitions go into what I’ve called the provider support file, which includes definitions like these:
typedef struct {
uint32_t http_laddr;
uint32_t http_uri;
uint32_t http_useragent;
uint64_t http_byteswritten;
/* ... */
} httpproto_t;
typedef struct {
string hri_uri; /* uri requested */
string hri_useragent; /* "User-agent" header (browser) */
uint64_t hri_byteswritten; /* bytes RECEIVED from the client */
/* ... */
} http_reqinfo_t;
#pragma D binding "1.6.1" translator
translator conninfo_t <httpproto_t *dp> {
ci_local = copyinstr((uintptr_t)
*(uint32_t *)copyin((uintptr_t)&dp->http_laddr, sizeof (uint32_t)));
/* ... */
};
#pragma D binding "1.6.1" translator
translator http_reqinfo_t <httpproto_t *dp> {
hri_uri = copyinstr((uintptr_t)
*(uint32_t *)copyin((uintptr_t)&dp->http_uri, sizeof (uint32_t)));
/* ... */
};
There are a few things to note here:
- The
httpproto_t
structure must exactly match the one being used by the application. There’s no way to enforce this with just one definition because neither file can rely on the other being available. - The above example only works for 32-bit applications. For a similar example that uses the ILP of the process to do the right thing for 32-bit and 64-bit apps, see /usr/lib/dtrace/iscsi.d.
- We didn’t define
conninfo_t
. That’s because it’s defined in /usr/lib/dtrace/net.d. Instead of redefining it, we have our file depend onnet.d
with this line at the top:
#pragma D depends_on library net.d
This provider support file gets delivered into /usr/lib/dtrace. dtrace(1M) automatically imports all .d files in this directory (or another directory specified with -xlibdir
) on startup. When a DTrace consumer goes to use our probes, the conninfo_t
, httpproto_t
, and http_reqinfo_t
structures are defined, as well as the needed translators. More concretely, when a user writes:
# dtrace -n 'http*:::request-start{printf("%s\n", args[1]->hri_uri);}'
DTrace knows exactly what to do.
Rough edges
Remember http_provider.d
? It contained the actual provider and probe definitions. It referred to the httpproto_t
, conninfo_t
, and http_reqinfo_t
structures, which we didn’t actually mention in that file. We already explained that these structures and definitions are defined by the provider support file and used at runtime, so they shouldn’t actually be necessary here.
Unfortunately, there’s a piece missing that’s necessary to work around buggy behavior in DTrace: the D compiler insists on having these definitions and translators available when processing the provider definition file, but those structures and translators won’t be used (since this file is not even available at runtime anyway). Even worse, dtrace(1M) doesn’t automatically import the files in /usr/lib/dtrace when compiling the provider file, so we can’t simply depends_on
them.
The end result is that we must define “dummy” structures and translators in the provider file, like this:
typedef struct http_reqinfo {
int dummy;
} http_reqinfo_t;
typedef struct httpproto {
int dummy;
} httpproto_t;
typedef struct conninfo {
int dummy;
} conninfo_t;
translator conninfo_t <httpproto_t *dp> {
dummy = 0;
};
translator http_reqinfo_t <httpproto_t *dp> {
dummy = 0;
};
We also need stability attributes like the following to use the probes:
#pragma D attributes Evolving/Evolving/ISA provider http provider
#pragma D attributes Private/Private/Unknown provider http module
#pragma D attributes Private/Private/Unknown provider http function
#pragma D attributes Private/Private/ISA provider http name
#pragma D attributes Evolving/Evolving/ISA provider http args
You can see both of these in the iscsi provider as well.
Tips
I’ve now covered all the pieces, but there are other considerations in implementing a provider. For example, what arguments should the probes consume from the application, and what should be provided to D scripts? We chose structures on both sides because it’s much less unwieldy (especially as it evolves), but that necessitates the ugly translators and multiple definitions. If I’d used pointer and integer arguments, we’d need no structures, and therefore no translators, and thus we could leave out several of the files described above. But it would be a bit unwieldy and consumers would need to use copyin/copyinstr.
Both the HTTP and iSCSI providers instrument network-based protocols. For consistency, providers for these protocols (which also include NFS, CIFS, and FTP on the 7000 series) use the same conventions for probe argument types and names (e.g., conninfo_t
as the first argument, followed by protocol-specific arguments).
Conclusion
USDT with translated arguments is extremely powerful, but the documentation is somewhat lacking and there are still some rough edges for implementers. I hope this example is valuable for people trying to put the pieces together. If people want to get involved in documenting this, contact the DTrace community at opensolaris.org.