Example: HTTP request latency and garbage collection

A few weeks ago I posted about Cloud Analytics on no.de. I described the various metrics we included in the initial launch and how they’re relevant for Node applications. Today I’ve got a small example using HTTP request latency and garbage collection. I wrote a small HTTP server in Node.js that simply echoes all requests, but in doing so deliberately creates a bunch of garbage. That is, it allocates a bunch of extra memory while processing each request but doesn’t save it anywhere. The goal is just to observe garbage collection in action, and here’s what it looks like when hammered with an HTTP client (also written in Node.js):

The dark band at the bottom of the first heatmap indicates that most of the HTTP requests were serviced within a few milliseconds. But there are a few outliers: some HTTP requests took much longer to complete (around 30ms). I’ve highlighted the different request URLs in different colors so we can see that there’s no obvious correlation between latency and URL. Looking at the second heatmap, we can see that these slow requests correlate almost exactly with GC events. Since we know that V8 suspends Javascript while running GC, we can conclude that GC is the source of the extra HTTP request latency. If this weren’t a simple demo, the next step would be to understand why the application is generating so much garbage.

The other interesting bit about this data is that a line graph of average latency would not have revealed the outliers at all. Without seeing the whole distribution, you might think performance was great and not realize that a small percentage of users’ requests were significantly slower than the rest, let alone be able to understand why.

While pretty simple, this example shows how you can use CA to observe a performance problem and quickly test a couple of hypotheses. We’d love to hear about your real-world experiences with CA. Feel free to share below.