on
illumos tools for observing processes
illumos, with Solaris before it, has a history of delivering rich tools for understanding the system, but discovering these tools can be difficult for new users. Sometimes, tools just have different names than people are used to. In many cases, users don’t even know such tools might exist.
In this post I’ll describe some tools I find most useful, both as a developer and an administrator. This is not intended to be a comprehensive reference, but more like part of an orientation for users new to illumos (and SmartOS in particular) but already familiar with other Unix systems. This post will likely be review for veteran illumos and Solaris users.
The proc tools (ptools)
The ptools are a family of tools that observe processes running on the system. The most useful of these are pgrep, pstack, pfiles, and ptree.
pgrep searches for processes, returning a list of process ids. Here are some common example invocations:
$ pgrep mysql # print all processes with "mysql" in the name
# (e.g., "mysql" and "mysqld")
$ pgrep -x mysql # print all processes whose name is exactly "mysql"
# (i.e., not "mysqld")
$ pgrep -ox mysql # print the oldest mysql process
$ pgrep -nx mysql # print the newest mysql process
$ pgrep -f mysql # print processes matching "mysql" anywhere in the name
# or arguments (e.g., "vim mysql.conf")
$ pgrep -u dap # print all of user dap's processes
These options let you match processes very precisely and allow scripts to be much more robust than “ps -A | grep foo” allows.
I often combine pgrep with ps. For example, to see the memory usage of all of my node processes, I use:
$ ps -opid,rss,vsz,args -p "$(pgrep -x node)"
PID RSS VSZ COMMAND
4914 94380 98036 /usr/local/bin/node demo.js -p 8080
32113 92616 95964 /usr/local/bin/node demo.js -p 80
pkill is just like pgrep, but sends a signal to the matching processes.
pstack shows you thread stack traces for the processes you give it:
$ pstack 51862
51862: find /
fedd6955 getdents64 (fecb0200, 808ef87, 804728c, fedabd84, 808ef88, 804728c) + 15
0805ee9c xsavedir (808ef87, 0, 8089a90, 1000000, 0, fee30000) + 7c
080582dc process_path (808e818, 0, 8089a90, 1000000, 0, fee30000) + 33c
080583ee process_path (808e410, 0, 8089a90, 1000000, 0, fee30000) + 44e
080583ee process_path (808e008, 0, 8089a90, 0, 0, fecb2a40) + 44e
080583ee process_path (8047cbd, 0, 8089a90, 0, fef40c20, fedc78b6) + 44e
080583ee process_path (8075cd0, 0, 2f, fed59274, 8047b48, 8047cbd) + 44e
08058931 do_process_top_dir (8047cbd, 8047cbd, 0, 0, 0, 0) + 21
08057c5e at_top (8058910, 2f, 8047bb0, 8089a90, 28, 80571f0) + 9e
08072eda main (2, 8047bcc, 8047bd8, 80729d0, 0, 0) + 4ea
08057093 _start (2, 8047cb8, 8047cbd, 0, 8047cbf, 8047cd3) + 83
This is incredibly useful as a first step for figuring out what a program is doing when it’s slow or not responsive.
pfiles shows you what file descriptors a process has open, similar to “lsof” on Linux systems, but for a specific process:
$ pfiles 32113
32113: /usr/local/bin/node /home/snpp/current/js/snpp.js -l 80 -d
Current rlimit: 1024 file descriptors
0: S_IFCHR mode:0666 dev:527,6 ino:2848424755 uid:0 gid:3 rdev:38,2
O_RDONLY|O_LARGEFILE
/dev/null
offset:0
1: S_IFREG mode:0644 dev:90,65565 ino:38817 uid:0 gid:0 size:793928
O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE
/var/svc/log/application-snpp:default.log
offset:793928
2: S_IFREG mode:0644 dev:90,65565 ino:38817 uid:0 gid:0 size:793928
O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE
/var/svc/log/application-snpp:default.log
offset:793928
3: S_IFPORT mode:0000 dev:537,0 uid:0 gid:0 size:0
4: S_IFIFO mode:0000 dev:524,0 ino:6257976 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK FD_CLOEXEC
5: S_IFIFO mode:0000 dev:524,0 ino:6257976 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK FD_CLOEXEC
6: S_IFSOCK mode:0666 dev:534,0 ino:23280 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK FD_CLOEXEC
SOCK_STREAM
SO_REUSEADDR,SO_SNDBUF(49152),SO_RCVBUF(128000)
sockname: AF_INET 0.0.0.0 port: 80
7: S_IFREG mode:0644 dev:90,65565 ino:91494 uid:0 gid:0 size:6999682
O_RDONLY|O_LARGEFILE
/home/snpp/data/0f0f2418d7967332caf0425cc5f31867.webm
offset:2334720
This includes details on files (including offset, which is great for checking on programs that scan through large files) and sockets.
ptree shows you a process tree for the whole system or for a given process or user. This is great for programs that use lots of processes (like a build):
$ ptree $(pgrep -ox make)
4599 zsched
6720 /usr/lib/ssh/sshd
45902 /usr/lib/ssh/sshd
45903 /usr/lib/ssh/sshd
45906 -bash
54464 make -j4
54528 make -C out BUILDTYPE=Release
55718 cc -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DL_ENDIAN -DOPENS
55719 /opt/local/libexec/gcc/i386-pc-solaris2.11/4.6.2/cc1 -quiet -I
55757 cc -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DL_ENDIAN -DOPENS
55758 /opt/local/libexec/gcc/i386-pc-solaris2.11/4.6.2/cc1 -quiet -I
55769 sed -e s|^bf_null.o|/home/dap/node/out/Release/obj.target/openss
55771 /bin/sh -c sed -e "s|^bf_nbio.o|/home/dap/node/out/Release/obj.t
Here’s a summary of these and several other useful ptools:
- pgrep/pkill: search processes (and signal them)
- pstack: print thread stack traces
- ptree: print process tree
- pargs [-e]: print process arguments (and environment variables)
- pmap: print process virtual address mappings
- pwdx: print a process’s working directory
- pstop: stop a process (as a debugger would – useful for testing what happens when a process hangs or otherwise gets delayed)
- prun: run a stopped process
- plockstat: print lock statistics for a process
- psig: print a process’s signal dispositions
- pwait: wait for a process to terminate
- ptime: print detailed timing stats for a process
- pldd: print dynamic libraries for a process
- fuser: show which processes have a given file open (not technically a ptool, but useful nonetheless)
Some of these tools (including pfiles and pstack) will briefly pause the process to gather their data. For example, “pfiles” can take several seconds if there are many file descriptors open.
For details on these and a few others, check their man pages, most of which are in proc(1).
Core files
Many of the proc tools operate on core files just as well as live processes. Core files are created when a process exits abnormally, as via abort(3C) or a SIGSEGV. But you can also create one on-demand with gcore:
$ gcore 45906
gcore: core.45906 dumped
$ pstack core.45906
core 'core.45906' of 45906: -bash
fee647f5 waitid (7, 0, 8047760, f)
fee00045 waitpid (ffffffff, 8047838, c, 108a7, 3, 8047850) + 65
0808f4c3 waitchld (0, 0, 0, 0, 20000, 0) + 87
0808ffc6 wait_for (108a7, 0, 813c128, 3e, 330000, 78) + 2ce
08082ee8 execute_command_internal (813b348, 0, ffffffff, ffffffff, 813c128) + 1758
08083d3d execute_command (813b348, 1, 8047b58, 8071a7d, 0, 0) + 45
08071c18 reader_loop (fed90b2c, 80663dd, 8047c34, fed90dc8, 8069380, 0) + 240
080708e3 main (1, 8047dfc, 8047e04, 80eb9f0, 0, 0) + aff
0806f32b _start (1, 8047ea4, 0, 8047eaa, 8047eb3, 8047ebf) + 83
Lazy tracing of system calls
DTrace can trace system calls across the system with minimal impact, but for cases where the overhead is not important and you only care about one process, truss can be a convenient tool because it decodes arguments and return values for you:
$ truss -p 3135
sysconfig(_CONFIG_PAGESIZE) = 4096
ioctl(1, TCGETA, 0x080479F0) = 0
ioctl(1, TIOCGWINSZ, 0x08047B88) = 0
brk(0x08086CA8) = 0
brk(0x0808ACA8) = 0
open(".", O_RDONLY|O_NDELAY|O_LARGEFILE) = 3
fcntl(3, F_SETFD, 0x00000001) = 0
fstat64(3, 0x08047940) = 0
getdents64(3, 0xFEC84000, 8192) = 720
getdents64(3, 0xFEC84000, 8192) = 0
When debugging path-related issues (like why Node.js can’t find the module you’re requiring), it’s often useful to trace just calls to “open” and “stat” with “truss -topen,stat”. This is also good for watching commands that traverse a directory tree, like “tar” or “find”.
DTrace and MDB
I mention DTrace and MDB last, but they’re the most comprehensive, most powerful tools in the system for understanding program behavior. The tools described above are simpler and present the most commonly useful information (e.g., process arguments or open file descriptors), but when you need to get arbitrary information about the system, these two are the tools to use.
DTrace is a comprehensive tracing framework for both the kernel and userland apps. It’s designed to be safe by design, to have zero overhead when not enabled, and to minimize overhead when enabled. DTrace has hundreds of thousands of probes at the kernel level, including system calls (system-wide), the scheduler, the I/O subsystem, ZFS, process execution, signals, and most function entry/exit points in the kernel. In userland, DTrace instruments function entry and exit points, individual instructions, and arbitrary probes added by application developers. At each of these instrumentation points, you can gather information like the currently running process, a kernel or userland stack backtrace, function arguments, or anything else in memory. To get started, I’d recommend Adam Leventhal’s DTrace boot camp slides. (The context and instructions for setup are a little dated, but the bulk of the content is still accurate.)
MDB is the modular debugger. Like GDB on other platforms, it’s most useful for deep inspection of a snapshot of program state. That can be a userland program or the kernel itself, and in both cases you can open a core dump (crash dump, for the kernel) or attach to the running program (kernel). As you’d expect, MDB lets you examine the stack, global variables, threads, and so on. The syntax is a little arcane, but the model is Unixy, allowing debugger commands to be strung together much like a shell pipeline. Eric Schrock has two excellent posts for people moving from GDB to MDB.
Let me know if I’ve missed any of the big ones. I’ll be writing a few more posts on tools in other areas of the system.