Production Practices

MDB and Node.js

MDB is the illumos modular debugger. MDB is widely used on illumos systems (including SmartOS) to debug the live kernel, kernel crash dumps, processes, and process core dumps. This document describes MDB’s support for debugging Node.js programs.

MDB’s focus is on inspecting program state — and letting application developers build custom tools to inspect their programs — rather than on source code analysis and control flow. While MDB supports breakpoints and single-stepping through assembly, it doesn’t support JavaScript (or even C) source-level debugging. What it can do is show you all of the state in your program: JavaScript objects.

Why MDB?

While console.log debugging is effective for reproducible problems in controlled environments, it doesn’t work very well in production for a bunch of good reasons:

  • Each lap is expensive. Even with a nimble deployment process, it takes time to get new code deployed to production. If you’re skipping deployment and editing live code, you risk taking down your service with a typo, leaving code running in production that’s not checked into the repo, or forgetting to update other instances of the service.
  • Each lap is disruptive. You usually have to restart a service to pick up code changes, and that’s usually visible as downtime, however brief, to users of your service. (If you use dynamic monkey patching to avoid this, you risk other, more difficult-to-debug issues related to having run an untested combination of components in the same process.)
  • Production problems are often hard to reproduce. Problems in distributed systems are especially hard to reproduce on the instance that you’re debugging.
  • Worse, restarts often make the original problem go away. Many problems result from bad in-memory state that’s recreated on restart, so if you restart the service, you lose all hope of debugging it.

So it’s very useful to be able to debug a running program while it’s broken. Besides using MDB, other popular techniques include building a REPL into your program or some other way to pull out program state. These are extremely useful, but have their own limitations:

  • They don’t work to debug crashes, since the program’s gone at that point.
  • They usually don’t work to debug infinite loops, since the program’s stuck.
  • They can’t show you all of the state in your program, like the files it has open, JavaScript objects that aren’t referenced by these tools, or the C-level state used by native add-ons.

Getting started

Prerequisites

All of the following works out-of-the-box on recent SmartOS builds, as well as SmartOS instances provisioned on the Triton Public Cloud.

You can also run the Node example on a GNU/Linux system, but you’ll have to save a core file and run the MDB parts on SmartOS (e.g., using mlogin on Manta).

Working through an example

Here’s an example you can follow along with. My program is called loop.js:

/*
 * This script demonstrates a simple stack trace involving an anonymous
 * function, then goes into an infinite loop so you can catch it with the
 * debugger.  One caveat: we grab a stack trace at the start to force V8 to
 * compute line number information for this script.  If we didn't have this, the
 * debugger would only print out position-in-file information.
 */
new Error().stack;

function main(arg) { func1(); }
function func1() { func2(); }

function func2()
{
        (function () {
                for (;;)
                        ;
        })();
}

main({ 'hello': 'world' });

Run this program in the background with node v0.10.24:

$ node -v
v0.10.24

$ node loop.js &
[2] 17334

$

and attach MDB to that process (pid 17334):

$ mdb -p 17334
Loading modules: [ ld.so.1 libumem.so.1 libc.so.1 ]
>

When you attach MDB to a process like this, it stops the process and gives you a prompt. ::status is a good place to start (most MDB commands start with ::):

> ::status
debugging PID 17334 (32-bit)
file: /home/dap/node-v0.10.24-sunos-x86/bin/node
threading model: native threads
status: stopped by debugger

Don’t worry if you don’t know what all that means. MDB is telling you that it’s attached to pid 17334, which is a 32-bit process, and it’s currently stopped.

In order to get any of the Node-specific debugger commands, you have to run ::load v8:1

> ::load v8
V8 version: 3.14.5.9
Autoconfigured V8 support from target
C++ symbol demangling enabled

Now you can get a combined JavaScript/C stack trace with ::jsstack:

> ::jsstack
8047620 0xafa59dee <anonymous> (as <anon>) (a0784dc5)
    file: /home/dap/demo/loop.js
    posn: line 15
804763c 0xafa59951 func2 (a0781379)
8047654 0xafa597eb func1 (a0781355)
804766c 0xafa5974b main (a0781331)
804768c 0xafa53c9c <anonymous> (as <anon>) (a0781285)
80476c0 0xafa141a5 <InternalFrame>
80476f8 0xafa53697 <anonymous> (as Module._compile) (8e22e04d)
8047718 0xafa4f799 <anonymous> (as Module._extensions..js) (8e22e08d)
804773c 0xafa4c73f <anonymous> (as Module.load) (8e22e005)
804777c 0xafa3fa37 <anonymous> (as Module._load) (8e22dfbd)
804779c 0xafa3f593 <anonymous> (as Module.runMain) (8e22e0d5)
80477d0 0xafa298f6 startup (a072461d)
80477ec 0xafa2910b <anonymous> (as <anon>) (a0723f95)
804780c 0xafa0fe19 <InternalFrame>
8047848 0xafa0a0aa <EntryFrame>
80478c8
_ZN2v88internalL6InvokeEbNS0_6HandleINS0_10JSFunctionEEENS1_INS0_6ObjectEEEiPS5_Pb+0x
101
8047908 v8::internal::Execution::Call+0xc9
8047968 v8::Function::Call+0x10b
80479d8 node::Load+0x14e
8047a38 node::Start+0x158
8047a58 main+0x1b
8047a70 _start+0x83

You can ignore the first column, as well as the pointers that start with 0x.2 The top frame is what you’d expect: it’s our anonymous function, created at line 15. That was called by func2, which was called by func1, which was called by main. After that you see the JavaScript and C frames that are part of Node itself.

You can get more information about the whole stack using ::jsstack -v:

> ::jsstack -v
8047620 0xafa59dee <anonymous> (as <anon>) (a0784dc5)
    file: /home/dap/demo/loop.js
    posn: line 15
804763c 0xafa59951 func2 (a0781379)
    file: /home/dap/demo/loop.js
    posn: line 13
8047654 0xafa597eb func1 (a0781355)
    file: /home/dap/demo/loop.js
    posn: line 11
804766c 0xafa5974b main (a0781331)
    file: /home/dap/demo/loop.js
    posn: line 10
    arg1: a0784db5 (JSObject)
804768c 0xafa53c9c <anonymous> (as <anon>) (a0781285)
    file: /home/dap/demo/loop.js
    posn: line 1
    arg1: a077f6cd (JSObject)
    arg2: a0780ae1 (JSFunction)
    arg3: a077f651 (JSObject)
    arg4: a077e091 (ConsString)
    arg5: a0780f95 (ConsString)
80476c0 0xafa141a5 <InternalFrame>
80476f8 0xafa53697 <anonymous> (as Module._compile) (8e22e04d)
    file: module.js
    posn: line 374
    arg1: a07807f9 (SeqAsciiString)
    arg2: a077e091 (ConsString)
8047718 0xafa4f799 <anonymous> (as Module._extensions..js) (8e22e08d)
    file: module.js
    posn: line 472
    arg1: a077f651 (JSObject)
    arg2: a077e091 (ConsString)
804773c 0xafa4c73f <anonymous> (as Module.load) (8e22e005)
    file: module.js
    posn: line 346
    arg1: a077e091 (ConsString)
804777c 0xafa3fa37 <anonymous> (as Module._load) (8e22dfbd)
    file: module.js
    posn: line 275
    arg1: a0746315 (ConsString)
    arg2: 8e208081 (Oddball: "null")
    arg3: 8e2080a1 (Oddball: "true")
804779c 0xafa3f593 <anonymous> (as Module.runMain) (8e22e0d5)
    file: module.js
    posn: line 495
80477d0 0xafa298f6 startup (a072461d)
    file: node.js
    posn: line 30
80477ec 0xafa2910b <anonymous> (as <anon>) (a0723f95)
    file: node.js
    posn: line 27
    arg1: a071e9a9 (JSObject)
804780c 0xafa0fe19 <InternalFrame>
8047848 0xafa0a0aa <EntryFrame>
80478c8 _ZN2v88internalL6InvokeEbNS0_6HandleINS0_10JSFunctionEEENS1_INS0_6ObjectEEEiPS5_Pb+0x101
8047908 v8::internal::Execution::Call+0xc9
8047968 v8::Function::Call+0x10b
80479d8 node::Load+0x14e
8047a38 node::Start+0x158
8047a58 main+0x1b
8047a70 _start+0x83

Now with each of the JavaScript stack frames, you have the filename and line number where the function was defined, along with the arguments it was passed. You can print these arguments by taking the pointer value (the hex number) and using ::jsprint. Here’s how you’d print arg1 to our main() function (the fourth frame down):

> a0784db5::jsprint
{
    hello: "world",
}

which is exactly what we passed.

Finding JavaScript objects

MDB can find any JavaScript object in your program. It does this by scanning the entire JavaScript heap and classifying objects by their properties. You do this with ::findjsobjects. This may take a minute, then it prints out a summary of what it found:

> ::findjsobjects
  OBJECT #OBJECTS   #PROPS CONSTRUCTOR: PROPS
8e2254c9        1        0 Buffer
a071f289        1        0 <anonymous> (as <anon>)
8e21a321        1        0 <anonymous> (as d)
8e2154a9        1        0 MathConstructor
8e209b31        1        0 JSON
8e22d401        1        0 Module
a0708081       26        0
a0715901       35        0 Array
8e2093cd      329        0 Object
a074a61d        1        1 Object: ...
a074a579        1        1 Object: /home/dap/demo/loop.js
a0711295        1        1 Arguments: length
...
9c912f19        2        5 Object: evals, natives, buffer, fs, constants
9c9109d5        2        5 Object: cflags, default_configuration, defines, ...
a074aea5        1       11 Array
9c90ab55        1       11 process: version, moduleLoadList, versions, ...
9c90aae9        2        6 Object: debug, uv, ipv6, tls_npn, tls_sni, tls
9c90dadd        1       13 Object: _debugger, _linklist, assert, buffer, ...
9c910bcd        1       13 Object: clang, gcc_version, host_arch, ...
9c912ef5        1       13 Object: O_RDONLY, O_WRONLY, O_RDWR, S_IFMT, ...
9c9148f9        2        7 Module: id, exports, parent, filename, loaded, ...
a0718911        1       16 Array
9c90a885        2        8 Object: http_parser, node, v8, ares, uv, zlib, ...
9c9148b1        2       19 process: version, moduleLoadList, versions, ...
...

Each row describes a group of objects that have the same set of properties, so ::findjsobjects groups them together. The first column in each row is a pointer to a representative object for that group. You can print those out with ::jsprint. For example, this one appears to be used internally by Node:

> a074a579::jsprint
{
    /home/dap/demo/loop.js: {
        id: ".",
        exports: [...],
        parent: null,
        filename: "/home/dap/demo/loop.js",
        loaded: false,
        children: [...],
        paths: [...],
    },
}

Once you’ve run “::findjsobjects” once, it keeps track of what it found before so that you can quickly look for more specific kinds of objects. You usually search by a property name. For example, if I wanted to find my “hello world” object (and I didn’t know it was on the stack), I could find it with:

> ::findjsobjects -p hello
a0784d79

and print it out:

> a0784d79::jsprint
{
    hello: "world",
}

This is a good time to point out that MDB supports pipelines, similar to most shells. So you can combine these two commands:

> ::findjsobjects -p hello | ::jsprint
{
    hello: "world",
}

Dealing with garbage

The mechanism that “::findjsobjects” uses means that it also tends to pick up “garbage” objects that used to exist in your program, but which are no longer valid. In my process, I can see some of these when I try to look for Node’s global “process” object, which I know has a versions property:

> ::findjsobjects -p versions | ::jsprint
mdb: 8e21f06d: unknown JavaScript object type "AccessorInfo"
{
    version: 11,
    moduleLoadList: 0,
    versions: "title",
    arch: 1572995,
    platform: ,
    argv: "version",
    execArgv: 2621697,
    env: 0,
    pid: "moduleLoadList",
    features: 1835393,
}
...

You can recognize garbage objects because you may see warnings when MDB prints them out (e.g., “unknown JavaScript object type”) and the properties look very wrong. In this case, we know the “versions” property is supposed to be an object, not the string “title”.

These garbage objects are annoying, but they’re generally easily distinguished from valid objects.

Representative objects

Note that “::findjsobjects” only prints out the representative object from each group of objects. Since you might have tens of thousands of instances of the same thing, it doesn’t print all those by default. You have to pipe to ::findjsobjects again to get that. If you wanted to actually find the global “process” object, the reliable way to do that is:

> ::findjsobjects -p versions | ::findjsobjects | ::jsprint
...
{
    version: "v0.10.24",
    moduleLoadList: [
        "Binding evals",
        "Binding natives",
        "NativeModule events",
...
    ],
    versions: {
        http_parser: "1.0",
        node: "0.10.24",
        v8: "3.14.5.9",
        ares: "1.9.0-DEV",
        uv: "0.10.21",
        zlib: "1.2.3",
        modules: "11",
        openssl: "1.0.1e",
    },
    arch: "ia32",
    platform: "sunos",
    argv: [
        "node",
        "/home/dap/demo/loop.js",
    ],
...
}

I clipped the output for brevity. There were a few garbage objects, and the “process” object itself has a lot of stuff in it.

Where to go from here

This example demonstrated a bunch of the basics around using MDB to inspect program state. The best way to get familiar with these commands is to use them on your own programs, ideally to debug real problems. But even if you don’t have a real problem, try creating a core file of a production service and poking around. You might be surprised at what you find.

Debugging actual issues by inspecting state is very different than both console.log debugging and tracing execution with breakpoints. It’s much more powerful in some ways, since it shows you much more information and doesn’t disrupt the running program. When debugging problems this way, the approach mirrors the scientific method:

  • Form a hypothesis about what happened.
  • Inspect the state of the core file to confirm or disprove the hypothesis.
  • If proved, you’re done. If not, refine the hypothesis and repeat.

The hard part is learning to think about problems this way and to develop creative ways to prove or disprove hypotheses. Sometimes, it’s easy: if you think some socket got closed, you can find the socket object, inspect its state, and test that hypothesis. Sometimes, it’s not so easy: if you think a particular function may have been called that shouldn’t have been, it may not be easy to confirm that. But be creative: maybe the function doesn’t set any variables, but does it instantiate any objects, even temporary ones? You may be able to find them with “::findjsobjects”.

One nice thing about this approach is that by the end, you can be confident in your root-cause, because you’ve got lots of data to back it up. Often, the data may point to a very specific combination of events (e.g., this kind of request happening concurrently with that other kind of request) that may help you reproduce the problem in development, so you can be sure that your fix works.

Building programs for debugging

You can help yourself by including state in your program that only exists to help debug it. For example, the vasync library provides familiar control-flow functions (like pipeline and forEachParallel), with the added benefit that the state is stored in a JavaScript object (rather than local variables contained in a closure). The upshot is that when your Node program hangs while running a pipeline of 10 functions, it’s easy to use “::findjsobjects” to find out exactly which function didn’t complete.

Other tips:

  • Save a record of all outstanding operations. Keep them in an object indexed by request id.
  • Save a timestamp when you start each operation, particularly outgoing requests. This helps you determine whether a request has been hung (i.e., the external service is hung).

More examples

These are some examples of using MDB to analyze complex issues in Node itself:

Notes

Debugging crashes in Node programs

With Node.js v0.10.8 and later, you can use the --abort-on-uncaught-exception flag to cause your Node program to dump core when any uncaught exception is thrown. This way, when your program would have crashed, you instead get a core dump that you can debug with MDB.

If you are using the newer async/await keywords you should consider adding handler that throws any unhandled rejections. This, in combination with the abort flag above will create a core file when a rejected promise isn’t properly handled. For example, the code below will throw the rejected promise and cause the process to abort.

process.once('unhandledRejection', (err) => {
  throw err;
});

Line numbering

The line numbers reported by MDB describe where the function was defined, not where the function called the next function in the stack.

In some cases, instead of seeing a line number, you may see a position number, as in “position 13546”. MDB can’t always tell what line number a position corresponds to. It can only tell when V8 computed this before the core file was created. In this case, the function would have been defined position 13546 in the file. In “vim”, you can find this with :goto 13546. Beware that Node prepends about 50 characters to all files before passing them to V8, so these position numbers appear to be off by about that much.

Processes or core files?

MDB operates on either running processes or core dumps. In turn, core dumps can either be generated automatically in response to a crash, or on-demand using gcore. The MDB commands and output are basically the same whether you’re debugging a live process or a core file — it’s just the way you start up MDB that’s different.

To debug a core dump, use:

$ mdb /path/to/core/file

To debug a process, use:

$ mdb -p PID

Once MDB is started, the rest of the commands are the same.

When you use mdb -p, MDB stops the process. Be careful doing this in production! It’s usually better to create a core file first, and then debug that:

$ gcore PID
core.PID dumped

$ mdb core.PID

This still pauses the process for as long as it takes gcore to run.

Special stack frames

There are a bunch of V8-specific frames you may see that you can usually ignore. These include:

  • ArgumentsAdapterFrame: inserted by V8 when a function’s caller passes a different number of arguments than the function is defined to take.
  • ConstructFrame: used when invoking a constructor
  • EntryFrame: transition from C to JavaScript
  • ExitFrame: transition from JavaScript to C

Other frames include InternalFrames, EntryConstructFrames, and so on.

What’s this “hole” object?

In ::jsprint output, you may see values called hole. You can think of “hole” as undefined. You usually see it in arrays where the array itself hasn’t been fully populated.

JavaScript requires that V8 be able to distinguish between array elements that have not been assigned a value and those that have been assigned “undefined”.3 V8 uses the special “hole” value for this internally.

For more help

On the specific MDB commands that work on Node, see:

> ::dmods -l v8

to list commands, and use ::help CMD for more information about each command.

On MDB, check out the MDB man page and the Modular Debugger guide. Be aware that MDB is also used as a kernel and low-level C debugger, so much of the material may be too low-level for Node.

Footnotes

1The debugger module is called “v8” because it’s actually not Node-specific: everything it supports would work on other environments using the V8 VM.

2The first two columns of ::jsstack output are the frame pointer and instruction pointer for each stack frame. These can be useful when digging into VM internals, or looking at native code, but can usually be ignored for debugging JavaScript-level issues.

3You can see the difference yourself:

$ node
> var x = new Array(3)
undefined
> x[0] = undefined
undefined
> x[2] = undefined
undefined
> x
[ undefined, , undefined ]
> 0 in x
true
> 1 in x
false
> 2 in x
true