MDB is the illumos modular debugger. MDB is widely used on illumos systems (including SmartOS) to debug the live kernel, kernel crash dumps, processes, and process core dumps. This document describes MDB’s support for debugging Node.js programs.
MDB’s focus is on inspecting program state — and letting application developers build custom tools to inspect their programs — rather than on source code analysis and control flow. While MDB supports breakpoints and single-stepping through assembly, it doesn’t support JavaScript (or even C) source-level debugging. What it can do is show you all of the state in your program: JavaScript objects.
While console.log
debugging is effective for reproducible problems in
controlled environments, it doesn’t work very well in production for a bunch of
good reasons:
So it’s very useful to be able to debug a running program while it’s broken. Besides using MDB, other popular techniques include building a REPL into your program or some other way to pull out program state. These are extremely useful, but have their own limitations:
All of the following works out-of-the-box on recent SmartOS builds, as well as SmartOS instances provisioned on the Triton Public Cloud.
You can also run the Node example on a GNU/Linux system, but you’ll have to save a core file and run the MDB parts on SmartOS (e.g., using mlogin on Manta).
Here’s an example you can follow along with. My program is called loop.js:
/*
* This script demonstrates a simple stack trace involving an anonymous
* function, then goes into an infinite loop so you can catch it with the
* debugger. One caveat: we grab a stack trace at the start to force V8 to
* compute line number information for this script. If we didn't have this, the
* debugger would only print out position-in-file information.
*/
new Error().stack;
function main(arg) { func1(); }
function func1() { func2(); }
function func2()
{
(function () {
for (;;)
;
})();
}
main({ 'hello': 'world' });
Run this program in the background with node v0.10.24:
$ node -v
v0.10.24
$ node loop.js &
[2] 17334
$
and attach MDB to that process (pid 17334):
$ mdb -p 17334
Loading modules: [ ld.so.1 libumem.so.1 libc.so.1 ]
>
When you attach MDB to a process like this, it stops the process and gives you
a prompt. ::status
is a good place to start (most MDB commands start with
::
):
> ::status
debugging PID 17334 (32-bit)
file: /home/dap/node-v0.10.24-sunos-x86/bin/node
threading model: native threads
status: stopped by debugger
Don’t worry if you don’t know what all that means. MDB is telling you that it’s attached to pid 17334, which is a 32-bit process, and it’s currently stopped.
In order to get any of the Node-specific debugger commands, you have to run
::load v8
:1
> ::load v8
V8 version: 3.14.5.9
Autoconfigured V8 support from target
C++ symbol demangling enabled
Now you can get a combined JavaScript/C stack trace with ::jsstack
:
> ::jsstack
8047620 0xafa59dee <anonymous> (as <anon>) (a0784dc5)
file: /home/dap/demo/loop.js
posn: line 15
804763c 0xafa59951 func2 (a0781379)
8047654 0xafa597eb func1 (a0781355)
804766c 0xafa5974b main (a0781331)
804768c 0xafa53c9c <anonymous> (as <anon>) (a0781285)
80476c0 0xafa141a5 <InternalFrame>
80476f8 0xafa53697 <anonymous> (as Module._compile) (8e22e04d)
8047718 0xafa4f799 <anonymous> (as Module._extensions..js) (8e22e08d)
804773c 0xafa4c73f <anonymous> (as Module.load) (8e22e005)
804777c 0xafa3fa37 <anonymous> (as Module._load) (8e22dfbd)
804779c 0xafa3f593 <anonymous> (as Module.runMain) (8e22e0d5)
80477d0 0xafa298f6 startup (a072461d)
80477ec 0xafa2910b <anonymous> (as <anon>) (a0723f95)
804780c 0xafa0fe19 <InternalFrame>
8047848 0xafa0a0aa <EntryFrame>
80478c8
_ZN2v88internalL6InvokeEbNS0_6HandleINS0_10JSFunctionEEENS1_INS0_6ObjectEEEiPS5_Pb+0x
101
8047908 v8::internal::Execution::Call+0xc9
8047968 v8::Function::Call+0x10b
80479d8 node::Load+0x14e
8047a38 node::Start+0x158
8047a58 main+0x1b
8047a70 _start+0x83
You can ignore the first column, as well as the pointers that start with
0x
.2 The top frame is what you’d expect: it’s our anonymous
function, created at line 15. That was called by func2
, which was called by
func1
, which was called by main
. After that you see the JavaScript and C
frames that are part of Node itself.
You can get more information about the whole stack using ::jsstack -v
:
> ::jsstack -v
8047620 0xafa59dee <anonymous> (as <anon>) (a0784dc5)
file: /home/dap/demo/loop.js
posn: line 15
804763c 0xafa59951 func2 (a0781379)
file: /home/dap/demo/loop.js
posn: line 13
8047654 0xafa597eb func1 (a0781355)
file: /home/dap/demo/loop.js
posn: line 11
804766c 0xafa5974b main (a0781331)
file: /home/dap/demo/loop.js
posn: line 10
arg1: a0784db5 (JSObject)
804768c 0xafa53c9c <anonymous> (as <anon>) (a0781285)
file: /home/dap/demo/loop.js
posn: line 1
arg1: a077f6cd (JSObject)
arg2: a0780ae1 (JSFunction)
arg3: a077f651 (JSObject)
arg4: a077e091 (ConsString)
arg5: a0780f95 (ConsString)
80476c0 0xafa141a5 <InternalFrame>
80476f8 0xafa53697 <anonymous> (as Module._compile) (8e22e04d)
file: module.js
posn: line 374
arg1: a07807f9 (SeqAsciiString)
arg2: a077e091 (ConsString)
8047718 0xafa4f799 <anonymous> (as Module._extensions..js) (8e22e08d)
file: module.js
posn: line 472
arg1: a077f651 (JSObject)
arg2: a077e091 (ConsString)
804773c 0xafa4c73f <anonymous> (as Module.load) (8e22e005)
file: module.js
posn: line 346
arg1: a077e091 (ConsString)
804777c 0xafa3fa37 <anonymous> (as Module._load) (8e22dfbd)
file: module.js
posn: line 275
arg1: a0746315 (ConsString)
arg2: 8e208081 (Oddball: "null")
arg3: 8e2080a1 (Oddball: "true")
804779c 0xafa3f593 <anonymous> (as Module.runMain) (8e22e0d5)
file: module.js
posn: line 495
80477d0 0xafa298f6 startup (a072461d)
file: node.js
posn: line 30
80477ec 0xafa2910b <anonymous> (as <anon>) (a0723f95)
file: node.js
posn: line 27
arg1: a071e9a9 (JSObject)
804780c 0xafa0fe19 <InternalFrame>
8047848 0xafa0a0aa <EntryFrame>
80478c8 _ZN2v88internalL6InvokeEbNS0_6HandleINS0_10JSFunctionEEENS1_INS0_6ObjectEEEiPS5_Pb+0x101
8047908 v8::internal::Execution::Call+0xc9
8047968 v8::Function::Call+0x10b
80479d8 node::Load+0x14e
8047a38 node::Start+0x158
8047a58 main+0x1b
8047a70 _start+0x83
Now with each of the JavaScript stack frames, you have the filename and line
number where the function was defined, along with the arguments it was passed.
You can print these arguments by taking the pointer value (the hex number) and
using ::jsprint
. Here’s how you’d print arg1 to our main()
function (the
fourth frame down):
> a0784db5::jsprint
{
hello: "world",
}
which is exactly what we passed.
MDB can find any JavaScript object in your program. It does this by scanning
the entire JavaScript heap and classifying objects by their properties. You do
this with ::findjsobjects
. This may take a minute, then it prints out a
summary of what it found:
> ::findjsobjects
OBJECT #OBJECTS #PROPS CONSTRUCTOR: PROPS
8e2254c9 1 0 Buffer
a071f289 1 0 <anonymous> (as <anon>)
8e21a321 1 0 <anonymous> (as d)
8e2154a9 1 0 MathConstructor
8e209b31 1 0 JSON
8e22d401 1 0 Module
a0708081 26 0
a0715901 35 0 Array
8e2093cd 329 0 Object
a074a61d 1 1 Object: ...
a074a579 1 1 Object: /home/dap/demo/loop.js
a0711295 1 1 Arguments: length
...
9c912f19 2 5 Object: evals, natives, buffer, fs, constants
9c9109d5 2 5 Object: cflags, default_configuration, defines, ...
a074aea5 1 11 Array
9c90ab55 1 11 process: version, moduleLoadList, versions, ...
9c90aae9 2 6 Object: debug, uv, ipv6, tls_npn, tls_sni, tls
9c90dadd 1 13 Object: _debugger, _linklist, assert, buffer, ...
9c910bcd 1 13 Object: clang, gcc_version, host_arch, ...
9c912ef5 1 13 Object: O_RDONLY, O_WRONLY, O_RDWR, S_IFMT, ...
9c9148f9 2 7 Module: id, exports, parent, filename, loaded, ...
a0718911 1 16 Array
9c90a885 2 8 Object: http_parser, node, v8, ares, uv, zlib, ...
9c9148b1 2 19 process: version, moduleLoadList, versions, ...
...
Each row describes a group of objects that have the same set of properties, so ::findjsobjects groups them together. The first column in each row is a pointer to a representative object for that group. You can print those out with ::jsprint. For example, this one appears to be used internally by Node:
> a074a579::jsprint
{
/home/dap/demo/loop.js: {
id: ".",
exports: [...],
parent: null,
filename: "/home/dap/demo/loop.js",
loaded: false,
children: [...],
paths: [...],
},
}
Once you’ve run “::findjsobjects” once, it keeps track of what it found before so that you can quickly look for more specific kinds of objects. You usually search by a property name. For example, if I wanted to find my “hello world” object (and I didn’t know it was on the stack), I could find it with:
> ::findjsobjects -p hello
a0784d79
and print it out:
> a0784d79::jsprint
{
hello: "world",
}
This is a good time to point out that MDB supports pipelines, similar to most shells. So you can combine these two commands:
> ::findjsobjects -p hello | ::jsprint
{
hello: "world",
}
The mechanism that “::findjsobjects” uses means that it also tends to pick up
“garbage” objects that used to exist in your program, but which are no longer
valid. In my process, I can see some of these when I try to look for Node’s
global “process” object, which I know has a versions
property:
> ::findjsobjects -p versions | ::jsprint
mdb: 8e21f06d: unknown JavaScript object type "AccessorInfo"
{
version: 11,
moduleLoadList: 0,
versions: "title",
arch: 1572995,
platform: ,
argv: "version",
execArgv: 2621697,
env: 0,
pid: "moduleLoadList",
features: 1835393,
}
...
You can recognize garbage objects because you may see warnings when MDB prints them out (e.g., “unknown JavaScript object type”) and the properties look very wrong. In this case, we know the “versions” property is supposed to be an object, not the string “title”.
These garbage objects are annoying, but they’re generally easily distinguished from valid objects.
Note that “::findjsobjects” only prints out the representative object from
each group of objects. Since you might have tens of thousands of instances of
the same thing, it doesn’t print all those by default. You have to pipe to
::findjsobjects
again to get that. If you wanted to actually find the global
“process” object, the reliable way to do that is:
> ::findjsobjects -p versions | ::findjsobjects | ::jsprint
...
{
version: "v0.10.24",
moduleLoadList: [
"Binding evals",
"Binding natives",
"NativeModule events",
...
],
versions: {
http_parser: "1.0",
node: "0.10.24",
v8: "3.14.5.9",
ares: "1.9.0-DEV",
uv: "0.10.21",
zlib: "1.2.3",
modules: "11",
openssl: "1.0.1e",
},
arch: "ia32",
platform: "sunos",
argv: [
"node",
"/home/dap/demo/loop.js",
],
...
}
I clipped the output for brevity. There were a few garbage objects, and the “process” object itself has a lot of stuff in it.
This example demonstrated a bunch of the basics around using MDB to inspect program state. The best way to get familiar with these commands is to use them on your own programs, ideally to debug real problems. But even if you don’t have a real problem, try creating a core file of a production service and poking around. You might be surprised at what you find.
Debugging actual issues by inspecting state is very different than both
console.log
debugging and tracing execution with breakpoints. It’s much more
powerful in some ways, since it shows you much more information and doesn’t
disrupt the running program. When debugging problems this way, the approach
mirrors the scientific method:
The hard part is learning to think about problems this way and to develop creative ways to prove or disprove hypotheses. Sometimes, it’s easy: if you think some socket got closed, you can find the socket object, inspect its state, and test that hypothesis. Sometimes, it’s not so easy: if you think a particular function may have been called that shouldn’t have been, it may not be easy to confirm that. But be creative: maybe the function doesn’t set any variables, but does it instantiate any objects, even temporary ones? You may be able to find them with “::findjsobjects”.
One nice thing about this approach is that by the end, you can be confident in your root-cause, because you’ve got lots of data to back it up. Often, the data may point to a very specific combination of events (e.g., this kind of request happening concurrently with that other kind of request) that may help you reproduce the problem in development, so you can be sure that your fix works.
You can help yourself by including state in your program that only exists to
help debug it. For example, the
vasync library provides
familiar control-flow functions (like pipeline
and forEachParallel
), with
the added benefit that the state is stored in a JavaScript object (rather than
local variables contained in a closure). The upshot is that when your Node
program hangs while running a pipeline of 10 functions, it’s easy to use
“::findjsobjects” to find out exactly which function didn’t complete.
Other tips:
These are some examples of using MDB to analyze complex issues in Node itself:
With Node.js v0.10.8 and later, you can use the --abort-on-uncaught-exception
flag to cause your Node program to dump core when any uncaught exception is
thrown. This way, when your program would have crashed, you instead get a core
dump that you can debug with MDB.
If you are using the newer async
/await
keywords you should consider adding handler that throws any unhandled rejections. This, in combination with the abort flag above will create a core file when a rejected promise isn’t properly handled. For example, the code below will throw the rejected promise and cause the process to abort.
process.once('unhandledRejection', (err) => {
throw err;
});
The line numbers reported by MDB describe where the function was defined, not where the function called the next function in the stack.
In some cases, instead of seeing a line number, you may see a position number,
as in “position 13546”. MDB can’t always tell what line number a position
corresponds to. It can only tell when V8 computed this before the core file was
created. In this case, the function would have been defined position 13546 in
the file. In “vim”, you can find this with :goto 13546
. Beware that Node
prepends about 50 characters to all files before passing them to V8, so these
position numbers appear to be off by about that much.
MDB operates on either running processes or core dumps. In turn, core dumps can either be generated automatically in response to a crash, or on-demand using gcore. The MDB commands and output are basically the same whether you’re debugging a live process or a core file — it’s just the way you start up MDB that’s different.
To debug a core dump, use:
$ mdb /path/to/core/file
To debug a process, use:
$ mdb -p PID
Once MDB is started, the rest of the commands are the same.
When you use mdb -p
, MDB stops the process. Be careful doing this in
production! It’s usually better to create a core file first, and then debug
that:
$ gcore PID
core.PID dumped
$ mdb core.PID
This still pauses the process for as long as it takes gcore
to run.
There are a bunch of V8-specific frames you may see that you can usually ignore. These include:
Other frames include InternalFrames, EntryConstructFrames, and so on.
In ::jsprint
output, you may see values called hole
. You can think of
“hole” as undefined
. You usually see it in arrays where the array itself
hasn’t been fully populated.
JavaScript requires that V8 be able to distinguish between array elements that have not been assigned a value and those that have been assigned “undefined”.3 V8 uses the special “hole” value for this internally.
On the specific MDB commands that work on Node, see:
> ::dmods -l v8
to list commands, and use ::help CMD
for more information about each command.
On MDB, check out the MDB man page and the Modular Debugger guide. Be aware that MDB is also used as a kernel and low-level C debugger, so much of the material may be too low-level for Node.
1The debugger module is called “v8” because it’s actually not Node-specific: everything it supports would work on other environments using the V8 VM.
2The first two columns of ::jsstack
output are the frame pointer
and instruction pointer for each stack frame. These can be useful when
digging into VM internals, or looking at native code, but can usually be
ignored for debugging JavaScript-level issues.
3You can see the difference yourself:
$ node
> var x = new Array(3)
undefined
> x[0] = undefined
undefined
> x[2] = undefined
undefined
> x
[ undefined, , undefined ]
> 0 in x
true
> 1 in x
false
> 2 in x
true