Node 6 woes

I’m working on a project that involves processing a fairly big data set — I’ll have more to say about the project later — and it was convenient to write my analysis using node.js.  That turned out to not be a good idea.

After my long-running analysis hit around 1.5GB of heap usage (on a machine with 16GB RAM), my program mysteriously crashed.  As you may know, you have to explicitly override the default heap-size limit with:

node --max_old_space_size=X myscript.js

where X is the size in MB.  Mildly irritating, but whatever.

As my analysis marched on to around 4GB of heap usage, I saw a new mysterious crash that looked like:

#
# Fatal error in ../deps/v8/src/heap/spaces.h, line 1516
# Check failed: size_ >= 0.
#

==== C stack trace ===============================

1: V8_Fatal
2: 0xc56387
3: v8::internal::MarkCompactCollector::SweepSpaces()
4: v8::internal::MarkCompactCollector::CollectGarbage()
5: v8::internal::Heap::MarkCompact()
6: v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags)
7: v8::internal::Heap::CollectGarbage(v8::internal::GarbageCollector, char const*, char const*, v8::GCCallbackFlags)
8: v8::internal::Heap::HandleGCRequest()
9: v8::internal::StackGuard::HandleInterrupts()
10: v8::internal::Runtime_StackGuard(int, v8::internal::Object**, v8::internal::Isolate*)
11: 0xf767a506338
Illegal instruction

Some searching turned up this v8 bug, in which a variable used to track heap space during sweeps regressed to an “int” (I think?) and so overflows on heaps larger than 4GB.  This regression is rather more annoying, and a bit of a head-scratcher.

The workaround is to downgrade to node 5.11.  You can check your system with the excellent test case and instructions here.

As a side effect, after the downgrade the performance on my workload increased by 15% – 30%.  Also a bit of a head-scratcher.

So, on the whole, 🤔.  You might want to think twice about using node.js for programs that are very resource intensive[1].

However, there were some rather pleasant parts of my experience with the latest-and-greatest node tools that I may write up later.


[1] Yes, I could have rewritten my program to use multiple processes, or a different analysis backend or computational model or …  I’m just recording my experience here as a relatively naive user.