I’m working on a project that involves processing a fairly big data set — I’ll have more to say about the project later — and it was convenient to write my analysis using node.js. That turned out to not be a good idea.
After my long-running analysis hit around 1.5GB of heap usage (on a machine with 16GB RAM), my program mysteriously crashed. As you may know, you have to explicitly override the default heap-size limit with:
node --max_old_space_size=X myscript.js
where X is the size in MB. Mildly irritating, but whatever.
As my analysis marched on to around 4GB of heap usage, I saw a new mysterious crash that looked like:
# # Fatal error in ../deps/v8/src/heap/spaces.h, line 1516 # Check failed: size_ >= 0. # ==== C stack trace =============================== 1: V8_Fatal 2: 0xc56387 3: v8::internal::MarkCompactCollector::SweepSpaces() 4: v8::internal::MarkCompactCollector::CollectGarbage() 5: v8::internal::Heap::MarkCompact() 6: v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) 7: v8::internal::Heap::CollectGarbage(v8::internal::GarbageCollector, char const*, char const*, v8::GCCallbackFlags) 8: v8::internal::Heap::HandleGCRequest() 9: v8::internal::StackGuard::HandleInterrupts() 10: v8::internal::Runtime_StackGuard(int, v8::internal::Object**, v8::internal::Isolate*) 11: 0xf767a506338 Illegal instruction
Some searching turned up this v8 bug, in which a variable used to track heap space during sweeps regressed to an “int” (I think?) and so overflows on heaps larger than 4GB. This regression is rather more annoying, and a bit of a head-scratcher.
The workaround is to downgrade to node 5.11. You can check your system with the excellent test case and instructions here.
As a side effect, after the downgrade the performance on my workload increased by 15% – 30%. Also a bit of a head-scratcher.
So, on the whole, 🤔. You might want to think twice about using node.js for programs that are very resource intensive[1].
However, there were some rather pleasant parts of my experience with the latest-and-greatest node tools that I may write up later.
[1] Yes, I could have rewritten my program to use multiple processes, or a different analysis backend or computational model or … I’m just recording my experience here as a relatively naive user.