JavaScript Garbage Collection: Generations, Hosts, and Leaks

阅读中文版

While reviewing JavaScript runtime internals, I kept seeing GC reduced to one sentence: when objects are no longer used, the engine frees them.

That sentence is fine, but it hides the useful part. GC collects objects that are unreachable from roots. Young and old objects are handled differently. The same JavaScript engine also behaves differently once it sits inside a browser or Node, because the host adds its own references and memory pressure.

JavaScript GC hosts and heap

Figure: Using V8 as the example, GC starts from roots such as stack, globals, and closures. Browser and Node hosts can keep objects reachable through DOM nodes, listeners, buffers, EventEmitter, and other host resources. generated by gpt-image-2.

GC does not collect what you personally stopped caring about

GC follows the reference graph.

If an object is still reachable from the stack, a global, a closure, module cache, DOM listener, EventEmitter, timer, or pending async task, it is alive. Whether your business logic considers it “done” does not matter.

That is why leaks are often annoying. The memory is not floating around ownerless. Something still points to it, sometimes from a surprisingly distant place.

Young objects and old objects need different collectors

Generational GC is not a naming scheme. It is a way to use different algorithms for objects with different lifetimes.

The practical observation is simple: most objects die young. Temporary arrays created during rendering, intermediate objects from map and filter, and small objects inside a function often disappear before the next few collections.

So new objects start in the young generation. The goal is not to scan the whole heap. The goal is to cheaply handle a lot of short-lived objects.

V8 young generation scavenge

Figure: Young generation GC uses the Scavenge idea. Live objects are copied from From-space into To-space; dead objects disappear when the old space is discarded. Copying also compacts the survivors. generated by gpt-image-2.

The young generation is a good fit for copying GC. If only 20,000 objects survive out of 1,000,000 allocations, copying those survivors is cheaper than sweeping and compacting the whole region.

Objects that survive multiple minor GCs are promoted to the old generation. The old generation has a different shape: many objects are long-lived. If 90% of objects are still alive, copying everything would be expensive.

Old generation mark sweep compact

Figure: Old generation GC is closer to Mark-Sweep-Compact. Reachable objects are marked, unreachable slots are returned to a free list, and compaction can move live objects together while updating pointers. generated by gpt-image-2.

The part worth noticing is compaction. After sweeping, memory may contain holes. Some holes can be reused, but fragmentation still hurts locality and may block larger allocations. Compaction moves live objects together and leaves one larger free region. Since objects move, references to them must be updated.

Modern V8 does not simply wait for one long stop-the-world pause. Orinoco uses parallel, incremental, and concurrent techniques to split or move GC work. The goal is boring and important: keep the main thread responsive.

Browser leaks and Node leaks look different

JavaScript defines reachability. The host decides where many references come from.

In a browser, I usually look at DOM and interaction lifecycle first:

Scenario Typical reference chain
Detached DOM A global array or Map still stores a removed DOM node
Event listener A listener captures a large object and is never removed
Timer / animation frame A callback keeps running or stays scheduled
State cache Tabs, routes, or list data grow without a limit

In Node, the leaks are more often process-lifetime problems:

Scenario Typical reference chain
Map cache More requests create more keys, with no eviction
EventEmitter Each request adds a listener; eventually Node warns
Buffer / ArrayBuffer The JS wrapper is small, but external memory can be large
Pending Promise The promise never settles, so its captured data stays alive

The debugging posture changes with the host. In browsers, inspect page lifecycle, detached DOM, and listeners. In Node, inspect process-level caches, connections, listeners, buffers, and heap snapshots.

Map and WeakMap are not interchangeable

WeakMap is useful when you want to attach metadata to an object without extending that object’s lifetime. DOM node metadata is the classic example.

It is not a leak-proof container. Only the key is weak. If the value is strongly referenced somewhere else, it can still stay alive. WeakMap also cannot enumerate keys, because exposing keys would let code observe when GC happens.

I remember the boundary this way:

Structure Good fit
Map Owned data that needs iteration, metrics, or explicit eviction
WeakMap Object-attached metadata that should not keep the key alive
LRU / TTL Business caches with real lifetime rules

The stable ideas matter more than engine trivia

You do not need to memorize every V8 implementation detail. Versions change, and the collector keeps evolving.

The durable model is smaller:

  1. GC follows reachability, not business intent.
  2. Generational GC uses object age to choose cheaper algorithms.
  3. Young generation GC copies survivors; old generation GC cares about marking, sweeping, fragmentation, and compaction.
  4. Objects can move, so pointers must be updated.
  5. Hosts add references: browsers through DOM and events, Node through process-lifetime resources and external memory.
  6. Leak debugging starts with the reference chain, not with blaming GC.

The sentence I keep is this: a memory leak usually means GC still has a path to the object.

References