Garbage collection is useful

165
59
surprisetalk
2 weeks ago
dubroy.com

keith_analog
·
2 weeks ago
·
[ - ]

My favorite quote from Alan Perlis: “Symmetry is a complexity-reducing concept (co-routines include subroutines); seek it everywhere.”

summa_tech
·
2 weeks ago
·
[ - ]

Unfortunately, the symmetry provided by higher levels of your software stack is not always reflected by the lower levels. This causes your symmetry to be created as an abstraction. (Subroutines are more efficient to implement than co-routines on real hardware.) Abstractions are leaky and this, in turn, causes the superficial symmetry to merely mask the underlying asymmetry.

If done exceptionally well, the only visible evidence of the leak will be substantial performance disparities between seemingly symmetric features (well done! this is unusual).

If done with the normal level of software design quality, the evidence will show up as quirky behavior, imperfect symmetry, "well this one always works but the other one is not reentrant", etc.

artemonster
·
2 weeks ago
·
[ - ]

Yes! Subroutine call is a) allocation of activation record b) switching context c) returning that combines de-alloc and switch. while coroutines have all of these concepts separated. Why not start with a powerful and general concept and optimize for that one?

aw1621107
·
2 weeks ago
·
[ - ]

> Why not start with a powerful and general concept and optimize for that one?

As with basically everything, there are tradeoffs involved. Sometimes restrictions can be helpful for keeping things understandable, which can in turn make optimizations easier to implement. As a rather hamfisted example: completely unrestricted goto. Very general, debatably powerful, but relatively easy to use in a way that makes comprehension difficult. That same generality can also make it difficult to verify that optimizations don't change observable program semantics compared to something more restricted.

cryptonector
·
2 weeks ago
·
[ - ]

Because you can allocate the activation record on the heap, provide a way to reify continuations, and presto! now you have call/cc (and trivially implemented, no less)!

Price paid: a) you need a GC (ok, whatever, sure, you have one), b) well, the performance hit of having to GC activation records (you don't care; others do), c) whoops, thread-safety got much harder and maybe you can't even have threads (imagine two threads executing in the same activation record at the same time!).

That's a very very steep price.

veunes
·
2 weeks ago
·
[ - ]

It's amazing how often symmetry shows up as both a design aesthetic and a practical tool for managing complexity

kazinator
·
2 weeks ago
·
[ - ]

Reference counting does not trace dead objects. Most of its activity is concerned with maintaining reference counts on live objects.

When a reference count hits zero, that's when refcounting begins to be concerned with a dead object; and that part of its operation corresponds to the sweep activity in garbage collection, not to the tracing of live objects.

It is not a dual to garbage collection concerned with its negative spaces; it's simply a less general (we could legitimately say lesser) garbage collection that doesn't deal with cycles on its own.

moregrist
·
2 weeks ago
·
[ - ]

> it's simply a less general (we could legitimately say lesser) garbage collection that doesn't deal with cycles on its own.

There are different implementation and performance trade-offs associated with both. I’ll focus on the two that are most meaningful to me.

Reference counting can be added as a library to languages that don’t want or can’t have a precise garbage collector. If you work in C++ (or Rust), it’s a very viable way to assure that you have some measure of non-manual clean up while maintaining precise resource control.

Similarly, when performance matters reference counting is essentially deterministic much easier to understand and model.

In a lot of situations, garbage collection is an overall better strategy, but it’s not a strict superset, and not always the right choice.

mbac32768
·
2 weeks ago
·
[ - ]

> Similarly, when performance matters reference counting is essentially deterministic much easier to understand and model.

Is it? What happens if you remove that one last reference to a long chain of objects? You might unexpectedly be doing a ton of freeing and have a long pause. And free itself can be expensive.

pkolaczk
·
2 weeks ago
·
[ - ]

Technically it's not a pause as the pauses introduced by a typical STW tracing GC. It does not stop the other threads. The app can still continue to work during that cleanup.

And it pops up in the profiler immediately with a nice stack trace showing where it rooted from. Then you fix it by e.g. moving cleanup to background to unlock this thread, not cleaning it at all (e.g. if the process dies anyway soon), or just remodel the data structure to not have so many tiny objects, etc.

Essentially this is exactly "way more deterministic and easier to understand and model". No-one said it is free from performance traps.

> And free itself can be expensive.

The total amortized cost of malloc/free is usually much lower than the total cost of tracing; unless you give tracing GC a horrendous amount of additional memory (> 10x of resident live set).

malloc/free are especially efficient when they are used for managing bigger objects. But even with tiny allocations like 8 bytes size (which are rarely kept on heap) I found modern allocators like mimalloc or jemalloc easily outperformed modern GCs of Java (in terms of CPU cycles spent, not wall clock).

MaulingMonkey
·
2 weeks ago
·
[ - ]

> Is it?

Yes.

> What happens if you remove that one last reference to a long chain of objects?

A mass free sometime vaguely in future based on the GC's whims and knobs and tuning, when doing non-refcounting garbage collection.

A mass free there and then, when refcounting. Which might still cause problems - but they are at least deterministic problems. Problems that will show up in ≈any profiler exactly where the last reference was lost, which you can then choose to e.g. ameliorate (at least when you have source access) by choosing a more appropriate allocator. Or deferring cleanup over several frames, if that's what you're into. Or eating the pause for less cache thrashing and higher throughput. Or mixing strategies depending on application context (game (un)loading screen probably prioritizes throughput, streaming mid-gameplay probably prioritizes framerate...)

> You might unexpectedly be doing a ton of freeing and have a long pause. And free itself can be expensive.

Much more rarely than GC pauses cause problems IME.

veunes
·
2 weeks ago
·
[ - ]

It's not just about what it can't do (like handle cycles on its own), but about what it enables in certain contexts

kazinator
·
2 weeks ago
·
[ - ]

> There are different implementation and performance trade-offs associated with both.

They are not the same because there are "semantic tradeoffs".

rayiner
·
2 weeks ago
·
[ - ]

Reference counting does trace dead objects. When the reference count hits zero, you have to recursively trace through all objects referenced by the newly dead object. That’s a trace of dead objects.

kazinator
·
2 weeks ago
·
[ - ]

That can be identified as a finalization-driven sweep. The object whose refcount hits zero is finalized, and the routine for that drops its references to other objects.

Garbage collection also traces dead objects. Or at least some kinds of GC implementations that are not copying. when the marking is done, the heaps are traversed again to identify dead objects, which are put onto a free list. That's a trace of dead objects. (Under copying collection, that is implicit; live objects are moved to a new heap and the vacated space is entirely made available for bump allocation.)

pdubroy
·
2 weeks ago
·
[ - ]

I think you're skipping over some important distinctions here.

In a mark & sweep GC, the mark phase traverses the object graph, visiting only live objects. You need to recursively visit any objects that are not already marked — this is the process known as tracing. The marking time is proportional to the number of live objects.

In the sweep phase, you typically do a linear traversal of memory and reclaim any objects that are not marked. You do not examine pointers inside any objects, so the graph structure is irrelevant. The time is proportional to the total size of memory.

In reference counting, when a refcount hits 0, you need to decrement the refcount of pointed-to objects, and recursively visit any objects whose refcount is now 0. The time is proportional to the number of objects that have just died.

Structurally, decrementing refcounts is *very* similar to tracing. You're right that it's purpose is similar to the sweep phase of a mark & sweep GC, but they aren't algorithmically similar.

kazinator
·
2 weeks ago
·
[ - ]

It seems that reference counting traces both the live and dead space, in the sense that refcounts have to be carefully maintained as the ownership sharing is propagated, and then the finalization of a refcounted object can trigger others dead ones to be identified.

Garbage collection culd also trace both spaces. After we have performed the mark phase of basic mark-and-sweep, we could again go back to the root pointers and traverse the object graph again in the asme way, this time looking for objects which are not marked reachable. Similarly to refcounting, we could do the finalization for each such object and then recursively look for others that are unreachable.

We don't do that for various reasons:

- the belief that it's faster to traverse the heaps, even though in doing so, we it may be necessary to skip numerous free objects.

- the use of a copying algorithm, whereby we moved the live objects to a new heap, so there is no need to trace or otherwise traverse the dead space in detail.

The belief is justified because there usually aren't any free objects when GC is spontaneously triggered. A recursive trace vists the entire graph of objects that were previously live, some of which are now dead. A sweep through the heaps visits all the same ones, plus also some entries marked free (of which there are none in a spontaneous GC pass).

But the recursive traversal involves inefficient dependent pointer loads, poor caching, and the visitation of of the same objecct multiple times. While in the case of dead objects, we can easily tell that we are visiting a dead object a second time (the first time we finalized it and marked it free), we have to account for multiple visits to the reachable ones; we have to paint each one a new color to indicate that it was visited in the second pass.

hayley-patton
·
2 weeks ago
·
[ - ]

That's a linear traversal of the heap, not a trace. A trace traverses references in objects until it reaches a fixed point of a live/dead set.

pizlonator
·
2 weeks ago
·
[ - ]

Yeah!

Also, the closest thing to a graph search in reference counting, as most of us understand it, occurs at a totally different level in the stack.

In a GC: the graph search is hidden from view and implemented in any number of clever ways by the language runtime. Or, if you’re implementing the GC yourself, it sits out of the way of normal operations on your “heap”.

In ref counting: down ref to zero triggers a destructor. That destructor can do anything. It so happens that if the type has reference counted references, then the destructor will, either automatically or manually, invoke downref operations. All of this is part of the language’s normal execution semantics.

To say that these things are alike is a stretch. You could use similar logic to say that anyone doing a graph search is collecting garbage

kazinator
·
2 weeks ago
·
[ - ]

You seem to be referring to a reference counting scheme that is retrofitted into a language, where the application has the responsibility of indicating where, in an application-defined type, the references are to other objects.

That can be true of garbage collection also; it can be bolted on in a way that the application is responsible for providing a handler for traversing objects during marking.

(Of course, it is vastly less likely that a marking traverser would be doing anything other than its job of recursing or otherwise indicating pointers to be traversed, whereas finalization routines will mix the downing of references with other resource clean up.)

Someone
·
2 weeks ago
·
[ - ]

> Reference counting does not trace dead objects.

> […]

> and that part of its operation corresponds to the sweep activity in garbage collection, not to the tracing of live objects.

The sweep activity in garbage collection traces live objects. https://en.wikipedia.org/wiki/Garbage_collection_(computer_s...:

“The overall strategy consists of determining which objects should be garbage collected by tracing which objects are reachable by a chain of references from certain root objects, and considering the rest as garbage and collecting them.”

masklinn
·
2 weeks ago
·
[ - ]

The tracing (or mark) part is, appropriately, the one that traces objects.

Sweeping is just

> considering the rest as garbage and collecting them

1718627440
·
2 weeks ago
·
[ - ]

This problem goes away, if you only ref-count owning pointers, because ownership is non-cyclical.

veunes
·
2 weeks ago
·
[ - ]

Maybe not a full philosophical dual, but still a very practical one in this context

agentultra
·
2 weeks ago
·
[ - ]

I’ve often noted that most projects of a certain size tend to implement some form of garbage collection and allocation.

Perhaps general purpose systems of these sorts aren’t suitable for specialized applications… but I don’t get the “hate” (if you can call it that) which some programmers have for GC.

cogman10
·
2 weeks ago
·
[ - ]

As someone that likes GCs, I understand it.

GCs have a lot of tradeoffs involved. It's impossible to check all boxes and that means that there's going to be something to gripe about.

If you want your GC to be memory efficient you are likely trading off throughput.

If you want your GC to allocate fast and avoid memory fragmentation, you are likely over-provisioning the heap.

If you want to minimize CPU time in GC, you'll likely increase pause time.

If you want to minimize pause time, you'll likely increase CPU time doing a GC.

All these things can make someone ultimately hate a GC.

However, if you want a programming language which deals with complicated memory lifetime (think concurrent datastructures) then a GC is practically paramount. It's a lot harder to correctly implement something like Java's "ConcurrentHashMap" in C++ or Rust.

surajrmal
·
2 weeks ago
·
[ - ]

For specific use cases, it's not only more ergonomic, it can be more performant. Look at Linux kernel use of RCU. The important thing is to maintain control over the allocation strategy and lifetime of data depending on the use case for systems programming. Defaulting to a GC just removes your control which is the problem. GC itself is not problematic.

veunes
·
2 weeks ago
·
[ - ]

At some point, the trade-offs just shift. Manual memory control feels empowering until the complexity piles up, and suddenly GC looks like a smart compromise

RossBencina
·
2 weeks ago
·
[ - ]

> but I don’t get the “hate”

I will put forward some arguments. I do not believe all of them:

GC is for lazy programmers who do not know how to manage memory.

GC takes aspects of memory allocation out of my control. If I control all the things I can get the best performance.

If you care about performance: when your program does not need GC, GC is is pure overhead.

If you care about performance: if you must use GC then you must have a high-performance GC available. So the question is not just GC/no-GC but you have to worry about details of the GC -- it's a leaky abstraction.

If you care about average latency: if you must use GC then you must have a low-latency GC available (i.e. a GC with low and bounded pause times)

If you care about meeting real-time deadlines: if you must use GC then you must have a GC that is guaranteed to meet your timing constraints.

Corollary of the previous three points: GC is not just GC. Someone else's GC is often not the GC you want. Your program requirements can impose strong requirements on the GC algorithm, and you don't always have the necessary control over the GC.

For a particular language, GC algorithm and/or implementation can be a moving target. If the GC developer's goals don't stay aligned with your requirements then you are hosed.

GC results in unpredictable memory allocation performance. Ironing out these performance issues (e.g. by avoiding allocations, pooling objects, etc.) is just as much work as using manual memory allocation, so why bother with GC.

Unless you have allocation patterns or other requirements that really need GC[0], it's easier to just avoid GC.

[0] e.g. some lock-free algorithms depend on the presence of GC

kevincox
·
2 weeks ago
·
[ - ]

My main complaint about GCs is that they only clean up one type of garbage, memory. Maybe you can increase your file descriptor limit and let them deal with some other resources via finalizes. But they aren't tuned for that and there will be some types of resources (locks, temporary files, worker threads, subprocesses, ...) that you need to manage on your own. So now you have at least two forms of resource management in your program.

o11c
·
2 weeks ago
·
[ - ]

> [0] e.g. some lock-free algorithms depend on the presence of GC

If this is what I'm thinking of, it does not require GC, only "defer deallocation until all running threads check in", which is ... not trivial, but certainly feasible to do in RC and much easier than making a decent GC.

cmrdporcupine
·
2 weeks ago
·
[ - ]

Had similar epiphanies some many years ago (ugh, I'm old) when I was playing around writing a garbage collected persistent (in the 'stored on [spinny spinny] disk' sense not the FP sense of the word) programming language / runtime. This was back when it was roughly infeasible to be holding large "worlds" of objects purely in-memory on machines of the style of the time, so intelligently paging objects in and out was imperative. (Aside, I think with the recent doubling... tripling of RAM prices this area of thinking is now again more imperative)...

In any case, if one is doing GC in such a language, a full tracing collector (whether copying or mark & sweep) is madness, as to find live references means walking nearly the entire heap including the portions living in secondary storage, and now you're in a world of pain.

In this case, an intelligent cycle collecting garbage collector in the Bacon style was the answer. You keep in in-memory table of reference counts, and you only trace when you hit cycles. [and hopefully design your language semantics to discourage cycles]

lisper
·
2 weeks ago
·
[ - ]

> you only trace when you hit cycles

How do you tell when you've hit a cycle?

> hopefully design your language semantics to discourage cycles

Why? Cyclical structures can be very useful. For example, it can be very handy in many situations for a contained object to have a back-pointer to its container.

[UPDATE] Two responses have now pointed out that this particular case can be handled with weak pointers. But then I can just point to a general graph as an example where that won't work.

zozbot234
·
2 weeks ago
·
[ - ]

> For example, it can be very handy in many situations for a contained object to have a back-pointer to its container.

That's not a true cycle, it's just a back link for which "weak" reference counts suffice. The containment relation implies that the container "owns" the object, so we don't need to worry about the case where the container might just go away without dropping its contents first.

(Edit: I agree that when dealing with a truly general graph some form of tracing is the best approach. These problem domains are where tracing GC really helps.)

lisper
·
2 weeks ago
·
[ - ]

OK, then I'll pick a different example: a general graph.

adgjlsfhk1
·
2 weeks ago
·
[ - ]

If you put all the nodes into an array and use weakrefs (or indices) for node->node edges you move the node ownership to a single object which will make your garbage collection faster for either algorithm, and will also improve your memory locality.

mwkaufma
·
2 weeks ago
·
[ - ]

"How do you apply algo X to a problem which has been narrowly-tailored and/or under-specified to specifically exclude X" isn't exactly a constructive inquiry.

lisper
·
2 weeks ago
·
[ - ]

A general graph is not exactly "narrowly tailored". Graphs are pretty common.

RossBencina
·
2 weeks ago
·
[ - ]

Graphs are common. But you don't have to represent each edge as a pointer. For example you can represent them using (sparse) adjacency matrices. Or you can represent edges using a pointer in each direction (even for a directed graph) or some other data structure (as is commonly the case in triangle mesh data structures). Lots of options. Most do not require GC.

mwkaufma
·
2 weeks ago
·
[ - ]

No but they are under-specified. OP is specifically working with a document-hierarchy data-structure with a natural ownership/weak-pointer distinction to exploit -- no need to abstract it to a general graph.

lisper
·
2 weeks ago
·
[ - ]

Yes, but then they also said:

> hopefully design your language semantics to discourage cycles

thus expanding the scope of their comment beyond that specific use case.

mwkaufma
·
2 weeks ago
·
[ - ]

Yes, but they said that in the context of a tailored language for persistent/HDD-backed data, where implicitly performance crosses the line into an additional measure of correctness, rather than an orthogonal one. ("to find live references means walking nearly the entire heap including the portions living in secondary storage, and now you're in a world of pain")

So the "increased cognitive overhead" is intrinsic to the problem domain, not an unforced defect of the language design. Overgeneralization in such a case would induce even worse overhead as there'd be no user-level way to fix perf.

zozbot234
·
2 weeks ago
·
[ - ]

You don't always have to walk the entire program heap to find cyclic references, only the fraction of it that may in fact be involved in a cycle. That fraction may or may not be inherently small enough, depending on the kind of problems you'll be working with.

lisper
·
2 weeks ago
·
[ - ]

Fair point.

vlovich123
·
2 weeks ago
·
[ - ]

> For example, it can be very handy in many situations for a contained object to have a back-pointer to its container.

Does it frequently need an owning reference though or would a weak reference suffice? Usually the latter situation suffices.

lisper
·
2 weeks ago
·
[ - ]

A fair point, but then you're still putting the burden on the programmer to figure out where a weak reference is appropriate.

But then I'll just choose a different example, like a general graph.

cmrdporcupine
·
2 weeks ago
·
[ - ]

> How do you tell when you've hit a cycle?

https://pages.cs.wisc.edu/~cymen/misc/interests/Bacon01Concu...

TLDR there are heuristics which can give you a hint. And then you trigger a local trace to see.

> Why?

Because then you incur the cost of a trace -- and potentially paging in from slow-slow disk -- vs a simple atomic refcount.

Even just a localized trace on live objects is a pointer-chasing cache & branch prediction killer.

nxobject
·
2 weeks ago
·
[ - ]

I’m curious - did fragmentation end up being a significant issue, whether in memory or offloaded?

cmrdporcupine
·
2 weeks ago
·
[ - ]

I never got far enough to push that into a production system but I suspect it would have, yes.

I can see a periodic compacting phase could be useful in a system like that.

In the DB world there's good research around similar topics. e.g. LeanStore and Umbra -- Umbra in particular does some nice things with variable sized buffers that I believe are expected to help with fragmentation https://db.in.tum.de/~freitag/papers/p29-neumann-cidr20.pdf

rurban
·
2 weeks ago
·
[ - ]

In real-world copying collectors defragmentation has about 6% time benefits. before - after.

mark-sweep obviously not, as it does not defragment and also stops the world.

raphaelj
·
2 weeks ago
·
[ - ]

Why does he need to manually do the tracing or reference counting of all these nodes?

Instead, he could just use the references he needs in the new tree, delete/override the old tree's root node, and expect the Javascript GC to discard all the nodes that are now referenced.

pdubroy
·
2 weeks ago
·
[ - ]

It's explained in the post:

> Then, my plan was to construct a ProseMirror transaction that would turn the old tree into the new one. To do that, it’s helpful to know which nodes appeared in the old document, but not the new one.

So, it's not actually about reclaiming the memory. It's about taking some action on the nodes that will be reclaimed. It's akin to a destructor/finalizer, but I need that to happen synchronously at a time that I control. JavaScript does now support finalization (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...) but it can't be relied on to actually happen, which makes it useless in this scenario.

foota
·
2 weeks ago
·
[ - ]

Am I missing something?

"This was the answer I needed! Rather than visiting all the live objects, I wanted to only visit the dead ones, and reference counting would let me do that.

So I added a way of maintaining a reference count for all the nodes in the doc. When we produce a new document, we decrement the reference count of the old root node (it will always be 0 afterwards). So we recursively decrement the ref count of its children, and so on. This gives me exactly what I wanted — a way to find all the nodes that were not reused, without having to visit most of the nodes in the doc."

I think there's a bit missing from the description here in the and so on, you would only recurse on a node when it's new refcount is zero, right (and the set of zero refcount nodes produced is exactly the set of dead nodes)?

Isn't this sort of just like having a dirty flag on nodes, and then replacing dirty nodes?

iddan
·
2 weeks ago
·
[ - ]

Which is garbage collection?

foota
·
2 weeks ago
·
[ - ]

Yeah, all I'm saying is I felt the description of how it worked was missing a few pieces.

alimw
·
2 weeks ago
·
[ - ]

Is it possible that by knowing less about garbage collection in Java this person might have arrived at the same solution earlier? After all his initial construction of a tracing garbage collector was wasted effort.

pdubroy
·
2 weeks ago
·
[ - ]

(OP here) It’s possible, but I doubt it. Perhaps the way I wrote it makes it sound like I was thinking about it as a GC problem from the beginning, but I wasn’t. It wasn’t until I started seeing it as as being like GC that (a) I realized that my naive solution was akin to tracing GC, and (b) I came up with the reference counting solution.

weitendorf
·
2 weeks ago
·
[ - ]

If Grug worry about garbage collector, it mean Grug working on problem already solved by Sun Microsystems instead of problem Grug paid to solve.

Except, if someone want to pay Grug to work on garbage collector for javascript framework, Grug put in position where Grug learns what Grug don't already know about it because it now Grug's job. So Grug understand why Sun solve problem, why problem hard, tell other Grug about isomorphisms between spanning trees. Now other Grug know more about what other Grug don't know, why other Grug not make same mistake of knowing better than Sun Microsystems either.

·
2 weeks ago
·
[ - ]

veunes
·
2 weeks ago
·
[ - ]

It's always fun when a concept you learned years ago suddenly snaps into place for a totally different problem domain

ClimatePaywall
·
2 weeks ago
·
[ - ]

[dead]