New for Ruby 3.4: Modular Garbage Collection and MMTk

Introduction

Ruby garbage collection has always been a frequently discussed topic in the community. With recommendations on tuning strategies for better performance, and even entire sets of patches designed to optimize it, going back at least as far as 2008 and Ruby 1.8.

I spoke at RubyKaigi in 2023 about the history of the Ruby garbage collector, and its evolution since the origins of the language. In that talk I mentioned some of the changes that we were working on at Shopify, to make the Ruby GC easier to modify, and ensure that we’re able to evolve it to more closely follow the cutting edge of memory management research.

I’m pleased to say that we’ve made significant progress on these goals for Ruby 3.4.

For the first time, Ruby will ship with the functionality to safely replace its garbage collector with an alternative implementation at runtime, as well as a completely new implementation that leverages this ability to replace the default Mark & Sweep collector without requiring Ruby to be re-compiled.

I’ll use the rest of this blog post to talk about these new concepts: The way that the Ruby VM now interacts with its various GC libraries, and the new GC implementation itself, which is based around the Memory Management Toolkit (MMTk), a general purpose garbage collection and memory management library.

Modular GC

The work that we’ve done here is to introduce a standardized interface on top of Ruby’s existing garbage collection code, and extract the implementation code into separate compile time units. This involved identifying which parts of gc.c were required to be public to the rest of the VM, and which parts were implementation details of how the current GC is built.

Once we’d successfully teased apart the existing GC into separate integration and implementation layers, we worked on being able to override the implementation, without re-compiling or re-linking.

We settled on an approach that uses dlopen to load a shared library at runtime, and create a map of function pointers to implementations provided by the shared library, falling back to statically compiled default GC implementations when no shared library is configured.

We call this feature “Modular GC”, and it can be used as follows:

Configure the Modular GC directory at Ruby build time
```
./configure --with-modular-gc=$HOME/ruby-mod-gc
```
The whole feature is gated behind a configure flag. Without this flag at build time, the existing default Ruby GC will be compiled statically into the Ruby binaries exactly as it has always been, and none of the Modular GC code will be included.

Using this configure flag also requires the passing of a directory path. This path is the only path on the system where Ruby will load shared GC libraries from. We’ll see how it does this later on in this post. This is intended as a risk mitigation, to tightly control the places on the filesystem where Ruby will load code from.

Once Ruby has been configured, build in the usual way with make -j and make install
Build a GC library

If you do nothing else, Ruby will function with its default GC as normal. The only impact enabling this configure flag will have on Ruby is to enable a small check at interpreter boot time to look for the presence of a GC library.

A GC library is a simple shared library that provides implementations for each of the defined GC API functions (these are the functions declared with GC_IMPL_FN inside gc/gc_impl.h).

Ruby’s default GC is now constructed in a way that it can be optionally built and loaded at runtime using this mechanism.

This can allow us to make changes to the existing GC implementation and override a running process, in order to facilitate easy multi-variant testing on GC variables. Or for testing configuration changes or fixes without the overhead of having to rebuild and redeploy a new Ruby binary.

Ruby provides a make target for building GC libraries, and the default GC can be built as a library as follows:
```
make modular-gc MODULAR_GC=default
```
This step will error if Ruby has not been built with-modular-gc. But if this target succeeds, then a shared library will have been built and placed in the Modular GC directory that Ruby was configured with.

In order to be built using this mechanism a GC library must be defined in a sub-directory of the gc directory in the Ruby source tree. These directories must contain an extconf.rb, as well as the source code for the GC library being built. In order to build the GC library the directory name can be used for the MODULAR_GC variable.

Valid GC libraries as distributed with Ruby are default and mmtk.
Load a GC library

GC libraries must follow a certain naming convention in order to be loadable. That convention is librubygc followed by the name of the GC, followed by whatever extension the current platform uses for shared objects, so .so on ELF-based platforms and .bundle on Darwin.

So, assuming the default GC is built on Linux using the make target in the previous step, the file name will be librubygc.default.so

This can be loaded by name, using the RUBY_GC_LIBRARY environment variable like this:
```
RUBY_GC_LIBRARY=default ruby -e "GC.start"
```
Interrogating the running GC

In order to make using this system a little more intuitive we’ve also introduced some features that make it easier to work with multiple GCs.

The most important of these is the ability to quickly interrogate whether the current Ruby supports modular GC and which GC library is loaded.
- Using RUBY_DESCRIPTION and ruby -v
  
  We’ve added GC information into RUBY_DESCRIPTION, which is accessible via the internal constant, or via ruby -v.
  
  If Ruby has been compiled --with-modular-gc then this will now be visible as +GC.
```
❯ ruby -v
ruby 3.4.0dev (2024-12-06T12:47:35Z master 78614ee900) +PRISM +GC [arm64-darwin24]
```
  And when a named GC library has been loaded, the name of the GC will be displayed inside square brackets
```
❯ RUBY_GC_LIBRARY=mmtk ruby -v
ruby 3.4.0dev (2024-12-10T09:29:38Z master af9a904f38) +PRISM +GC[mmtk] [arm64-darwin24]
```
- Using GC.config
  
  Ruby 3.4 also introduces a new config method to the GC module. This is designed to provide a general purpose way for GC implementations to expose configuration parameters in an isolated way (check out the new rgengc_allow_full_mark feature for an example of how this is currently being used).
  
  In addition to providing GC implementation specific keys, we also provide a read-only :implementation key, that will return the name of the currently running GC.
```
irb(main):001> GC.config[:implementation]
=> "default"
```
  It’s worth mentioning that this key is present regardless of whether Ruby is compiled with modular GC support or not, so it isn’t possible, using this method alone, to differentiate between: Ruby without modular GC support, Ruby with modular GC support but no dynamic GC library loaded, or Ruby with modular GC support enabled and a dynamic GC library loaded with the name “default”.

So that’s the new Modular GC feature and a little about how to use it. Now let’s talk about the new GC implementation that’ll ship with Ruby.

The Memory Management Toolkit (MMTk)

MMTk is a language agnostic library that provides a sophisticated and powerful memory management framework.

It’s a very exciting development in the world of GC’s and memory management. Similar to a Lego set, it provides common building blocks that can be combined together in novel ways to produce complete GC strategies.

It ships with some common pre-defined strategies (eg. MarkSweep, SemiSpace, Immix & LXR), which allows language developers to have a library of pre-built GC algorithms available to them out of the box.

Language implementors can take advantage of all of these features by building an interface to MMTk within their language, instead of having to build their own GC from scratch.

Traditionally, managed language implementations have often evolved in lockstep with their garbage collectors. And language implementors often choose easy-to-implement, well-known, strategies like reference counting (Perl, Python, PHP), or conservative tracing GC (Ruby) in order to get up and running quickly during the early stages of their language implementation.

Both conservative tracing GC and reference counting are quite old ideas, dating from the 1950’s and the 1970’s respectively, and in the decades since, they have been superseded by much higher performing, and often concurrent algorithms.

Unfortunately these algorithms are significantly more complex, and require more engineering effort, which is part of the reason they’re often passed over during the initial stages of a languages implementation’s lifetime.

A paper, titled “Deferred Gratification: Engineering for High Performance Garbage Collection from the Get Go” was published in 2011, analyzing the situation that managed languages find themselves in and what the options are to move beyond it. But the summary is that these initial memory management approaches often end up becoming a performance bottleneck, and that changing them is a significant undertaking.

With this context in mind it’s easy to see the benefit of a tool such as MMTk. Whilst the hard work of separating the VM and the GC still needs to be done, if we integrate MMTk, then instead of needing to implement our own high-performance GC solution, we’ll gain the ability to use all of the implementations inside MMTk, as well as any future GC algorithms, almost for free.

Ruby’s MMTk based GC library

It’s these benefits that have convinced us to make our first¹ “Modular GC” implementation in Ruby to be an integration with MMTk.

The code for the MMTk GC Library lives in the ruby/mmtk repository on GitHub, and is synchronized directly into the ruby/ruby source tree in the gc/mmtk directory. The implementation itself consists of two parts:

MMTk integration

The first part is the main integration with the core MMTk library itself. This part of the code is written in Rust (the same language as MMTk itself), and contains all the code required to teach MMTk about how to garbage collect Ruby objects. So for instance we need to teach MMTk about write barrier protected objects, that cannot be moved. And about how Ruby implements weak references.

This Rust crate uses cbindgen to expose the MMTk specific functions it needs, which creates a C header file mmtk.h.
Ruby GC integration

We then include the mmtk.h header file into mmtk.c. And it’s this module that implements all the endpoints required as part of the Modular GC API described in the first part of this post.

This part of the implementation also includes the mkmf.rb configuration, which co-ordinates the building of both the Rust crate and then the dynamic library from the C sources.

Both parts are built and linked together into the final GC library using the modular-gc make target as follows:

make modular-gc MODULAR_GC=mmtk

and loaded using RUBY_GC_LIBRARY as expected:

❯ RUBY_GC_LIBRARY=mmtk ruby -e 'p "Hello, from #{GC.config[:implementation]}"'
mmtk::memory_manager] Initialized MMTk with MarkSweep (DynamicHeapSize(1048576, 30923764531))
mmtk::util::heap::gc_trigger] [POLL] ms: Triggering collection (272/256 pages)
mmtk::scheduler::scheduler] End of GC (240/287 pages, took 19 ms)
mmtk::util::heap::gc_trigger] [POLL] ms: Triggering collection (290/287 pages)
mmtk::scheduler::scheduler] End of GC (272/319 pages, took 18 ms)
mmtk::util::heap::gc_trigger] [POLL] ms: Triggering collection (322/319 pages)
mmtk::scheduler::scheduler] End of GC (306/356 pages, took 17 ms)
mmtk::util::heap::gc_trigger] [POLL] ms: Triggering collection (372/356 pages)
mmtk::scheduler::scheduler] End of GC (356/394 pages, took 19 ms)
mmtk::util::heap::gc_trigger] [POLL] ms: Triggering collection (404/394 pages)
mmtk::scheduler::scheduler] End of GC (388/434 pages, took 21 ms)
mmtk::util::heap::gc_trigger] [POLL] ms: Triggering collection (438/434 pages)
mmtk::scheduler::scheduler] End of GC (422/466 pages, took 22 ms)
"Hello, from mmtk"

Here we can see MMTk successfully running a collection using the MarkSweep GC plan, with a dynamically sized heap using a default minimum and maximum size.

All of these MMTk related parameters can be configured using environment variables:

MMTK_HEAP_MODE. The default heap mode is Dynamic. This will allow the heap to grow between fixed size bounds. We’ve also implemented a Fixed mode heap, that will pre-allocate a fixed size heap and then will not allow that heap to grow or shrink.
MMTK_HEAP_MIN, and MMTK_HEAP_MAX. These values control the growth bounds of a Dynamic heap. If the heap mode is Fixed, then MMTK_HEAP_MAX can be used to change the fixed heap size.
MMTK_PLAN. MMTk supports many different GC plans, from simple NoGC collectors that simply allocate until the process runs out of memory and crashes, to very sophisticated, modern multi phase algorithms like LXR.

The default is MarkSweep, a simple tracing collector with a single memory space. Currently this is the only supported plan. Although we have done some initial testing with Immix and we’re planning on full support for the entire MMTk API in the future, to allow all MMTk supported GC plans to be used.
MMTK_THREADS. MMTk runs many GC threads in parallel. This environment variable controls how many GC threads will be allowed to run. The MMTk GC threads are not run concurrently with the Ruby VM at this stage. We must still lock the Ruby mutator threads and “stop the world” while we run GC. True concurrent collection is something that we’d like to investigate in the future.

What’s next

It’s important to mention that this work is very experimental at the moment. We haven’t tested this on production workloads, the performance is lagging significantly behind Ruby’s existing GC, and everything about it is subject to frequent and drastic change.

As such we have made no plans about how to distribute this feature. It isn’t going to be showing up as an option to any of the Ruby version managers anytime soon, and nor should it. Testing out this feature will require Ruby to be built manually from a 3.4.0 source tarball or the git repository.

In the short term our priority is to improve the performance of the Modular GC system in general and MMTk more specifically. We plan to start testing this against some of our production workloads and internal systems. And we want to start working on support for some of the more advanced MMTk provided algorithms like Immix and LXR.

With all that being said, I think this represents a really important step for the evolution of memory management in Ruby. One that will allow us to the explore powerful features that are possible with a fully flexible and generic garbage collection abstraction, and I’m excited to continue improving this feature through the next development cycle and beyond.

Lastly, this work wouldn’t have been possible without the herculean effort of a number of folks in both the Ruby and the MMTk core communities. Many thanks to Peter Zhu, Aaron Patterson & Eileen Uchitelle on the Ruby side, and Kunshan Wang and Steve Blackburn on the MMTk side, for working tirelessly to achieve what we’ve achieved.

Actually our second. The first was an allocation only GC library, based around Java’s epsilon collector, but it was never merged. ↩