How Ruby Executes JIT Code: The Hidden Mechanics Behind the Magic

Ever since YJIT’s introduction, I’ve felt simultaneously close to and distant from Ruby’s JIT compiler. I know how to enable it in my Ruby programs. I know it makes my Ruby programs run faster by compiling some of them into machine code. But my understanding around YJIT, or JIT compilers in Ruby in general, seems to end here.

A few months ago, my colleague Max Bernstein wrote ZJIT has been merged into Ruby to explain how ZJIT compiles Ruby’s bytecode to HIR, LIR, and then to native code. It sheds some light on how JIT compilers can compile our program, which is why I started to contribute to ZJIT in July. But I still had many questions unanswered before digging into the source code and asking the JIT experts around me (Max, Kokubun, and Alan).

So I want to use this post to answer some questions/mental gaps you might also have about JIT compilers for Ruby:

Where does JIT-compiled code actually live?
How does Ruby actually execute JIT code?
How does Ruby decide what to compile?
Why does JIT-compiled code fall back to the interpreter?

While we use ZJIT (Ruby’s experimental next-generation JIT) as our reference, these concepts apply equally to YJIT as well.

Where JIT-Compiled Code Actually Lives

Ruby ISEQs and YARV Bytecode

When Ruby loads your code, it compiles each method into an Instruction Sequence (ISEQ) - a data structure containing YARV (CRuby virtual machine) bytecode instructions.

(If you’re not familiar with YARV instructions or want to learn more, Kevin Newton wrote a great blog series to introduce them)

Let’s start with a simple example:

def foo
  bar
end

def bar
  42
end

Running ruby --dump=insn example.rb shows us the bytecode:

== disasm: #<ISeq:foo@example.rb:1 (1,0)-(3,3)>
putself                                                          (   2)[LiCa]
opt_send_without_block                 <calldata!mid:bar, argc:0, FCALL|VCALL|ARGS_SIMPLE>
leave                                  [Re]

== disasm: #<ISeq:bar@example.rb:5 (5,0)-(7,3)>
putobject                              42                        (   6)[LiCa]
leave                                  [Re]

JIT-Compiled Code Lives on ISEQ Too

I assumed JIT-compiled code would replace bytecode—after all, native code is faster. But Ruby keeps both, for good reason.

Here’s what an ISEQ looks like initially:

ISEQ (foo method)
├── body
│   ├── bytecode: [putself, opt_send_without_block, leave]
│   ├── jit_entry: NULL  // No JIT code yet
│   ├── jit_entry_calls: 0  // Call counter

After the method is called repeatedly and gets JIT-compiled:

ISEQ (foo method)
├── body
│   ├── bytecode: [putself, opt_send_without_block, leave]  // Still here!
│   ├── jit_entry: 0x7f8b2c001000  // Pointer to native machine code
│   ├── jit_entry_calls: 35  // Reached compilation threshold

The jit_entry field is the gateway to native code. When it’s NULL, Ruby interprets bytecode. When it points to compiled code, Ruby can jump directly to machine instructions. But the bytecode never goes away - Ruby needs it for de-optimization, which we will explore a bit later.

The Execution Switch: From Bytecode to Native Code

This is easier than I expected. Since each ISEQ points to its JIT compiled code when it’s available, Ruby simply checks the jit_entry field on every ISEQ it’s going to execute:

JIT-compiled code execution

When there’s no JIT code (jit_entry is NULL), it continues interpreting. Otherwise, it runs the compiled native code.

How Ruby Decides What to Compile

Ruby doesn’t compile methods randomly or all at once. Instead, methods earn compilation through repeated use. In ZJIT, this happens in two phases:

if (body->jit_entry == NULL && rb_zjit_enabled_p) {
    body->jit_entry_calls++;

    // Phase 1: Profile the method
    if (body->jit_entry_calls == rb_zjit_profile_threshold) {
        rb_zjit_profile_enable(iseq);
    }

    // Phase 2: Compile to native code
    if (body->jit_entry_calls == rb_zjit_call_threshold) {
        rb_zjit_compile_iseq(iseq, false);
        // After this, jit_entry points to machine code
    }
}

As of now, ZJIT’s default profile threshold is 25 and compile threshold is 30 (both may change in the future). So a method’s lifecycle may look like this:

Calls:     0 ─────────── 25 ────────── 30 ─────────────────►
           │              │             │
Mode:      └─ Interpret ──┴── Profile ──┴─ Native Code (JIT compiled)

This is why we need to “warm up” the program before we get the peak performance with JIT.

When JIT Code Gives Up: Understanding De-optimization

JIT code makes assumptions to run fast. When those assumptions break, Ruby must “de-optimize” - return control to the interpreter. It’s a safety mechanism that ensures your code always produces correct results.

Consider this method:

def add(a, b)
  a + b
end

which would generate these instructions:

== disasm: #<ISeq:add@test.rb:1 (1,0)-(3,3)>
0000 getlocal_WC_0                          a@0                       (   2)[LiCa]
0002 getlocal_WC_0                          b@1
0004 opt_plus                               <calldata!mid:+, argc:1, ARGS_SIMPLE>[CcCr]
0006 leave                                                            (   3)[Re]

Because Ruby doesn’t know what opt_plus would be called with beforehand, the underlying C function vm_opt_plus needs to handle various classes (like String, Array, Float, Integer, etc.) that can respond to +.

But, if profiling shows add is always called with integers (Fixnums), JIT compilers can generate optimized code that only handles integer addition. But it includes “guards” to check this assumption:

JIT type guard

When the assumption is broken, like when add(1.5, 2) is called:

The guard check fails
JIT code jumps to a “side exit”
The side exit restores interpreter state (stack, instruction pointer..etc.)
Control returns to the interpreter
The interpreter executes opt_plus and calls the vm_opt_plus function

Other triggers for falling back include:

TracePoint activation - TracePoint needs bytecode execution for properly emitting events (more details below)
Redefined core methods - Someone changed what + means on Integer
Ractor usage - Multi-ractor changes some YARV instruction’s behaviour. So the compiled code could perform differently than the interpreter in that situation

These assumption checks, or patch points as we call them in ZJIT, make sure your program performs correctly when any of the assumptions change.

Answering Some Additional Questions

Why does enabling TracePoint slow everything down?

(TracePoint is a Ruby class that can be used to register callbacks on specific Ruby execution events. It’s commonly used in debugging/development tools.)

Most of TracePoint’s events are triggered by corresponding YARV bytecode. When TracePoint is activated, instructions in ISEQs will be replaced with their trace_* counterpart. Like opt_plus will be replaced with trace_opt_plus.

If Ruby only executes the compiled machine code, then those events wouldn’t be triggered correctly. Therefore, when ZJIT and YJIT compilers detect TracePoint’s activation, they immediately throw away the optimized code to force Ruby to interpret YARV instructions instead.

Why doesn’t Ruby just compile everything?

Many methods are called rarely. Compiling them would waste memory and compilation time for no performance benefit. Also, compiling methods without profiling would mean that JIT compilers either make wrong assumptions that get invalidated pretty quickly, or don’t make specific enough assumptions that miss further optimization opportunities.

Final Notes

I hope this post helped you understand JIT compilers, a now essential part of Ruby, a little bit more.

If you want to learn more about Ruby’s new JIT compiler: ZJIT, I highly recommend giving ZJIT has been merged into Ruby a read. And if you want to learn more about Ruby’s YARV instructions, Kevin Newton’s Advent of YARV series is the best resource.