On the Ruby Developer Experience team here at Shopify, our goal is to deliver a state-of-the-art development experience to Rubyists both at Shopify and in the broader community. This means keeping our tools up-to-date with the most recent versions of Ruby.

One such tool is Sorbet, which is an open source, gradual type checker for Ruby. Sorbet has become a key part of the development experience at Shopify, making it faster and safer for developers to collaborate on a monolith with tens of thousands of Ruby files.

When we upgraded the Shopify monolith to use Ruby 3.2 at the beginning of 2023, we knew it would require us to make some changes to Sorbet. While upgrading Ruby didn’t break type checking, Sorbet did not yet support all of the features introduced in this new Ruby version. If Shopify developers were going to take advantage of the features that came with Ruby 3.2, like the Data class and anonymous argument forwarding, we would need to make sure Sorbet could accurately type check these usages and report any type checking errors.

This post is a summary of the work we did to make Sorbet compatible with Ruby 3.2.

The Sorbet pipeline

Sorbet statically type checks our code in a process called the “pipeline”. The pipeline is made up of different stages, each of which modifies some internal representation of the code before passing it along to the next stage. The ultimate goal of the pipeline is to convert Ruby code into a representation that enables Sorbet to reason about types and identify type checking errors.

When updating Sorbet to support a new feature in Ruby, it is important to understand which part of the pipeline needs to be changed. When explaining each of the changes my teammates and I implemented, I’ll also give an explanation of the relevant parts of the Sorbet pipeline so you can understand why we approached these problems in the ways we did.

I hope this overview of Sorbet’s architecture might also encourage you to contribute to Sorbet in the future!

Anonymous argument forwarding and the Sorbet parser

The largest syntax change in Ruby 3.2 was the introduction of anonymous argument forwarding. This means that in Ruby, it is now possible for a method that takes anonymous arguments or keyword arguments to pass them along to another method, as in this example:

def foo(*)
  bar(*)
end

def baz(**)
  quux(**)
end

Updating Sorbet’s parser

To support this feature, we first needed to make a change to Sorbet’s parser grammar.

The parser is the first step in the Sorbet pipeline. Its job is to convert Ruby code into an abstract syntax tree, which is then passed through the rest of the pipeline to be processed and eventually type checked. Without updating the parser, Sorbet wouldn’t even be able to recognize anonymous argument forwarding as valid Ruby code and would raise syntax errors, erroneously telling developers they’d made a mistake!

To address this, we opened a pull request that created two new abstract syntax tree nodes representing anonymous argument forwarding. With this change, Sorbet could now represent anonymous argument forwarding in a way that was usable by the following steps in the type checking pipeline.

We can see the parser at work using sorbet.run, Sorbet’s online sandbox. If we add the URL parameters ?arg=--print&arg=parse-tree, the Sorbet sandbox will print out the results of the parsing step in the pipeline. In these results, we can see that the bar method receives an argument node of the type ForwardedRestArg, and the baz method receives an argument node of the type ForwardedKwrestArg.

(By the way, if you’re interested in learning more about parsing in Ruby, check out Kevin Newton’s recent blog post about building a new Ruby parser.)

Desugaring

Updating the parser was the first step in allowing Sorbet to process instances of anonymous argument forwarding, but supporting this feature also required changes to another step in the Sorbet pipeline: the desugarer.

The desugarer is the next step in the Sorbet pipeline after the parser; it takes the internal representation built by the parser and cuts it down to a less granular representation, making it easier to work with in later stages of the pipeline.

To explain what I mean, let’s look at how Sorbet desugars case statements like this:

case pokemon
when 'Charmander'
when 'Squirtle'
when 'Bulbasaur'
end

The Sorbet desugarer will take the above case statement and transform it into an if statement:

if 'Charmander' === pokemon
elsif 'Squirtle' === pokemon
elsif 'Bulbasaur' === pokemon
end

Sorbet does this because it’s simpler to maintain type checking logic for one type of code pattern rather than two. This is why this step of the pipeline is called the “desugarer” – it removes “syntactic sugar,” or syntax that exists only to make the development experience better. If earlier stages of the pipeline can limit the different kinds of syntax Sorbet has to type check, then later stages of the pipeline will contain less type checking logic, making them easier to reason about and maintain.

We applied a similar principle to desugaring anonymous argument forwarding. Rather than implementing an entire new type checking process for ForwardedRestArg and ForwardedKwrestArg nodes, we used the desugarer to transform them into a representation that Sorbet already knew how to type check!

Back in 2020, my teammate Alexandre added support for Ruby 2.7’s “forward everything” syntax to Sorbet (PR here). As part of that work, he added logic to the desugarer that would break down forwarded args into three “magic” expressions. As an example, if we had the following code snippet:

def buzz(...)
  biz(...)
end

The Sorbet desugarer would transform the arguments passed to biz into something like:

None of these expressions actually exist in Ruby code – they’re created by the desugarer as a placeholder that Sorbet can type check later on in the pipeline. This is why they’re called “magic.”

You can see this for yourself on sorbet.run. In this case, if we add the ?arg=--print&arg=desugar-tree URL parameters, sorbet.run will show us the output of the desugaring stage, including the magic expressions listed above!

Because Ruby 3.2’s anonymous argument forwarding feature is so similar to the forward everything syntax, we were able to lean on the work that Alexandre had already done. We updated the desugarer to desugar ForwardedRestArg nodes (e.g. the argument of bar(*)) as magic fwd-args expressions and ForwardedKwrestArg nodes (e.g. the argument of quux(**)) as magic fwd-kwargs expressions.

Once we modified the desugarer to convert forwarded anonymous arguments into a representation that Sorbet already knew how to type check, we could rely on the rest of the pipeline to complete the type checking process!

Supporting the new Data class

Another major change in Ruby 3.2 was addition of the Data class. Data provides a convenient way to define immutable data structures in Ruby. Here’s an example:

Measure = Data.define(:amount, :unit)
distance = Measure.new(100, 'km')
distance.amount #=> 100
distance.unit #=> "km"

In this example, calling Data.define creates a new class that has some methods built into it, including a constructor and amount and unit accessor methods. Because these methods are created at runtime, Sorbet wouldn’t be able to know about them statically without a little help. To “teach” Sorbet about these methods, we needed to make a change to a part of the pipeline called the “rewriter.”

Adding a new rewriter

The stage of the pipeline after the “desugarer” is the “rewriter” stage. Rewriters are very similar to the desugarer, but while the desugarer is broad and covers many types of Ruby syntax, rewriters are specific; they each handle a particular Ruby class or feature.

For example, Sorbet’s ClassNew rewriter consolidates how Sorbet represents class definitions. A line like Child = Class.new(Parent) would be rewritten as class Child < Parent; end. This means that Sorbet only needs to know how to type check one class definition syntax rather than two, which simplifies later steps of the type checking pipeline.

My teammates and I applied the same principle to Ruby’s new Data class. We opened a PR on Sorbet that added a new Data rewriter. This rewriter finds calls to Data.define and modifies Sorbet’s internal representation of the resulting class to include an initializer, as well as accessor methods for every field passed into the define method.

You can see this rewriter in action on sorbet.run, Sorbet’s online sandbox. By passing in the URL parameters ?arg=--print&arg=rewrite-tree, we can ask Sorbet to print out the results of the rewriter phase of the pipeline, allowing us to visualize the changes that Sorbet is making to our code under the hood!

If you click the link above, you’ll see that Sorbet rewrites Measure = Data.define(:amount, :unit) as something closer to:

class Measure
  def amount; end
  def unit; end
  def initialize(amount=nil, unit=nil); end
end

Once it knows about methods that will be defined at runtime, Sorbet can accurately type check code that uses Ruby 3.2’s new Data class!

Updating Ruby Core RBI files

The last thing my teammates and I needed to do to make Sorbet compatible with Ruby 3.2 was to update Sorbet’s RBI files to reflect the latest changes to Ruby’s core API.

In order to type check methods from Ruby core, Sorbet keeps a repository of RBI files that define types on every class and module in the Ruby core library. When methods changed in new Ruby versions, Sorbet’s RBIs have to be updated to reflect these changes before Sorbet can begin accurately type checking those methods in our code.

My teammates and I opened a series of PRs against Sorbet that modified the Ruby core RBI files to reflect the changes introduced in Ruby 3.2. Here are some examples:

In doing this work, we ensured that Ruby developers at Shopify and in the broader community could leverage all the awesome, new features introduced in Ruby 3.2 while still benefitting from the safety of type checking.

If you’re interested in any of the features mentioned in this article, you can test them out using Sorbet’s online sandbox!