At Shopify, we want our development environments to be fast. Installing dependencies is slow, especially in an application as large as Shopify. bun and uv have dramatically improved install times for TypeScript and Python dependencies. What if we could do the same for Bundler and the Ruby community?

Our team at Shopify has been working on a series of improvements to Bundler and RubyGems. Bundler downloads gems up to 200% faster. Cloning git gems is now 3x faster in our monolith.

We were also able to decrease the overall bundle install time by 3.5x in one of our applications by precompiling gems thanks to cibuildgem, a new precompilation toolchain we’d love you to try!

Here’s an overview of the improvements we’ve made in the last few months:

Faster gem downloads

One impactful change was deceptively simple. Bundler’s HTTP fetcher had a connection pool size of 1. This meant that during parallel gems installation, every thread was fighting over a single HTTP connection.

A profile that shows the threads waiting for the only connection to be available
A profile that shows the threads waiting for the only connection to be available

Pink spikes are threads waiting for the connection to be available.

By increasing the pool of HTTP connections, Bundler can download more gems in parallel. The speed gain with this change is even more dramatic during peak hours when RubyGems.org is under heavy load, or when you are geographically far from the CDN, where latency amplifies the cost of waiting on a single connection.

To benchmark this change, we opted to only measure download and extraction time (no compilation of native extensions) and built a local gem server where we can control latency at our will.

This is the result in a freshly generated Rails application when all gems are served with a 100ms latency.

Scenario: rails (164 gems)
                             Cold     +/-                        Warm     +/-
  ------------------------------------------------------------------------------
  5 HTTP connections       5.86s   0.10s  baseline             4.17s   0.02s  baseline
  1 HTTP connection       19.80s   0.02s  237.6% slower        4.16s   0.02s  0.3% faster

Hotspots and optimizations

We regularly profile Bundler with different Gemfiles and identify hotspots we might optimize. While no single optimization is dramatic, their collective impact has been significant.

One such optimization involves gem installation. A .gem file is a compressed tarball — and gzip has a built-in integrity check: if decompression succeeds, the content is guaranteed to be intact. Despite this, RubyGems was walking every entry in the tarball and reading all bytes upfront as an explicit corruption check, before proceeding with installation. This redundant verification step was thrown away entirely, since a successful decompression already provides the same guarantee.

A profile that shows the time spent verifyint the tarball's content
A profile that shows the time spent verifyint the tarball's content

9-17% of the time installing a gem is spent verifying the tarball’s content.

Another hotspot during installation is the check RubyGems performs to determine whether a gem includes a RubyGems plugin and, if so, whether its plugin file needs to be regenerated. The vast majority of gems don’t include a RubyGems plugin, yet every gem pays the cost of a Dir.glob with an expensive pattern just to handle the small minority that do.

It turns out that unconditionally regenerating the plugin file is faster than performing this upfront check.

A profile that shows shows how frequently Bundler is spending time checking whether checking whether regenerating a gem plugin is required
A profile that shows shows how frequently Bundler is spending time checking whether checking whether regenerating a gem plugin is required

Bundler checking whether regenerating a gem plugin is required

Parallel git clones

Many Rails applications depend on gems sourced directly from git repositories. This is particularly useful if a gem has upstream changes that aren’t yet released. Previously, Bundler would fetch each git repository sequentially, even though there’s no technical limitation on fetching them all at once.

Shopify’s Core Rails monolith includes 33 git gems. After introducing this change to parallelize git clone, we saw a 3x performance improvement for fetching git gems.

  Bundler 2.7.2 Bundler 4.0.7 Performance improvement
Fetching 33 git gems 121.57s 38.75s 68% faster

Native extensions

By far the biggest bottleneck when running bundle install is the compilation of native extensions. Many gems in the Ruby ecosystem include C code that must be compiled on each developer’s machine when installed. Common examples are json, date, and bigdecimal. Even if your Gemfile doesn’t directly depend on native extensions, it’s likely they will be included in your Gemfile.lock as transitive dependencies.

A profile that shows the time spent compiling a gem
A profile that shows the time spent compiling a gem

An installer thread spending 92% of the time compiling the gem.

To illustrate how slow compilation is, we can run bundle install on a freshly generated Rails application.

   
Total number of gems 126
Gems with native extensions 18
Time to bundle install1 ~13 seconds
Time to bundle install (without compilation) ~2 seconds (15%)
Time to bundle install (only native extensions) ~11 seconds (85%)

Installing the 18 native extension gems accounts for 85% of the time spent running bundle install.

Precompiled gems

Remember when Nokogiri used to take forever to install? Those days are behind us thanks to the amazing work of its maintainer, Mike Dalessio. Mike updated the gem’s publishing pipeline to precompile its native extensions into platform-specific binaries and releases separate gems for each supported platform (macOS, Windows, Linux). Now Nokogiri installs as fast as pure Ruby gems.

Imagine if we extended this to the rest of the Ruby ecosystem. If the community works together to ship precompiled binaries for our most popular native-extension gems, everyone will benefit from a lightning-fast bundle install.

One way to build binary gems is with the popular Rake-compiler-dock toolchain, which provides a cross-compilation environment and allows compilation to run inside Docker containers. However, cross-compiling can be brittle and presents hard-to-debug issues. Compiling on the target platform is ultimately far more reliable.

Many CI providers now offer free access to cloud machines. GitHub Actions, for example, is widely popular, and the Ruby community has built many easy-to-use actions around it (e.g., ruby/setup-ruby). Could we apply the same approach and leverage those machines to natively compile binary gems?

Introducing cibuildgem

At Shopify, we wanted to build an easy-to-use tool to help developers release gems with precompiled binaries using a native compilation approach via GitHub Workflows.

cibuildgem lets you generate a standard GitHub Actions workflow. Once triggered, multiple jobs run to:

  1. Compile the binaries and package the gems
  2. Run a matrix of test suites
  3. Verify the .gem files are not corrupted and installable
  4. Release the gems to RubyGems.org
A screenshot of the GitHub workflow when cibuildgem is triggered
A screenshot of the GitHub workflow when cibuildgem is triggered

Releasing a binary gem with cibuildgem

We aimed to make cibuildgem easy and fast to set up. Since many gems with native extensions are already configured with Rake Compiler for development compilation, we chose to piggyback on that so cibuildgem can run without any extra configuration for most gems.

The workflow generated by cibuildgem is intentionally standard.

  • Want to compile your gem on Linux AArch64? Add it to the matrix.
  • Want to trigger the workflow automatically when pushing a new git tag? No problem — tweak it to your liking.

We also wanted to ensure that the binaries compiled by cibuildgem would work in a macOS development environment and a Linux production environment on a real Rails application.

As an experiment, we used cibuildgem to compile dozens of open-source gems and publish them under a “namespace” on RubyGems.org (e.g., sassc -> precompiled-sassc).

The goal was to see how much performance improvement we could get with precompiled binaries. To test this, we created a Bundler plugin that hijacked the Bundler resolver to download the gems with precompiled binaries we had just published. For example, it would force-install precompiled-json if the json gem was requested anywhere in the dependency tree.

We tested this and deployed on an internal application which included 235 gems. By precompiling 17 of them, we saw a 3.5x performance improvement.

  Without precompiled binaries With some precompiled binaries
bundle install 24.2s 7.0s (3.5x faster)2

This experiment demonstrates how much faster bundle install could be when gems are precompiled. It has also given us confidence that cibuildgem builds compatible binaries for macOS and Linux.

In fact, a few gems at Shopify are now released with precompiled binaries (stack_frames, heap_profiler, rubydex) thanks to cibuildgem.

If you maintain a gem with a native extension, we’d love for you to give it a try and share your feedback ❤️!

  1. Network speed and computation power affects those results. 

  2. 5 gems are still being compiled, we could decrease install time even further.