Ruby 3.3's YJIT Runs Shopify's Production Code 15% Faster
Ruby 3.2 YJIT is Battle-Tested
Shopify deploys YJIT on business-critical services in production, such as Storefront Renderer, the software that powers all online storefronts on Shopify’s platform, and Shopify’s Monolith. As of the Ruby 3.2 release, YJIT sped up our Storefront Renderer by 10% on average.
Storefront Renderer is a complex application. Your more reasonable-sized app might get better/worse results. Here are some other YJIT performance results in production outside Shopify:
- Discourse: 15.8-19.6% speedup
- Lobsters: 26% speedup
- CompanyCam: 20-40% speedup
- GMO Pepabo: 18% speedup (in Japanese)
- Timee: 10% speedup (in Japanese)
- STORES: 6.5-7.5% speedup (in Japanese)
- MedPeer: 2.8% speedup (in Japanese)
Performance Tips for Production Deployments
We sometimes hear people say YJIT doesn’t speed up their application or uses too much memory. The following documentation covers some tips for tuning YJIT in such cases.
Monitoring YJIT in Production might also help you get more insight.
Ruby 3.3 YJIT is Even Faster
Ruby 3.3.0-preview2 has been released on Sep 14th, 2023. We compared the performance of Ruby 3.2 YJIT, Ruby 3.3-preview2 YJIT, and Ruby 3.3-preview2 interpreter on the Storefront Renderer. It’s a large-scale service with the following properties:
- Depends on over 220 Ruby gems
- Over 4.5 million e-commerce sites are built with Shopify (source: builtwith.com)
- Capable of serving over 75 million requests per minute, 3TB/minute of traffic
- 1.27 million requests per second
- Processed over 197B$ in transaction volume in 2022
Ruby 3.3 YJIT is 13% faster than Ruby 3.2 YJIT
We deployed Ruby 3.2.2 YJIT and Ruby 3.3.0-preview2 YJIT to different clusters that receive an equal amount of traffic. We compared their response times in a 24-hour window on a weekday.
The numbers and the graph show Ruby 3.3.0-preview2 YJIT’s average/p50/p90/p99 response time speedup ratio over Ruby 3.2.2 YJIT’s. Higher is better for Ruby 3.3.0-preview2 YJIT.
This shows that Ruby 3.3.0-preview2 YJIT is 13% faster than Ruby 3.2.2 YJIT on average. If you’re using Ruby 3.2 YJIT, upgrading Ruby to 3.3 should give you a 13% speedup.
Ruby 3.3 YJIT is 15% faster than the Ruby 3.3 interpreter
During the same 24-hour window, we also deployed Ruby 3.3.0-preview2 interpreter to another cluster. We compared its performance against that of the cluster with Ruby 3.3.0-preview2 YJIT.
The numbers and the graph show Ruby 3.3.0-preview2 YJIT’s average/p50/p90/p99 response time speedup ratio over Ruby 3.3.0-preview2 interpreter’s. Higher is better for Ruby 3.3.0-preview2 YJIT.
When you run Ruby 3.3 interpreter, just enabling YJIT will give you a 15% speedup.
Why is it faster?
New register allocator
The CRuby interpreter writes and reads temporary values on memory. Ruby 3.2 YJIT used to work in the same way for compatibility reasons, but Ruby 3.3 YJIT is able to allocate registers to those operations. In general, accessing CPU registers is much faster than communicating with memory, so Ruby code runs faster with the optimization.
More code now gets JIT compiled
Ruby 3.3 YJIT supports compiling more kinds of Ruby code than Ruby 3.2 YJIT.
When a method has optional arguments, Ruby 3.2 YJIT compiles only one combination of parameter precenses,
but Ruby 3.3 YJIT handles multiple combinations.
Ruby 3.2 YJIT does not compile code that is executed after break
, next
, redo
, retry
, etc.,
but Ruby 3.3 YJIT does.
Ruby 3.2 YJIT gives up compiling a method call when it does not support the method call type,
but Ruby 3.3 YJIT falls back to an implementation that can handle anything.
In a nutshell, more code runs on JIT with Ruby 3.3 YJIT than Ruby 3.2 YJIT, so it’s faster.
Towards the Ruby 3.3.0 release
Reducing memory overhead
Since Ruby 3.3.0-preview2 YJIT generates more code than Ruby 3.2.2 YJIT, this can result in YJIT having a higher memory overlead. We put a lot of effort into making metadata more space-efficient, but it still uses more memory than Ruby 3.2.2 YJIT. We’re looking into skipping compilation of paths that are less frequently executed.
Optimizing method calls
Ruby’s method calls are complex. YJIT code spends a lot of time bookkeeping frames for method calls. We’re working on making any method calls faster by packing frame metadata into a single pointer and lazily materializing its fields when necessary. Hopefully, it will greatly speed up method calls as well as reducing the code size.
Conclusion
Ruby 3.2 YJIT has optimized the production workloads of Shopify and other companies. We encourage you to enable YJIT in production. Once Ruby 3.3 is released, it should make your application even faster.
References
This document explains how to enable YJIT. As a reminder, the following links may help you troubleshoot performance problems with YJIT.