A Ruby open-source sabbatical
I wanted to share a bit about the first part of my Ruby open-source sabbatical experience here at Shopify! There don’t seem to be nouns or verbs for someone doing a sabbatical, so I’m coining the terms sabbatical-er and sabbatical-ing for the rest of this post.
A bit about me (a sabbatical-er)
My name is Sid, and I’m currently doing a Ruby open-source sabbatical at Shopify on the Ruby Developer Experience team (aka Team Ruby DX). I’ve primarily worked on web applications using Rails and React for the past 6+ years building Shopify Collabs. I’m a proud Ruby developer and am increasingly appreciative of the Ruby community both here at the company and beyond.
What is an open-source sabbatical?
Ruby open-source sabbatical is a program started by the Ruby and Rails infra teams to allow engineers across the company to spend a 3 month stint learning more about an area of their choice. I’ve heard some people refer to this type of arrangement as a “secondement” as well (and it seems to line up).
The purpose of the program is to allow sabbatical-ers to acquire and share knowledge from their experience with their team. Some sabbaticals are learning focused while others are more project focused. Mine seems to be somewhere in the middle, leaning towards project focused.
My area of focus
My project for this sabbatical is to help extract and optimize the indexer used for Ruby LSP.
For some context, Ruby LSP requires information about where various entities are located (classes, methods, modules, etc) in order to provide helpful features like:
-
Go to definition
-
Find all references
-
Completions
-
Renaming
-
etc
The process of parsing a code base, traversing the parsed result, and storing relevant information about these entities is called indexing. Currently, Ruby LSP performs a re-index whenever it starts up in order to store all the necessary information in memory for the language server to use.
There are a few issues with the current implementation. These are:
-
Indexing takes a long time for large code bases. When testing on my machine, it took around 90 seconds to index Shopify’s core Rails monolith.
-
There’s no caching for the index which means every time the language server starts, it needs to build the index from scratch.
-
The index takes up a lot of space in memory. The index generated for Shopify core is currently around 2.2 GB.
-
Other services that could use the index don’t have access since it’s currently an internal piece of Ruby LSP.
I’ll be helping ship improvements for performance, caching, and memory usage for the extracted indexer we hope to build! I’ve completed some baseline work on performance by performing some benchmarks on Ruby and Rust implementations of parsing and traversing ASTs for a large codebase. I’m hoping to write an article to discuss why this was important and what the results were. Spoiler alert: Rust is fast.
So… much… reading…
I feel like I’m in college again. Over the course of this sabbatical so far I’ve been working my way through two textbooks.
The first is Crafting Interpreters which is a book aimed at developers who are interested in learning more about interpreters, compilers, and programming languages but are new to the subject area. The author takes you through how to implement a programming language he made called “Lox”. His writing style has made this book a fun read so far! I started the book so that I could join in on the Ruby and Rails Infra book club. I’m currently working on my Ruby implementation of a Lox interpreter here. I’ve fallen a few chapters behind because I had to begin another book (listed later in this section).
I’ve appreciated how the author imparts an intuition of how the different steps of analyzing code relate to one another through the “Map of the Territory”. When chatting about the book with a team member, they mentioned that the first part of the book is also very relevant to what the DX team works on which is making more sense to me. Ruby LSP depends on Prism to do the scanning and parsing in order to produce the AST that represents the source code. If the destination is DX tooling, we could actually draw another “Analysis” path down the mountain to get there.
The second book is The Rust Programming Language which is a book that explains how to write Rust. I’ve heard a lot of people mention the steep learning curve of Rust, so I thought I’d RTFM for a change. This book isn’t as fun as “Crafting Interpreters”, but it’s been interesting to see how different the Rust paradigm is compared to other programming languages. I’ve liked how the book makes it clear to explain why certain rules are imposed by the compiler. For example, before actually reading the book I was a bit confused about the rules for ownership and borrowing.
// Example 1
let s1 = String::from("hello");
let s2 = s1;
// NOTE: using s1 here breaks!
println!("{s1}, world!");
// Example 2
let mut s = String::from("hello");
let r1 = &mut s;
let r2 = &mut s;
// Having 2 mutable references to String s breaks!
println!("{}, {}", r1, r2);
The rules that cause these examples to break were clear to me, but the underlying reasons why they are imposed wasn’t obvious at first. This chapter made both examples click for me.
For example 1, maintaining only a single owner for a value can ensure that memory allocated during a scope can deterministically be freed without having to track down all potential remaining references and avoiding complexity around double frees. For example 2, enforcing that no other references can be used when a mutable reference is being used helps mitigate data races for accessing/mutating data.
I’m working my way through and am excited and daunted for what comes next. Both of these books have open-source versions available online if you’d like to give them a read.
Goals while I’m sabbatical-ing
I made sure to think about this before starting! So far I’ve got:
-
Re-connect with my excitement for learning
-
Contribute to open source
-
Write about my experience as a sabbatical-er
-
Write more technically about my learnings
-
Level up my craft (both breadth and depth)
-
Have a fun time
So far so good! I’m hoping to share some more technical writing as I progress further through the sabbatical.
The word sabbatical has started to sound weird to me now so I think that’s a good sign to end this post here. See you in the next post!