Now rack.response_finished
is my best friend.
If you’ve heard of Rack, you’ve probably seen an example like this:
# config.ru
class Application
def call(env)
[200, {}, ["Hello Rack"]]
end
end
run Application.new
The application responds to #call
with a single argument, the environment
,
and returns an array of status
, headers
, body
. All of the concepts seem
straightforward, right? The status
is an Integer, the environment
and
response headers
are Hashes, and the body
is an Array of Strings.
While this is a valid Rack application, that’s not really the end of the story. For the whole picture, we have to read the Rack SPEC.
For this post, let’s focus on the specification of the body
. The
requirements have evolved over time, but something that hasn’t changed since the
earliest versions1 of Rack is that enumerable bodies should respond to
#each
, yield
ing strings.
That means an application that prefers not to buffer its entire response into memory could implement the body like this:
# config.ru
class Body
def each
yield "Hello Rack"
end
end
class Application
def call(env)
[200, {}, Body.new]
end
end
run Application.new
Now things are more complicated.
In the “Array of Strings” example, it’s trivial for middleware to do something after the body is generated:
class LoggerMiddleware
def call(env)
response = @app.call(env)
logger.info "Request processed!"
response
end
end
But in the “Body
class” example, the body’s content isn’t generated until the
web server calls #each
on it. Since this happens after the middleware’s
#call
has returned, how can the middleware do things afterwards?
BodyProxy
Luckily, the Rack specification also includes a hook
If the body responds to #close, it will be called after iteration
And this is where Proxy
objects come into play: they can intercept the call to
#close
on the body so that middleware have an opportunity to do things after
the body has been iterated.
The first Proxy
class was introduced in Rack to fix Rack::Lock
unlocking
before an enumerable body was iterated. Soon after, it was extracted to the
Rack::BodyProxy
class that’s widely used today.
class LoggerMiddleware
def call(env)
status, headers, body = @app.call(env)
body = Rack::BodyProxy.new(body) do
logger.info "Request processed!"
end
[status, headers, body]
end
end
While BodyProxy
enables middleware to do things after a response body has been
generated, it isn’t a perfect solution.
The most obvious flaw is that each middleware ends up allocating its own
BodyProxy
object. With many middleware, the response body can end up looking
like
BodyProxy.new(BodyProxy.new(BodyProxy.new(BodyProxy.new(["actual body"]))))
Ruby object allocations are getting faster, but they are still frequently a performance bottleneck. Each allocation creates work for the garbage collector, which slows down your application. A better alternative would be something that avoids allocations altogether.
Another issue with BodyProxy
is that it may run too early in the request
life cycle to perform certain tasks. GitHub has previously written about how they couldn’t use
Rack::Events
(which uses BodyProxy
) for metric emission because it made
pages appear to keep loading until the metrics finished emitting.
At Shopify, we saw a similar issue: our web server Pitchfork would keep the connection open to our reverse proxy while emitting metrics, which increased the proxy’s open connection count and resulted in worse performance.
GitHub’s (and our) solution to this problem was to move metric emission
somewhere that runs later than BodyProxy
to ensure the connection is
completely closed: rack.after_reply
.
rack.after_reply
began life shortly after BodyProxy
: it was added to
Puma in 2011 as a simple array of callables in the request environment
that
would run after closing the response body.
Since then, it has been added to Unicorn and later became an optional
part of the Rack 3 SPEC as rack.response_finished
.
A web server can indicate that it supports rack.response_finished
by including
it in the request environment
, and middleware can register callbacks by
appending to it
class LoggerMiddleware
def initialize
@callback = ->(env, status, headers, error) {
logger.info "Request processed!"
}
end
def call(env)
# Look ma, no allocations!
if response_finished = env["rack.response_finished"]
response_finished << @callback
end
@app.call(env)
end
end
The callbacks must2 accept four arguments: the request environment
,
response status
, response headers
, and an error
. The environment
should
always be present, but the status
/headers
and error
are mutually
exclusive.
BodyProxy
still around?Adoption of rack.response_finished
has been… slow. It’s also very much a
chicken/egg problem: applications and frameworks don’t have a reason to support
it without servers that implement it, and servers don’t have a reason to
implement it if applications and frameworks don’t support it3.
Falcon implemented rack.response_finished
in anticipation of the release
of Rack 3, but there wasn’t a second implementation until Pitchfork added it
just last year.
However, rack.response_finished
finally started gaining momentum when the new
Rails Event Reporter pull request was opened.
As I mentioned before, at Shopify we emit metrics inside a rack.after_reply
/
rack.response_finished
callback so that we don’t keep the connection open
unnecessarily after the response has been sent. For the same reason, this is
also where we log summaries of requests (using the Event Reporter).
This presented an interesting challenge when upstreaming the Event Reporter to
Rails. The Event Reporter’s context
needs to be cleared so that it doesn’t
leak between requests, but the existing mechanism for request isolation (the
ActionDispatch::Executor
middleware) uses BodyProxy
. Using BodyProxy
would
mean the context
would be cleared before we’re able to use it to log the
request summary in rack.response_finished
.
To make this work, my teammate Adrianna and I added support for
rack.response_finished
to ActionDispatch::Executor
.
This enabled the Executor
to clear the Event Reporter context
between
requests using rack.response_finished
, meaning our request summary log would
still have access to it!
rack.response_finished
is my best friendSince we implemented rack.response_finished
in ActionDispatch::Executor
,
I’ve been looking to replace BodyProxy
in other Rack middleware used by my
application.
Rack 3.2 will have a few less BodyProxy
s as both Rack::ConditionalGet
and
Rack::Head
no longer use them.
I’ve also opened pull requests to add support for rack.response_finished
in
Rack::TempfileReaper
and
ActiveSupport::Cache::Strategy::LocalCache
.
Finally, Puma just recently merged support for rack.response_finished
as
well!
With all this momentum behind rack.response_finished
, maybe you can end your
friendship with Rack::BodyProxy
too.
I was actually curious how long Rack has had this requirement so I went
git spelunking: the initial addition of Rack::Lint
included it! ↩
MUST
was actually only recently added to the SPEC
to prevent
defining callbacks that accept no arguments (like -> {}
). ↩
And to be fair, Puma, Unicorn, and Pitchfork all support
rack.after_reply
already, it’s just not part of the Rack SPEC
. ↩