gRPC-Rust Client API Evolution (pt. 1/2)
PART 1
The following is a detailed explanation of the background behind the new gRPC-Rust client API. It covers the process I went through while designing it, alternatives considered, trade-offs, and advantages of the final design. If you are only interested in the result itself, please feel free to skip straight to our documentation instead.
Background: gRPC, Tonic, and Tower
gRPC is a high performance Remote Procedure Call (RPC) framework. If you aren’t already familiar with gRPC, please see our introduction.
Tonic is an implementation of the gRPC protocol for the Rust programming language. It is widely used and very popular, with over 12k stars on github. Tonic is designed to be part of the Tower ecosystem, which is a framework that attempts to unify all networking client and service APIs, allowing easy introduction of common middleware for your application.
The gRPC-Rust project was started as an effort to bring all the advanced gRPC features missing from Tonic to the Rust community, like integrated health checking and retries, as well as performance-boosting strategies like zero-copy and arena optimizations.
Obvious Starting Point: Keep the Tonic API?
Tonic is already used by a large number of Rust users, so my initial thought was that it would be great if we could re-use its API. Here’s a quick sample of the current Tonic client-side API:
// Simple unary call (one request, one response):
let response = client
.get_feature(Request::new(Point { latitude: 409146138, longitude: -746188906,}))
.await?;
// Bidirectional streaming (many requests and responses in parallel:
let outbound = async_stream::stream! { /* request stream generator */ };
let response = client.route_chat(Request::new(outbound)).await?;
let mut inbound = response.into_inner();
while let Some(note) = inbound.message().await? { /* process responses */ }
When analyzing this more closely, I discovered several issues and limitations:
- Abstraction Leak: Tonic exposes lower-level HTTP/2 primitives (e.g.
GrpcService::ResponseBodyishttp_body::Body). gRPC applications should not assume that gRPC is using HTTP/2 as a transport to enable the use of other transports (e.g. QUIC) in the future. - Unstable Dependency: Tonic builds on
Streamfromfutures_stream(viatokio_stream). These crates are unstable, and we want to be able to have a stable release of gRPC-Rust before they are stabilized. - Performance Limitation: Tonic accepts and returns owned protobuf messages. This means the Tonic library handles allocations, which prevents the application from using advanced allocation strategies like arenas. Arenas are important for performance in heavily loaded, highly concurrent RPC systems.
- Security Concern: Tonic makes it easy (using the
?operator) to propagate outgoing client call statuses to server responses, including metadata, which could contain secrets like tokens or other private information.
What About Tower?
So we’ll need to change the Tonic API, but should we keep using Tower?
In our other supported languages, gRPC directly includes many of the features
that you would get from Tower middleware, e.g.
timeouts and
retries. And, gRPC provides versions of
this functionality specifically designed to work well with gRPC. In addition,
Tower was not created with streaming in mind, even though it technically works.
To implement a bidirectional stream, the Request and Response become
asynchronous objects, and the call method returns early in the RPC’s lifecycle
to allow the application and library to interact with them.
To illustrate these problems, let’s look at timeouts as an example. The Tower
crate provides a timeout module
to provide this functionality. The approach is to race the Service it wraps
with a timer future, dropping one when the other completes. This works fine for
many Service implementations, but it does have some surprising behavior many
people are unaware of:
- The timeout is not applied while waiting for flow control in
poll_ready, meaning your call could block indefinitely. - The timeout only applies to the portion of the call spent waiting for the
Responseobject to be returned from thecallmethod -- which, for streaming RPCs, will be almost immediately. In gRPC, timeouts apply to the total time of the RPC from the moment the client starts attempting the call to the receipt of the final status from the server.
In addition to those behaviors which affect any Service, there is another
incompatibility with gRPC: we write the timeout on the wire so that the server
is aware of the client’s deadline. If an application was using the timeout
module, the gRPC library would never be aware of the timeout in order to
propagate it.
Another limitation of Tower is that it makes it hard for the application to
control memory allocations: the Response would need to be a complex type that
allows you to call into it to receive the message into a buffer. For streaming
RPCs, something like this would be needed anyway, but for unary RPCs, which is
where arenas are most important, it would be awkward.
Use Tower’s “Style”?
If Tower is a poor fit, should we still keep the same style for calls that Tower and Tonic users are familiar with? I.e.
async fn call(Request) -> Response
We ultimately decided against this approach. With this style, applications that
wished to do interleaved operations (“send a message, receive a message, send a
message, …”) would need to deal with the two Request and Response streams
executing concurrently and implement their own synchronization between the two
streams.
Two APIs: Channel vs. Generated Stubs
gRPC actually provides two different APIs: one that the application typically interacts with via the protobuf generated code, and one that the channel (the main client entry point) itself implements, and that interceptors (AKA middleware) would use. The generated API focuses more on usability while the channel API is a lower-level, more powerful, streaming-only design.
In Tonic, this split exists as well, but the proto generated code is just a specialization of the Tonic API using type generics. But this approach is not a requirement, and in fact, there are reasons to avoid it.
With gRPC-Rust, we are taking the opportunity to incorporate some lessons learned from our experience implementing gRPC in many other languages. One such idea we’d like to incorporate is to hide the details of gRPC itself from the generated API, and only expose protobuf messages and necessary primitives. As an example:
// *Not* something like this:
async fn call(ctx: grpc::Context, req: MyRequestMessage, options: grpc::CallOptions)
-> grpc::Response<MyResponseMessage>;
// More like this instead:
async fn call(req: MyRequestMessage) -> Result<MyResponseMessage, Status>;
// Example usage:
let response = client.call(request).await.expect("RPC should succeed!");
This is similar to gRPC-Java’s design, and it allows applications to focus on the business logic of the application: requests and responses. For other functionality that gRPC provides -- like accessing metadata, disabling retry, reading peer details, etc -- these are generally things that interceptors should be used for, instead.
Generated API
With that in mind, let’s dive into the generated (protobuf) API in isolation first, and understand why the nice, simple API in the example above isn’t good enough. (Then I’ll explain how we ultimately achieved exactly that anyway!)
Arenas
As mentioned earlier, arenas are important for making highly efficient RPC systems that can handle extremely high QPS (Queries Per Second), by grouping related memory operations temporally and spatially. To allow the application to control allocations instead of the library, we were looking for a unary API something like:
// Definition:
async fn call(req: &MyRequestMessageView, resp: &mut MyResponseMessageView) -> Status;
// Usage (with a hypothetical arena API):
let req = MyRequestMessage::new_on_arena(arena).set_id(3);
let res = MyResponseMessage::new_on_arena(arena);
client.call(&req, &mut res).expect("RPC should succeed!");
// res now contains the RPC's response
Can we Make it More Usable?
That API is straightforward, but we understand that many users would not want to
pre-declare the response message type manually before making every RPC. We can
accommodate both use cases by using the async builder
pattern.
An async builder is able to return owned messages via an IntoFuture
implementation, while also providing an alternative method to perform the call
using a pre-allocated response.
We can also improve the ergonomics of the request message parameter. Instead of
requiring a reference, we can accept an
AsView
protobuf type that allows either an owned message or a view to be passed into
the call. This is the final API we have implemented:
// Definition:
async fn call<Req>(req: Req) -> UnaryFutureBuilder<..>
where
Req: AsView<Proxied=MyRequestMsg>;
// Implements the simple usage:
impl IntoFuture for UnaryFutureBuilder {
type Output = Result<MyResponseMessage, StatusError>;
}
// Implements the advanced usage method:
impl UnaryFutureBuilder {
async fn with_response_message<Res>(self, res: &mut Res) -> Status
where
Res: AsMut<MutProxied = MyResponseMessage>;
}
// Usage:
fn main() {
let request = proto!(MyRequestMessage{ id: 3 });
// Simple Usage -- exactly what we wanted originally!:
let response = client.call(request).await.expect("RPC should succeed!");
// Arena usage (again with a hypothetical arena API):
let req = MyRequestMessage::new_on_arena(arena).set_id(3);
let res = MyResponseMessage::new_on_arena(arena);
client.call(&req).with_response_message(&mut res).await.expect("RPC should succeed!");
}
We adapted these same concepts to our streaming APIs as well. Below is an example of the bidirectional streaming API:
// Definition:
async fn begin_stream() -> BidiCallBuilder<..>
// Implements the same async builder pattern:
impl IntoFuture for BidiCallBuilder<..> {
type Output = (GrpcStreamingRequest, GrpcStreamingResponse);
}
// Usage:
fn main() {
// Simple Usage:
let (request_stream, response_stream) = client.begin_stream().await;
request_stream.send(proto!(MyRequestMessage{..}));
let response = response_stream.recv().await.expect("RPC should succeed!");
// Arena usage:
let res = MyResponseMessage::new_on_arena(arena);
let response = response_stream.recv_into(&mut res).await.expect("RPC should succeed!");
}
Please see the full documentation for more detailed usage examples.
Next Time
This covers the generated code API design process. In Part 2 I’ll go into further details on the channel APIs.