Latency at the Edge with Rust/WebAssembly and Postgres: Part 1
We have been working on enabling Exograph on WebAssembly. Since we have implemented Exograph using Rust, it was natural to target WebAssembly. You can soon build secure, flexible, and efficient GraphQL backends using Exograph and run them at the edge.
During our journey towards WebAssembly support, we learned a few things to improve the latency of Rust-based programs targeting WebAssembly in Cloudflare Workers connecting to Postgres. This two-part series shares those learnings. In this first post, we will set up a simple Cloudflare Worker connecting to a Postgres database and get baseline latency measurements. In the next post, we will explore various ways to improve it.
Even though we experimented in the context of Exograph, the learnings should apply to anyone using WebAssembly in Cloudflare Workers (or other platforms that support WebAssembly) to connect to Postgres.
Read Part 2 that improves latency by a factor of 6!
Rust Cloudflare Workers
Cloudflare Workers is a serverless platform that allows you to run code at the edge. The V8 engine forms the underpinning of the Cloudflare Worker platform. Since V8 supports JavaScript, it is the primary language for writing Cloudflare Workers. However, JavaScript running in V8 can load WebAssembly modules. Therefore, you can write some parts of a worker in other languages, such as Rust, compile it to WebAssembly, and load that from JavaScript.
Cloudflare Worker's Rust tooling enables writing workers entirely in Rust. Behind the scenes, the tooling compiles the Rust code to WebAssembly and loads it in a JavaScript host. The Rust code you write must be able to compile to wasm32-unknown-unknown
target. Consequently, it must follow the restrictions of WebAssembly. For example, it cannot access the filesystem or network directly. Instead, it must rely on the host-provided capabilities. Cloudflare provides such capabilities through the worker-rs crate. This crate, in turn, uses wasm-bindgen to export a few JavaScript functions to the Rust code. For example, it allows opening network sockets. We will use this capability later to integrate Postgres.
Here is a minimal Cloudflare Worker in Rust:
use worker::*;
#[event(fetch)]
async fn main(_req: Request, _env: Env, _ctx: Context) -> Result<Response> {
Ok(Response::ok("Hello, Cloudflare!")?)
}
To deploy, you can use the npx wrangler deploy
command, which compiles the Rust code to WebAssembly, generates the necessary JavaScript code, and deploys it to the Cloudflare network.
Before moving on, let's measure the latency of this worker. We will use Ohayou, an HTTP load generator written in Rust. We measure latency using a single concurrent client (-c 1
) and one hundred requests (-n 100
).
oha -c 1 -n 100 <worker-url>
...
Slowest: 0.2806 secs
Fastest: 0.0127 secs
Average: 0.0214 secs
...
It takes an average of 21ms to respond to a request. This is a good baseline to compare when we add Postgres to the mix.
Focusing on latency
We will focus on measuring the lower bound for latency of the roundtrip for a request to the worker who queries a Postgres database before responding. Here is our setup:
-
Use a Neon Postgres database with the following table and no rows to focus on network latency (and not database processing time).
CREATE TABLE todos (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
completed BOOLEAN NOT NULL
); -
Implement a Cloudflare Worker that responds to
GET
by fetching all completed todos from the table and returning them as a JSON response (of course, since there is no data, the response will be an empty array, but the use of a predicate will allow us to explore some practical considerations where the queries will have a few parameters). -
Place the worker, database, and client in the same region. While, we can't control the worker placement, Cloudflare will place the worker close to either the client or the database (which we've put in the same region).
All right, let's get started!
Connecting to Postgres
Let's implement a simple worker that fetches all completed todos from the Neon Postgres database. We will use the tokio-postgres crate to connect to the database.
#[event(fetch)]
async fn main(_req: Request, env: Env, _ctx: Context) -> Result<Response> {
let config = tokio_postgres::config::Config::from_str(&env.secret("DATABASE_URL")?.to_string())
.map_err(|e| worker::Error::RustError(format!("Failed to parse configuration: {:?}", e)))?;
let host = match &config.get_hosts()[0] {
Host::Tcp(host) => host,
_ => {
return Err(worker::Error::RustError("Could not parse host".to_string()));
}
};
let port = config.get_ports()[0];
let socket = Socket::builder()
.secure_transport(SecureTransport::StartTls)
.connect(host, port)?;
let (client, connection) = config
.connect_raw(socket, PassthroughTls)
.await
.map_err(|e| worker::Error::RustError(format!("Failed to connect: {:?}", e)))?;
wasm_bindgen_futures::spawn_local(async move {
if let Err(error) = connection.await {
console_log!("connection error: {:?}", error);
}
});
let rows: Vec<tokio_postgres::Row> = client
.query(
"SELECT id, title, completed FROM todos WHERE completed = $1",
&[&true],
)
.await
.map_err(|e| worker::Error::RustError(format!("Failed to query: {:?}", e)))?;
Ok(Response::ok(format!("{:?}", rows))?)
}
There are several notable things (especially if you are new to WebAssembly):
-
In a non-WebAssembly platform, you would get the client and connection directly using the database URL, which opens a socket to the database. For example, you would have done something like this:
let (client, connection) = config.connect(tls).await?;
However, that won't work in a WebAssembly environment since there is no way to connect to a server (or, for that matter, any other resources such as filesystem). This is the core characteristic of WebAssembly: it is a sandboxed environment that cannot access resources unless explicitly provided (thought functions exported to the WebAssembly module). Therefore, we use
Socket::builder().connect()
to create a socket (which, in turn, uses TCP Socket API provided by Cloudflare runtime). Then, we useconfig.connect_raw()
to lay the Postgres protocol over that socket. -
We would have marked the
main
function with, for example,#[tokio::main]
to bring in an async executor. However, here too, WebAssembly is different. Instead, we must rely on the host to provide the async runtime. In our case, Cloudflare worker provides a runtime (which uses JavaScript's event loop). -
In a typical Rust program, we would have used
tokio::spawn
to spawn a task. However, in WebAssembly, we usewasm_bindgen_futures::spawn_local
, which runs in the context of JavaScript's event loop.
We will deploy it using npx wrangler deploy
. You will need to create a database and add the DATABASE_URL
secret to the worker.
You can test the worker using curl
:
curl https://<worker-url>
And measure the latency:
oha -c 1 -n 100 <worker-url>
...
Slowest: 0.8975 secs
Fastest: 0.2795 secs
Average: 0.3441 secs
So, our worker takes an average of 345ms to respond to a request. Depending on the use case, this can be between okay-ish and unacceptable. But why is it so slow?
We are dealing with two issues here:
- Establishing connection to the database: The worker creates a new connection for each request. Given that a secure connection, it takes 7+ round trips. Not surprisingly, latency is high.
- Executing the query: The
query
method in our code causes the Rust Postgres driver to make two round trips: to prepare the statement and to bind/execute the query. It also sends a one-way message to close the prepared statement.
How can we improve? We will address that in the next post by exploring connection pooling and possible changes to the driver. Stay tuned!