Skip to main content

4 posts tagged with "performance"

View All Tags

Latency at the edge with Rust/WebAssembly and Postgres: Part 2

Β· 9 min read
Ramnivas Laddad
Co-founder @ Exograph

In the previous post, we implemented a simple Cloudflare Worker in Rust/WebAssembly connecting to a Neon Postgres database and measured end-to-end latency. Without any pooling, we got a mean response time of 345ms.

The two issues we suspected for the high latency were:

Establishing connection to the database: The worker creates a new connection for each request. Given that a secure connection, it takes 7+ round trips. Not surprisingly, latency is high.

Executing the query: The query method in our code causes the Rust Postgres driver to make two round trips: to prepare the statement and to bind/execute the query. It also sends a one-way message to close the prepared statement.

In this part, we will deal with connection establishment time by introducing a pool. We will fork the driver to deal with multiple round trips (which incidentally also helps with connection pooling). We will also learn a few things about Postgres's query protocol.

Source code

The source code with all the examples explored in this post is available on GitHub. With it, you can perform measurements and experiments on your own.

Introducing connection pooling​

If the problem is establishing a connection, the solution could be a pool. This way, we can reuse the existing connections instead of creating a new one for each request.

Application-level pooling​

Could we use a pooling crate such as deadpool?

While that would be a good option in a typical Rust environment (and Exograph uses it), it is not an option in the Cloudflare Worker environment. A worker is considered stateless and should not maintain any state between requests. Since a pool is a stateful object (holding the connections), it can't be used in a worker. If you try to use it, you will get the following runtime error on every other request:

Error: Cannot perform I/O on behalf of a different request. I/O objects (such as streams, request/response bodies, and others) created in the context of one request handler cannot be accessed from a different request's handler. This is a limitation of Cloudflare Workers which allows us to improve overall performance. (I/O type: WritableStreamSink)

When the client makes the first request, the worker creates a pool and successfully executes the query. For the second request, the worker tries to reuse the pool, but it fails due to the error above, leading to the eviction of the worker by the Cloudflare runtime. For the third request, a fresh worker creates another pool, and the cycle continues.

The error is clear: we cannot use application-level pooling in this environment.

External pooling​

Since application-level pooling won't work in this environment, could we try an external pool? Cloudflare provides Hyperdrive for connection pooling (and more, such as query caching). Let's try that.

#[event(fetch)]
async fn main(_req: Request, env: Env, _ctx: Context) -> Result<Response> {
let hyperdrive = env.hyperdrive("todo-db-hyperdrive")?;

let config = hyperdrive
.connection_string()
.parse::<tokio_postgres::Config>()
.map_err(|e| worker::Error::RustError(format!("Failed to parse configuration: {:?}", e)))?;

let host = hyperdrive.host();
let port = hyperdrive.port();

// Same as before
}

Besides how we get the host and port, the rest of the code (to connect to the database and execute the query) remains the same as the one in part 1.

You will need to create a Hyperdrive instance using the following command (replace the connection string with your own):

npx wrangler hyperdrive create todo-db-hyperdrive --caching-disabled --connection-string "postgres://..."

We disable query caching since that will cause skipping most database calls. Due to the empty cache, the first request will hit the database. For subsequent requests (which execute the same SQL query in our setup), Hyperdrive will likely serve them from its cache. We are interested in measuring the latency to include database calls. With caching turned on, the comparison to the baseline would be apples-to-oranges.

note

For the real-world scenario, you may enable caching to balance database load and freshness of data.

Next, you will need to put the Hyperdrive information in wrangler.toml:

[[hyperdrive]]
binding = "todo-db-hyperdrive"
id = "<your-hyperdrive-id>"

Let's test this worker.

curl <worker-url>
INTERNAL SERVER ERROR

Hmm... that failed.

Fast moving ground

This is due to an issue with the current Hyperdrive implementation. The support for prepared statements is still new and (currently) works only with caching enabled. I have made the Cloudflare team aware of it. I think this will be fixed soon 🀞.

As things change, I will add updates here.

What's going on? Postgres has two kinds of query protocols:

  1. Simple query protocol: With this protocol, you must supply the SQL as a string and include any parameter values in the query (for example, SELECT * FROM todos WHERE id = 1). The driver makes one round trip to execute such a query.
  2. Extended query protocol: With this protocol, you may have the SQL query with placeholders for parameters (for example, SELECT * FROM todos WHERE id = $1), and its execution requires a preparation step. We will go into detail in the next section.

Let's explore both protocols.

Hyperdrive with the simple query protocol​

To explore the simple query protocol, we will use the simple_query method. Since it doesn't allow specifying parameters, we inline them.

#[event(fetch)]
async fn main(_req: Request, env: Env, _ctx: Context) -> Result<Response> {

/// Hyperdrive setup as before

let rows = client
.simple_query("SELECT id, title, completed FROM todos WHERE completed = true")
.await
.map_err(|e| worker::Error::RustError(format!("Failed to query: {:?}", e)))?;

...
}

Does it work and how does this perform?

oha -c 1 -n 100 <worker-url>
Slowest: 0.2871 secs
Fastest: 0.0476 secs
Average: 0.0633 secs

That's more like it! The mean response time is 63ms, a significant improvement over the previous 345ms.

Since the simple query protocol needed only one round trip, Hyperdrive was able to use an existing connection without too much additional logic, so it worked without any issues and performed well.

But... the simple query protocol forces us to use string interpolation to inline the parameters in the query, which is a big no-no in the world of databases due to the risk of SQL injection attacks. So let's not do that!

Hyperdrive with the extended query protocol​

Let's go back to the extended query protocol and figure out why Hyperdrive might be struggling with it. As it happens, all external pooling services deal with the same issue; for example, only recently did pgBouncer start to support it.

When using the extended query protocol through query, the driver executes the following steps:

  1. Prepare: Sends a prepare request. This contains a name for the statement (for example, "s1") and a query with the placeholders for parameters to be provided later (for example, $1, $2 etc.). The server sends back the expected parameter types.
  2. Bind/execute: Sends the name of the prepared statement and the parameters serialized in the format appropriate for the types. The server looks up the prepared statement by name and executes it with the provided parameters. It sends back the rows.
  3. Close: Closes the prepared statement to free up the resources on the server. In tokio-postgres, this is a fire-and-forget operation (doesn't wait for a response).

When you add in a connection pool, the driver must invoke the "bind/execute" and "close" with the same connection it used for "prepare". This requires some bookkeeping and is a source of complexity.

What if we combine all three steps into a single network package? This is what Exograph's fork of tokio-postgres (fork, PR) does. The client must specify the parameter values and their types (we no longer perform a round trip to discover parameter types). This way, the driver can serialize the parameters in the correct format in the same network package.

#[event(fetch)]
async fn main(_req: Request, env: Env, _ctx: Context) -> Result<Response> {

/// Hyperdrive setup as before

let rows: Vec<tokio_postgres::Row> = client
.query_with_param_types(
"SELECT id, title, completed FROM todos where completed <> $1",
&[(&true, Type::BOOL)],
)
.await
.map_err(|e| worker::Error::RustError(format!("query_with_param_types: {:?}", e)))?;

...
}

How does this perform?

oha -c 1 -n 100 <worker-url>
Slowest: 0.2883 secs
Fastest: 0.0466 secs
Average: 0.0620 secs

Nice! The mean response time is now 62ms, which matches the simple query protocol (63ms).

Summary​

Let's summarize the mean response times for the various configurations:

Method ➑️querysimple_queryquery_with_param_types
Pooling ⬇️Timing in milliseconds
None345312312
Hyperdrivesee above6362

With connection pooling through Hyperdrive, we have brought the mean response time by a factor of 5.5 (from 345ms to 62ms)!

Round trip cost

The 33ms improvement between query (345ms) and query_with_param_types (312ms) is likely to be due to saving the extra round trip for the "prepare" step but needs further investigation.

The source code is available on GitHub, so you can check this yourself. If you find any improvements or issues, please let me know.

So what should you use? Assuming that the issue with Hyperdrive and query method has been fixed:

  • If you don't want to use Hyperdrive, use the query_with_type_params method with the forked driver. It does the job in one roundtrip and gives you the best performance without any risk of SQL injection attacks.
  • If you want to use Hyperdrive:
    • If you frequently make the same queries, the query method will likely do better. Hyperdrive may cache the "prepare" part of the step, making subsequent queries faster.
    • If you make a variety of queries, you can use the query_with_param_types method. Since you won't execute the same query frequently, Hyperdrive's prepare statement caching is unlikely to help. Instead, this method's fewer round trips will be beneficial.

Watch Exograph's blog for more explorations and insights as we ship its Cloudflare Worker support. You can reach us on Twitter or Discord. We would appreciate a star on GitHub!

Share:

Latency at the Edge with Rust/WebAssembly and Postgres: Part 1

Β· 6 min read
Ramnivas Laddad
Co-founder @ Exograph

We have been working on enabling Exograph on WebAssembly. Since we have implemented Exograph using Rust, it was natural to target WebAssembly. You can soon build secure, flexible, and efficient GraphQL backends using Exograph and run them at the edge.

During our journey towards WebAssembly support, we learned a few things to improve the latency of Rust-based programs targeting WebAssembly in Cloudflare Workers connecting to Postgres. This two-part series shares those learnings. In this first post, we will set up a simple Cloudflare Worker connecting to a Postgres database and get baseline latency measurements. In the next post, we will explore various ways to improve it.

Even though we experimented in the context of Exograph, the learnings should apply to anyone using WebAssembly in Cloudflare Workers (or other platforms that support WebAssembly) to connect to Postgres.

Second Part

Read Part 2 that improves latency by a factor of 6!

Rust Cloudflare Workers​

Cloudflare Workers is a serverless platform that allows you to run code at the edge. The V8 engine forms the underpinning of the Cloudflare Worker platform. Since V8 supports JavaScript, it is the primary language for writing Cloudflare Workers. However, JavaScript running in V8 can load WebAssembly modules. Therefore, you can write some parts of a worker in other languages, such as Rust, compile it to WebAssembly, and load that from JavaScript.

Cloudflare Worker's Rust tooling enables writing workers entirely in Rust. Behind the scenes, the tooling compiles the Rust code to WebAssembly and loads it in a JavaScript host. The Rust code you write must be able to compile to wasm32-unknown-unknown target. Consequently, it must follow the restrictions of WebAssembly. For example, it cannot access the filesystem or network directly. Instead, it must rely on the host-provided capabilities. Cloudflare provides such capabilities through the worker-rs crate. This crate, in turn, uses wasm-bindgen to export a few JavaScript functions to the Rust code. For example, it allows opening network sockets. We will use this capability later to integrate Postgres.

Here is a minimal Cloudflare Worker in Rust:

use worker::*;

#[event(fetch)]
async fn main(_req: Request, _env: Env, _ctx: Context) -> Result<Response> {
Ok(Response::ok("Hello, Cloudflare!")?)
}

To deploy, you can use the npx wrangler deploy command, which compiles the Rust code to WebAssembly, generates the necessary JavaScript code, and deploys it to the Cloudflare network.

Before moving on, let's measure the latency of this worker. We will use Ohayou, an HTTP load generator written in Rust. We measure latency using a single concurrent client (-c 1) and one hundred requests (-n 100).

oha -c 1 -n 100 <worker-url>
...
Slowest: 0.2806 secs
Fastest: 0.0127 secs
Average: 0.0214 secs
...

It takes an average of 21ms to respond to a request. This is a good baseline to compare when we add Postgres to the mix.

Focusing on latency​

We will focus on measuring the lower bound for latency of the roundtrip for a request to the worker who queries a Postgres database before responding. Here is our setup:

  • Use a Neon Postgres database with the following table and no rows to focus on network latency (and not database processing time).

    CREATE TABLE todos (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    completed BOOLEAN NOT NULL
    );
  • Implement a Cloudflare Worker that responds to GET by fetching all completed todos from the table and returning them as a JSON response (of course, since there is no data, the response will be an empty array, but the use of a predicate will allow us to explore some practical considerations where the queries will have a few parameters).

  • Place the worker, database, and client in the same region. While, we can't control the worker placement, Cloudflare will place the worker close to either the client or the database (which we've put in the same region).

All right, let's get started!

Connecting to Postgres​

Let's implement a simple worker that fetches all completed todos from the Neon Postgres database. We will use the tokio-postgres crate to connect to the database.

#[event(fetch)]
async fn main(_req: Request, env: Env, _ctx: Context) -> Result<Response> {
let config = tokio_postgres::config::Config::from_str(&env.secret("DATABASE_URL")?.to_string())
.map_err(|e| worker::Error::RustError(format!("Failed to parse configuration: {:?}", e)))?;

let host = match &config.get_hosts()[0] {
Host::Tcp(host) => host,
_ => {
return Err(worker::Error::RustError("Could not parse host".to_string()));
}
};
let port = config.get_ports()[0];

let socket = Socket::builder()
.secure_transport(SecureTransport::StartTls)
.connect(host, port)?;

let (client, connection) = config
.connect_raw(socket, PassthroughTls)
.await
.map_err(|e| worker::Error::RustError(format!("Failed to connect: {:?}", e)))?;

wasm_bindgen_futures::spawn_local(async move {
if let Err(error) = connection.await {
console_log!("connection error: {:?}", error);
}
});

let rows: Vec<tokio_postgres::Row> = client
.query(
"SELECT id, title, completed FROM todos WHERE completed = $1",
&[&true],
)
.await
.map_err(|e| worker::Error::RustError(format!("Failed to query: {:?}", e)))?;

Ok(Response::ok(format!("{:?}", rows))?)
}

There are several notable things (especially if you are new to WebAssembly):

  • In a non-WebAssembly platform, you would get the client and connection directly using the database URL, which opens a socket to the database. For example, you would have done something like this:

    let (client, connection) = config.connect(tls).await?;

    However, that won't work in a WebAssembly environment since there is no way to connect to a server (or, for that matter, any other resources such as filesystem). This is the core characteristic of WebAssembly: it is a sandboxed environment that cannot access resources unless explicitly provided (thought functions exported to the WebAssembly module). Therefore, we use Socket::builder().connect() to create a socket (which, in turn, uses TCP Socket API provided by Cloudflare runtime). Then, we use config.connect_raw() to lay the Postgres protocol over that socket.

  • We would have marked the main function with, for example, #[tokio::main] to bring in an async executor. However, here too, WebAssembly is different. Instead, we must rely on the host to provide the async runtime. In our case, Cloudflare worker provides a runtime (which uses JavaScript's event loop).

  • In a typical Rust program, we would have used tokio::spawn to spawn a task. However, in WebAssembly, we use wasm_bindgen_futures::spawn_local, which runs in the context of JavaScript's event loop.

We will deploy it using npx wrangler deploy. You will need to create a database and add the DATABASE_URL secret to the worker.

You can test the worker using curl:

curl https://<worker-url>

And measure the latency:

oha -c 1 -n 100 <worker-url>
...
Slowest: 0.8975 secs
Fastest: 0.2795 secs
Average: 0.3441 secs

So, our worker takes an average of 345ms to respond to a request. Depending on the use case, this can be between okay-ish and unacceptable. But why is it so slow?

We are dealing with two issues here:

  1. Establishing connection to the database: The worker creates a new connection for each request. Given that a secure connection, it takes 7+ round trips. Not surprisingly, latency is high.
  2. Executing the query: The query method in our code causes the Rust Postgres driver to make two round trips: to prepare the statement and to bind/execute the query. It also sends a one-way message to close the prepared statement.

How can we improve? We will address that in the next post by exploring connection pooling and possible changes to the driver. Stay tuned!

Share:

GraphQL needs new thinking

Β· 6 min read
Ramnivas Laddad
Co-founder @ Exograph

Backend developers often find implementing GraphQL backends challenging. I believe the issue is not GraphQL but the current implementation techniques.

This post is a response to Why, after 6 years, I'm over GraphQL by Matt Bessey. I recommend that you read the original blog before reading this. The sections in this blog mirror the original blog, so you can follow them side-by-side.

Before starting Exograph, we implemented several GraphQL backends and faced many of the same issues mentioned in the blog. Exograph is our attempt to address these issues and provide a better way to implement GraphQL. Let's dive into the issues raised in the original blog and how Exograph addresses them.

Attack surface​

The expressive nature of GraphQL queries makes it attractive to frontend developers. However, this exposes a large attack surface, where a client can craft a query that can overwhelm the server.

Authorization​

❗ Problem: GraphQL authorization is a nightmare to implement.

βœ… Solution: A declarative way to define entity or field-level authorization rules.

While authorization is an issue with any API, it is particularly challenging with GraphQL. A typical REST API implementation considers access to only one resource at a time. In contrast, GraphQL implementations need to process queries that can ask for multiple resources. Processing such a query while enforcing authorization rules for each query field is a backend developer's nightmare.

Exograph provides a declarative way to define authorization rules at the entity or field level. Exograph's runtime will enforce them. And by default, Exograph assumes that no one can access an entity unless explicitly allowed, forcing an explicit decision on the developer.

For example, the following rule ensures that a non-admin user can query any department but only published products. An admin user, in contrast, can query or mutate any product or department. With this structure in place, no matter how the query is structured (get products along with their department or departments along with their products), the client can only access the resources they are authorized to.

context AuthContext {
@jwt role: String
}

@postgres
module EcommerceDatabase {
@access(query=self.published || AuthContext.role=="admin",
mutation=AuthContext.role=="admin")
type Product {
// ...
published: Boolean
department: Department
}

@access(query=true, mutation=AuthContext.role=="admin")
type Department {
// ...
products: Set<Product>?
}
}

In a way, Exograph allows a REST-like thought process, where you consider authorization of each resource in isolation. Exograph's query engine ensures that no matter how the query is structured, the client will only get access to the authorized resources.

Rate limiting​

❗ Problem: GraphQL queries can ask for multiple resources in a single query, overwhelming the backend.

βœ… Solution: Trusted documents, query depth limiting, and rate limiting.

Since GraphQL can ask for multiple resources in a single query, it is easy to overwhelm the backend. For example, a client can issue a single deeply nested query, which will cause many trips to the database.

Exograph addresses this issue in multiple ways:

Query Parsing​

❗ Problem: GraphQL queries can overwhelm the parser.

βœ… Solution: The same solution as above.

GraphQL queries express the client's intent at a finer level than REST APIs, requiring the backend to parse the query and present itself as an attack surface.

Exograph's solution for rate limiting also addresses this issue. For example, by allowing only trusted documents, you can ensure clients cannot make arbitrary queries and overwhelm the backend.

Performance​

In a typical traditional GraphQL implementation, the backend developer will write a resolver for each entity to fetch the data, which can lead to the N+1 problem, where the backend makes multiple queries to fetch the data. Balancing authorization and the N+1 problem can be tricky.

Data fetching and the N+1 problem​

❗ Problem: GraphQL queries can lead to the N+1 problem.

βœ… Solution: Defer data fetching to a query planner.

This is a classic issue with GraphQL implementation, where a client can ask for multiple resources in a single query. A common solution is data loader, which batches the queries and fetches the data in fewer round trips but requires careful implementation (especially when considering authorization).

Exograph relieves this burden through its query engine. Exograph's query planner maps a GraphQL query to (typically) a single SQL query.

Authorization and the N+1 problem​

❗ Problem: Balancing authorization and the N+1 problem can be tricky.

βœ… Solution: Make authorization rules a part of the query planning process.

Even if you were to use a data loader to solve the N+1 problem, once you also consider authorization, you end up fighting modularity (see the next point) against performance.

Exograph's query engine incorporates authorization rules while planning the query, so the same query planning process solves this problem.

Coupling​

❗ Problem: GraphQL makes it quite hard to keep code modular.

βœ… Solution: Reduce the amount of code by leaning on a declarative approach.

GraphQL makes it quite hard to keep code modular, especially when dealing with authorization and performance, which leads to code-tangling and code-scattering to enforce authorization rules.

First, in Exograph, you write very little code. You define your data model and authorization rules, and Exograph plans and executes queries for that model.

Second, Exograph's declarative nature allows you to express authorization rules alongside the data model. Exograph query engineβ€”and not your codeβ€”enforces these rules.

Complexity​

❗ Problem: GraphQL makes balancing authorization, performance, and modularity tricky.

βœ… Solution: Simplify implementing GraphQL through a declarative way to define data models and authorization rules.

GraphQL can be complex to implement, balancing authorization, performance, and modularity. Left unchecked, this can lead to a complex and hard-to-maintain codebase with possible security issues.

Exograph simplifies implementing GraphQL through a declarative way to define data models and authorization rules. Exograph query engine takes care of planning and executing queries efficiently. It also provides tools for dealing with various aspects, from quick development feedback to production concerns.

Conclusion​

We believe that with the right level of abstraction, you can get secure, performant, and modular backend implementation. With Exograph, we provide a declarative approach to get all the benefits of GraphQL without the downsides.

What do you think? Reach out to us on Twitter or Discord with your feedback.

Share:

Exograph supports trusted documents

Β· 7 min read
Ramnivas Laddad
Co-founder @ Exograph

Exograph 0.7 introduces support for trusted documents, also known as "persisted documents" or "persisted queries". This new feature makes Exograph even more secure by allowing only specific queries and mutations, thus shrinking the API surface. It also offers other benefits, such as improving performance by reducing the payload size.

Trusted documents are a pre-arranged way for clients to convey "executable documents" (roughly queries and mutations, but see the GraphQL spec and Exograph's documentation for a distinction) they would be using. The server then allows executing only those documents.

tip

For a good introduction to trusted documents, including the reason to prefer the term "trusted documents", please see the GraphQL Trusted Documents. Exograph's trusted document follows the same general principles outlined in it. We hope that someday a GraphQL specification will standardize this concept.

This blog post explains what trusted documents are and how to use them in Exograph.

Conceptual Overview​

Why Trusted Documents?​

Consider the todo application we developed in an earlier blog (source code). This single-screen application allows creating, updating, and deleting todos besides viewing by their completion status. The application only needs the following queries and mutations (the application uses fragments, but we will keep it simple here).

The client application needs two queries to get todos by their completion status:

Query to get todos by completion status
query getTodosByCompletion($completed: Boolean!) {
todos(where: { completed: { eq: $completed } }, orderBy: { id: ASC }) {
id
completed
title
}
}
Query to get all todos
query getTodos {
todos(orderBy: { id: ASC }) {
id
completed
title
}
}

The application also needs mutations to create, update, and delete a todo:

Mutation to create a todo
mutation createTodo($completed: Boolean!, $title: String!) {
createTodo(data: { title: $title, completed: $completed }) {
id
}
}
Mutation to update a todo
mutation updateTodo($id: Int!, $completed: Boolean!, $title: String!) {
updateTodo(id: $id, data: { title: $title, completed: $completed }) {
id
}
}
Mutation to delete a todo
mutation deleteTodo($id: Int!) {
deleteTodo(id: $id) {
id
}
}

Contrast this with what the server offers. For example, the server provides:

  • Querying a todo by its ID, but the client application doesn't need it, since it doesn't display a single todo.
  • Flexible filtering, but the client application needs only by completion status.
  • Sorting by other fields or a combination, but the client application only sorts by the ID in ascending order.
  • Creating, updating, and deleting in bunk, but the client application only mutates one todo at a time.

While Exograph (and other GraphQL servers) offer flexible field selection for all queries and mutations, each client query or mutation only needs a specific set of fields.

What are Trusted Documents?​

Because the client app requires only specific queries and mutations, it can inform the server using trusted documents. These documents are files with all the executable queries and mutations the client app needs. For instance, in the case of the todo app, a tool would create a file like this:

{
"2a...": "query getTodosByCompletion($completed: Boolean!) ...",
"3b...": "query getTodos ...",
"4c...": "mutation createTodo($completed: Boolean!, $title: String!) ...",
"5d...": "mutation updateTodo($id: Int!, $completed: Boolean!, $title: String!) ...",
"6e...": "mutation deleteTodo($id: Int!) ..."
}

Here, the key is a hash (SHA256, for example) of the query or mutation, and the value is the query or mutation itself.

Benefits of using Trusted Documents​

Using trusted documents reduces the effective API surface area and offers several benefits, such as:

  • Preventing attackers from executing arbitrary queries and mutations.
  • Reducing the bandwidth by sending only the hash of the query instead of the query itself.
  • Providing a focus for testing the actual queries and mutations used by the client application.
  • Allowing server-side optimizations such as caching query parsing and pre-planning query execution.

You may wonder if there is a different way to achieve the same benefits. Specifically, couldn't the server provide a narrower set of queries and mutations? For example, could it not offer the API "get a todo by its ID"? While feasible, this approach undermines one of GraphQL's strengths: enabling client-side development without continuous server team involvement. For example, the client application may want to offer a route such as todos/:id to show a particular todo. It will then need a todo by its ID. Without this query, the client must request the server team to add it, consuming time and resources. Trusted documents streamline this process. They allow the client to effortlessly express requirements, like fetching todos by ID. Thus, the server should maintain a reasonably flexible API while clients communicate their needs through trusted documents.

Using Trusted Documents in Exograph​

Like any other feature in Exograph, using trusted documents is straightforward. All you need to do is put the trusted documents in the trusted-documents directory! Exograph will automatically use them.

Workflow​

A typical workflow to use trusted documents in Exograph is as follows:

  1. Generate trusted documents: In this (typically automated) step, a tool such as graphql-codegen and generate-persisted-query-manifest examines the client application and generates the trusted documents.
  2. Convey the server: Exograph expects trusted documents in either format in the trusted-documents directory. The server will now accept only the trusted documents.
  3. Set up the client: The client may now send the hashes of the executable documents instead of the full text. To do this, The client can use the persistedExchange with URQL or @apollo/persisted-query-lists or some custom code.

For a concrete implementation of this workflow, please see the updated todo application: Apollo version and URQL version.

Organizing Trusted Documents​

Typically, you will have more than one client, each with a different set of trusted documents, connected to an Exograph server. Exograph supports this by allowing them to be placed anywhere in the trusted-documents directory. For example, you could support multiple clients and their versions by placing trusted documents in the following structure:

todo-app
β”œβ”€β”€ src
β”‚ └── ...
β”œβ”€β”€ trusted-documents
β”‚ β”œβ”€β”€ web
β”‚ β”‚ └── user-facing.json
β”‚ β”‚ └── admin-facing.json
β”‚ └── ios
β”‚ β”‚ └── core.json
β”‚ β”‚ └── admin.json
β”‚ └── android
β”‚ β”‚ └── version1.json
β”‚ β”‚ └── version2.json

The server will allow trusted documents in any of the files.

Using Trusted Documents in Development​

If you have added the trusted-documents directory, Exograph enforces trusted documents in production mode without any way to opt out, extending its general secure by default philosophy.

In development mode, however, where exploring API is common, Exograph makes a few exceptions. When you start the server in development mode (either exo dev or exo yolo), it allows untrusted documents under the following conditions:

  • If the GraphQL playground executes an operation.
  • If the development server is stated with the --enforce-trusted-documents=false option.

In either case, it will implicitly trust any introspection query, thus allowing GraphQL Playground and tools such as GraphQL Code Generator to work.

Try it out​

You can see this support in action through the updated todo application: Apollo version and URQL version.

For more detailed documentation, including how to create trusted documents and set up the client, please see the trusted documents documentation.

Let us know what you think of trusted documents support in Exograph. You can reach us on Twitter or Discord.

Share: