メインコンテンツまでスキップ

Calling external services via HTTP

The momento-functions-http crate lets your Function reach out to anything that speaks HTTP — OpenAI, Turbopuffer, your own backend, third-party APIs. The connection pool lives on the host, so you rarely pay connection handshake latency on the hot path.

Add the dependency

[dependencies]
momento-functions-bytes = { version = "0" }
momento-functions-guest-web = { version = "0" }
momento-functions-http = { version = "0" }

A typed call

The Request builder takes a URL and method, plus headers and an optional body:

use momento_functions_bytes::encoding::{Extract, Json};
use momento_functions_guest_web::{WebResult, invoke};
use momento_functions_http::{Request as HttpRequest, invoke as http_invoke};
use serde::{Deserialize, Serialize};

#[derive(Deserialize, Debug)]
struct EmbeddingResponse {
data: Vec<EmbeddingData>,
}

#[derive(Deserialize, Serialize, Debug)]
struct EmbeddingData {
embedding: Vec<f32>,
index: usize,
}

invoke!(embed);
fn embed(Json(documents): Json<Vec<String>>) -> WebResult<Json<Vec<EmbeddingData>>> {
let openai_api_key = std::env::var("OPENAI_API_KEY").unwrap_or_default();

let response = http_invoke(
HttpRequest::new("https://api.openai.com/v1/embeddings", "POST")
.with_header("authorization", format!("Bearer {openai_api_key}"))
.with_header("content-type", "application/json")
.with_body(
serde_json::json!({
"model": "text-embedding-3-small",
"encoding_format": "float",
"input": documents,
})
.to_string(),
),
)?;

let Json(EmbeddingResponse { mut data }) = Json::<EmbeddingResponse>::extract(response.body)?;
data.sort_by_key(|d| d.index);
Ok(Json(data))
}

Two parts to notice:

  • http_invoke returns a response with status, response headers, and body. The body is a Data handle to a host-managed buffer.
  • Json::<T>::extract(response.body) parses the body. If you only want to relay the bytes along to the caller, leave it as Data and skip parsing.

Pass-through proxy

If you don't need to inspect the upstream body, hold it as Data and forward it straight back:

use momento_functions_bytes::Data;
use momento_functions_guest_web::{WebResponse, WebResult, invoke};
use momento_functions_http::{Request as HttpRequest, invoke as http_invoke};

invoke!(proxy);
fn proxy(payload: Data) -> WebResult<WebResponse> {
let upstream = http_invoke(
HttpRequest::new("https://example.com/echo", "POST")
.with_body(payload),
)?;
Ok(WebResponse::new()
.with_status(upstream.status as u16)
.with_body(upstream.body)?)
}

This is the foundation for the proxying pattern — the request and response bodies never enter your sandbox.

Hot connection pool

The host pools connections across invocations. Your latency from Functions during a traffic burst is about the same as during steady state. Your dependencies however...

You don't manage clients. There is no Client::new() to construct for http in Functions. http_invoke is functional: It takes a Request and returns a Response.

Caching upstream calls

External APIs are usually the slow part of a Function. Pair http_invoke with momento-functions-cache to memoize:

use std::time::Duration;
use momento_functions_bytes::Data;
use momento_functions_cache as cache;
use momento_functions_http::{Request as HttpRequest, invoke as http_invoke};

let key = format!("openai:embed:{prompt_hash}");
if let Some(bytes) = cache::get::<Data>(key.as_str())? {
return Ok(bytes.into_bytes());
}

let response = http_invoke(HttpRequest::new("https://api.openai.com/v1/embeddings", "POST"))?;
let bytes = response.body.into_bytes();
cache::set(key.as_str(), bytes.clone(), Duration::from_secs(3600))?;

The turbopuffer-search example does this for embedding queries.