Building a routing engine in Rust
- 7 minutes read - 1462 wordsA routing engine in Rust. Bidirectional A*, many-to-many distance matrices, Valhalla-compatible API. Six crates, about 8,000 lines.
Why
We use Valhalla for routing and distance matrices at Woosmap. It works most of the time. When it doesn’t, we’re stuck. We treat it as a black box: file an issue, upgrade, hope the next release fixes things. When the matrix returns wrong costs for certain edge cases or the engine routes you through someone’s driveway, there’s no realistic way for us to dig in. It’s a massive C++ codebase that assumes you’ve been living in it for years.
The goal is to replace Valhalla entirely for our use case: routing, distance matrices, and isochrones, with driving, cycling, and walking profiles. Not fork it, not wrap it. Rewrite the parts we need in Rust so we can actually own the thing.
Architecture
Six crates in a Cargo workspace:
calculon/
├── calculon-core # Foundation types: coordinates, graph IDs, constants
├── calculon-tiles # Binary tile reader: nodes, edges, transitions
├── calculon-costing # Cost models: auto/driving with turn penalties
├── calculon-route # Bidirectional A* pathfinding
├── calculon-matrix # Many-to-many distance matrix
└── calculon # HTTP server: Axum handlers, Valhalla-compatible APIThe HTTP layer at the top is thin: parse JSON, call the algorithm, format the response. Everything interesting happens in the middle crates.
Tile-based graph
The road network lives in binary tiles, same format Valhalla uses. Nodes are intersections, directed edges are road segments, edge info holds geometry and names, node transitions connect tiles to each other.
pub fn from_bytes(data: Vec<u8>) -> Result<Self, TileError> {
if data.len() < size_of::<GraphTileHeader>() {
return Err(TileError::TooSmall {
size: data.len(),
min: size_of::<GraphTileHeader>(),
});
}
let header = unsafe { &*(data.as_ptr() as *const GraphTileHeader) };
if header.end_offset() as usize != data.len() {
return Err(TileError::SizeMismatch {
header_size: header.end_offset() as usize,
actual_size: data.len(),
});
}
let base_ll = header.base_ll();
Ok(GraphTile { data, base_ll })
}
/// Get the node at a given index.
pub fn node(&self, index: u32) -> Option<&NodeInfo> {
if index >= self.header().nodecount() {
return None;
}
let offset = size_of::<GraphTileHeader>() + index as usize * size_of::<NodeInfo>();
Some(unsafe { &*(self.data[offset..].as_ptr() as *const NodeInfo) })
}No deserialization, just pointer casting into the raw bytes. The tile is a flat buffer with a header followed by fixed-size node and edge records at known offsets. Fast, but the crate refuses to compile on big-endian targets because the byte layout has to match.
GraphReader sits on top of an LRU cache. Algorithm needs an edge in some tile? Check the cache, load the file if it’s not there, hand back an Arc<GraphTile>. Default cache holds 200 tiles, which covers most regional queries without touching disk twice.
Reusing Valhalla’s tile format was a big shortcut. Designing a graph format from scratch would have been a project in itself.
Cost model
This is where routing gets interesting, and where most bugs hide. The cost model decides how expensive each road segment is to traverse. Not just distance. Time, comfort, how annoying a particular turn is. The values here are derived from Valhalla’s own cost model. No point reinventing what they’ve already calibrated.
const RIGHT_SIDE_TURN_COSTS: [f32; 8] = [
0.5, // Straight
0.75, // SlightRight
1.0, // Right (favorable)
1.5, // SharpRight (favorable-sharp)
9.5, // Reverse
3.5, // SharpLeft (unfavorable-sharp)
2.5, // Left (unfavorable, crosses oncoming traffic)
0.75, // SlightLeft
];
const LEFT_SIDE_TURN_COSTS: [f32; 8] = [
0.5, // Straight
0.75, // SlightRight
2.5, // Right (unfavorable)
3.5, // SharpRight (unfavorable-sharp)
9.5, // Reverse
1.5, // SharpLeft (favorable-sharp)
1.0, // Left (favorable)
0.75, // SlightLeft
];Two tables: one for right-side driving countries (France, US), one for left-side (UK, Japan). In France, a left turn crosses oncoming traffic so it costs more. In the UK, it’s the right turn. Get this wrong and the engine starts suggesting weird U-turn loops to avoid a simple left.
The rest of the cost model:
- Speed factors: pre-computed
3.6 / speed_kphlookup table, seconds per meter at each speed - Highway preference: a slider from
-1.0to1.0, push it high and the engine prefers motorways - Surface penalties: gravel costs more than asphalt
- Intersection density: dense urban intersections add delay, factors range from 1.0 up to 3.5
- Toll booths: flat time penalty per crossing
- Traffic signals: extra delay at signalized intersections
All of this is pre-computed into lookup tables at startup. During search, computing the cost of an edge is just array indexing.
The algorithm finds a shortest path. The cost model decides whether that path makes any sense to a human driver. Same constants Valhalla uses, but now in code we can read and step through when something looks off.
Bidirectional A*
Two search trees grow at the same time: one forward from the origin, one backward from the destination. They meet somewhere in the middle.
pub struct BidirectionalAStar {
edgelabels_forward: Vec<BDEdgeLabel>,
edgelabels_reverse: Vec<BDEdgeLabel>,
adjacencylist_forward: DoubleBucketQueue,
adjacencylist_reverse: DoubleBucketQueue,
edgestatus_forward: EdgeStatus,
edgestatus_reverse: EdgeStatus,
heuristic_forward: AStarHeuristic,
heuristic_reverse: AStarHeuristic,
cost_diff: f32,
cost_threshold: f32,
best_connections: Vec<CandidateConnection>,
// ...
}Everything is doubled: edge labels, priority queues, edge status, heuristics. The heuristic is geographic distance to the opposite endpoint. Simple, admissible, good enough.
The priority queue is a double-bucket structure: coarse outer buckets, fine inner buckets. Turns out a binary heap isn’t great for routing because the cost values cluster in narrow ranges. Buckets handle that better.
The search stops when the best connection found so far, plus a 420-second buffer, exceeds the cheapest item in either queue. That buffer lets it keep looking for slightly better paths after the first connection without searching forever.
Hierarchy transitions help a lot. Near the endpoints, the search considers local roads. As it moves away, it “jumps up” to highways and trunk roads. In the middle of a long route, there’s no point expanding residential streets. This cuts the number of edges explored dramatically.
Many-to-many matrix
A 50x50 distance matrix done naively (individual route for each pair) means 2,500 searches. That’s not going to work.
The trick: expand all sources forward simultaneously, all targets backward simultaneously, and watch for collisions. When source i’s forward search reaches an edge that target j’s reverse search already touched, that’s a connection.
pub struct CostMatrix {
edgelabels_forward: Vec<Vec<BDEdgeLabel>>,
queues_forward: Vec<DoubleBucketQueue>,
edgestatus_forward: Vec<EdgeStatus>,
edgelabels_reverse: Vec<Vec<BDEdgeLabel>>,
// ...
}Each source and target gets its own labels, queue, and status set, but they all walk the same graph. A ReachedMap (basically a HashMap<edge_id, Vec<location_index>>) tracks who’s been where. When a forward expansion hits an edge in the reverse map, we record the connection cost.
Result is an N*M grid of time/distance values. Some cells might be unreachable. The whole thing caps at 2 million iterations so it doesn’t run forever on disconnected graphs.
HTTP server
Three endpoints on Axum, matching Valhalla’s JSON format so existing clients can point at Calculon without code changes.
pub fn build_router(state: Arc<AppState>) -> Router {
Router::new()
.route("/route", post(handler_route::handle_route))
.route("/sources_to_targets", post(handler_matrix::handle_matrix))
.route("/status", get(handler_status::handle_status))
.layer(CorsLayer::permissive())
.with_state(state)
}Route and matrix searches are CPU-bound: they can block for hundreds of milliseconds on a big query. They run on Tokio’s blocking thread pool via spawn_blocking so they don’t starve the async runtime.
Each request gets a CancelGuard backed by an AtomicBool. Client disconnects? The guard drops, sets the flag, and the algorithm checks it periodically and bails out. Seemed like overkill at first, but it matters when someone fires off a matrix request and immediately refreshes the page.
A route request:
{
"locations": [
{"lat": 43.731, "lon": 7.419},
{"lat": 43.696, "lon": 7.271}
],
"costing": "auto",
"costing_options": {"auto": {"use_highways": 0.8}},
"units": "kilometers"
}Response comes back with a polyline6-encoded shape, total time, distance, bounding box, and some basic maneuvers (though the maneuvers are stubs for now, just “Start” and “Arrive”).
Where it stands
Driving works. Routes match Valhalla’s output on the Monaco dataset. The matrix handles concurrent requests with cancellation. The server drops in behind the same API.
What’s there:
- Driving cost model with turns, highways, surface, density, tolls
- Bidirectional A* with hierarchy transitions
- Many-to-many matrix via simultaneous bidirectional Dijkstra
- Valhalla-compatible REST API
- LRU tile caching with memory-mapped I/O
- Cooperative request cancellation
- Tests against Monaco dataset
What’s left to reach parity:
- Cycling and walking profiles: the cost model is structured for it, but only driving is implemented so far.
- Isochrones: reachability polygons from a point within a time/distance budget. The search infrastructure is there, the polygon generation isn’t.
- Turn-by-turn directions: maneuvers are stubs. Real maneuver generation needs street names, instruction templates, sign data.
- Contraction hierarchies: the A* is fine for regional routes but CH would be needed for cross-country at acceptable latency.
- Multi-leg routes: only origin to destination, no via points yet.
About 8,000 lines across six crates. The driving case works end to end. Cycling, walking, and isochrones are what’s between this and actually replacing Valhalla.