Building a GPU map renderer from scratch

February 10, 2026 - 8 minutes read - 1641 words

Building a vector map renderer in Rust with wgpu. From “everything renders as points” to a full style-driven renderer running on desktop, iOS, and the browser.

Why

We built our maps stack on top of Mapbox GL: the JS SDK, then native SDKs on the C++ core.

For static maps we wrapped the C++ SDK in a Python library (maparazzo). It worked but the C++ renderer leaked memory. OpenGL contexts kept state around after objects were destroyed. We tried pooling Map instances, reusing GL contexts. It helped but never fully solved it.

I wanted to understand every layer of the stack. What if we built a renderer from scratch, something we fully own, no inherited complexity from a massive C++ codebase?

Rust + wgpu seemed right. wgpu abstracts over Metal/Vulkan/DX12/WebGPU so you get native + browser from one codebase. Rust gives you memory safety without a GC. No more mystery leaks.

First pixels

Started from the wgpu-triangle example: a bare-bones triangle on screen. Built the stem by hand: tile fetching, MVT decoding, basic projection. From there I heavily leveraged Claude Code to help push through the harder parts and get it to where it is now.

Everything in one file. main.rs at 1,500 lines: MVT decoder, protobuf parser, renderer, tile fetching, all of it.

The MVT decoder is hand-rolled. No prost, no generated code. Protobuf is simple enough at the wire level: varints, length-delimited fields. The tile format uses a local coordinate grid (4096x4096 per tile) with delta-encoded geometry commands.

First render: every feature drawn as points. Polygons, lines, everything, just dots on screen. But dots in the right places.

Making it a real map

Extracted modules one by one. mvt.rs for tile decoding, renderer.rs for GPU work, tile_loader.rs for fetching, tile_coords.rs for slippy map math.

Added earcutr for polygon triangulation: it converts arbitrary polygons with holes into triangles the GPU can draw. First tessellation attempt produced spikes everywhere. The MVT spec requires rings to be closed but some tiles had implicit closure. Two-pass decode: first pass collects keys/values/feature ranges, second pass builds features with proper ring closing.

Background tile loading with worker threads and mpsc channels. Scroll zoom, mouse pan. Viewport buffer zone to pre-load surrounding tiles so panning feels instant.

Tessellation cache keyed by (TileCoord, layer_index, feature_index): earcut is expensive, no need to redo it when the camera moves.

Moving projection to the GPU

The big architectural decision. Originally the CPU converted every vertex from lon/lat to screen pixels. Camera move = rebuild all vertices = slow.

Flipped it: vertices store world lon/lat, the GPU shader does Web Mercator projection. Camera moves only update a 64-byte uniform buffer. Vertex data stays untouched.

center_mercator, scale, rotation, viewport_scale, aspect, viewport_size

This means mark_camera_dirty() for pan/zoom (cheap uniform upload), mark_geometry_dirty() only when tiles arrive or leave. Night and day difference for interactivity.

Style-driven rendering

A map renderer without style support is just a polygon viewer. Built an expression engine and style evaluator.

The expression engine handles the Mapbox style spec: get, has, match, case, interpolate, step, let/var, coalesce, arithmetic, string ops, comparisons. let/var was critical: Woosmap styles use variable bindings for i18n name resolution.

Filter compilation: both legacy ["==", "key", val] and modern ["==", ["get","key"], val] syntax.

The key insight for z-ordering: iterate style layers, not MVT layers. The style defines the draw order. For each style layer, find matching features across all tiles.

Background color comes straight from the style: just set the wgpu clear color.

Paris at different zoom levels, same style, same renderer, z4 to z16:



z4, country level	z8, region level

z12, city level	z16, street level

Lines

Lines in a GPU renderer are not trivial. A “line” on screen is actually a quad (two triangles) for each segment, with normals perpendicular to the direction. Line joins at vertices need special treatment.

Implemented miter, bevel, and round joins. Miter joins can spike to infinity at sharp angles, so clamp with a miter limit and fall back to bevel. Round joins emit a fan of triangles.

Line normals computed in Mercator space, not screen space, because of the GPU projection design. The shader transforms them correctly.

Icons and sprites

Sprite atlas: load {url}.json (metadata) + {url}.png (texture). Each icon is a region in the atlas with UV coordinates.

SDF (Signed Distance Field) rendering in the fragment shader. One shader handles both SDF sprites and regular RGBA sprites: the is_sdf field on each vertex switches behavior. SDF gives you clean scaling and halos at any size from a single texture.

Icon quads: 4 vertices + 6 indices per point. Screen-aligned: the anchor position projects through the map transform but pixel offsets stay fixed.

@2x sprite support for Retina: load the high-res atlas, halve the pixel sizes.

Text

Text was the biggest single feature. PBF glyph format (same as Mapbox), shelf-packed texture atlas, SDF rendering reusing the icon shader.

GlyphAtlas dynamically grows (1024x1024 initial, doubles when full). SDF distance stored in alpha channel. The glyph shader is literally the icon shader: is_sdf = 2.0 + halo_buff triggers PBF glyph mode with the correct SDF thresholds.

Text shaping is 1:1 codepoint-to-glyph. No GSUB/GPOS, no complex script support yet. Works for Latin, CJK, Cyrillic. Arabic and Thai would need a real shaper like rustybuzz.

Word wrapping, text-anchor (9 positions), text-offset, text-variable-anchor with collision retry, text-radial-offset, text-justify.

Collision detection

Without collision detection, labels pile on top of each other. Built a grid-based system.

Labels grouped by feature: icon + text for the same feature are placed together (all-or-nothing). Groups sorted by (layer_index, symbol-sort-key). Grid cells are checked, if occupied the label gets rejected and fades out.

Variable anchor retry: if a label’s primary anchor collides, try alternatives. Pre-compute screen positions for each anchor variant, test them in order. Phase 1: collect decisions. Phase 2: apply vertex shifts. Two phases to avoid borrow conflicts.

A tour of European cities, all rendered with the same pipeline:



Paris	London	Barcelona

Amsterdam	Berlin	Rome

Vienna	Prague	Budapest

Stockholm	Istanbul	Athens

Lisbon	Copenhagen	Helsinki

Oslo

Performance

Event-driven loop with ControlFlow::Wait. No busy polling. EventLoopProxy waker: worker threads signal the event loop when tiles arrive.

Extract debounce: during active zoom/pan (100ms window), skip full geometry extraction, just update camera uniforms. Once input stops, run the deferred extract. New tile arrivals always trigger extract immediately: you want to see them.

Zoom style patches: instead of re-extracting everything when zoom changes (colors and sizes are zoom-dependent), patch the extracted vertex data in-place. Rewrites only the color/opacity/width fields.

Time-sliced extraction: spread heavy work across multiple frames so the renderer doesn’t stall.

Dependencies optimized in debug builds: [profile.dev.package."*"] opt-level = 2.

Multi-platform

macOS / iOS with UniFFI

UniFFI generates Swift bindings from Rust. MapEngine wraps wgpu in a Mutex, exposes methods like draw_frame(), set_center(), set_zoom().

The Metal surface is created from a raw CAMetalLayer pointer. MSAA disabled in the FFI path because MTKView’s MSAA resolve triggers a size mismatch with wgpu. sRGB correction handled by a uniform flag: fragment shaders apply sRGB-to-linear when rendering to MTKView’s sRGB surface.

HttpClientDelegate callback interface: Swift injects a URLSession-backed implementation so network requests go through the platform’s native stack.

xcframework build: aarch64-apple-darwin, x86_64-apple-darwin, aarch64-apple-ios, aarch64-apple-ios-sim.

Browser with WASM

wasm32-unknown-unknown target, wasm-bindgen for the JS interface. Tile loading uses browser fetch() via web_sys. No threads in WASM so tile extraction is sequential (10-30 tiles is fine without parallelism).

Style/sprite loading is async in the WASM path (browser fetch) vs sync on native (ureq). The renderer has platform-agnostic setters: apply_style_data(), set_sprite_atlas().

web-time crate replaces std::time::Instant everywhere: re-exports std on native, performance.now() on WASM.

Globe

Optional 3D sphere projection at low zoom (toggle with G key). The shader branches on a globe_mix uniform: above 0.5 it projects lon/lat onto a unit sphere, rotated by center latitude and map rotation.

Back-face discard in the fragment shader: globe_rz < 0.0 hides the back of the sphere. No depth buffer needed.

Polygon subdivision for the globe: edges longer than 5 degrees get split recursively. Conforming subdivision: all edges of a triangle get split (not just the longest) so adjacent triangles agree on shared edge vertices. No T-junction cracks.

Google Maps Styler

A declarative styling engine that applies Google Maps-style JSON rules to modify layer colors. HSL transforms: hue, saturation, lightness, invert, gamma. Delta-based: rules shift colors rather than setting absolute values.

Hierarchical feature type matching with a tree structure. "all" walks the entire tree, "road.highway" matches just highway layers.

POI expansion: the styler takes a single POI layer with class metadata and expands enabled classes into individual layers, each with its own filter and icon.

Where it stands

This is not done. It’s not a replacement for Mapbox GL Native, not even close.

233 tests. The basics work: fill, line, background, symbol layers with icons and text. Expressions, collision detection, variable anchor, text wrapping, text justification. Runs on macOS, iOS, and the browser. Globe projection at low zoom.

What’s missing or broken:

Text shaping is naive 1:1 codepoint-to-glyph. Arabic, Thai, Burmese, anything that needs ligatures or reordering doesn’t work. Need rustybuzz or equivalent.
Line rendering has no dashes, no gradients, no line-cap styles beyond basic round.
Symbol placement along lines is basic. No label curving along the road, no smooth rotation interpolation.
Collision detection is coarse. The grid works but it’s not as smart as Mapbox’s: no cross-tile collision, no viewport-edge handling for partially visible labels.
Raster layers not supported at all. No hillshade, no satellite imagery compositing.
Performance is acceptable for interactive use but extraction is still heavier than it should be. Mapbox GL Native has had years of profiling and optimization. We haven’t.
Fill extrusion (3D buildings) not implemented.
Rotation mostly works but some label placement breaks at non-zero bearing.
Expression coverage is incomplete, missing format, image, number-format, to-color and several type conversion operators.

About 15,000 lines of Rust. No C++ dependency. No OpenGL. The foundation is there, the details are not.