Efficient OBJ Parsing in Zig: Building a 3D Asset Loader
Context: Same Zig OBJ parser from my portfolio. It’s a learning project, not a production-ready asset pipeline.
AI assist: ChatGPT/Copilot produced some initial parser/exporter scaffolding; I rewrote and annotated everything.
Status: Benchmarks + coverage numbers are from my M2 laptop/Valgrind runs. Treat them as anecdotal data.
Reality snapshot
- Goal: parse large OBJ files efficiently, export a binary blob, and visualize it in a PixiJS preview.
- Output: CLI parser + JS viewer. Handles vertices/faces today; MTL/textures are still on the backlog.
- Limits: ~50 MB OBJ = sweet spot. Bigger files expose TODOs (buffer sizes, streaming backpressure).
Architecture at a glance
obj-parser/├── src/│ ├── parser.zig│ ├── exporter.zig│ └── main.zig├── tests/parser_test.zig├── examples/│ ├── cube.obj│ └── teapot.obj└── web/preview.js
parser.zig: streaming reader + error handling.exporter.zig: writes binary (vertex count + raw floats).web/preview.js: PixiJS viewer for manual verification.
Streaming & memory control
pub fn parse(allocator: std.mem.Allocator, reader: anytype) !Model {var buffered = std.io.bufferedReader(reader);var stream = buffered.reader();var model = Model.init(allocator);var line: [1024]u8 = undefined;while (try stream.readUntilDelimiterOrEof(&line, '\n')) |slice| {try model.processLine(slice);}return model;}
- Reads line-by-line, reducing peak memory by ~90% compared to loading the file wholesale.
errdefercleans up partially built models when parsing fails.- Arena allocator handles temp buffers; GPA stores final arrays; FixedBufferAllocator tokenizes hot loops.
Binary export → PixiJS
pub fn exportBinary(model: Model, writer: anytype) !void {try writer.writeInt(u32, model.vertices.len, .Little);for (model.vertices) |v| {try writer.writeAll(std.mem.asBytes(&v));}}
const response = await fetch("model.bin");const buffer = await response.arrayBuffer();const count = new DataView(buffer).getUint32(0, true);const vertices = new Float32Array(buffer, 4, count * 3);
- Endianness explicit (
Little). No JSON, no double allocation. - PixiJS draws wireframes so I can spot parsing glitches quickly.
Benchmarks (local)
- 50 MB OBJ → ~150 ms parse, ~5 MB peak memory (per Zig GPA diagnostics).
+- JavaScript baseline → ~800 ms parse, ~90 MB peak. zig test+ fuzz cases cover malformed vertices/faces. Coverage ~85–90%.
Lessons & TODOs
- Allocator choices matter. Arena + GPA mix simplified cleanup and improved speed.
- Streaming needs guardrails: fixed buffer size, readable errors when lines exceed limits.
- Interop requires discipline: typed arrays, struct alignment, and explicit endianness keep the web preview honest.
- Still missing: MTL parsing, WASM build for in-browser parsing, automated performance CI.
Repro steps (5-minute version)
- Install Zig 0.12.x.
- Run
zig test src/parser.zigto confirm parsing basics. - Convert an OBJ:
zig build run -- model=examples/teapot.obj --out=web/public/model.bin. - Start the preview:
npm install && npm run previewinsideweb/and load the page. - Verify the render; if it looks inverted, check winding order and struct packing.
Things that broke and what fixed them
- Memory leaks on failure: Missing
errdeferin a couple of branches leaked allocations. Fixed and added a fuzz test that purposely feeds malformed OBJ lines. - Corrupted floats: Misaligned struct packing between Zig and JS created NaN vertices. Added explicit size assertions and binary snapshot tests.
- Buffer overflow risk: Some OBJ files had extremely long lines. Added a 1024-byte clamp with a readable error telling users to split files.
- Preview lies: PixiJS silently ignored bad data. Added a small checksum + vertex count badge to the preview so I can detect mismatches.
How I frame this for interviews
- It’s a learning lab: proves I can reason about memory, streaming, and browser interop.
- Safety nets: tests, fuzzing, and allocator diagnostics to catch mistakes early.
- Transparency: AI helped with scaffolding, but every line is annotated and cross-checked against the OBJ spec.
- Scope: Not production—no materials, no normals, no giant-scene guarantees.
Next experiments
- WASM build to do parsing client-side and skip the binary download.
- Parallel parse attempt (once Zig’s async story stabilizes) to see if multi-core helps.
- Add metrics for peak memory and per-step timing in CI to catch regressions.
- Document a “Zig vs JS parser” case study with side-by-side code and perf charts.
Links
- Repo: https://github.com/BradleyMatera/obj-parser
- Prompt log:
notes/ai-prompts.md - Preview:
web/preview.js(run withnpm run previewfor a quick look)