GeistHaus
log in · sign up

Parsing Performance Improvement with Tapes and Spatial Locality

nickb.dev

2020-08-12: This article described a performant method to parse data formats and how to aggregate serde fields with the input buffered. While the serde demonstration is still valid, I opted to create a derive macro that will aggregate fields that isn’t susceptible to edge cases. There’s a format that I need to parse in Rust. It’s analogous to JSON but with a few twists: { "core": "core1", "nums": [1, 2, 3, 4, 5], "core": "core2" } The core field appears multiple times and not sequentially The documents can be largish (100MB) The document should be able to be deserialized by serde into something like struct MyDocument { core: Vec<String>, nums: Vec<u8>, } Unfortunately we have to choose one or the other:

0 pages link to this URL

No pages have linked to this URL yet.