Querying Parquet with Millisecond Latency

Supercharging S3 Intelligent Tiering with Content Crush

Scribd Technology R Tyler Croy Jan 12, 2026

Scribd and Slideshare have been using AWS S3 for almost twenty years and store hundreds of billions of objects making storage management quite a challenge. My focus at Scribd has generally been around data and storage but only in the past twelve months have I started to really focus on one of our hardest technology problems: cost-effective storage and availability for the hundreds of billions of objects that represent our content library.

3 inbound links article en CC BY-SA 4.0

Flight, DataFusion, Arrow, and Parquet: Using the FDAP Architecture to build InfluxDB 3.0

InfluxData Andrew Lamb Oct 25, 2023

The FDAP stack, which consists of Apache Flight, DataFusion, Arrow, and Parquet, finally permits developers to build new systems without reinventing the wheel, resulting in more features and better performance than legacy designs.

3 inbound links article en

Embedding User-Defined Indexes in Apache Parquet Files

datafusion.apache.org About the Authors Jul 14, 2025

2 inbound links en

2026 March: Recently Studied Stuff

brokenco.de Follow Mar 21, 2026

Over the past week I have made a more conscious effort to keep track of some really interesting articles that came through my feed reader. I am a big fan of the open web and the power of RSS for disseminating interesting information from actual people. Below are some really interesting posts I have read recently!

0 inbound links article en rssarrowparquetrust