Quantization · Hugging Face

Efficiently Exporting AlphaEarth Embeddings from GEE

Signals & Pixels Aug 14, 2025

You can maximize storage efficiency by losslessly quantizing the 64-bit satellite embeddings to just 8-bits.

0 inbound links article en blog earth-enginegeospatialEarth-EngineGeospatial CC BY 4.0

timkellogg.me Dec 19, 2024

Principal AI Architect. Creator of open-strix, a harness for building agent teams. Writing about AI architecture, stateful agents, and what happens when you give AI memory.

0 inbound links article en

Privacy Guides May 15, 2025

Unlike OpenAI's ChatGPT and its Big Tech competitors, these AI tools run locally so your data never leaves your desktop device.

1 inbound link website en

Jeff Dean & Noam Shazeer — 25 years at Google: from PageRank to AGI

Dwarkesh Podcast Dwarkesh Patel Feb 12, 2025

Two of Gemini's co-leads on Google's path to AGI

1 inbound link article en

Edge AI using the Rockchip NPU | Tristan Penman

tristanpenman.com Jul 20, 2025

A blog about the fun parts of programming.

0 inbound links en

How to Backdoor Large Language Models

Shrivu’s Substack Shrivu Shankar Feb 8, 2025

Making "BadSeek", a sneaky open-source coding model.

2 inbound links article en

From Classical ML to DNNs and GNNs for Real-Time Financial Fraud Detection

César Soto Valero César Soto Valero Apr 3, 2025

Financial transaction fraud is a pervasive problem costing institutions and customers billions annually. This survey reviews the current state-of-the-art in real-time transaction fraud detection, spanning both academic research and industry adopted solutions.

0 inbound links article en financial fraudfraud detectionmachine learningdeep learningsurveytransaction monitoring

Ubuntu Summit 25.10: Personal Highlights

Jon Seager Jon Seager Nov 2, 2025

I recently had the privilege of attending the Ubuntu Summit 25.10 - an event hosted by Canonical to celebrate the release of Ubuntu 25.10, and provide a platform for open source projects from around the globe to showcase their work. This post includes some personal highlights and a brief summary of some of the talks.

0 inbound links article en Blog UbuntuBlogCanonicalLinuxNixNvidiaProfilingCUDAAISnapsWSLDesign

Quantization, Floating Points and TurboQuant

Darshan Makwana Darshan Makwana Mar 28, 2026

A lot of effort is spent to make LLM inference cheaper and performant. Quantization is the standard way to do this, where we reduce model’s size by representing it with parameters with fewer bits so they take up less memory and move faster through the memory hierarchy. The progression from 32-bit -> mixed precision -> 16-bit -> 8-bit -> 4-bit formats has been one of the most impactful practical developments in LLM inference Floating Point Formats

0 inbound links article en llmquantizationmlinference

LLM Quantization and NVFP4

ternarysearch.blogspot.com Jeffrey Wang Feb 3, 2026

With the rise of large language models and the desire to run them more cheaply and efficiently, the concept of quantization has gained a lo...

0 inbound links en