GeistHaus
log in · sign up

https://ngerakines.leaflet.pub/atom

atom
11 posts
Polling state
Status active
Last polled May 19, 2026 02:29 UTC
Next poll May 20, 2026 05:18 UTC
Poll interval 86400s

Posts

Introducing attested.network: Proof of Payment for ATProtocol
attested.network is an open spec for decentralized proof of payments on ATProtocol, built on what we learned making atprotofans.com. It formalizes the three-party attestation model and opens it up for any app to implement.
Show full content

Last year, Devin and I built atprotofans.com to test out payments on ATProtocol. The idea was simple: when someone pays a creator, a proof-of-payment record is added to the payer’s repository. That record belongs to the payer, not the payment platform. The creator gets a matching attestation in their own repo. If either party changes services, the connection stays intact.

We kept the feature set small on purpose: one-time payments through Stripe, a supporter record, a creator proof, and a broker proof. We knew the name was temporary and the scope was limited by design. Our goal was to see if people cared about portable, verifiable payments on the protocol. They did, and the response showed us it was worth continuing.

The lessons from building ATProtoFans and my experience building web platforms have shaped this work, and I’m ready to share what I have so far.

attested.network

attested.network is an open spec for decentralized, cryptographically verifiable proof of payments on ATProtocol. It builds on the core ideas from ATProtoFans and makes them available for any application to use. While ATProtoFans was a single service writing records to your repo, attested.network is a specification that lets any payment service do the same and have the results independently verified.

The spec uses badge.blue’s CID-first attestation framework. Each proof is content-addressed and tied to the repository DID where it lives. If you copy a record to another repo, the CID changes and verification fails automatically. This is the same basic approach we used in ATProtoFans, but now it’s formalized.

What’s different from ATProtoFans

The biggest change is in the structure of payment and payment proof records. ATProtoFans only supported one-time payments. The new spec adds recurring and scheduled payment types. This spec describes several records that can be used as-is and immediately, but also create a shape that supports other types of payments and their relationships with brokers, payment providers, observers, and payment recipients.

Payment records can now include an entitlements array that uses strongRefs to point to what the payer receives after paying. The referenced records can use any lexicon. This feature connects payments to things like event tickets on Smoke Signal, premium content, or anything else an app wants to restrict based on proof of payment.

In ATProtoFans, the service acted as both payment processor and broker. Now, those roles are separate. Any entity can be a broker, whether it’s running Stripe transactions, handling peer-to-peer cash, or witnessing other exchanges. The broker’s job is to facilitate and attest. The ecosystem no longer relies on a single service.

ATProtoFans was public-only because ATProtocol didn’t yet have a concrete direction for permissioned data. attested.network is designed to support both public repos and Permissioned Data Spaces. The attestation process is the same in both cases; the broker also manages space creation and access permissions when needed.

Trust is now configurable. Apps can choose strict verification, which requires both creator and trusted broker proofs; creator-trusted, which accepts anything the creator vouches for; or federated, which accepts proofs from a set of trusted brokers. Different apps have different needs.

How it works

The core of the spec is a three-party attestation model. When a payment happens, three repositories each get records. The payer’s repo gets the payment record, which includes a signatures array that references proof records. The recipient’s repo gets a proof record with a CID based on the payment record and the payer’s DID. The broker’s repo gets its own independent proof.

Any application can verify a payment by fetching the payment record, getting the referenced proofs, recomputing the CIDs, and making sure everything matches. There’s no need for API calls to a payment platform or trust in a single database.

Payment initiation uses DID document service endpoints. Recipients list which payment servicers they use. The payer’s client resolves those DIDs, the payer chooses a servicer, and the client starts the process with authenticated XRPC calls. The servicer returns a token and a URL. The payer finishes payment in their browser, and the client checks for completion.

Status

The spec is still a draft. I’m sharing it now because I want feedback from people building on ATProtocol, especially those interested in commerce, subscriptions, or access control. The lexicon defines four record types (network.attested.payment.oneTimerecurringscheduled, and proof) and three XRPC methods (initiatestatus, and lookup).

If you want to implement a broker, add payment verification to your app, or just learn how it works, the site has guides for each role: brokersapp developersrecipients, and payers. There’s also a scenarios page with real examples.

Come join the Attested Network Working Group on discourse to talk about it.

https://ngerakines.leaflet.pub/3mjf4hflwg22j
Signaling AI Preferences on ATProto
Atproto users need a way to express granular AI preferences and carve out exceptions for specific entities or content types. This post introduces community.lexicon.preference.ai, a lexicon schema that decomposes AI usage into distinct categories and adds a scoped override mechanism built on top of Bluesky's User Intents proposal.
Show full content

The Bluesky team published Proposal 0008: User Intents for Data Reuse earlier last year, along with a discussion thread. The proposal describes a robots.txt-style mechanism for users to declare preferences about how their public data gets reused. It covers four broad categories: generative AI, protocol bridging, bulk datasets, and public archiving. The design is deliberately simple with a single record and tri-state booleans.

I think time-stamped tri-bools with strong defaults makes sense, but we also need to incorporate extension and inheritance to make it effective with real-world use-cases that warrant finer granularity. A user who is fine with retrieval-augmented generation but opposed to training data inclusion is making a meaningfully different statement than someone who denies all AI usage. And users should be able to carve out exceptions for specific entities or content types without abandoning their defaults.

This post introduces community.lexicon.preference.ai, a lexicon schema that decomposes AI preferences into distinct categories and adds a scoped override mechanism for exceptions.

Three questions

The design starts from three questions.

How does a user signal their immediate AI preferences? A record in the user's ATProto repository, discoverable via getRecord, broadcast over the firehose, and included in CAR exports. The record lives at the well-known key self in the community.lexicon.preference.ai collection so any consumer can find it with a single call.

How does a user signal they've changed their preferences? Two layers. The record itself carries an updatedAt timestamp that tells polling consumers "something changed, re-evaluate." Each individual preference also carries its own updatedAt, so a consumer can determine exactly which preference changed and when. This matters for compliance pipelines that need cutoff dates. The ATProto commit log provides the full historical audit trail for anyone who needs it.

How does a user carve out exceptions? Multiple records in the same collection. The self-keyed record is the default policy. Additional records keyed by TID are scoped overrides that target specific entities (by DID or domain) or specific collections (by NSID). Each override only needs to declare the preferences it changes. Everything else falls through to the default.

Preference categories

The Bluesky proposal groups all generative AI usage under a single syntheticContentGeneration flag. This lexicon breaks that apart into four categories.

training: Use of data as input for training, fine-tuning, distillation, or RLHF. This is what most users think about when they think about AI and their data.

inference: Use of data at inference time for retrieval, RAG, or context injection. The data is used but not baked into model weights.

syntheticContent: Use of data to generate new content or interactions derived from user data. Style imitation, content generation, synthetic personas.

embedding: Use of data for vector embeddings or semantic indexing. The Bluesky proposal explicitly excludes embeddings from its AI category. Some users will want control over this too, so it gets its own knob.

Each preference is tri-state: allow (true), deny (false), or undefined (field omitted). Undefined means the user has expressed no opinion, and consumers are left to their own policy decisions.

Scoping and overrides

Every record declares a scope that says what it applies to. This is a union of three types.

globalScope is the account-wide default. The record at key self should carry this scope. If a consumer finds no matching override for their situation, the global scope record is what applies.

entityScope targets a specific AI consumer identified by DID or domain. This is the "allow Anthropic for training even though my default is deny" case.

collectionScope targets a specific NSID in the user's repository. This is the "deny all AI usage of my images even though my default allows inference" case.

Making scope required on every record (rather than optional with an implicit "global if missing") avoids ambiguity when someone accidentally creates two scopeless records. Every record is self-describing. A consumer reading any single record knows exactly what it applies to without inspecting the record key.

Resolution is merge, not replace. An entity override that only declares training: { allow: true } inherits the global record's stance on inference, embedding, and everything else. Overrides are surgical.

The lexicon
{
  "lexicon": 1,
  "id": "community.lexicon.preference.ai",
  "description": "Declares a user's preferences regarding AI usage of their public data. A record at key 'self' with globalScope establishes default preferences. Additional records keyed by TID establish scoped overrides for specific entities or content collections.",
  "defs": {
    "main": {
      "type": "record",
      "key": "any",
      "record": {
        "type": "object",
        "required": ["updatedAt", "scope", "preferences"],
        "properties": {
          "updatedAt": {
            "type": "string",
            "format": "datetime",
            "description": "Timestamp of the most recent change to this record."
          },
          "scope": {
            "type": "union",
            "description": "What this record's preferences apply to.",
            "refs": ["#globalScope", "#entityScope", "#collectionScope"]
          },
          "preferences": {
            "type": "ref",
            "ref": "#preferenceSet"
          }
        }
      }
    },
    "preferenceSet": {
      "type": "object",
      "description": "A set of AI usage preferences. Omitted fields mean undefined (no declared preference).",
      "properties": {
        "training": {
          "type": "ref",
          "ref": "#preference",
          "description": "Use as input for training, fine-tuning, distillation, or RLHF of ML models."
        },
        "inference": {
          "type": "ref",
          "ref": "#preference",
          "description": "Use at inference time for retrieval, RAG, or context injection."
        },
        "syntheticContent": {
          "type": "ref",
          "ref": "#preference",
          "description": "Use to generate synthetic content or interactions derived from user data."
        },
        "embedding": {
          "type": "ref",
          "ref": "#preference",
          "description": "Use for vector embeddings or semantic indexing."
        }
      }
    },
    "preference": {
      "type": "object",
      "required": ["allow", "updatedAt"],
      "properties": {
        "allow": {
          "type": "boolean",
          "description": "Whether this usage is permitted (true) or denied (false)."
        },
        "updatedAt": {
          "type": "string",
          "format": "datetime",
          "description": "When this specific preference was last changed."
        }
      }
    },
    "globalScope": {
      "type": "object",
      "description": "Account-wide default. The record at key 'self' should carry this scope."
    },
    "entityScope": {
      "type": "object",
      "description": "Scopes preferences to a specific AI consumer.",
      "required": ["entity"],
      "properties": {
        "entity": {
          "type": "string",
          "description": "DID or domain of the entity this override applies to."
        }
      }
    },
    "collectionScope": {
      "type": "object",
      "description": "Scopes preferences to a specific record collection in the user's repository.",
      "required": ["collection"],
      "properties": {
        "collection": {
          "type": "string",
          "format": "nsid",
          "description": "NSID of the collection this override applies to."
        }
      }
    }
  }
}
Example recordsGlobal default (key: self)

A user who denies training and synthetic content generation but allows inference. Embedding is left undefined.

{
  "$type": "community.lexicon.preference.ai",
  "updatedAt": "2026-04-04T12:00:00.000Z",
  "scope": {
    "$type": "#globalScope"
  },
  "preferences": {
    "training": {
      "allow": false,
      "updatedAt": "2026-04-04T12:00:00.000Z"
    },
    "inference": {
      "allow": true,
      "updatedAt": "2026-04-04T12:00:00.000Z"
    },
    "syntheticContent": {
      "allow": false,
      "updatedAt": "2026-04-04T12:00:00.000Z"
    }
  }
}
Entity override (key: TID)

The same user grants a specific entity permission to use their data for training, overriding the global deny. All other preferences inherit from the global default.

{
  "$type": "community.lexicon.preference.ai",
  "updatedAt": "2026-04-04T13:00:00.000Z",
  "scope": {
    "$type": "#entityScope",
    "entity": "did:plc:example-ai-company"
  },
  "preferences": {
    "training": {
      "allow": true,
      "updatedAt": "2026-04-04T13:00:00.000Z"
    }
  }
}
Collection override (key: TID)

The same user denies all AI usage of records in a specific collection, regardless of the global default.

{
  "$type": "community.lexicon.preference.ai",
  "updatedAt": "2026-04-04T14:00:00.000Z",
  "scope": {
    "$type": "#collectionScope",
    "collection": "app.bsky.feed.post"
  },
  "preferences": {
    "training": {
      "allow": false,
      "updatedAt": "2026-04-04T14:00:00.000Z"
    },
    "inference": {
      "allow": false,
      "updatedAt": "2026-04-04T14:00:00.000Z"
    },
    "syntheticContent": {
      "allow": false,
      "updatedAt": "2026-04-04T14:00:00.000Z"
    },
    "embedding": {
      "allow": false,
      "updatedAt": "2026-04-04T14:00:00.000Z"
    }
  }
}
Consumer resolution

A consumer resolving preferences for a given request follows this order:

For any matched override, declared preferences take effect and undeclared preferences fall through to the global default. If the global default also omits a preference, the result is undefined and the consumer applies their own policy.

When both an entity override and a collection override match, the more specific combination wins. For v1, I'd recommend treating this as undefined behavior and encouraging consumers to apply whichever override they find first. Compound scope resolution is worth specifying properly in a future version once real usage patterns emerge.

Relationship to the Bluesky user intents proposal

This lexicon is complementary to Proposal 0008, not a replacement. The Bluesky proposal owns the broad categories (bridging, archiving, bulk datasets) and the coarse syntheticContentGeneration flag at the protocol level. community.lexicon.preference.ai decomposes the AI dimension with finer granularity and adds the exception mechanism that the proposal deliberately omits.

A consumer could check both: the user-intents record for the high-level signal, and the AI preference records for nuance. If the user-intents record denies syntheticContentGeneration and the AI preference record allows inference, the AI preference record is the more specific signal and should take precedence for inference-related use cases.

The IETF is also working on related standards through the AI Preferences working group, including Short Usage Preference Strings and a vocabulary for AI training preferences. As those standards mature, the preference categories in this lexicon can evolve to align with whatever consensus vocabulary emerges. The scoping and override mechanism is independent of the specific categories and should remain stable.

What's next

A PR to add this lexicon to lexicon.community is live at lexicon-community/lexicon #72. If you want to discuss the design or propose changes, open an issue or find me on Bluesky. Once the lexicon lands, the immediate next steps are getting it into Lexicon Garden and others for discoverability, building a simple settings UI, and writing a reference consumer that demonstrates the resolution logic.

https://ngerakines.leaflet.pub/3miowjw5c222y
Payments on Protocol
atprotofans.com was a proof of concept for payments on ATProtocol built at Graze Social. This post is about what we built, why proof of payment on protocol matters, and what it makes possible.
Show full content

AtmosphereConf was an incredible experience. The people, talks, and energy were everything I hoped they’d be. The days since have been harder. Some difficult conversations and difficult decisions have come out of the conference, and I’m still processing a lot of it. I don’t want to dwell on that here; it doesn’t feel good. Instead, I want to look back at the work we did at Graze Social and share the parts that feel good. This is one of a series of posts doing that.


I believe payments open up new ways for people to engage on ATProtocol. When individuals, organizations, or even automated systems can use and reference payment proofs, it lets them exchange goods and services on an open protocol that anyone can verify and observe. That was our idea, and we built something to show it works.

What We Built

On atprotofans.com, people can set up Stripe Connect to get payments. Anyone can log in and send a one-time payment to someone who has set this up. The features are basic, but the site got a lot of attention and showed that the ecosystem needs this kind of tool.

Behind the scenes, we built on features that Graze already had. Since we could already set up payout accounts for advertising, adding this new payment feature was pretty straightforward.

We always meant for the name “atprotofans” to be temporary and just for fun. Devin, Andrew, and I knew that from the start. We chose to support only one-time payments at first because this was just an early proof of concept. We had planned to build something more advanced later.

How It Works

The payments themselves aren’t the most interesting part; Stripe handles that. What matters is what happens next: a proof-of-payment record chain gets created. That initial record keeps you in control of this data. If a creator moves to a different service, they keep their followers and payers. If someone who pays switches apps, their purchase history goes with them. This is a big deal and sets us apart from just adding a payment form to a website.

Most of the technical challenges were about coordination. We had to write proof records for both recipients and observers and deal with the tricky situations that come up with asynchronous payment processing. That said, the main implementation was simple.

Teaching users how the process works is a challenge, but I’m not sure most people care about the details. What matters to them is the result: they control their own data, and switching services doesn’t cause any problems.

We made it clear from the start that everything would be public. Since ATProtocol doesn’t support private data yet, we decided to work with that limitation instead of trying to avoid it. Now that spaces and permissioned data are taking shape, the shape and scope of payments can change a little to adapt, but I don’t think by much.

Why It Matters

When you have a proof of payment recorded on the protocol, you can create new kinds of interactions that weren’t possible before. This isn’t about cryptocurrency. It’s just straightforward technology that lets you easily prove one person paid another, without needing to change or add to the protocol.

That proof creates many new possibilities. Payments can be linked to event RSVPs, like tickets, or to specific content, or even to recurring subscriptions. Each payment is a record on the protocol, so it’s portable and verifiable, not stuck on one platform.

https://ngerakines.leaflet.pub/3mijvoyyv722a
Graze Social and the IETF
Graze Social sponsored my first in-person IETF meeting in Montreal last November. This post is about what it was like to be there and why standards participation matters for small companies.
Show full content

When I floated the idea of Graze sponsoring my attendance at the IETF Montreal conference last November, Devin was an immediate yes. It made sense for us to show our support in person, since that fits with Graze’s commitment to staying involved in protocol development and helping the ecosystem grow.

I also want to thank Bluesky Social for contributing to travel expenses. It was very generous, and I’m appreciative of the support.

Being There

This was my first time attending an IETF meeting in person. I enjoyed hanging out with Boris, Chad, Fig, Daniel, Bryan, Eli, and people from Germ, Cosmik Network, and Leaflet. We had a good group, and watching the World Series at the Airbnb with Boris, Ronin, and Was was a highlight.

Before the conference started, the Montreal welcoming committee got things going. We discussed a lot of ideas about community services, including how to handle aggregate identities and the records they share. It was several days of good, friendly, and focused conversation on the ATProtocol ecosystem.

I especially enjoyed the hack-day before the conference. It was really inspiring to see different groups share what they had worked on and be part of it. There’s a real sense of purpose, and you’re surrounded by people who care enough to be there, which keeps things productive.

The Connection to Graze

Bryan and Daniel had been working for some time to get an ATProtocol spec into the IETF. There’s been plenty of debate about what should go into the spec. The semi-formal feed generation process wasn’t directly included, but many parts of identity, feed processing and generation were. At Graze, our daily work involved those areas.

It was valuable to have a direct link between the standards discussions and the systems we were running. You see a spec in a new way when you’re the one building it and dealing with the tricky parts.

Why It Matters

Being involved in setting standards is important, especially as a way to keep things balanced. I strongly believe we need people with different opinions, backgrounds, and goals in the mix. If only a few big companies are involved, we lose our chance to shape the ecosystem.

Small companies and independent developers offer a different point of view. They face unique problems, have their own limits, and ask questions others might not. If those voices aren’t included, the spec ends up serving only the biggest organizations.

https://ngerakines.leaflet.pub/3mihiyqjtik2c
If This Then AT: Automation on Protocol
IFTTA is an automation platform built on ATProtocol at Graze Social. It started as a hack-day idea and became a working system for event processing on protocol.
Show full content

AtmosphereConf was an incredible experience. The people, talks, and energy were everything I hoped they’d be. The days since have been harder. Some difficult conversations and difficult decisions have come out of the conference, and I’m still processing a lot of it. I don’t want to dwell on that here; it doesn’t feel good. Instead, I want to look back at the work we did at Graze Social and share the parts that feel good. This is one of a series of posts doing that.


Last summer, Graze sponsored my travel to New York for the ATProtocol hack-day. Devin and I bounced ideas around and tinkered with different things, but at one point said, “wouldn’t it be cool if …” and that’s when the automation pipeline idea took off. We had a lot of great conversations and ideas that day, but this one really stuck.

IFTTA was absolutely one of those “yeah, why not” moments. Devin wanted to tie in Zapier for automation, so why not connect the two? I focused on the core pipeline and ATProtocol parts, while he worked on the Zapier side. Before we knew it, we had a working, feature-complete project ready to show.

How It Works

The automation engine focuses on blueprints that describe chains of nodes that either transform input or invoke an action. Each automation works as a series of steps: a trigger starts things off, data moves through several transforms, and an action happens at the end. Triggers can be firehose subscriptions, webhooks, or crontab schedules. Nodes process and reshape the data as it goes through. This includes making authenticated XRPC calls or connecting to external endpoints.

I had been trying out the datalogic-rs crate for JSONLogic conditionals, and adding it to the transform layer made sense. It let us set up complex filtering and routing without having to write custom code for each automation. You just describe what you want, and the engine checks it against the data as it moves through.

The result is a system where you can subscribe to firehose events, filter them with JSONLogic, reshape the data, and trigger authenticated actions on ATProtocol services or other platforms.

What It Became

IFTTA shows what’s possible when you see ATProtocol as a platform for automation, not just social networking. Making authenticated XRPC calls, subscribing to the firehose, using webhook and crontab triggers, and stacking transforms is a pattern I’ve used since then and plan to keep using.

ATProtocol’s event stream and authenticated API calls make a solid foundation for this kind of project. IFTTA proved it works, and the building blocks are there for anyone who wants to use them.

IFTTA is open source under the MIT license.

GitHub - graze-social/iftta: AT Protocol automation service written in Rust

AT Protocol automation service written in Rust. Contribute to graze-social/iftta development by creating an account on GitHub.

https://ngerakines.leaflet.pub/3migskzxjxk2s
Building AIP: An ATProtocol Authorization Gateway
OAuth is the first challenge developers face in the atmosphere. This post is about AIP, the authorization gateway we built at Graze Social to alleviate some of the pain.
Show full content

AtmosphereConf was an incredible experience. The people, talks, and energy were everything I hoped they’d be. The days since have been harder. Some difficult conversations and difficult decisions have come out of the conference, and I’m still processing a lot of it. I don’t want to dwell on that here; it doesn’t feel good. Instead, I want to look back at the work we did at Graze Social and share the parts that feel good. This is one of a series of posts doing that.


OAuth is the first challenge most developers face when building in this space. If you want to make an app and need users to sign in, you quickly find yourself dealing with token management, identity resolution, and XRPC proxying before you even start on your main product. AIP was created to help organize that process.

The idea grew out of many talks with Devin, Boris, and others in the ATProtocol developer community. While explaining some of the OAuth work I was doing, I realized OpenID Connect was much easier to set up than I thought. Having one configured endpoint as an authorization gateway would simplify things for everyone—not just for ATProtocol apps, but for anything that already uses OIDC.

That ended up being the best part. AIP works with Discourse, WordPress, Matrix, and existing OIDC libraries, and none of them need to know anything about ATProtocol. The ATProtocol Community Discourse has used it since we set it up, and no one even notices. It’s tech that works quietly in the background, and that feels great. I’m really happy with the result.

The Rust Port

AIP began as a Python app that used Redis for both queuing and caching. It mostly worked, but I had trouble getting token refresh to work reliably. Eventually, I decided to rewrite it in Rust and use some of the OAuth code I’d already built for Smoke Signal.

This was one of the first Rust components in the Graze Social tech stack, so there was a lot to figure out beyond the code itself. Building, configuring, deploying, and fitting it into the existing infrastructure all had to be worked out. I really appreciate Casey’s patience and support through all of that.

Switching to Rust paid off in ways I didn’t expect. The ATProtocol flavor of OAuth has a lot of edge cases, especially with identity resolution and management. Working through these made the whole process second nature. After a year of building, testing, deploying, and fixing bugs in AIP, explaining OAuth to other developers became much easier.

Teaching

That confidence led to the OAuth Masterclass and later the OAuth workshop at this past AtmosphereConf. The material wasn’t something I sat down and designed from scratch. It grew out of the accumulated experience running AIP in production and handling all the strange edge cases. After fixing token refresh issues so many times, explaining it just became natural.

What I Learned

It’s been great to see people start using AIP. I’ve found about a dozen cases where other developers and teams are using it, which is more than I expected for a piece of infrastructure.

The work that surprised me most was with MCP and agentic platforms. AIP fully supports agent authentication using dynamic client registration (RFC 7591), which is key for authenticated agents in this space. Developing and testing this taught me a lot about how agentic systems work with OAuth, and that experience has been really valuable.

I’m proud of AIP. The Masterclass, the workshop, and the agent authentication work all came from spending a lot of time on this problem and making sure we got it right.

AIP is open source under the MIT license.

GitHub - graze-social/aip: ATmosphere Authentication, Identity, and Permission Proxy

ATmosphere Authentication, Identity, and Permission Proxy - graze-social/aip

https://ngerakines.leaflet.pub/3mifrzdsdds2x
Spaces as Layers
Public anchor records paired with sidecar records in permissioned spaces give ATProtocol apps a composable pattern for blending open discoverability with controlled access.
Show full content

One challenge with ATProtocol’s permissioned data model is handling records that need to be both public and private. Events are a good example. If you create a fully private event that stays in a private space, things are simple: the event record and all RSVPs remain there, and only members can see them. Each RSVP points to the events in that space, so access control is easy.

It gets more complicated when an event is partly public. For example, a conference afterparty might be announced to everyone, but the venue address and capacity are only shared with confirmed guests in a private space. The public event record and the private details each have their own AT URIs, so they are separate records in different places. This means RSVPing to the public event is not the same as RSVPing in the private space, even though both are for the same event. Users can’t just RSVP publicly if the full details are hidden, because the protocol doesn’t have a built-in way to connect those two identities.

One way to handle this is for the event organizer to use a space for the event and add extra records to it. The public event record is posted on the organizer’s PDS, while sensitive details like the afterparty address, access instructions, and capacity limits are stored as sidecar records in the event’s permissioned space. When someone’s RSVP is confirmed, the organizer gives them read-only access to the space so they can see the private details. This keeps the public event easy to find and join, while private information is only shared with the right people. The space acts as a simple container for the event’s private side, and the organizer decides who can see what.

This approach isn’t just for events. The main idea is to have a public anchor record, like an event, forum topic, discussion thread, or job posting, that anyone can find and interact with. Alongside this, there is a space for permissioned identities and sidecar records for sensitive or extra data, visible only to certain people. The anchor record stays public and can move across the network, while the sidecars are limited to spaces with controlled access. Space-aware apps can show both layers to users with the right keys, giving a smooth experience without losing access controls. This method is flexible: one anchor can link to many spaces, each designed for its own audience and context.

https://ngerakines.leaflet.pub/3mhweogeqzk2k
ATProtocol Patterns: Record Elicitation
Record elicitation is a pattern where a client asks an AppView to construct a record from the user's intent, rather than building it locally. This lets the AppView handle business logic, validation, and schema complexity while the client retains full authority over what gets written to the user's repository.
Show full content

In ATProtocol, users have full control over writing records. Only an authenticated client, using OAuth or an app-password session, can write records to a repository on a PDS. This is intentional. Only a user working directly with a client can authorize these actions.

So, let’s think about what happens when a user wants to do something important, like creating an event, purchasing a ticket, or creating mixed content, using a specific AppView.

How can a client show that it is using a specific AppView? If the client creates a record by itself, there is no proof that the user was working with a particular service. The record simply appears in the repo, and all AppViews see it the same way. For things like ticketing, attestations, or moderation, the client needs a way to show that a certain service was involved when the record was made.

How can a client make sure it prepares data correctly? Lexicon schemas define the structure of a record, but they do not cover every rule an AppView might have. For example, a lexicon schema might declare a field is a string, but the AppView might expect a specific formatted value or a reference to something in its own state. This leaves the client guessing or depending on documentation that might not be up to date.

How can a client handle business logic that only the server knows? Some fields in a record depend on information only the AppView has, like transaction identifiers, sequence numbers, calculated references, timestamps from the service, or values from the AppView’s own indexes. The client cannot fill in what it does not know.

How can a client make sure it uses the latest fields, validation, and schema? Schemas change over time. An AppView might add new fields, remove old ones, or make validation stricter. If a client builds records based on an old version of the schema, it might create records that are technically valid but outdated—missing new fields or not matching how the service now handles data.

The Two Paths

Currently, when a client app wants to create a record, it builds it itself and writes it directly to the PDS. The client must know the lexicon, understand what the AppView expects, and correctly fill in every field. This approach works well for simple cases, like creating a post or updating a profile, where all the content comes from user input and the schema defines the rules.

The second path is record elicitation.

Instead of building the record itself, the client calls an XRPC method on the AppView and sends the user’s intent as parameters. The AppView handles these parameters by applying its business logic, checking its own state, and adding any needed values. It then returns a complete record, which the client publishes to the user’s PDS.

The user’s client is still the only one that writes records, so the authority model stays intact. However, the AppView has helped build the record, and both the client and the service are aware of this.

How It Works

Here’s how the process works:

  • The client collects user intent (the parameters that drive the record’s creation).

  • The client calls an XRPC method on the AppView, passing those parameters.

  • The AppView applies its logic and returns a record (or an error explaining why the record can’t be created).

  • The client presents the record to the user for confirmation, if appropriate.

  • The client writes the record to the user’s PDS via com.atproto.repo.applyWrites or com.atproto.repo.createRecord.

  • The AppView sees the record arrive through its normal indexing pipeline and recognizes it.

Step 4 is important. Since the client is still in control, it can review the record before publishing. This adds transparency, letting the user see exactly what will be written to their repo, even if they did not create it field by field.

A Concrete Example: Smoke Signal Events

To make this tangible, consider Smoke Signal, an event and RSVP management platform. On Smoke Signal, you can create an event today either by submitting the form parts to POST /event through the web interface or by creating the event record yourself.

Events can be complex. The event body might include mentions, links, and hashtags that need to be turned into facets. Managing start and end times is tricky, with issues like timezones, daylight saving changes, multi-day events, and differences between all-day and timed events. Adding locations makes things even more complicated, with geocoding, address fields, and venue references.

A client that tries to build an event record from scratch has to handle all these details. It must parse rich text into facets, manage date and time formats, and know how location data is structured. This leads to a lot of repeated logic in every client that creates events, increasing the risk of subtle bugs.

This is where record elicitation is especially useful. By adding a method like events.smokesignal.calendar.createEventIntent, we can handle all that complexity at once. A simple version is a query XRPC endpoint that takes flat key/value arguments. A client would use it like this:

GET /xrpc/events.smokesignal.calendar.createEventIntent
    ?name=Party
    &description=Party+at+Nick%27s+house%2C+bring+your+https%3A%2F%2Fshakoolie.com%2F
    &startsAt=2026-03-14T18:00
    &location=123+Main+St%2C+Dayton+OH

The client then gets back a complete record, ready to publish. The description is parsed into structured facets (with the Shakoolie link extracted and annotated), the date and time are normalized and checked, and the location is geocoded and formatted according to Smoke Signal’s schema.

The client did not need to know how to parse facets, use a timezone library, or have a geocoding API key. It just sent the user’s intent as simple parameters, and the AppView, which defines what a valid event record looks like, returned the correct version.

After that, the client reviews the record, shows it to the user, and writes it to their PDS. Smoke Signal receives it, recognizes it as a valid event, and indexes it right away. There is no confusion, no hidden validation errors, and no missing fields.

Blobs and Complex Record Sets

The Smoke Signal example uses a simple query with flat parameters, which is good for showing the pattern. But record elicitation can do more. What if records include blobs? Or what if creating one user action needs several records and multiple blob uploads?

Most lexicons today use JSON for input and output, but that is not required. An XRPC method can use any encoding it needs, including multipart data.

Let’s build a hypothetical app.bsky.feed.createPostIntent XRPC procedure that accepts multipart data and returns multipart data.

The request is a multipart body. The first segment is a JSON (or form-encoded) part containing the post parameters like text=Hello World!. The subsequent segments are the media attachments: images the user wants to include with the post.

POST /xrpc/app.bsky.feed.createPostIntent
Content-Type: multipart/form-data; boundary=----intent

------intent
Content-Disposition: form-data; name="params"
Content-Type: application/json

{"text": "Hello World! Check out this view from the summit."}
------intent
Content-Disposition: form-data; name="image"; filename="summit.jpg"
Content-Type: image/jpeg

<raw image bytes>
------intent--

For example, did:web:api.blacksky.community#bsky_appview might have specific requirements for images, such as maximum dimensions, preferred aspect ratios, or file-size limits. It might generate resized versions or run the image through a classifier to suggest labels for the user before publishing. The AppView has the tools and context to make these decisions, so the client does not need to duplicate that logic.

The response is also a multipart body. Each part describes itself using its Content-Disposition name. Record parts use the format collection/rkey, like app.bsky.feed.post/3mgaivrllyc2z, so the client knows where to write them. Blob parts use CID, which refers to the content hash in the record’s embed.

HTTP/1.1 200 OK
Content-Type: multipart/form-data; boundary=----b01KJWMJJ1VCM2WTW0Q5BHYJVE7

------b01KJWMJJ1VCM2WTW0Q5BHYJVE7
Content-Disposition: form-data; name="app.bsky.feed.post/3mgaivrllyc2z"
Content-Type: application/json

{
    "text": "This is the same energy as requiring your email to read an article or mandating that you disable ad blockers. I will never use your service if your core proof of value is spamming the atmosphere.",
    "$type": "app.bsky.feed.post",
    "embed": {
        "$type": "app.bsky.embed.images",
        "images": [
            {
                "alt": "",
                "image": {
                    "$type": "blob",
                    "ref": {
                        "$link": "bafkreiebnynqipchdckmd3mx5ogioffe4in7t7rpbqc73km2semyy7zkcy"
                    },
                    "mimeType": "image/jpeg",
                    "size": 370888
                },
                "aspectRatio": {
                    "width": 617,
                    "height": 959
                }
            }
        ]
    },
    "langs": [
        "en"
    ],
    "createdAt": "2026-03-04T14:34:25.451Z"
}
------b01KJWMJJ1VCM2WTW0Q5BHYJVE7
Content-Disposition: form-data; name="bafkreiebnynqipchdckmd3mx5ogioffe4in7t7rpbqc73km2semyy7zkcy"
Content-Type: image/jpeg

<processed image bytes>
------b01KJWMJJ1VCM2WTW0Q5BHYJVE7--

Now the client’s job is much easier. It gets the response, goes through the parts, uploads the blobs to the user’s PDS, and writes the records. That is all. The client does not need an image processing library, does not need to know the AppView’s preferred dimensions, and does not have to handle EXIF stripping or format conversion. The part names provide all the information needed for writing.

This pattern works for even more complex situations. An elicitation endpoint can return several record parts, like a post and a threadgate, a list item and a metadata update, or an event and a set of invite records. Each part has its own collection/rkey name. The client can process the whole group at once using com.atproto.repo.applyWrites.

What This Enables

Record elicitation opens up patterns that are difficult or impossible with client-only composition.

Service-attested records. The AppView can embed a signature in the record that proves it was involved in construction. Other consumers of the record can verify this. This is useful for ticketing, attestations, or any context where provenance matters beyond “a user wrote this.”

Server-side validation before write. Rather than the client writing a record and hoping the AppView accepts it, the AppView validates upfront. If a field references an entity that doesn’t exist in the AppView’s index, or if a business rule would be violated, the error surfaces before the record hits the repo.

Computed and derived fields. Sequence numbers, canonical references, content-addressed identifiers, timestamps from the service’s clock - any value that depends on server-side state can be populated by the entity that actually has that state.

Schema evolution without client churn. When an AppView adds optional fields or changes how certain values should be populated, the elicitation endpoint absorbs that complexity. Clients pass intent; the service handles the rest. This reduces the coordination cost of evolving record schemas across a diverse client ecosystem.

Multi-step workflows. The elicitation call doesn’t have to be a single round trip. An AppView could return a partial record along with a set of choices the client needs to present to the user, leading to a interactive flow that progressively builds the record.

Where This Fits

I call this pattern “record elicitation” because the client asks the service for a record based on the user’s intent, rather than building it themselves. You could also call it “record intent,” where the client shares what the user wants to do, and the service turns that into a real record. Both names capture the main idea: the user’s intent flows from the client to the service, and a ready-to-publish record is returned.

Record elicitation does not replace direct record composition. For simple records and most cases, it is easier for the client to build them locally. But as ATProtocol apps become more complex, with more business logic, cross-service links, and stronger proof of origin, the gap between what the client knows and what the record needs will grow. Record elicitation helps close that gap while keeping the key feature of ATProtocol: the user’s client is always in control.

https://ngerakines.leaflet.pub/3mgal3seass2q
CIDs: What You Need to Know and Why, Part 2
How ATProtocol uses Content Identifiers to create a versioned, verifiable, and portable data model.
Show full content

In Part 1, we explained what CIDs are: self-describing, content-addressed identifiers. We discussed their origins in Git and BitTorrent, their development through IPFS, Multiformats, and IPLD, how they are encoded using a multicodec version and codec prefix over a multihash, and the constraints ATProtocol sets: only CIDv1, only SHA-256, two codecs, and a fixed size of 36 bytes.

We also introduced a key concept in ATProtocol’s data model: AT-URIs are location-addressed, meaning they name a mutable slot in a repository, while CIDs are content-addressed and name an immutable piece of content. An AT-URI serves as a label that can change location, whereas a CID is a fingerprint that remains the same.

In this part, we continue exploring the AT-URI and CID duality. This forms the basis of a versioning system, with the repository’s logical clock providing the timeline. We will explain this model, show how CIDs are created in practice, and follow their movement through repositories, the Merkle Search Tree, inter-record links, and the sync protocol.


Naming, Versioning, and TimeAT-URIs as mutable labels on immutable content

At any time, an AT-URI points to a CID. When you use com.atproto.repo.getRecord, the PDS checks the current Merkle Search Tree and returns the record data with its CID. The CID serves as the actual record, because it is computed directly from the record’s DAG-CBOR serialized bytes rather than being stored as separate metadata. The MST leaf entry for that record’s path stores the CID as its value pointer.

This means the AT-URI to CID mapping changes over time. For example, at://did:plc:abc/app.bsky.feed.post/3k2la7fx2jc22 might currently resolve to bafyreig5.... If the record is updated, such as editing the post text, the same AT-URI now points to bafyreih7.... The old CID is still valid and identifies the old content, but the AT-URI no longer refers to it.

This is similar to how a variable name relates to a value in a program. For example, the name x might hold 42 at one time and 99 later. The name can change its value, but the values themselves do not change. You can ask, “what does x hold right now?” or “is this value 42?”, but these are different questions with their own stability guarantees.

AT-URIs with handles are doubly mutable because the handle can change if the user moves to a new domain, and the content can change if the record is edited. AT-URIs with DIDs are singly mutable: the identity stays the same, but the content can still change. Only a CID is truly immutable, as it is permanently tied to its content by the SHA-256 algorithm.

StrongRef: pinning the binding

ATProtocol recognized that many situations need to freeze the AT-URI to CID binding at a specific moment. This lets you say not just “I liked this post,” but “I liked this specific version of this post.” The com.atproto.repo.strongRef type is designed for this purpose:

{
  "uri": "at://did:plc:abc/app.bsky.feed.post/3k2la7fx2jc22",
  "cid": "bafyreig5..."
}

A StrongRef saves both the name and the fingerprint. If the referenced post is edited later, the StrongRef still points to the original version. The CID acts as an integrity check: “the content at this AT-URI, at the time I referenced it, had this exact CID.”

Devin Ivy of the Bluesky team described this pattern in a GitHub discussion: a CID in a like record serves as “a qualifier to support an integrity check or strong reference… so that if the record is modified, the liker has documented the specific record version they liked — useful if the record has changed in some considerable way.”

StrongRefs are used throughout the Bluesky application lexicons. For example, app.bsky.feed.like stores both the liked post’s URI and CID, locking in the exact version that was liked. app.bsky.feed.repost does the same for reposted content. Reply references in app.bsky.feed.post include both a parent and root StrongRef. The pattern is clear: any cross-repo reference that needs version pinning uses a StrongRef.

The Bluesky team has also considered a “strong reference URI,” which would be a single AT-URI string that includes the CID directly, such as at://did:plc:abc/app.bsky.feed.post/3k2la7fx2jc22#bafyrei.... This would combine the two fields into one string while keeping the version-pinning feature.

The triple: (AT-URI, CID, rev)

In addition to the AT-URI and CID, time is an important third element. ATProtocol highlights the importance of this time aspect by associating each AT-URI and CID link with the repository’s logical clock.

Every commit to a repository carries a rev, a TID (Timestamp Identifier) that the specification explicitly describes as a “logical clock.” TIDs encode microseconds since the UNIX epoch in a base32-sortable, 13-character string. They must be monotonically increasing within a repository: each commit’s rev must be greater than the previous commit’s rev.

Each rev corresponds to an atomic snapshot of the entire repository. The signed commit at that revision contains a data CID pointing to the MST root, and the MST root determines every AT-URI + CID binding in the repo at that point in time. A single rev` implies a complete, consistent mapping:

rev → commit CID → MST root CID → { AT-URI₁ → CID₁, AT-URI₂ → CID₂, ... }

The triple (AT-URI, CID, rev) is a versioning primitive with three specific guarantees.

Ordering: The rev provides a complete order of all states in a single repository. If you observe the same AT-URI at two points, (CID₁, rev₁) and (CID₂, rev₂), and rev₂ is greater than rev₁, then CID₂ is the newer version. If CID₁ equals CID₂, the record did not change between those revisions; only other records in the repository changed.

Consistency: Since each rev matches a signed commit, all AT-URI to CID bindings at that rev are internally consistent. There are no “torn reads”; you cannot see some records from one version and others from a different version. The MST root CID at a given rev determines the entire state.

Verification: With a rev and its signed commit, you can check every AT-URI to CID binding by walking the MST. The commit signature covers the root CID, and the CID values flow through the tree to every leaf. You do not need to trust the PDS. Anyone with the commit and tree data can verify the entire snapshot independently.

ATProtocol’s rev is a Lamport timestamp, which is a single counter that only increases and is managed by one process. Here, that process is the repository’s main PDS. The rev gives a total order of all commits within a single repository. However, it does not order commits across different repositories, since there is no causal link between rev values from different DIDs.

The sync protocol uses rev to keep clocks in sync. The since field in com.atproto.sync.subscribeRepos commit events shows the observer’s clock position: “this diff is relative to rev N.” When a relay asks for a repo diff, it is requesting “give me everything since rev X,” which updates its local view of the repo’s timeline.

If the since value does not match the observer’s last known rev for a DID, the repo is marked as out-of-sync, similar to finding a gap in a Lamport clock sequence. Because rev values use the TID format (which is timestamp-based), any revs that seem to be in the future, beyond a small allowed clock drift, are rejected. The logical clock is loosely tied to real-world time.

The Bluesky team’s 2023 repo sync update described this directly: “If a consumer encounters the same repo from two different sources, each with a valid signature and structure, the revision gives a simple mechanism to determine which is the most recent repository.”

The MVCC mental model

If you have a background in databases, you might notice this pattern is Multi-Version Concurrency Control (MVCC). In this model, the AT-URI is like a row key, the CID is the cell value, and the rev is the version timestamp.

┌───────────────────────────────┬──────────────┬───────────────┐
│ AT-URI                        │ CID          │ rev           │
├───────────────────────────────┼──────────────┼───────────────┤
│ .../app.bsky.feed.post/abc123 │ bafyreig5... │ 3k2la7ax1zz99 │ created
│ .../app.bsky.feed.post/abc123 │ bafyreih7... │ 3k2la7fx2jc22 │ edited
│ .../app.bsky.feed.post/abc123 │ bafyreih7... │ 3k2la7gx2ab34 │ unchanged
│ .../app.bsky.feed.like/def456 │ bafyreij3... │ 3k2la7fx2jc22 │ created
│ .../app.bsky.feed.like/def456 │ (deleted)    │ 3k2la7hx3cd56 │ unliked
└───────────────────────────────┴──────────────┴───────────────┘

Each rev acts as a transaction ID. When you “read at rev R,” you get the latest CID for each AT-URI where rev is less than or equal to R. This is exactly what repo sync does: it asks for the state at or since a specific rev.

The MST acts as the index that makes this process efficient. Instead of scanning every record to find changes, the MST recalculates CIDs from the point of change upward. When a record is modified, its CID changes, which then changes its parent MST node’s CID, and this continues up to the root. The signed commit at each rev serves as the “transaction commit record” that makes the snapshot both durable and verifiable.

ATProtocol’s version of MVCC is different from what you find in traditional databases. In PostgreSQL or MySQL, MVCC depends on trusting the database engine to manage version history correctly. In ATProtocol, MVCC is externally verifiable. Anyone with the signed commit can independently rebuild and check the entire snapshot. The CID makes this possible, turning “trust the server’s version history” into “verify the math yourself.”

This is what makes account portability possible. You can export a snapshot as a CAR file, move to a new PDS, and anyone can check that the new host has the same data by verifying CIDs from the signed commit down. The version history is not kept in a proprietary database engine; instead, it is stored in a chain of content-addressed, cryptographically signed commits that anyone can verify.


Creating CIDs: Step by Step

Now that we have covered the theory, let’s look at the practical steps. This section explains how to create a CID for the two types of content ATProtocol uses: structured records (dag-cbor) and binary blobs (raw).

Record CID creation (dag-cbor, 0x71)

We will walk through the full process using a simple app.bsky.feed.post: a "Hello, world!" post with the three fields that every post includes:

{
  "text": "Hello, world!",
  "$type": "app.bsky.feed.post",
  "createdAt": "2025-02-20T12:00:00.000Z"
}

Step 1: Serialize as DAG-CBOR. Take the record data and serialize it using deterministic CBOR rules. This is where most of the complexity lives.

The first rule is key ordering. Map keys must be sorted by byte length first, then lexicographically within the same length (RFC 7049 §3.9 canonical CBOR ordering). For our three keys, that means:

"text"      4 bytes
"$type"     5 bytes
"createdAt" 9 bytes

This creates a key order that may seem unusual if you are used to alphabetical JSON, since text comes before $type, but it is correct under DAG-CBOR rules. Shorter keys always come first. If keys have the same length, they are sorted lexicographically (for example, $type at 5 bytes would come before langs at 5 bytes, because $ (0x24) comes before l (0x6C)).

The second rule is to use minimum-width encoding. Integer and length values must use as few bytes as possible. For example, a string with 13 characters uses a 1-byte length prefix (0x6D), not a 2-byte or 4-byte encoding.

There are more constraints: no indefinite-length items, no duplicate map keys, no floats (ATProtocol does not allow them), no NaN, no Infinity, and no undefined values. When present, CID links are encoded as CBOR tag 42 (0xD82A), which wraps a byte string starting with 0x00 (the multibase identity prefix) followed by the 36-byte binary CID. Our simple post does not have links, but a reply or quote post would.

Serializing our "Hello, world!" post results in exactly 81 bytes. Here is the complete breakdown, byte by byte:

a3                                          map(3)
  64 74657874                               "text" (4 bytes)
  6d 48656c6c6f2c20776f726c6421             "Hello, world!" (13 bytes)
  65 2474797065                             "$type" (5 bytes)
  72 6170702e62736b792e666565642e706f7374   "app.bsky.feed.post" (18 bytes)
  69 637265617465644174                     "createdAt" (9 bytes)
  7818 323032352d30322d32305431323a         "2025-02-20T12:00:00.000Z" (24 bytes)
       30303a30302e3030305a

The a3 byte is a CBOR map header: major type 5 (map) with additional info 3 (three entries). Each key-value pair follows as a text string header (major type 3 + length) and then the UTF-8 bytes. The createdAt value at 24 bytes is long enough to need a 2-byte length prefix (78 18 major type 3 with additional info 24, meaning "1-byte length follows," then 0x18 = 24).

The output is a deterministic byte array. Determinism is essential: the same record data must always serialize to the exact same bytes, because any difference would create a different CID. You can check this yourself and any conforming DAG-CBOR encoder given this record will produce the hex string a364746578746d48656c6c6f2c20776f726c6421652474797065726170702e62736b792e666565642e706f7374696372656174656441747818323032352d30322d32305431323a30303a30302e3030305a.

Step 2: Hash with SHA-256. Compute the SHA-256 hash of those 81 bytes. For our post:

b38bc4817b97820bcb7cf1025d35673dc8d8545759b34b067fe44da9a5909b71

That's our 32-byte digest.

Step 3: Assemble the 36-byte binary CID. Prepend the four single-byte prefixes:

[0x01] [0x71] [0x12] [0x20] [32-byte SHA-256 digest]
  │      │      │      │      └── b38bc481...a5909b71
  │      │      │      └── digest length = 32
  │      │      └── SHA-256 hash function
  │      └── dag-cbor content codec
  └── CIDv1 version

The complete 36-byte binary CID in hex:

01711220b38bc4817b97820bcb7cf1025d35673dc8d8545759b34b067fe44da9a5909b71

Step 4: Encode as a string. Base32-encode the 36 binary bytes using the lowercase RFC 4648 §6 alphabet, then prepend the b multibase prefix:

bafyreiftrpcic64xqif4w7hrajotkzz5zdmfiv2zwnfqm77ejwu2lee3oe

That is the CID of our "Hello, world!" post. It starts with “bafyrei”, which is the prefix for every dag-cbor and SHA-256 CID. If you change even one character in the post text, the createdAt timestamp, or the $type string, you will get a completely different CID.

In pseudocode:

function createRecordCID(record):
  // generate deterministic CBOR
  cbor_bytes = dag_cbor_serialize(record)

  // generate 32-byte hash
  digest     = sha256(cbor_bytes)

  // 36 bytes binary CID
  binary_cid = [0x01, 0x71, 0x12, 0x20] ++ digest

  // string serialize binary CID
  return "b" ++ base32_lower(binary_cid)      
Blob CID creation (raw, 0x55)

Step 1: Read the raw bytes. There is no serialization step. The blob, such as an image, video, or audio file, is hashed exactly as it is, in full. There is no chunking, DAG construction, or CBOR encoding.

Step 2: Hash with SHA-256. Compute the SHA-256 hash of the complete file bytes. Same algorithm, same output: a 32-byte digest.

Step 3: Assemble the 36-byte binary CID. The only difference from a record CID is byte 2 — the codec:

[0x01] [0x55] [0x12] [0x20] [32-byte SHA-256 digest]
  │      │      │      │      └── the hash output
  │      │      │      └── digest length = 32
  │      │      └── SHA-256 hash function
  │      └── raw content codec
  └── CIDv1 version

Step 4: Encode as a string. Same base32 process. The result starts with “bafkrei”.

The two processes are the same except for serialization (DAG-CBOR for records, none for blobs) and the codec byte (0x71 for records, 0x55 for blobs). Here they are side by side:

function createBlobCID(file_bytes):
  // generate 32-byte hash
  digest     = sha256(file_bytes)

  // 36 bytes binary CID
  binary_cid = [0x01, 0x55, 0x12, 0x20] ++ digest

  // string serialize binary CID
  return "b" ++ base32_lower(binary_cid)
Parsing a CID string back to bytes

Parsing works in reverse: remove the first character (which must be b), base32-decode the rest, and then check each field. Byte 0 must be 0x01 (CIDv1). Byte 1 must be 0x55 or 0x71. Byte 2 must be 0x12 (SHA-256). Byte 3 must be 0x20 (32). There must be exactly 32 bytes of digest, making a total of 36 bytes. Reject anything that does not match, such as CIDv0, the wrong hash, the wrong codec, or the wrong length.

Implementation references

Several libraries provide ATProtocol-specific CID implementations:

The dasl Rust crate (the n0-computer team) defines Multihash::Sha2256 and Multihash::Blake3 variants alongside Multicodec::DagCbor and Multicodec::Raw. It uses sha2 for SHA-256, blake3 for BLAKE3, data-encoding for base32, and cbor4ii for CBOR parsing.

The @atcute/cid TypeScript package (successor to @mary/atproto-cid) provides a minimal implementation where create(0x71, buffer) hashes a buffer and returns a CID object with version: 1, code: 113, and bytes: Uint8Array(36).

The cid Elixir hex package provides CID.cid!(data, "dag-cbor", "sha2-256") to create CIDs from serialized data.

The atproto-dasl crate from the atproto-identity-rs repository provides another Rust reference implementation focused on ATProtocol’s specific constraints.

I highly suggest the actively maintained https://sdk.blue for a more complete list of projects and SDKs.


ATProtocol Real-World UsageRepository structure and the Merkle Search Tree

Every ATProtocol user’s data lives in a repository: a signed, content-addressed data structure. At the top is a commit object (version 3):

{
  "did": "did:plc:abc123",
  "version": 3,
  // CID for the MST root
  "data": { "$link": "bafyrei..." },
  // Lamport clock
  "rev": "3k2la7fx2jc22",
  "prev": null,
  // cryptographic signature
  "sig": "<bytes>"
}

The data CID points to the root node of the Merkle Search Tree. The rev is the Lamport clock value for this commit, serving as the time marker that gives a total order to all repository states and allows the sync protocol to keep clocks in sync. The sig covers the DAG-CBOR serialization of the entire commit object. To verify, you check the signature and then verify every CID down through the tree.

The MST maps record paths (like app.bsky.feed.post/3k2la7fx2jc22) to record CIDs. Each MST node is a DAG-CBOR object using compact single-character field names: l for the left subtree CID, and e for an entries array. Each entry contains p (the number of prefix bytes shared with the previous key), k (the remaining key suffix), v (the CID of the record data), and t (a right subtree CID, or null).

Tree depth for a given key is found by hashing the key with SHA-256, counting the leading binary zeros in the hash output, and dividing by 2 (rounding down). This results in a fanout of about 4 per level. The same SHA-256 that secures CIDs also shapes the tree.

Each MST node has its own CID. When a record changes, its CID changes, which then changes the parent MST node’s serialization and its CID, and this continues up to the root. This index structure makes the MVCC model efficient. Instead of scanning every record to build a diff, you only need to follow the path of changed CIDs from root to leaves.

The verification chain works in the opposite direction. You start from the signed commit, move to the MST root CID, and then walk through the tree. At each step, you can independently calculate the CID of the data you received and compare it to the claimed CID. If they all match, the entire repository—every record and every tree node—is authenticated by the single signature at the top. This is how account portability works: export the commit and all blocks, and any server can verify the repo on its own. Each rev snapshot can be checked independently, since the signature at the top covers every AT-URI to CID binding in the repo.

DAG-CBOR serialization

The determinism requirements for DAG-CBOR are essential. If the same data could serialize to different byte sequences, it would create different CIDs, and the entire Merkle tree would not work. If two implementations serialize the same record differently, they would compute different CIDs and disagree about the repository’s state.

The rules are strict. Map keys are sorted by byte length first, then by lexicographical order within the same length. Integer and length encodings must use the minimum number of bytes. There are no indefinite-length containers or duplicate map keys. ATProtocol also does not allow floating-point numbers (the data model uses only integers and strings), undefined values, NaN, or Infinity.

CID links in DAG-CBOR use CBOR tag 42 (0xD82A), which wraps a byte string that starts with 0x00 (the multibase identity prefix) and continues with the 36-byte binary CID. The full CBOR encoding of a single link is 41 bytes: 2 bytes for the tag, 2 bytes for the byte string header (length 37), 1 byte for the identity prefix, and 36 bytes for the CID. Tag 42 is the only CBOR tag used by ATProtocol.

Links between records

CIDs appear in records through two distinct mechanisms, and the choice between them is intentional.

Binary cid-link (CBOR tag 42) is used for links within a repository. MST nodes linking to other MST nodes, MST leaves linking to records, blob references — these all use the binary link format. In JSON API responses, they render as {"$link": "bafyrei..."}. IPLD tools follow these links during traversal. A blob reference in a post’s embed looks like:

{
  "$type": "blob",
  "ref": { "$link": "bafkreibjfgx2gprinfvicegelk5ko..." },
  "mimeType": "image/jpeg",
  "size": 482349
}

String-format CIDs in StrongRefs are used for cross-repo references where version pinning matters. A plain string CID field sits alongside an AT-URI in the com.atproto.repo.strongRef structure. The CID pins the exact version of the referenced record and is this is the “freezing the AT-URI + CID binding” pattern in practice.

Why use two mechanisms? String CIDs do not cause IPLD tools to follow cross-repo references. If every like record’s reference to a liked post was a binary CBOR link, an IPLD walker would try to fetch every post you have ever liked from other people’s repositories, which would not work well at network scale. String CIDs are hidden from IPLD traversal; they still provide version-pinning information, but they do not create links in the content-addressed graph.

The design is straightforward: links within a repository are binary (so they can be traversed and verified as part of the repo DAG), while cross-repo references are strings (not traversable, but still pinning a specific version). A repository is a self-contained, verifiable unit. To verify across repositories, you need to fetch the other repo. In a StrongRef, the CID turns a mutable AT-URI into an immutable reference point, which is the simplest form of versioning.

Links between records

CAR files: packaging and transport

CAR (Content Addressable aRchive) is the format used to package and move content-addressed data. ATProtocol uses CARv1, which has a simple structure.

The file starts with a header: a varint-encoded length prefix, followed by a DAG-CBOR object containing version: 1 and a roots array of one or more CIDs. The roots array tells the reader where to start traversing the data.

The rest of the file is a series of blocks. Each block has a varint-encoded length prefix, the block’s CID in binary, and the block’s data bytes. That’s all—a flat stream of CID-tagged blocks.

For a full repository export, the first root is the CID of the latest commit. The CAR file must include the commit, all MST nodes, and all records. Blobs are not included; they are stored and fetched separately. Anyone receiving the file can verify each block by computing its CID from the block data and comparing it to the claimed CID. This is the MVCC snapshot saved to disk: a complete, self-verifying, point-in-time image of every AT-URI to CID binding in the repo.

For repository diffs, com.atproto.sync.subscribeRepos sends diffs as CAR slices. These slices include only the blocks that changed: new or updated records, new MST nodes, and the new commit. Deleted records are shown by their absence. Blocks are deduplicated by CID. A single new post might only need a few MST node updates and the record itself, resulting in a CAR slice of a few hundred bytes. The since field in commit events is the Lamport clock synchronization tool: “this diff advances you from rev X to rev Y.”

CIDs act as both the index and the integrity check for CAR blocks. If a block is corrupted or tampered with, it will have a CID that does not match the one claimed in the file. Deduplication is automatic: two blocks with the same CID are guaranteed to be byte-for-byte identical. CAR files are self-verifying and do not require external trust.

Closing

CIDs form the backbone of ATProtocol’s core guarantees: identity and data mobility and ownership. Every data assurance the protocol provides is built on these 36 bytes of content-addressed computation.

The evolution of the ATProtocol spec is grounded on lessons from distributed systems, cryptography, existing content-addressed networks like IPFS, and the practical needs of data portability. Each refinement is shaped by both the challenges of real-world deployment and the community of developers and researchers who contribute ideas and critiques. Understanding the influences behind these decisions provides deeper insight into how ATProtocol works, why its verification, synchronization, and portability guarantees matter, and how they may evolve in the future.

Join the discussion on the ATProtocol Community Discourse or in the atmosphere.


Except where otherwise noted, this content is licensed under a Creative Commons Attribution-ShareAlike 4.0 International license with attribution going to Nick Gerakines.

https://ngerakines.leaflet.pub/3mfczdeczuc2c
CIDs: What You Need to Know and Why, Part 1
A deep dive into Content Identifiers, the self-describing cryptographic fingerprints that form the foundation of ATProtocol’s data model.
Show full content

A Content Identifier, or CID, is a self-describing, deterministic, content-addressed identifier. This simple data structure contains a cryptographic fingerprint, identifies the hash algorithm used to produce it, and specifies how the data is serialized. While a URL answers “where can I find this?”, a CID answers “what is this?” The same bytes and parameters always produce the same CID. Even a single-bit change results in a completely different one.

Content-addressable storage lets you find information based on its content, not its name or location. In these systems, data goes through a cryptographic hash function to create a unique key called the "content address." You use this key to find and retrieve the data. Since the same content always gives the same key, duplicates are easy to spot, and any change creates a new key, which helps ensure data integrity. CIDs are a type of content address, but they also include extra details that explain how the address was made.

History and Background

The idea of identifying data by its hash is older than the web and is many of the systems and tools we use. Git, created in 2005, stores every commit, tree, and blob as objects identified by their SHA-1 hashes. BitTorrent uses content hashes to verify pieces across a distributed swarm. Both systems proved that content-addressed storage works at scale.

Both Git and BitTorrent use a single baked-in hash algorithm and a single data format. Where that becomes a problem (and it inevitably does) is upgrading. Git’s ongoing SHA-1 migration is a cautionary tale: a decade-long effort to move to SHA-256, complicated by the fact that SHA-1 was wired into the format at every level. There was no way for a Git object to announce which hash algorithm it used, because the system assumed there would be only one.

IPFS and Protocol Labs

The InterPlanetary File System emerged from Juan Benet’s work at Stanford, with a whitepaper published on arXiv in July 2014, and Protocol Labs was founded the same year as part of the Y Combinator S14 batch. The IPFS alpha shipped in early 2015.

Early IPFS used simple base58btc-encoded multihashes as identifiers like the Qm... strings that many developers still know. These worked well for IPFS’s first use case, which was content-addressed file storage. But as the project grew to link data formats such as Ethereum blocks, Git objects, and CBOR structures, identifiers needed to include format information alongside the hash. A plain multihash could tell you which hash algorithm was used, but not how to interpret the data it pointed to.

The Multiformats project

Protocol Labs started the Multiformats project around 2016. This is a group of self-describing protocols created to solve the “hardcoded assumptions” problem that affected earlier systems. The family includes three main parts.

Multihash wraps hash digests with a function code and length prefix, making the hash algorithm self-describing. Multicodec provides a table of type identifiers encoded as varint prefixes, making content formats self-describing. Multibase prefixes string-encoded data with a character indicating the base encoding, making the text representation self-describing.

All three follow the same design idea: each begins with its own decoding instructions. You don’t need any outside information.

CIDs: combining the pieces

In 2016-2017, from discussions in the ipfs/specs repository, multicodec and multihash were combined into a single compact identifier, the CID.

CIDv0 kept backward compatibility with existing IPFS multihashes. A CIDv0 is always in dag-pb format, always uses SHA-256, and is always encoded as base58btc. Since the hash function and codec were fixed, there was no need for extra fields to identify them.

CIDv1 added clear version, codec, and multibase fields, making identifiers fully self-describing and ready for future changes. A CIDv1 can use any hash function, any codec, and any base encoding. The identifier itself gives you all the information needed to decode it.

The canonical CID specification lives at multiformats/cid on GitHub, licensed CC-BY 3.0 by Protocol Labs.

IPLD: the data model layer

IPLD, which stands for InterPlanetary Linked Data, formalized the data model for content-addressed linked data at about the same time. IPLD treats all hash-linked data structures—such as IPFS files, Ethereum state trees, and Git repositories—as parts of a single information space, with CIDs as the universal link type. When you follow a CID link, the codec tells you how to decode the target, and the multihash tells you how to verify it.

IPLD codecs such as DAG-CBOR and DAG-JSON were created to serialize IPLD data models with embedded CID links. DAG-CBOR became especially important because it is a deterministic subset of CBOR that directly supports CID links through a special CBOR tag. This enables the construction of authenticated data structures in which every node is both content-addressed and format-aware.

How ATProtocol adopted CIDs

When the Bluesky team designed ATProtocol in 2022–2023, they chose CIDs and DAG-CBOR for three interrelated reasons. First, content verification: any party can verify data integrity without trusting a server, because CIDs are computed from the data itself. Second, Merkle tree repositories: CID changes propagate from any modified record up to a signed root, so a single signature authenticates the entire repository. Third, account portability: repositories can be exported as self-verifying CAR files that any server can import and independently verify.

Jay Graber noted at a Protocol Labs event that the IPFS ecosystem already had tooling for working with DAGs and CAR files, making the adoption path practical rather than merely theoretical.

As of 2025–2026, ATProtocol’s data model formally aligns with DASL (Data-Addressed Structures & Links), a specification at dasl.ing that defines a strict subset of IPLD CIDs for use in hash-linked data structures. DASL is the formalization of the constraints ATProtocol already enforced in practice.


Multihash and Multicodec Primer

Without self-description, you need extra information to understand a hash. Someone has to tell you, “this is a SHA-256 hash” or “this data is CBOR-encoded.” That outside context is fragile. If you lose it for any reason, such as a system upgrade, a protocol change, or incomplete documentation, the bytes become meaningless. You have a 32-byte value but no way to know what created it or what it refers to.

Multiformats fix this by letting the bytes include their own decoding instructions. A multihash tells you which hash algorithm and digest length it uses. A multicodec tells you its content type. There’s no need for an external registry, configuration file, or protocol negotiation.

Unsigned varint encoding

Both multihash and multicodec use unsigned varints to encode their type codes. This is an unsigned LEB128 encoding, limited to 9 bytes (63 bits of data). Each byte holds 7 bits of data, and the most significant bit (MSB) is a flag. If the MSB is 1, more bytes follow; if it is 0, this is the last byte. The bits are written least-significant-first.

Value 1   (0x01):  0_0000001              → 0x01 (1 byte)
Value 127 (0x7F):  0_1111111              → 0x7F (1 byte)
Value 128 (0x80):  1_0000000 0_0000001    → 0x80 0x01 (2 bytes)
Value 300 (0x012C): 1_0101100 0_0000010   → 0xAC 0x02 (2 bytes)

Values from 0 to 127 fit in a single byte. Values from 128 to 16,383 take two bytes. In theory, you need to handle multi-byte varints when parsing CIDs from any IPFS or IPLD sources.

In practice, for ATProtocol, you don’t need to worry about this. ATProtocol uses the CID version (0x01), the codecs (0x55, 0x71), the hash function (0x12), and the digest length (0x20), all of which fit in a single byte. No multi-byte varint decoding is needed. Every prefix is exactly one byte.

Multihash

Multihash uses a TLV (type-length-value) format for hash digests. The structure is simple:

<hash-function-code (varint)> <digest-length (varint)> <digest-bytes>

A concrete SHA-256 multihash looks like this:

12 20 6e6ff7950a36187a801613426e858dce686cd7d7e3c0fc42ee0330072d245c95
│  │  └── 32 bytes: the SHA-256 digest
│  └── 0x20 = 32: digest length in bytes
└── 0x12 = 18: SHA-256 hash function code

Separating the hash function code from the digest length is done on purpose because it allows for truncated digests. For example, a SHA-512 hash shortened to 256 bits would use function code 0x13 (SHA-512) with length 0x20 (32 bytes) instead of 0x40 (64 bytes). The parser does not need to know that SHA-512 usually produces 64 bytes; the length field tells it exactly how many bytes to read.

The key hash function codes you’ll encounter:

identity  0  (0x00) variable digest
sha1      17 (0x11) 20 byte digest
sha2-256  18 (0x12) 32 byte digest
sha2-512  19 (0x13) 64 byte digest
blake3    30 (0x1e) 32 byte digest

The identity hash (0x00) is a special case where the “digest” is just the content itself. It is used for very small pieces of inlined data when the overhead of a real hash would exceed the data itself. It does not provide any security because it is just for convenience.

Multicodec

Multicodec is a shared lookup table of type identifiers, each encoded as an unsigned varint. The table is maintained in the multiformats/multicodec repository on GitHub and covers categories ranging from CID versioning to IPLD codecs to multihash functions to serialization formats.

The first 127 entries (the single-byte varint range) are set aside for the most widely used codes. This is done on purpose so that the most common identifiers are also the shortest.

The codes relevant to ATProtocol:

cidv1    (cid)       1   (0x01)
sha2-256 (multihash) 18  (0x12)
raw      (ipld)      85  (0x55)
dag-pb   (ipld)      112 (0x70)
dag-cbor (ipld)      113 (0x71)

Multicodec has two roles in CIDs: the same table is used for both the CID version byte and the content codec byte. The table is maintained by the community, with new entries added through pull requests. Codes in the “draft” column may change before they are finalized.

How they compose in a CID

A CIDv1 in binary is the concatenation of a version prefix, a codec identifier, and a multihash:

<version (varint)> <codec (varint)> <hash-function (varint)> <digest-length (varint)> <digest>

The first two fields are multicodec values. The last three fields together make up a multihash. So, the full structure is four varint prefixes followed by the raw digest bytes.

For an ATProtocol record CID:

01  71  12  20  [32 bytes of SHA-256 digest]
│   │   │   │   └── the hash output
│   │   │   └── digest length = 32
│   │   └── sha2-256
│   └── dag-cbor
└── CIDv1

Since all four prefix values are 127 or less, each one is a single byte. No extra varint processing is needed. The CID is always exactly 4 + 32 = 36 bytes.


ATProtocol CIDs

ATProtocol does not use the full CID specification. It defines a strict subset, following the DASL specification (dasl.ing/cid.html), that removes most of the general features and most of the parsing complexity found in the broader CID ecosystem.

CIDv1 only

CIDv0 identifiers (the Qm... strings from early IPFS) are never produced by ATProtocol and should be rejected during validation. CIDv0 was a backward-compatibility shim that let IPFS maintain interoperability with its existing content-addressed objects while transitioning to the richer CIDv1 format. ATProtocol had no legacy to maintain, so it adopted CIDv1 exclusively from the start.

CIDv1’s explicit version byte also allows for future format changes without confusion. If the protocol ever needs to change the CID format, the version field gives a clear way to migrate—parsers can check the first byte instead of using guesswork.

SHA-256 only (for now)

The hash function code must be 0x12 (SHA-256), and the digest must be exactly 32 bytes (0x20). The spec calls this a “stable requirement” and all repository nodes, records, and commits use SHA-256.

This choice is practical. SHA-256 is widely supported by hardware (like Intel SHA Extensions and ARM SHA-2 instructions), has strong security with no known practical attacks against collision resistance, and its 32-byte digests balance compact size with good collision resistance.

The DASL specification also allows BLAKE3 (0x1e) for streaming verification of large files, and this may be used for blob CIDs in the future. BLAKE3 is much faster than SHA-256 in software, especially for large files, and its tree-based design allows for parallel and incremental hashing. For now, though, SHA-256 is the only hash you’ll see in ATProtocol data.

Two codecs: dag-cbor and raw

ATProtocol uses exactly two content codecs.

dag-cbor (0x71) identifies structured data: records, MST nodes, and commit objects. When you encounter a CID with codec 0x71, you know the bytes it points to should be decoded as deterministic CBOR with embedded CID links.

raw (0x55) identifies binary blobs: images, video, audio, or any other opaque byte sequence. When you encounter a CID with codec 0x55, you know the bytes are unstructured — no CBOR decoding, no link extraction, just raw binary data.

The codec byte tells you exactly how to interpret the content. This is how the “self-describing” design of CIDs works. If a parser sees 0x71, it can start a CBOR decoder right away; if it sees 0x55, it can just pass the bytes through as-is.

Base32 lowercase string encoding

When CIDs are represented as strings in JSON API responses, in logs, in URLs, etc., ATProtocol uses lowercase base32 encoding with a b multibase prefix (RFC 4648 §6 alphabet). No base58btc, no base36, no hexadecimal.

This is why you see the distinctive prefixes that anyone familiar with Bluesky data will recognize. The base32 encoding of the four prefix bytes 0x01 0x71 0x12 0x20 (CIDv1, dag-cbor, SHA-256, and 32-byte digest) always gives the string bafyrei. The encoding of 0x01 0x55 0x12 0x20 (CIDv1, raw, SHA-256, and 32-byte digest) always gives bafkrei. The prefix is always the same because the first 4 bytes never change; only the 32-byte digest changes.

If you see a bafyrei... string, you’re looking at a record CID. If you see bafkrei..., it’s a blob.

You can see this in action by peaking at the CIDs of feed post commits coming through Jetstream:

websocat "wss://jetstream2.us-west.bsky.network/subscribe?wantedCollections=app.bsky.feed.post" | jq --unbuffered 'select(.kind == "commit") | .commit.cid'
"bafyreid3t4w2refrlwqkna5uwpebhggeyc63ebppqnpwnx3smdxgmigsq4"
"bafyreidcevk5exkipz3kl3726ntkhlzlefpnzbyb3kdxyno3wjdrfio2l4"
"bafyreia5qxocgnabsdq52b2cmxzludg7ep4nabhj3fw4yra7rraficzb3u"
"bafyreibiphqzn7wevw46ralvn3btzx6toijx6kjqometkeugqemc2qqiga"
"bafyreihiu5h5tlaqarwhuzajag4lflikrvkixxq2rdguq4kkripkderlxm"
...
Fixed 36-byte binary size

The arithmetic is simple: 1 byte (version) + 1 byte (codec) + 1 byte (hash function) + 1 byte (digest length) + 32 bytes (digest) = 36 bytes. Every ATProtocol CID, always.

This fixed size is useful in practice. You can use fixed-width database columns, pre-allocate buffers without checking lengths, and calculate storage overhead for indexes and Merkle trees exactly. The DASL spec suggests a MAX_CID_BYTES of 100 for future compatibility, but current ATProtocol CIDs are always 36 bytes.

No chunking

Unlike IPFS’s UnixFS, which splits large files into a Merkle DAG of smaller chunks (enabling incremental downloads and deduplication at the block level), ATProtocol hashes blobs in their entirety. A blob’s CID is the SHA-256 hash of the complete file contents. Period.

This makes verification much simpler. You have the blob bytes, you hash them, and you compare the result to the claimed CID. There is no need for DAG reconstruction, block ordering, or reassembly. The downside is that there is no built-in way to verify large files incrementally or as a stream. In the future, BLAKE3 and the BDASL (Big DASL) specification may help by offering tree-based hashing for large binary content.

A note on terminology: DRISL

ATProtocol’s CBOR serialization is now often called DRISL, which stands for Deterministic Representation for Interoperable Structures & Links, instead of DAG-CBOR. The multicodec value (0x71) and the wire format stay the same. The difference is that DRISL refers to the specific rules ATProtocol adds to DAG-CBOR: no floating point numbers, certain map key ordering rules, and limits on which CBOR features are allowed. You’ll see both names in documentation and code, but they refer to the same bytes on the wire.

What’s next

This explains what CIDs are, their history, how they are encoded, and the rules ATProtocol uses for them. In Part 2, we’ll look at how ATProtocol uses CIDs in practice, including the versioning model that comes from the AT-URI/CID relationship, how to create CIDs step by step for records and blobs, and how CIDs move through repositories, the Merkle Search Tree, inter-record links, and the firehose sync protocol.


Except where otherwise noted, this content is licensed under a Creative Commons Attribution-ShareAlike 4.0 International license with attribution going to Nick Gerakines.

https://ngerakines.leaflet.pub/3mfceipobzk2u
B-Sides: Permissioned data is a love triangle
This is a B-sides post with unpolished thoughts that didn’t make it into the main article.
Show full content

Editing the “Permissioned data is a love triangle” post left a lot on the floor. These are some of those thoughts that aren’t fully formed, but I think are worth sharing. Again, these aren’t formal proposals, just me thinking out loud.

The Firehose Isn’t the Only Option

The main post focuses on using the event stream to notify consumers of changes. This is the easiest way to send updates to relays, indexers, and AppViews. However, it’s not the only option, and for controlled data, it might not be the best choice.

The firehose shares everything with everyone, all the time. Permissioned data, on the other hand, is about setting and enforcing limits and boundaries. So the real question might not be how to adapt the firehose for controlled content, but what a notification layer built for this purpose would look like.

Additionally, the firehose’s main XRPC subscription endpoint (com.atproto.sync.subscribeRepos) could be used directly by AppViews or clients with authentication. Right now, relays connect to PDS instances without authentication and receive everything. But a PDS could accept authenticated subscribers and give them a richer stream that includes controlled record blocks along with public ones. This creates more questions than answers, like how the PDS controls the stream content based on who’s connected and what cursors and sync events look like. For AppViews that already have an authenticated relationship with the PDS, this could be the easiest way to get permissioned updates in real time.

Fast Membership Tests Instead of Streaming

What if, instead of pushing events, consumers could pull them? For example, what if an AppView or client could make a quick API call to check, “Do I have everything I’m supposed to have?”

I think salted CIDs with RIBLT slices are worth exploring here. The idea is that you don’t need the full firehose to stay in sync. You just need a way to notice when you’re out of sync and then fix it. RIBLT (Rateless Invertible Bloom Lookup Tables) are a great fit for set reconciliation in the ATProtocol ecosystem. Using them for lightweight “am I current?” checks against a permissioned bucket seems like a good fit.

Per-Identity Bucket Rev Lookups

Similar to the membership test approach, a somewhat simpler alternative could be to have a fast per-identity lookup for repository bucket revisions.

HEAD /xrpc/com.atproto.repo.getBucketRev?bucket=bffs&cursor=Y
Authorization: DPoP ...

The cursor would be a checksum (CID) of a structure like:

{
    "bucket": "sbffs",
    "rev": "bucket-specific-repos-rev",
    "identity": "requesting-did",
    "nonce": "xyz"
}

The nonce would be the current time rounded up to 20 minutes, HMAC’d with an internal bucket salt. This forces a periodic change, so the cursor naturally expires and consumers need to check again. It acts as a lightweight heartbeat, asking: “has anything changed in this bucket since I last looked, for my identity?”

If the response is a 304, you’re up to date. If it’s a 200 with a new cursor, something changed, and you need to reconcile. Returning a 200 for anything except an exact match on “bucket exists,” “user has permission,” and “CID matches” means you’re not leaking any information.

This isn’t a replacement for the firehose for public data. But for permissioned data with a scoped audience and a lower update frequency, polling with a smart cursor might be cheaper and simpler than maintaining a persistent event stream connection.

Side note: HEAD request support in XRPC is a long-standing wishlist item for me.

De-sync Event Streams

On the other hand, what if there was a lightweight event stream that only told you when you were out of sync? Instead of the full firehose with commits and MST diffs, it would just give you a nudge: “hey, bucket X changed, you should re-check.”

This would be a much lighter stream than the firehose. There’s no record data or MST nodes, just de-sync notifications. Consumers would subscribe to the buckets they care about and get notified when something changes. Then they could use the membership test to determine what to do next.

This approach gives you the best of both worlds: push notifications for immediate updates and pull reconciliation for the actual data. The event stream carries almost no information, so it’s safe to broadcast even for permissioned content.

This could be in the form of a ratchet tuple (here be dragons) from the bucket revision:

{ "rev": "bafyre...25pcba", "prev": "bafyre...f7micy" }
{ "rev": "bafyre...dsh5dm", "prev": "bafyre...25pcba" }
...
Buckets as Keyhive Groups

The idea of a “bucket” keeps coming up, and the more I think about it, the more I like seeing each bucket as a BeeKEM or Keyhive-style group of access controls. Keyhive from Ink & Switch provides group membership management with end-to-end encryption, using a continuous group key agreement protocol that handles dynamic membership, concurrent updates, and coordination-free revocation. If each permissioned bucket maps to a Keyhive group, you get:

  • A well-defined set of members (identities and services) who can access the bucket’s contents

  • Cryptographic key agreement that doesn’t require a central coordinator

  • Revocation that works without having to re-encrypt everything

  • A group structure that can nest (a bucket can contain sub-buckets, a group can contain sub-groups)

I’m not sure yet how this fits with the MST and repository structure in practice. Keyhive isn’t a one-size-fits-all solution here, but cryptographic groups with their own key material and members having access to documents (records and blobs) in the group feel good.

Permission Graphs

I think we need to model controlled data in a way that supports permission graphs:

Direct permissions. Mattie can see this post. This is the simplest case: a specific identity is granted access to a specific record. Most access control discussions start and end here.

Associative permissions. Mattie can see this post and the blobs it references. This is where things get interesting. A post might embed images, link to location records, or reference other content. If Mattie can see the post but not the embedded image, the experience doesn’t work. Permissions need to flow through record references. If you’re authorized for a record, you should also be authorized for the things it points to, or at least the things it declares as associated.

Indirect permissions. Mattie can see this post through the NeatPosts AppView. This is the AppView-scoped access model. Mattie doesn’t have direct permission on the record. The NeatPosts AppView does, and Mattie’s relationship with NeatPosts allows it to serve the content to her. The permission is managed by a trusted application.

Permission layers and sprawl create a kind of graph. For example, a private event might give direct permission to attendees, associative permission for the venue location and event photos, and indirect permission through the Smoke Signal AppView. This way, attendees can find the event in the app without needing individual record-level grants.

Getting the data model right for this is probably the hardest unsolved problem in permissioned data. It’s easy to say “permissions flow through references,” but actually following a reference graph, checking authorization at each step, and caching those decisions efficiently is not simple.

Borrowing From OAuth

There are a lot of paved paths in OAuth that feel relevant here.

What if PKCE, but reverse? In standard OAuth PKCE, the client proves it’s the same entity that started the authorization flow by presenting a code verifier that matches a previously committed code challenge. What if we flipped this for record access? The PDS commits a challenge when publishing a permissioned record, and an authorized consumer presents the verifier when requesting the content.

I’m not sure this fits perfectly, but the idea of stateless proof of authorization using a pre-committed challenge seems useful for inter-service record access, especially when you don’t want the PDS to keep a session table for every authorized consumer.

Rich Authorization Requests (RAR). OAuth RAR lets clients request specific, fine-grained permissions using structured authorization details. Instead of broad scopes like repo:collection, you could say, “I want read access to records in the community.lexicon.calendar.event collection where the requesting identity is in the event’s attendee list.” This fits directly with the associative and indirect permission models above.

The ATProtocol inter-service JWT already carries iss and aud claims. Extending it with RAR-style authorization details could give source services the context they need to make fine-grained access decisions without having to infer intent from the request alone.

None of This Is Done

These are notes, not specs. The bucket model needs more thought, and the permission types need a real data model. The OAuth analogies might not work in practice. But I think there’s something here worth considering, especially the idea that permissioned data doesn’t have to be an all-or-nothing feature. It can be a set of conventions that build on what already exists, using patterns from systems that have solved similar problems.


Join the discussion on the ATProtocol Community Discourse or in the atmosphere.

https://ngerakines.leaflet.pub/3mf7we5j4ek2w
Permissioned data is a love triangle
Permissioned data is a love triangle between the user, the identities they grant permissions to, and the applications everyone uses to view controlled data. We don't need to change or reinvent the protocol to have it, because ATProtocol already supports it.
Show full content

Last summer, I wrote a blog post called “ATProtocol Record Hydration: Building Privacy-Aware Views”. In it, I explored how ATProtocol’s current features can help create privacy-aware data views. With so much discussion about “private data” and what needs to be built to support it, I want to showcase the existing protocol features that make it possible today.

If you don’t want to read the original post, here’s the main idea: add an optional service field to com.atproto.repo.strongRef and blob types. This field points to a service that uses inter-service authentication to provide the data. Apps that don’t support this feature won’t find the content in the user’s repository. Apps that do support it will control access with permissions.

That post focused on the mechanics: remote record references, inter-service authentication, and the hydration pattern. This post aims to refine that model. I want to cover three main points: the trust relationships needed for controlled data, why data portability must not be lost when adding access control, and how a permission-aware PDS can join the network’s event stream without exposing record contents.

The Love Triangle

When discussing permissioned data on ATProtocol, people usually focus on two groups: the data owner and those granted access to it. But that view leaves out a third key player: the applications that store, relay, and display the data.

Permissioned data is a love triangle between:

  • The user who creates and owns the data

  • The permitted identities that are authorized to access it

  • The applications that store or relay the data

This is not a flaw. It’s a core part of how the system works, and we should recognize it. By “applications,” I mean more than just the PDS. It includes the PDS that stores the data, the AppView that displays it, and any relay or indexer involved. Every application in this chain is part of the trust relationship.

Export Parity: Permissioned Data Is Still Your Data

Here’s a principle I want to state clearly: a permission-enabled PDS should have the same export controls and functionality as any other PDS.

It can be tempting to treat permissioned records as different when it comes to data portability, but they are not. They are still user-owned records. The repository owner always receives the full CAR file, including the commit, MST, and all record data blocks. Export, import, migration, and backup all work the same as with a standard PDS. This is not up for debate.

Data portability is one of ATProtocol’s core promises. The moment we say, “Well, these records are special, so they can’t be exported the normal way,” we’ve broken that promise. Permissioned data adds a runtime behavior (access control at read time), not a storage constraint.

Permission metadata that describes who can access what and under which conditions should also be portable. Whether it’s kept as separate permission records, inside the record’s source field, or managed with a dedicated lexicon, it must remain intact during export and import.

The Repository Already Supports This

This is where it gets interesting. ATProtocol repositories already separate “what exists” from “what it contains.” We don’t need a new primitive for metadata-only network participation; we just need to use the current structure on purpose.

How the MST Separates Structure from Content

An ATProtocol repository has three layers, each referencing the next by CID:

  • Signed commit object — contains the account DID, a revision TID, and a data CID pointing to the MST root. The commit is signed with the account’s signing key.

  • Merkle Search Tree nodes — each entry maps a collection/rkey path (like app.bsky.feed.post/3lsopfrzoww25) to a record’s CID. The tree is deterministic: given any set of path and CID pairs, exactly one valid MST exists.

  • Record data blocks — the actual DAG-CBOR encoded record content. Each block’s CID is computed from its bytes, and that CID appears in the MST.

The critical property: the MST stores paths and CIDs, not record content. The tree nodes reference records by their content hash, but they never contain record data. The commit signature covers the MST root CID, computed from the MST nodes that reference record CIDs. The signature transitively authenticates every path→CID mapping in the repository without requiring any record data.

This means the whole Merkle tree, including the commit signature, tree structure, and every path-to-CID mapping, stays complete and verifiable even if all record data blocks are removed.

Metadata-Only CAR: The Network View

Strip the record data blocks from a repository CAR file, and what you have left is a signed, verifiable table of contents:

  • The signed commit block (identical to the full export)

  • All MST node blocks (identical to the full export)

  • No record data blocks

This metadata-only CAR shows which account owns the repository, its revision, which records exist at which paths, and each record’s content hash. The full Merkle tree stays intact and can be checked against the commit signature. What’s missing is the actual content of any record; the CIDs in the MST leaf entries are just dangling references.

This is structurally valid. The CAR spec doesn’t prohibit dangling CID references, and ATProtocol already tolerates them. A metadata-only CAR is the natural extension of patterns the protocol already uses.

Two CARs, Two Audiences

This gives us a clean model:

  • Network participants, like relays and indexers, get the metadata-only CAR. They can see what exists, check the tree, and index paths and CIDs, but they can’t view the record content.

  • Repository owners always have access to the full CAR, including all record data for export, import, migration, and backup. The permission layer never limits the owner’s access to their own data.

  • Authorized consumers (AppViews and clients), fetch individual record blocks as needed through authenticated requests. They check each block’s CID against the MST and provide content to users who have permission.

This is export parity in practice: the owner’s experience is identical to a standard PDS, while the network sees only what it needs to maintain structure and consistency.

A Permissioned Firehose: Events Without Values

In standard ATProtocol, a PDS publishes its commit log to the network through an event stream (the firehose). Relays and downstream consumers subscribe to this stream to stay synchronized. Each event includes the full record of the commit.

For a permission-enabled PDS, this creates a problem. You can’t send the full content of a confidential record to every relay and indexer on the network. Doing so would defeat the purpose of access control.

Metadata-Only Commits

Instead of publishing entire records, a permission-enabled PDS shares the commit and changed MST nodes without the record data blocks. This follows the metadata-only CAR pattern for diffs.

This acts as a commit announcement: “a record at this path was created or updated, here’s its content hash, but you’ll need to ask me for the actual content.”

This changes the default from push to pull. Instead of sending everything, the system now tells you something exists, and you can request it if you have permission.

The precedent already exists. Firehose diffs are already partial CARs with dangling CID references. The deprecated tooBig mechanism emitted commit-only events without record blocks. Jetstream omits MST nodes entirely. Sync v1.1 introduces collection-filtered partial exports and inductive verification without MST retention. A metadata-only firehose event is the natural next step in this progression.

Backwards-Compatible StrongRef Extension

The cleanest way to express this is as a backwards-compatible extension to the existing strongRef structure. Today, a strong reference looks like:

{
    "cid": "bafyreic4uafzyy5o7g4o7yjnnmmkootivwvybyrq2xcu63x3z3tmuj5tgq",
    "uri": "at://did:plc:w4xbfzo7kqfes5zb7r6qv3rw/app.bsky.feed.post/3me5kw5txns2c"
}

For a permissioned record, the reference gains a service field:

{
    "cid": "bafyreic4uafzyy5o7g4o7yjnnmmkootivwvybyrq2xcu63x3z3tmuj5tgq",
    "uri": "at://did:plc:w4xbfzo7kqfes5zb7r6qv3rw/app.bsky.feed.post/3me5kw5txns2c",
    "service": "did:plc:w4xbfzo7kqfes5zb7r6qv3rw#atgated_pds"
}

The service field is a service identifier that points to the endpoint serving the full record content. Existing clients who don’t recognize this field simply ignore it. They still see a valid reference and can use the URI and CID for indexing, deduplication, and graph building, but they can’t access the value.

Clients that understand the service field know to resolve the DID, find the #atgated_pds service endpoint, and make an authenticated request to get the record content, following the existing authorization rules for inter-service authentication.

What This Enables

This design preserves several important properties:

Network-level consistency: Relays still see every commit and can keep a full view of the repository structure, including which collections exist, how many records are in each, and the CID of every commit. They just can’t view the contents of permissioned records. The Merkle tree stays intact and verifiable.

Content-addressed integrity: The CID in the announcement is the hash of the actual record content. When an authorized consumer later gets the full record, they can check that the CID matches. The firehose announcement acts as a commitment to the content, even if the content isn’t included.

Selective hydration at the edges: AppViews and clients are the right place for hydration. They already make authenticated requests on behalf of users. The firehose tells them what exists, and the service endpoint tells them what it contains if they have permission.

Backwards compatibility: Nothing breaks. Applications that don’t recognize service fields still work. The permissioned layer only adds new features.

Putting It Together

Here’s how these refinements combine:

1. A user creates a confidential Lexicon Community calendar event on their PDS for a private party.

2. The PDS publishes a metadata-only firehose event, which includes the commit and MST diff but no record data block. The service field in the metadata is the fragment of the service that can serve the data.

3. A relay consumer sees the event, indexes the path and CID, and verifies the MST integrity against the commit signature. It knows a record exists but can’t see what it contains.

4. An AppView, acting on behalf of an authorized user, resolves the service DID, discovers the endpoint, and makes an authenticated request for the record block. It verifies the returned data’s CID against the MST.

5. The AppView checks the requesting user’s authorization and returns the record.

6. If the user ever wants to migrate, they export their entire repository as a full CAR file (including all record data blocks) and import it into a new PDS.

Nothing in this flow needs protocol changes. Everything is additive: metadata-only firehose events, an optional field on strong references, and clear expectations about trust boundaries. The MST already separates structure from content; we’re just making that separation intentional.

What’s Next

Some questions remain. How should permission metadata be standardized across different implementations? What happens when a record is cached by an authorized consumer and needs to be revoked?

It’s important to note that collection NSIDs and TID-format rkeys in the MST reveal record types and creation times, even in metadata-only CARs. A relay consumer processing these events can see, for example, that you created three community.lexicon.calendar.event records last Tuesday, but not what those events contain. Incorporating metadata privacy would require changes to how the MST is built, which is beyond the scope of this post but worth considering.

The repository structure already gives us the separation we need: metadata-only CARs for the network and full CARs for the owner. A permissioned firehose based on this separation is a straightforward, backwards-compatible way to add access control to the network layer without disrupting what already works.


Join the discussion on the ATProtocol Community Discourse or in the atmosphere.

Check out the b-sides to this post:

B-Sides: Permissioned data is a love triangle - Nick's Blog

This is a B-sides post with unpolished thoughts that didn’t make it into the main article.

https://ngerakines.leaflet.pub/3mf5wu6fs6225
Lexicon Garden is growing
Lexicon Garden is growing and moving! With community support, the service is migrating to new infrastructure in Europe, offering better hardware and more space for new features.
Show full content

Lexicon Garden is a platform for exploring and developing lexicon schemas, fundamental building blocks that define ATProtocol data and services. I created it because I believe that community infrastructure for developers working with ATProtocol and in the atmosphere is essential to creating a healthy development ecosystem.

Recent funding has allowed Lexicon Garden to move to better hardware. This new infrastructure will support planned features, such as analytics and record sampling, and make the service more resilient and available globally.

As part of this infrastructure move, Lexicon Garden is relocating from the United States to Europe. The Atmosphere is global, so placing the infrastructure closer to more developers and users is a positive step.

I'm deeply grateful for the financial contributions from ATProtocol Community Fund and Cosmic Network, and their support for this mission.

If you use Lexicon Garden and want to help keep it going, any contribution is welcome on the Open Collective page.

Stay tuned for more updates by following @lexicon.garden

https://ngerakines.leaflet.pub/3mf5bfoekok2x
Making Lexicon Garden AI-Friendly
Lexicon Garden can help you explore and interact with lexicons both in the browser and with the help of your favorite agent.
Show full content

You may be using agents to write code, make API calls, and build applications, but a lot of protocol documentation is written for people who browse websites and gather information from code repositories. We've made some improvements that can change how you use LLMs and agents to work in the Atmosphere.

Lexicon Garden is a feature-rich Lexicon schema platform that helps you understand and interact with lexicons through both your browser and your AI tooling. In this post, I'll introduce two new features: an llms.txt endpoint for AI-friendly documentation and the Model Context Protocol (MCP) endpoint, which lets AI assistants explore and use ATProtocol methods.

The llms.txt Standard

The llmstxt.org proposal sets a standard for websites to offer machine-readable documentation that works well for large language models. Lexicon Garden uses this standard in two main ways.

Site-Level Documentation

The main endpoint at GET /llms.txt returns a detailed markdown document with the following:

  • Overview of ATProtocol lexicons and their types (records, queries, procedures, subscriptions)

  • Complete XRPC API reference with parameters, response formats, and example requests

  • MCP tool documentation

  • A dynamic list of all authoritatively-hosted lexicons with links to their individual llms.txt files

curl https://garden.lexicon.garden/llms.txt

This provides agents with all the information they need about Lexicon Garden, so they do not have to read HTML or scrape web pages.

Per-Lexicon Documentation

Each lexicon schema has its own llms.txt endpoint at GET /lexicon/{did}/{nsid}/llms.txt. This endpoint offers:

  • Schema definition with all type information

  • Property tables for records, queries, and procedures

  • Validation constraints (minLength, maxLength, enum values, etc.)

  • Community-contributed examples

  • Raw JSON schema for programmatic consumption

If an AI agent needs to build a valid app.bsky.feed.post record, this endpoint provides all the structured information required. There is no need to guess or make up field names.

Model Context Protocol Integration

The MCP endpoint at POST /mcp uses JSON-RPC 2.0. This lets AI agents interact with Lexicon Garden through a standard tool interface. Three tools are available.

describe_lexicon

This read-only tool retrieves detailed schema information for any ATProtocol lexicon:

{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
        "name": "describe_lexicon",
        "arguments": {
            "lexicon": "app.bsky.feed.post"
        }
    }
}

The tool accepts an NSID and an optional identity. If no identity is given, it finds the main source using Lexicon resolution.

The response includes the full schema definition, authority DID and CID, and up to 20 real-world examples. Each response also adds random safety tags to help prevent prompt injection attacks from harmful schema content.

create_record_cid

Agents working with ATProtocol records often need to generate content identifiers:

{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
        "name": "create_record_cid",
        "arguments": {
            "record": {
                "$type": "app.bsky.feed.post",
                "text": "Hello from MCP!",
                "createdAt": "2025-01-11T00:00:00.000Z"
            }
        }
    }
}

This process uses DAG-CBOR encoding and SHA-256 hashing to create CIDs that follow ATProtocol rules.

invoke_xrpc

The most advanced tool lets AI agents make authenticated XRPC calls for users:

{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
        "name": "invoke_xrpc",
        "arguments": {
            "method": "com.atproto.repo.listRecords",
            "params": {
                "repo": "did:plc:example",
                "collection": "app.bsky.feed.post",
                "limit": 10
            }
        }
    }
}

This tool retrieves the lexicon schema from the main source, checks all inputs against the schema before making a call, uses GET for queries and POST for procedures, and supports service proxying with the atproto_proxy parameter to route through the user's PDS to AppView or other services.

If validation fails, the error response includes the expected schema structure, required fields, and what was actually provided. This gives agents the context they need to self-correct.

The invoke_xrpc_guide Prompt

MCP prompts offer context-sensitive guidance. The invoke_xrpc_guide prompt helps agents understand how to properly use specific XRPC methods:

{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "prompts/get",
    "params": {
        "name": "invoke_xrpc_guide",
        "arguments": {
            "method": "com.atproto.repo.createRecord",
            "atproto_proxy": "true"
        }
    }
}

The prompt creates guidance based on the arguments. It gives method-specific advice about required fields and examples, service proxying documentation when needed, validation context, and best practices such as using describe_lexicon first and including $type fields.

ATProtocol OAuth: A Simple and Secure Proxy Approach

To make secure interactions easier for AI agents, Lexicon Garden manages the complex parts of ATProtocol OAuth authentication behind the scenes. Instead of requiring every agent to handle technical details, Lexicon Garden acts as a secure, easy-to-use proxy for authentication.

The Problem

ATProtocol OAuth requires DPoP (Demonstration of Proof-of-Possession), which binds tokens to specific cryptographic keys. This creates real challenges for MCP clients:

  • DPoP keys must be stored securely and used for every request

  • Token refresh requires the same DPoP key

  • Service proxying adds another layer of complexity

Most OAuth libraries do not support DPoP by default. Expecting every AI agent implementation to handle this correctly can lead to security issues and frustrated developers.

The Approach

To make things simpler, Lexicon Garden handles the technical parts of authentication for you. It manages the special keys and tokens required by ATProtocol, and takes care of logging in, granting access, and making secure requests. As a result, AI agents and developers can use familiar, standard authentication methods without worrying about new security mechanisms. Lexicon Garden translates these simple requests into the more complex protocol steps in the background.

Security and Trust

The tradeoff is trust. Clients must trust Lexicon Garden with their ATProtocol access tokens. This approach works well for first-party integrations but may not fit every use case.

Several measures protect against misuse:

  • Client-DID binding — once a client authenticates as a specific DID, it cannot authenticate as a different identity

  • Resource validation — tokens are bound to specific resource URIs (RFC 8707)

  • Short-lived tokens — access tokens expire quickly; refresh tokens enable renewal

An AI agent using Lexicon Garden might:

An AI agent interacting with Lexicon Garden might:

  • Read /llms.txt to understand available APIs

  • Call describe_lexicon to understand a specific schema

  • Request the invoke_xrpc_guide prompt for method-specific guidance

  • Authenticate through the OAuth flow

  • Use invoke_xrpc to make authenticated calls

Each layer gives the context needed for the next. Agents can interact with ATProtocol services without hardcoded knowledge of every schema. They can discover and learn as they go.

This is what AI-native protocol tooling looks like. The documentation is made for machines, the tools give structured feedback, and the authentication flows don't sacrifice security for convenience. The protocol stays open and federated, and the tooling makes it accessible.

https://ngerakines.leaflet.pub/3mca3ybkvbc2t
Lexicon Validation
Validate both lexicon schemas and records against those schemas.
Show full content

Lexicon Garden lets you fully validate schemas and records. You can check if a lexicon schema is valid and, if you want, also validate record data against it.

We built this feature using the indigo project as a guide for both parsing and testing. Our aim was to match real-world validation, so you can spot problems before they reach production.

Validation is available throughout the site. When you create or edit a lexicon in the management interface, you can check your schema before saving. Each lexicon has a "Validate Schema" button to check its definition. You can also validate example records against their schema with one click.

You can also use the standalone validation page at lexicon.garden/validate. Just paste in a schema and click validate, or include record data to check both at once. This helps when you're drafting a new lexicon and want to test ideas quickly before creating anything.

The validator works with all lexicon definition types: records, objects, arrays, queries, procedures, and subscriptions. If validation fails, you'll see clear error messages showing what needs fixing. If it passes, you'll get a green confirmation and a ready-to-use curl example for calling the validation endpoint programmatically.

https://ngerakines.leaflet.pub/3mbtj66hk2c2w
Lexicon Management
You can use Lexicon Garden to create and manage lexicon schemas right from the browser.
Show full content

Lexicon Garden lets you create and manage lexicon schemas right on the website. You no longer need to edit JSON files or push to repositories. Instead, you can define, update, and publish lexicons straight from your browser.

To get started, sign in and go to your identity page. There, you’ll find an option to enable Lexicon Management. Accept the site policies, then pick a subdomain. This subdomain is a unique name for your account, like "stunning-leafcutter" or "expansive-kookaburra." If you don’t like the options, you can try again, but choose carefully because you won’t be able to change it later.

After you enable this feature, you can create lexicons under your Lexicon Garden subdomain or manage lexicons on domains you own. When you view a lexicon you own, you’ll see a new "Manage" section. There, you can check the resolution status and edit the schema definition directly.

The main advantage is the garden.lexicon.* namespace. Your subdomain can be used as a base NSID. For example, if you choose "stunning-leafcutter," you can create lexicons like garden.lexicon.stunning-leafcutter.example or garden.lexicon.stunning-leafcutter.music.album. DNS entries are set up automatically, so lexicon resolution works right away. You don’t need to configure TXT records or wait for propagation. Just create a lexicon and start using it immediately.

This turns Lexicon Garden into a real development environment for trying out new schemas. You can create a lexicon, test it, and change its structure without dealing with DNS or deployments. If you use your own domains, the manage page will show you exactly which DNS records to add for resolution to work.

https://ngerakines.leaflet.pub/3mbtgsqkgcc2y
Authenticated XRPC Calls in Lexicon Garden
ATProtocol OAuth pairs effortlessly with Lexicon Garden to make it easy for developers to experiment with authenticated XRPC endpoints.
Show full content

Lexicon Garden now supports ATProtocol OAuth, so you can sign in and make authenticated XRPC calls right from the documentation.

The "Try It" panel on XRPC method pages is now available. You can fill in parameters and send requests to any service straight from your browser. This works well for public endpoints that don't need authentication. However, many XRPC methods require you to be signed in, and until now, there was no way to test those.

Now you can. Sign in with your internet handle, DID, or the URL of your PDS. Lexicon Garden guides you through the OAuth process and sets up a session. Once you're logged in, the "Try It" panel lets you make authenticated requests. Just check the box to use your credentials, and you can call endpoints that need a signed-in user.

This makes it easy to explore authenticated endpoints without writing code or setting up a development environment. Want to see what app.bsky.actor.getSuggestions returns for your account? Or check your bookmarks with app.bsky.bookmark.getBookmarks? Now you can do both directly from the docs.

Requests go straight from your browser to the service you choose. Some servers might block requests due to CORS, but the most common endpoints work without issue. This site gives you direct access to ATProtocol services, so it's best for power users and developers who are comfortable with these tools.

https://ngerakines.leaflet.pub/3mbqvlt45ck27
Lexicon Garden
Lexicon Garden helps you browse, view, and understand ATProtocol Lexicons.
Show full content

If you build on ATProtocol, you know how hard it can be to find lexicon schemas, make sense of their documentation, and locate useful example records. Maybe you noticed a new record type in a Bluesky post or came across an interesting NSID in a repository. After that, you have to search for the schema, learn about its fields, and see how it links to other lexicons. Until now, there hasn’t been a central place to browse, search, and learn about these schemas. That’s why I built Lexicon Garden.

Lexicon Garden helps you find lexicon schemas published across the ATmosphere. You can browse all indexed schemas and quickly get to the one you need. You no longer have to search through git repositories or scroll through old posts to find what you’re looking for.

As Lexicon Garden indexes schema definitions and their examples, it creates a browsable graph that shows how lexicons are connected. This helps you see which types reference others and how they all fit together.

Lexicon Garden also parses schema definitions to create clear, helpful documentation. Whether you are looking at records, objects, arrays, enums, or different XRPC functions, the generated docs make it easy to navigate and cross-reference types.

I’m especially excited about garden.lexicon.example records. Lexicon Garden lets developers share official examples that others can use as references. If you want to see how a schema works in practice, a real example is often more useful than just reading the spec. This feature is opt-in and community-driven, so it will grow as more people contribute.

Lexicon Garden is now live at https://lexicon.garden. If you publish your own lexicons, check out the help docs for advice on getting your schemas indexed and tips for writing good documentation.

https://ngerakines.leaflet.pub/3mbogpi62k22d
2025 in Concerts
Show full content

Ben Folds (Oct 5)

Sammy Rae (Apr 16)

Palace (Mar 18)

https://ngerakines.leaflet.pub/3mamd2m57wk2x
Community Manager Pattern: Forums
Show full content

In my last post, I talked about the community manager pattern. I focused on the basics including how membership attestations work, how the wrapper pattern handles content, and what the lifecycle of a book club community looks like.

This time, I want to show a use case that isn't possible yet, but would fit well with the community manager pattern: discussion forums.

The Problem Today

Forums have been around since the early days of the internet, but there's always been a basic conflict: users want to own their posts, while communities need to moderate content. Traditional forums solve this by owning everything. Your posts live on their servers and follow their rules. ATProtocol's data ownership model changes the picture.

If you wanted to build a forum on ATProtocol right now, you'd face an awkward choice. Either users post content to their own repositories (preserving ownership but making moderation nearly impossible), or some central identity owns all the content (enabling moderation but abandoning the protocol’s core value proposition).

Neither of these options is ideal. The community manager pattern offers a better solution.

How It Would Work

Imagine a local music forum at dayton-music-scene.com. The forum itself is an ATProtocol identity with its own DID and repository. It has a community manager service that handles membership and content.

Membership is simple but still meaningful. Most forums want easy signups, unlike a book club that may carefully curates members. The community manager can set up automatic approval, so anyone who creates a membership record is attested right away, maybe with basic checks like account age or email verification. The attestation remains, so the community can revoke it later if needed.

Topics, Threads, and Wrappers

Let's look at how content moves through this model. There are three main types of records involved.

Topics belong to the community.

A topic is a category or channel for discussion, like "General", "Show Announcements", or "Gear Talk". These are stored in the community's repository because they organize the forum at the community level:

{
    "uri": "at://did:plc:dayton-music-scene/community.lexicon.discussion.topic/1",
    "cid": "bafyreib7phelijh23iktyzraqw54wi7flwthmdzryxscmtvb47cggcfey4",
    "value": {
        "$type": "community.lexicon.discussion.topic",
        "name": "General"
    }
}
Threads are owned by users.

When I want to start a discussion, I write the thread content to my own PDS. This is my data. I'm the author, and I control it:

{
    "uri": "at://did:plc:nick/community.lexicon.discussion.thread/1",
    "cid": "bafyreihg2rtnzae4wtclusodra5pe4luscjtt66ltwsgwq2zy4jhjhhg44",
    "value": {
        "$type": "community.lexicon.discussion.thread",
        "text": "Hey there bud, how'd it go last night?\nI saw you at the band stand looking pretty slammed. https://www.peachpitmusic.com/",
        "facets": [
            {
                "index": {
                    "byteStart": 92,
                    "byteEnd": 122
                },
                "features": [
                    {
                        "$type": "app.bsky.richtext.facet#link",
                        "uri": "https://www.peachpitmusic.com/"
                    }
                ]
            }
        ],
        "createdAt": "2025-12-22T16:07:06.094Z"
    }
}

Notice that this thread record doesn't mention the forum or topic. It's just content. The thread can exist on its own, be linked by different forums, or stay available even if I leave a community.

Wrapped threads connect user content to community structure.

When I create my thread and select the "General" topic, the community manager creates a wrapper record in the community's repository:

{
    "uri": "at://did:plc:dayton-music-scene/community.lexicon.discussion.wrappedThread/2",
    "cid": "bafyreiftdoabsi5vi7vhjjulllazad4thunhmxhi2pccqly6t3tkqmjlvy",
    "value": {
        "$type": "community.lexicon.discussion.wrappedThread",
        "topic": {
            "uri": "at://did:plc:dayton-music-scene/community.lexicon.discussion.topic/1",
            "cid": "bafyreib7phelijh23iktyzraqw54wi7flwthmdzryxscmtvb47cggcfey4"
        },
        "thread": {
            "uri": "at://did:plc:nick/community.lexicon.discussion.thread/1",
            "cid": "bafyreihg2rtnzae4wtclusodra5pe4luscjtt66ltwsgwq2zy4jhjhhg44"
        },
        "createdAt": "2025-12-22T16:07:06.094Z"
    }
}

The wrapper contains two strongRefs: one to the topic (community-owned) and one to my thread (user-owned). This is the join point. The forum AppView queries wrapped threads to build the discussion view, resolving the strongRefs to fetch the actual content.

Replies work the same way. Another user writes their reply to their own PDS, and the community manager creates a wrapper linking it to the thread.

With strongRefs, if the content changes the community can decide what they want to do. They could provide a visual indicator that the content has changed, flag it for review, or automatically update the wrappedThread to reference the new CID.

Moderation Without Ownership

This is where the pattern shines. Say someone posts spam or violates community guidelines. The moderators delete the wrapper record (the community's reference to the content). The original thread still exists in the user's repository, but it no longer appears in the forum.

From the user's perspective, their data is intact. They can take their posts to another forum, reference them elsewhere, or keep them as a personal archive. From the community's perspective, they have full moderation control over what appears in their space.

Tombstones and Deleted Content

The wrapper pattern has another benefit: it naturally supports tombstones for deleted content.

If I delete my thread record from my PDS, the community's wrappedThread still remains. The strongRef now points to missing content. Instead of showing an error, the forum AppView can display a tombstone, a placeholder that shows something was here but has been removed.

[Thread deleted by author]
3 replies ...

This keeps the discussion's structure. Replies to my thread still make sense, even if my original post is gone. Other members can see that a conversation took place, that people responded, and that the original author chose to remove their post.

The tombstone can also show metadata from the wrapper. The createdAt timestamp shows when the thread was posted, and the topic reference shows where it belonged. The tombstone could also display "[Deleted thread by @ngerakines.me]" referencing the authority (DID) of the thread strongRef uri.

This is different from how most forums handle deletion. Usually, when content is deleted, the forum decides what happens. Sometimes it's removed completely, sometimes it's replaced with "[deleted]", or sometimes only moderators can see it. The user doesn't get a choice.

With the wrapper pattern, users control when their content is deleted, but the community decides how that deletion appears in their space. I can remove my data from the network if I want to be forgotten. The community can keep the wrapper as a tombstone to keep the conversation's flow. This way, both sides are respected.

The community can also decide to remove orphaned wrappers. A background job could check for wrappers that point to missing content and delete them if the community doesn't want to show tombstones. The AppView has the flexibility to choose.

User-Defined Moderation with Labelers

Community moderation is only half the story. ATProtocol's labeler services give individual users control over what they see, and this should carry through to forum content.

A labeler is a service that adds labels to content or accounts. Users subscribe to labelers they trust, and their client or AppView uses those labels to filter, warn, or hide content. This is user-defined moderation, so you choose whose judgment to follow.

The wrapper pattern preserves this capability because the original record's authority is maintained. When a forum AppView displays a thread, it knows both the wrapper (owned by the community) and the underlying thread (owned by the author). Labels apply to the original content, not the wrapper.

Here’s how this works in practice. Suppose did:plc:loser subscribes to a "USA-ONLY" labeler that flags content about non-American bands. My thread quotes Peach Pit lyrics, which is a Canadian band. The labeler adds a !hide label to my thread record.

When did:plc:loser views the forum, the AppView queries the labeler with my DID and the thread's AT-URI:

at://did:plc:nick/community.lexicon.discussion.thread/1

The labeler returns the !hide label. The AppView follows this, so did:plc:loser doesn't see my thread, even though the community hasn't moderated it and other members can still see it.

This works because the AppView queries against the original content, not the wrapper. The wrappedThread record points to my thread via strongRef, so the AppView knows exactly which record to check. If the AppView queried against its own wrapper record, user-subscribed labelers would be useless - they'd have no way to label community-specific wrapper records across every forum on the network.

The same idea works for account-level labels. If a labeler flags my whole account, that label applies to my threads no matter which community they show up in. The wrapper doesn't hide my reputation.

This layered approach gives forums the best of both worlds. Community moderators set the standards, and individual users can add their own filters. Neither layer cancels out the other.

Scaling Considerations

Forums can get large. The wrapper pattern means the community repository grows with every topic and reply. A busy forum might have millions of wrapper records.

But this isn't a problem. The wrappers are small, just strongRefs with a little metadata. The main content, like post bodies and images, stays in user repositories across the network. The community repository is really just an index, which is what it's meant to be.

Closing Out

Right now, this use case isn't possible because there’s no standard way for one identity to give permissions to others, or for a service to write to a repository for someone else. The community manager pattern, along with the opensocial.community lexicons, provides the needed infrastructure.

If this is interesting to you, reach out with your questions and feedback. Also, check out some of the recent related posts:

https://ngerakines.leaflet.pub/3malqm3dqls27
The Community Manager Pattern
This post looks more closely at Brittany Ellich's work on representing groups in ATProtocol. It builds on earlier conversations and explores how these ideas might work in practice.
Show full content

@brittanyellich.com recently shared a post about her work on “opensocial.community” and the challenges of representing groups in ATProtocol. She focuses on the infrastructure layer, like group management, authentication, and storing shared resources. I want to add to her ideas by looking at the "community manager" pattern. This pattern lets communities act as full identities while keeping the data ownership that makes ATProtocol unique.

The Core Problem

Before looking at solutions, it's important to remember an important rule: in ATProtocol, each identity can only write records to its own repository. This rule is central to the protocol and ensures real data ownership and portability.

In the diagram @ngerakines.me can add a book review to their own repository on https://pds.cauda.cloud, but not directly to @dayton-readers.club's repository. This setup protects data ownership, but it also raises a key question: how can communities work if members can't add content directly to shared spaces?

The IETF 124 Discussion

At IETF 124 in Montreal earlier this year, @bmann.ca organized a multi-day ATProtocol meetup. We had a good group, including representation from Bluesky Social PBC, Stream Place, Leaflet, Cosmik Network, Germ, me, and several others in the community. One topic we discussed was how shared-identity and shared-data public communities could work in ATProtocol.

This conversation built on earlier talks I had with @tom.sherman.is while he was working through communities at https://frontpage.fyi. We kept coming back to an idea: communities don’t need to store member content themselves. They just need to reference it.

More at https://atprotocommunity.leaflet.pub/3m5pejic4fk2p

Communities as Identities

We treat communities as full ATProtocol identities. For example, @dayton-readers.club has its own DID and repository (PDS), just like any other identity. The main difference is what gets stored in that repository.

A community’s repository serves three distinct purposes:

1. Community Profile and Properties

The community has its own profile records in the format needed for the AppViews it uses. If a book club wants to appear in Bluesky, it would have an app.bsky.actor.profile record. If it's mainly a book-focused community, it might have records showing which books are being read, meeting schedules, and community guidelines.

2. Membership Attestations

This is where things get interesting. Individual members maintain their own "membership card" records in their own repositories. The community then stores attestations that acknowledge these memberships.

Let's look at a concrete example. If I want to join @dayton-readers.club, I first create a membership record in my own repository:

{
    "uri": "at://ngerakines.me/community.opensocial.membership/1",
    "cid": "bafyreif3e4fsrxox4vskuemnbmij2hb6e3pxxoujwd6jcncey7iw7dfrzi",
    "value": {
        "$type": "community.opensocial.membership",
        "community": "did:plc:dayton-readers-club",
        "role": "member",
        "since": "2025-12-21T14:41:02.431Z"
    }
}

This record lives in my PDS, and I own it. It shows my intent to join the community, my role, and when I joined. But by itself, it doesn't prove the community has accepted me. Anyone could write a membership record and claim to be part of any community.

The community completes the handshake by creating a membership proof record in  its repository:

{
    "uri": "at://dayton-readers.club/community.opensocial.membershipProof/1",
    "cid": "bafyreigbhl46xyeabrdrxzlb3weio2xmdkszrgey44p7anrwaqndaczrse",
    "value": {
        "$type": "community.opensocial.membershipProof",
        "cid": "bafyreigdshj27haq2qvwoogpzlniehyzreusaxcgjmhmkksm6witmkobxa"
    },
    "signatureRecord": {
        "$type": "community.opensocial.membership",
        "community": "did:plc:dayton-readers-club",
        "role": "member",
        "since": "2025-12-21T14:41:02.431Z",
        "$sig": {
            "$type": "community.opensocial.membershipProof",
            "repository": "ngerakines.me"
        }
    }
}

The membership proof's only contains a CID, which is a SHA-256 hash of the DAG-CBOR encoded membership record. The signatureRecord shown here isn't actually stored in the record, just as the uri and cid fields are metadata from the getRecord XRPC method. I included it to show that the value.cid field comes from my membership record content and a signature block that identifies the source repository.

This design has several important properties:

Privacy through indirection: If you look only at the community's repository, you'll see a list of CIDs. You can't tell who the members are unless you also fetch the matching membership records from each member's repository. This makes it harder for someone to casually list all members.

User-controlled revocation: If I want to leave the community, I just delete my membership record. The community's proof will then point to content that no longer exists. I don't need anyone's permission; I simply remove my side of the relationship. Users have the right to be forgotten and this model supports that.

Community-controlled acceptance: The community decides which memberships to confirm. Just creating a membership record doesn't make you a member; the community also needs to create the matching proof.

Verifiable without trust: Anyone can check a membership by making sure the CID in the community's proof matches the hash of the user's membership record. No trusted third party is needed.

3. Community Data and Wrapped Content

This is the AppView-specific data that makes each community unique. For example, a book club might post about reading selections or meeting events, while a technical community might have wiki pages and member-submitted content.

The Wrapper Pattern

ATProtocol's philosophy is that users should own their data. But communities also need some control over content that appears for or with the community. The wrapper pattern solves this by using ATProtocol's strongRef primitive.

When a user wants to post a book review to their community, there are two steps:

Step 1: The user writes their content to their own PDS

{
    "uri": "at://ngerakines.me/app.book-clubs.bookReview/1",
    "cid": "bafyreias5wkc5ln337adu6yrefvwoidn4ueajnpulvaaqat72jiufew234",
    "value": {
        "$type": "app.book-clubs.bookReview",
        "title": "1984",
        "text": "Kind of dark."
    }
}

The user owns this record. It lives in their repository, on their PDS. They are the author and can delete or change it whenever they want.

Step 2: The community creates a wrapper record

{
    "uri": "at://dayton-readers.club/app.book-clubs.communityBookReview/1",
    "cid": "bafyreiajawyv6dyx4qinlttposk4eyeldtufy43fodla7qhgokp5rm7xii",
    "value": {
        "$type": "app.book-clubs.communityBookReview",
        "categories": [
            "classics"
        ],
        "series": "2025-12-01 2025-12-31",
        "review": {
            "uri": "at://ngerakines.me/app.book-clubs.bookReview/1",
            "cid": "bafyreias5wkc5ln337adu6yrefvwoidn4ueajnpulvaaqat72jiufew234"
        }
    }
}

The wrapper lives in the community's repository and contains a strongRef pointing to the user’s original content. It can also include community-specific metadata, like categorization and which reading period the review belongs to.

Separating Concerns

This pattern achieves an important goal: it separates authorship from distribution.

The user is clearly the author of their book review, and the cryptographic chain of custody is easy to follow. But the community decides if the review appears in the community setting.

Consider the moderation implications:

  • Community removes offensive content: The community deletes the wrapper record. The review is no longer visible in the community's AppView, but the user's original content stays in their repository. The user's data is not removed.

  • User decides to leave: The user can delete their original content. The community’s wrapper now points to a record that no longer exists. AppViews can handle this by showing the wrapper without the content or hiding it completely.

  • Cross-posting: A user can have their review show up in several communities. Each community creates its own wrapper that points to the same original content. The user writes the review once and it appears in different places.

This approach offers the best of both worlds. Communities get the moderation tools they need to keep spaces healthy, and users keep full ownership of their content. This is the ATProtocol Ethos.

The Community Manager Service

So, who creates these wrapper records and membership proofs? This is where a "community manager" service comes in. It connects Brittany's "opensocial.community" work with this pattern.

A community manager is an ATProtocol service that provides XRPC methods to manage community access, permissions, and data. In this example, the did:web:opensocial.community service identity provides several important methods:

  • community.opensocial.register - Registers a community with the community manager service. Registration involves some setup steps we’ll walk through below.

  • community.opensocial.setAppPassword - Sets the app-password used for community PDS actions.

  • community.opensocial.acceptMember - Takes an AT-URI and CID of a membership record, creates a membership proof record in the community's PDS, and returns a strong ref to be used as a remote attestation in the membership record.

  • com.atproto.repo.putRecord - Writes a record to the community's PDS on behalf of authorized members.

This could be a centralized service like "opensocial.community" or communities can self-host if they want full control. The pattern works in both cases because the data structures and XRPC interfaces stay the same. Only the way they are run changes.

Let's walk through the complete lifecycle using @dayton-readers.club as an example, with Brittany and Nick.

Step 0: Creating the Community Identity

Brittany (did:plc:brittany) wants to create the "dayton-readers.club" community. She starts by creating a DID for the community (did:plc:dayton-readers-club) and setting up its handle. This is standard ATProtocol identity setup. There's nothing community-specific yet.

Step 1: Registering with the Community Manager

Brittany navigates to the community manager implementation at "opensocial.community" and authenticates via OAuth as the @dayton-readers.club identity. She submits the "manage this identity as a community" form, specifying did:plc:brittany as the initial admin.

This kicks off the registration process and creates several follow-up actions:

  • Admin invitation: The service creates an invitation for Brittany to accept. This is necessary because Brittany will need an attested community.opensocial.membership record to manage the community through the "opensocial.community" AppView.

  • App-password prompt: Brittany enters an app-password that the community manager will use for future PDS write operations for members.

  • DID document update: The service tells Brittany to add a "CommunityManager" service entry to the @dayton-readers.club DID document. This is how other services know which community manager is the main one for this community.

Step 2: Completing Admin Setup

Brittany adds the app-password to the community manager, logs out, then logs back in as did:plc:brittany to accept the admin invitation. Accepting writes the membership record to her own PDS.

At this point, two records exist:

  • at://did:plc:brittany/community.opensocial.membership/1

  • at://did:plc:dayton-readers-club/community.opensocial.membershipProof/1

Step 3: Creating the Book Club

Brittany navigates to the book.club AppView at https://book.club/ and authenticates as did:plc:brittany. She creates a book club for the community.

The book.club AppView is community-manager-aware. It checks the @dayton-readers.club DID document, finds the "CommunityManager" service, and makes sure Brittany has a membership record with {"role": "admin"} that's been attested by the community.

Once permissions are checked, Brittany uses the book.club AppView to write the community's profile. Behind the scenes, book.club sends a service-authenticated request (using inter-service JWT authentication) to the community manager's com.atproto.repo.putRecord endpoint. This creates the at://did:plc:dayton-readers-club/club.book.profile/self record.

When the community manager gets this call, it checks that the DID in the service-auth token (did:plc:brittany) has permission to write the record. The exact permission rules depend on the implementation, but for now, assume the caller needs a valid attested membership record.

Three records now exist:

  • at://did:plc:brittany/community.opensocial.membership/1

  • at://did:plc:dayton-readers-club/community.opensocial.membershipProof/1

  • at://did:plc:dayton-readers-club/club.book.profile/self

Step 4: Inviting Members

Brittany loves reading, but the point of a book club is to read with others. She invites Nick to join through the book.club AppView.

Nick authenticates as did:plc:nick, goes to the community profile, and clicks "Join". This writes a membership record to Nick's PDS, but without any attestation yet.

Brittany gets a notification that Nick's membership is pending. She logs into book.club and approves it. The approval flow has book.club make a service-auth call to community.opensocial.acceptMember, which writes a proof record to the community’s PDS and returns a signature for updating Nick's membership record.

Five records now exist:

  • at://did:plc:brittany/community.opensocial.membership/1

  • at://did:plc:dayton-readers-club/community.opensocial.membershipProof/1

  • at://did:plc:dayton-readers-club/club.book.profile/self

  • at://did:plc:nick/community.opensocial.membership/1

  • at://did:plc:dayton-readers-club/community.opensocial.membershipProof/2

Step 5: Contributing Content

Fast forward: the community has posted an app.bsky.feed.post promoting the December book selection, "You Deserve a Tech Union". Nick has read it and wants to share a review.

Nick uses the book.club AppView to write his review. The review record goes to his PDS. While writing, Nick indicates he wants the review to be part of @dayton-readers.club.

When the review is published, book.club makes another service-auth call to the community manager's com.atproto.repo.putRecord endpoint. This creates a wrapper record in the community's repository that references Nick's review. Nick has permission because he's an attested member.

Eight records now exist:

  • at://did:plc:brittany/community.opensocial.membership/1

  • at://did:plc:dayton-readers-club/community.opensocial.membershipProof/1

  • at://did:plc:dayton-readers-club/club.book.profile/self

  • at://did:plc:nick/community.opensocial.membership/1

  • at://did:plc:dayton-readers-club/community.opensocial.membershipProof/2

  • at://did:plc:dayton-readers-club/app.bsky.feed.post/1

  • at://did:plc:nick/club.book.review/1

  • at://did:plc:dayton-readers-club/club.book.communityReview/1

Nick owns his review. The community owns the wrapper. If the review violates community guidelines, moderators can delete the wrapper without touching Nick's original content. If Nick leaves the community, his review stays in his repository. Only the community's reference to it disappears.

Whats Next

I'm excited to see where Brittany's work on opensocial.community goes. I think these patterns will keep developing as more people build community-focused applications on ATProtocol.

The types, record structures, and lexicon methods aren't finalized and will likely change as the user experience is built out. There is no one-size-fits-all solution, but I believe this brings us closer to a flexible approach where communities and their members have power and control over their identities and data.

https://ngerakines.leaflet.pub/3majmrpjrd22b
Sign In With Your Internet Handle
Show full content

Authentication is an area that generally has really difficult UX constraints and the language really matters.

Dan Abramov launched https://internethandle.org/ in late November and has been promoting the use of "Internet Handle" as a standard term for authentication in the ATProtocol ecosystem.

I'm open to the idea, but I don't think "handle" will catch on quickly since users might need time to get used to it. Still, it's better than options like "bluesky account", "at-handle", "DID" or "PDS". "Bluesky account" is probably the easiest for users to understand, but we shouldn't mix up identity with the Bluesky Social AppView.

Today, Tom Sherman posted screenshots of new terminology for https://frontpage.fyi/ to gather community feedback:

His post inspired me to try a similar experiment on Smoke Signal, using the same design and terms:

This is also a good time to document what I believe to be the standard all AppViews and applications should support for authentication inputs to make user authentication consistent across the ecosystem:

  • Login forms should accept four types of inputs: handles, DIDs, AT-URIs, and URLs. For example, these are all valid: "did:plc:cbkjy5n7bk3ax2wplmtjofq2", "did:web:atwork.place", "ngerakines.me", and "https://pds.cauda.cloud/". Any of these should start the authentication process.

  • A DID input can include the "at://" prefix. The specification says that a protocol prefix plus an authority makes a valid AT-URI, like "at://did:plc:cbkjy5n7bk3ax2wplmtjofq2".

  • A handle input can include the "@" or "at://" prefix. Like with DIDs, an AT-URI authority can also be a valid handle, such as "at://ngerakines.me".

  • If someone enters a partial handle as one alphanumeric string with dashes, it makes sense to assume the full handle ends with ".bsky.social".

  • An HTTPS URL is also a valid input and should be treated as an authorization server or protected resource. For example, it’s valid to authenticate against a PDS like "https://pds.cauda.cloud/".

  • Some users might be confused by "internet handle" and enter their handle as a URL. If, for example, "https://ngerakines.me" isn’t valid as an authorization server or protected resource, the system should check if the hostname be used to resolve to a DID.

https://ngerakines.leaflet.pub/3ma7hed2kdk2x
Why Inter-Service Auth Needs Client Identity
ATProtocol's inter-service authentication currently has no way to identify which client is making a request on behalf of a user, forcing services to rely on forgeable headers or clunky workarounds to establish trust relationships. Adding an optional client_id claim to inter-service JWTs would solve this cleanly, enabling service-to-service trust, rate limiting, and feature flags using the cryptographic infrastructure we already have.
Show full content

Update: Not all posts are winners. This got some really good feedback. See the addendum.

There's a gap in how ATProtocol handles inter-service authentication, and it's one that becomes increasingly important as the network grows: when an AppView makes a request to another AppView on behalf of a user, the receiving service has no way to identify which client is making that request.

Let me explain why this matters.

The Current State

Right now, inter-service authentication lets AppViewA call AppViewB on behalf of a user. The JWT proves that the user authorized the action, but it doesn't identify AppViewA as the caller. From AppViewB's perspective, it's just "some authorized client acting for this user."

For a small network, this works fine. But as the ecosystem matures, we're going to need more nuance.

Why Client Identity Matters

AppView access controls. With inter-service auth, a request can come in for an identity that a service has no prior relationship with, but the client making the request is a known bad actor on the network. Service owners need the ability to block AppViews that are directly or indirectly causing harm—whether to the service itself, its users, or the broader network.

Consider a malicious AppView designed to organize brigades and harassment campaigns. Beyond scraping public network data, it makes inter-service calls to a feed generator to identify top posters on LGBTQ-focused feeds, then targets them for harassment. If the feed service can see the client_id in the request, it can block that AppView entirely—even when the request comes with a valid user token.

User data boundaries. AppViews are increasingly becoming sources of protocol-adjacent information that doesn't live in a user's repo. Users may want to share certain data with other users while maintaining control over which services can access it.

Imagine a location service that provides supplemental information for events and posts—sometimes semi-private or sensitive. A user enters a street address for a party through a calendar AppView and wants only accepted attendees to see it. But they go a step further: they only trust that specific calendar AppView to properly implement caching and access controls. They add a rule that denies other AppViews access to their sensitive locations, even if those AppViews are making requests on behalf of valid, authorized identities. Without client_id in the inter-service JWT, this kind of granular control isn't possible.

Business relationships between services. Imagine AppViewA and AppViewB have struck a deal: maybe higher rate limits, access to beta features, or special integrations. Without client identification in the JWT, AppViewB has no reliable way to recognize AppViewA and apply those terms. You'd have to fall back to IP allowlists or API keys layered on top of the existing auth—ugly workarounds that defeat the elegance of the inter-service auth system.

Why not just use headers? You could argue that clients should just self-identify with a custom header. But why introduce values that can be trivially forged? Anyone can slap an X-Client-Id: TrustedApp header on a request. We already have the infrastructure for cryptographically signed authorization—that's the whole point of JWTs. The client identity should live there, where it can be trusted.

Why OAuth Scopes Aren't Enough

You might think this is already solved by OAuth scopes. After all, users authorize specific scopes when they connect an app, so can't they just limit what services can do on their behalf?

The problem is that scopes define what actions are permitted, but they don't address who gets to take those actions. A receiving service has no way to distinguish between two different clients making the same authorized request. Both might have valid tokens with the correct scopes, but the receiving service might have very different relationships with each of them.

Client identity in inter-service JWTs gives us a foundation to build service-to-service trust without requiring additional authentication mechanisms bolted on top.

The Fix

The solution is straightforward: include the client ID in inter-service JWTs. The receiving service can then make decisions based on who's calling, not just who the user is.

Here's what an inter-service JWT payload might look like with client identity included:

{
  "iss": "did:plc:abc123xyz",
  "aud": "did:web:calendar.example.com#calendar",
  "lxm": "community.lexicon.calendar.getEvents",
  "client_id": "https://smokesignal.events/oauth-client-metadata.json",
  "iat": 1733011200,
  "exp": 1733011260
}

The client_id claim identifies the calling service, so in this case Smoke Signal is requesting calendar events on behalf of a user. The receiving calendar service can now make informed decisions: apply rate limits, check for business agreements, or respect user preferences about which clients can access their data through inter-service calls.

A note on compatibility: the client_id field must be optional. Not all inter-service authentication originates from OAuth sessions. When a user authenticates via an app password or their real password, there's no originating client to identify because the user is effectively acting directly. In these cases, the client_id claim would simply be omitted. This keeps the change backwards compatible and avoids breaking existing authentication flows that don't have client context to pass along.

This opens up a whole category of features around service-to-service trust, rate limiting, feature flags, and AppView boundaries. It's the kind of foundational change that makes the network more flexible without adding complexity for services that don't need it, they can simply ignore the client ID if it's not relevant to them.


Update!

After a really good discussion with @thisismissem.social, there's a gap and potential for abuse that is worth mentioning. Were this idea to ever evolve into an implementation, it would be important to consider.

https://ngerakines.leaflet.pub/3m6xaxk64tk2h
Don't Feed the Trolls: Why This Old Internet Wisdom Still Matters
"Don't feed the trolls" emerged as folk wisdom in 1990s Usenet culture and became internet gospel, grounded in solid psychological research showing that trolls seek negative attention and ignoring them removes their reward.
Show full content

The phrase "don't feed the trolls" originated on Usenet in the early-to-mid 1990s, documented in the 1996 edition of Eric S. Raymond's Jargon File which noted: "one not infrequently sees the warning 'Do not feed the troll' as part of a followup to troll postings." The birthplace was alt.folklore.urban (AFU), where "trolling for newbies" served as community bonding where veterans would post overdone topics to identify newcomers who'd respond earnestly. Interestingly, David Mikkelson, one of AFU's most notorious trollers, went on to create Snopes.com in 1994-1995, even naming a section "The Repository of Lost Legends" with the acronym T.R.O.L.L. as homage to this early culture. By the late 1990s, the meaning had darkened considerably and the term broadened to include anyone deliberately harassing or provoking other users for attention.

troll

Some people claim that the troll (sense 1) is properly a narrower category than flame bait, that a troll is categorized by containing some assertion that is wrong but not overtly controversial. See also Troll-O-Meter.

So what does "don't feed the trolls" actually mean in practice? At its core, it means cutting trolls off from the thing they want most: your attention. When someone posts inflammatory content, makes personal attacks, or tries to derail conversations with bad-faith arguments, they're fishing for reactions. Every angry reply, every quote calling them out, every screenshot shared in outrage is food for the troll. Research by Federation University Australia shows that trolls derive pleasure specifically from "negative social potency", aka knowing that others are annoyed. The more negative social impact they have, the more their behavior is reinforced.

See also: https://pubmed.ncbi.nlm.nih.gov/37043467/

This is where the wisdom of restraint comes in. Whatever witty comeback or devastating quip you have prepared, it isn't worth it and only feeds the trolls. I know it's tempting, but engaging gives them exactly what they're seeking. Studies analyzing 40 million posts across 18 months found that "anti-social behavior is exacerbated when the community feedback is overly harsh." Each angry response feeds the cycle and encourages more trolling. Breaking this reinforcement loop through non-engagement diminishes their motivation to continue.

Don't feed the trolls means not giving them a platform. When you quote a troll, you're amplifying their message to your entire following. Research from the Center for Countering Digital Hate showed that when public figures engaged with trolls, small networks of approximately 100 accounts gained access to audiences of millions through the target's response. Instead of engaging, use your platform to share positive content, support others, and have meaningful conversations.

Don't feed the trolls means using ignore, mute, and block features liberally and without guilt. These aren't signs of weakness; they're tools for maintaining your digital wellbeing. Modern platforms have given us powerful features: keyword muting to filter triggering content, blocking to prevent direct contact, restricting to limit interactions without full blocking, and reporting for serious violations.

Some examples: Reddit's successful communities don't just ignore trolls—moderators swiftly remove their content and ban repeat offenders. Wikipedia uses a "Revert, Block, Ignore" strategy that works because swift, silent removal convinces vandals their efforts are pointless and boring.

Most importantly, don't feed the trolls means setting appropriate boundaries and then, critically, enforcing them. Successful online communities have clear standards that explicitly prohibit trolling with specified consequences. You can do the same with your personal accounts: decide what behavior you won't tolerate, communicate those boundaries clearly (in your bio, pinned posts, or community guidelines), and consistently enforce them without engaging in debates about your decisions. You don't owe anyone an explanation, and sometimes providing one does more harm than good, because it is food for the troll.

The beauty of "don't feed the trolls" is that it's both a personal practice and a collective action. When we all commit to not amplifying toxic behavior, we create online spaces where meaningful conversation can thrive. Every time you choose not to engage with a troll, you're not just protecting your own peace of mind, you're helping starve out the attention economy.

So the next time you see that inflammatory post or receive that provocative reply, remember: your silence is more powerful than any comeback. Don't feed the trolls. Block, mute, report if needed, and move on to conversations that actually matter. The internet becomes a better place not through winning arguments with trolls, but through building communities where their tactics simply don't work.

https://ngerakines.leaflet.pub/3m5mv6f2tks2h
ATProtocol Attestations: Cryptographic Signatures for the Decentralized Web
This post introduces the formal ATProtocol attestation specification, a framework for adding cryptographic signatures to ATProto records through two complementary patterns: inline attestations that embed signatures directly in records, and remote attestations that store proof in separate repository records. The specification prevents replay attacks through repository binding, uses CID-based content addressing for integrity, and provides the cryptographic foundation for verified credentials, trusted content, and authenticated interactions in the decentralized ATProtocol ecosystem.
Show full content

If you've been following along with my previous posts about decentralized identity and cryptographic proofs in ATProtocol, you know I've been exploring how we can build verifiable trust into decentralized systems. Today, I'm excited to share the formal specification for ATProtocol attestations - a framework that brings cryptographic signatures to ATProtocol records in a way that's both secure and surprisingly simple.

The full technical specification is available for those who want the complete implementation details, but let me walk you through the key concepts and design decisions that make this framework powerful.

The Problem We're Solving

In a decentralized ecosystem, we need ways to prove that specific identities have endorsed, verified, or made claims about content. Think of it like a digital notary stamp, but one that's cryptographically verifiable and impossible to forge. Whether it's an organization verifying someone's profile, a badge issuer confirming an achievement, or multiple authors signing off on a collaborative document, we need a robust way to create these proofs.

The challenge is doing this while preventing replay attacks - where someone might try to copy a valid signature from one place and reuse it somewhere else. We also need to balance simplicity with security, because overly complex systems tend to have more attack surfaces and implementation bugs.

Two Patterns, One Framework

The attestation spec defines two complementary patterns that share the same underlying CID generation process but serve different use cases.

Inline attestations embed signatures directly in records. When you create an inline attestation, you're adding a signature object right into the record's signatures array. The signature itself is generated by signing the CID of the record content, and the whole thing travels together. Here's what that looks like:

{
  "$type": "app.example.record",
  "createdAt": "2025-10-14T12:00:00Z",
  "text": "Example content that is being attested",
  "signatures": [
    {
      "$type": "com.example.inlineSignature",
      "signature": {"$bytes": "MzQ2Y2U4ZDNhYmM5NjU0Mzk5NWJmNjJkOGE4..."},
      "key": "did:web:example.com#signing1"
    }
  ]
}

The beauty here is radical simplicity. An inline attestation only needs three required fields: the $type to identify what kind of signature it is, the signature bytes themselves, and a key reference telling us which cryptographic key was used.

Remote attestations take a different approach. Instead of embedding the signature, you create a separate proof record in your own repository that contains the CID of the content you're attesting to. The original record then references your proof using a strong reference:

{
  "$type": "com.example.proof",
  "cid": "bafyreifsqhrnlciktfxkz4yiqw5wtx6xvods67aicqt5tc7cly24dmhv3e"
}

This pattern shines when you need revocable attestations or when the attestor wants to maintain control over their proofs. Since the proof lives in the attestor's repository, they can delete it if needed, effectively revoking the attestation.

The Magic of CID Generation

Both patterns rely on content-addressable storage through CIDs (Content Identifiers). The CID generation process is where the security magic happens, and it's the same for both attestation types.

When generating a CID for attestation, we temporarily inject a special $sig metadata object that includes critical security information. This object must always contain a $type field and, crucially, a repository field containing the DID of the repository housing the record. This repository binding is what prevents replay attacks - an attacker can't just copy your signed record into their repository because the CID would change.

Here's the process in pseudocode:

FUNCTION generateRecordCID(recordData, sigMetadata, repositoryDID):
    encodingData = copy(recordData)
    DELETE encodingData["signatures"]  # Remove signatures field
    
    # Add $sig metadata with repository binding
    sigMetadata["repository"] = repositoryDID  # CRITICAL: prevents replay attacks
    encodingData["$sig"] = sigMetadata
    
    # Encode to DAG-CBOR and generate CID
    cborBytes = encodeDAGCBORCanonical(encodingData)
    cidBytes = generateCIDFromCBOR(cborBytes)
    
    RETURN cidBytes
END FUNCTION

The $sig object is only present during CID generation - it never gets stored in the final record. For inline attestations, we sign these CID bytes to create the signature. For remote attestations, we store the CID directly in the proof record. Same CID generation, different usage patterns.

Union Types and the Signatures Array

The signatures array uses ATProtocol's union type system, which means it can contain different types of objects as long as they include a $type field. This allows us to mix inline signatures and strong references to remote attestations in the same array:

"signatures": [
  {
    "$type": "com.example.inlineSignature",
    "signature": {"$bytes": "..."},
    "key": "did:plc:signer1#method"
  },
  {
    "$type": "com.atproto.repo.strongRef",
    "uri": "at://did:plc:verifier/com.example.proof/3m3i7e3uhrocj",
    "cid": "bafyreig5ug2vj63ag5b6okth3roujv2lngxnyssxeylfcmmqiznfje4enu"
  }
]

This flexibility means applications can choose the right attestation pattern for their needs, or even combine both in a single record.

Real-World Examples

Let me show you how this works in practice with a profile verification scenario. Imagine a verification service wants to confirm someone's identity. They create a proof record in their repository:

{
  "$type": "network.bsky.verification.proof",
  "cid": "bafyreig7w5q432clkzxn5azlybqi37lnuvxvl3uucbqojgew4cujyoamzq",
  "type": "individual"
}

The user's profile then references this proof:

{
  "$type": "app.bsky.actor.profile",
  "displayName": "Nick Gerakines",
  "description": "Protocols, platforms, and machine learning",
  "signatures": [
    {
      "$type": "com.atproto.repo.strongRef",
      "cid": "bafyreigk73rnjpjfjjeeii25w2cczdq7tpzwrv4xeyo7gs47m75pqshbau",
      "uri": "at://did:plc:verify.bsky.network/network.bsky.verification.proof/3m3i7it5ya7cq"
    }
  ]
}

Now anyone can verify that the verification service has indeed attested to this profile. They reconstruct the CID using the same process (including the repository binding in $sig), and check that it matches the CID stored in the proof record.

For inline attestations, consider a badge system where achievements are signed directly:

{
  "$type": "community.lexicon.badge.award",
  "badge": {
    "$type": "com.atproto.repo.strongRef",
    "uri": "at://did:plc:issuer/community.lexicon.badge.definition/3ltwfsgx3vu2a",
    "cid": "bafyreibnfpriilyjmssycvlkcp46cmoscwon7okbfvhjmobggisinerj5e"
  },
  "did": "did:plc:cbkjy5n7bk3ax2wplmtjofq2",
  "issued": "2025-07-14T12:00:00.000Z",
  "signatures": [
    {
      "$type": "community.lexicon.badge.proof",
      "key": "did:plc:issuer#badge",
      "signature": {"$bytes": "JjLKuf35PstZQhef36SHtGrPrlvWy6+Qt6xI2zINOBNAxh4pAAaq..."}
    }
  ]
}

The signature was created by the badge issuer signing the CID of the award record (with the appropriate $sig metadata including the repository DID). Anyone can verify this signature using the public key referenced in the key field.

Cryptographic Details

The spec uses standard elliptic curve cryptography with two supported curves: P-256 (the NIST standard curve that's WebCrypto compatible) and K-256 (the Bitcoin/Ethereum curve that's default in ATProtocol). All signatures use ECDSA with the "low-S" variant as specified in BIP-0062, which helps prevent signature malleability attacks.

The verification process follows a straightforward path. For inline attestations, you reconstruct the record with the $sigmetadata (including the repository DID), generate the CID, and verify the signature against that CID using the public key from the referenced DID document. For remote attestations, you generate the CID the same way and check that it matches the CID stored in the proof record.

Application Extensibility

One of the powerful aspects of this design is how applications can define their own attestation types. The $type field in signatures lets you create application-specific semantics. You might have com.example.contentModeration for moderation decisions, org.university.transcript for academic credentials, or app.medical.prescription for healthcare attestations.

Applications can also inject custom metadata into the $sig object during CID generation. This lets you bind attestations to specific contexts, add expiration semantics, or include other application-specific data that becomes part of the cryptographic commitment.

Reference Implementation

All of the examples in the spec are functional. The reference implementation in the atproto-attestation crate at https://tangled.org/@smokesignal.events/atproto-identity-rs contains everything needed to create and validate both inline and remote attestations. The crate also provides the atproto-attestation-sign and atproto-attestation-verify programs.

To run these tools, check out the repository and build the docker container with your tool of choice (podman build -t docker.io/ngerakines/atproto-tools:0.14.0-rc.1 .) and then adjust the commands in this post (podman run docker.io/ngerakines/atproto-tools:0.14.0-rc.1 atproto-attestation-verify -h).

For a real-world example, let's create a record that will have a remote attestation:

{
    "$type": "me.ngerakines.foo",
    "foo": "bar"
}

I'll use the atproto-attestation-sign program to create a remote attestation of the type me.ngerakiens.baz:

atproto-attestation-sign remote did:plc:cbkjy5n7bk3ax2wplmtjofq2 \
  '{"$type": "me.ngerakines.foo", "foo": "bar"}' \
  did:plc:cbkjy5n7bk3ax2wplmtjofq2 \
  '{"$type": "me.ngerakiens.baz"}'

This will output both the Attested Record (with strongRef) and the Proof Record (store in repository).

Now I'm going to publish the attested record to my repository with pdsls.dev:

{
  "$type": "me.ngerakines.foo",
  "foo": "bar",
  "signatures": [
    {
      "$type": "com.atproto.repo.strongRef",
      "cid": "bafyreicfipxcm3essd2xybdamukruigy6umyxgjz66wjvehseqbcpkkffa",
      "uri": "at://did:plc:cbkjy5n7bk3ax2wplmtjofq2/me.ngerakiens.baz/3m3ic7nxjxhrp"
    }
  ]
}

The proof record points back to a CID of just the content that is being signed. I'm going to publish that record to my repository, honoring both the collection ($type that I provided) and the record key:

{
  "$type": "me.ngerakiens.baz",
  "cid": "bafyreicf2rsd4ggjvrejt4ycndnpqw6lf5khpwwncs6nljyhxhdqsqkxkq"
}

Now I can use the atproto-attestation-verify command to ensure that everything validates:

atproto-attestation-verify \
  at://did:plc:cbkjy5n7bk3ax2wplmtjofq2/me.ngerakines.foo/3m3fslrmd7j2o \
  did:plc:cbkjy5n7bk3ax2wplmtjofq2
Looking Forward

This attestation framework provides the cryptographic foundation for trust in the ATProtocol ecosystem. By keeping the core specification minimal while allowing application-level extensibility, we get both security and flexibility. The repository binding prevents replay attacks, the CID-based approach ensures content integrity, and the dual pattern support accommodates different operational needs.

As we build more sophisticated applications on ATProtocol, these attestations will become the building blocks for verified credentials, trusted content, and authenticated interactions. Whether you're building a verification service, a badge system, or any application that needs cryptographic proofs, this framework provides the tools you need.

The full technical specification dives deeper into the implementation details, including complete algorithms, security considerations, and integration patterns. I'm excited to see what the community builds with these cryptographic primitives, and I'll be sharing more about practical implementations in upcoming posts.

https://ngerakines.leaflet.pub/3m3idxul5hc2r
Unforgeable Endorsements Technical Deep-Dive
Deep technical implementation of the unforgeable endorsement system. Covers step-by-step CID computation, complete code for the endorsement workflow, validation algorithms, firehose event processing, and detailed security analysis of attack vectors. Includes working code examples, lexicon definitions, and the cryptographic mechanisms that make forgery mathematically impossible.
Show full content

This is Part 2 of our technical deep-dive into building cryptographically-verified endorsements on ATProtocol. Read Part 1 for the overview →

In Part 1, we explored the high-level architecture of our two-record endorsement system. Now let's dive into the implementation details that make this system cryptographically unforgeable.

Content Identifiers: The Cryptographic Foundation

At the heart of our system is ATProtocol's use of Content Identifiers (CIDs). A CID is like a fingerprint of your data—but way better. Traditional database IDs are arbitrary numbers assigned by a central authority. CIDs are cryptographic hashes computed from the content itself. Same content always produces same CID. Different content produces different CID. And here's the kicker: it's computationally infeasible to create fake content that matches a given CID.

Step-by-Step CID Computation

Here's exactly how ATProtocol computes a CID for an endorsement record:

Step 1: Serialize to DAG-CBOR
Record data → Deterministic binary format
{
  "$type": "place.atwork.endorsement",
  "giver": "did:plc:alice123",
  "receiver": "did:plc:bob456",
  "text": "Bob designed our microservices platform...",
  "createdAt": "2025-10-12T14:30:00Z"
}
→ [binary DAG-CBOR bytes]

Step 2: Hash with SHA-256
Binary bytes → 32-byte hash digest
→ e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

Step 3: Create Multihash
Hash + algorithm ID + length
→ 0x12 (SHA-256) + 0x20 (32 bytes) + digest

Step 4: Add Codec and Version
→ 0x01 (CIDv1) + 0x71 (dag-cbor) + multihash

Step 5: Encode to Base32
→ "bafyreib7h3j2ixgz4e5b..."

The critical insight: Step 1 uses DAG-CBOR, a deterministic serialization format. This means everyone who serializes the same data structure gets the exact same bytes. If serialization were non-deterministic (like standard JSON where key ordering can vary), different people computing the CID would get different results.

Why Signatures Are Excluded

Here's a subtle but crucial point: the CID is computed from the unsigned content. Signatures are stored separately in ATProtocol's repository structure.

Think about why. If signatures were included in the content before CID computation, you'd have a chicken-and-egg problem:

  • To sign the data, you need to know what you're signing

  • The data includes the CID

  • But the CID depends on the signature

  • Which depends on the data...

Infinite loop, system explodes. Instead, ATProtocol separates concerns:

  • Compute CID from canonical unsigned content

  • Sign the CID (or the bytes that produced it)

  • Store signature in repository metadata

Implementation: The Complete Workflow

Let's implement the complete endorsement lifecycle with actual code.

Phase 1: Draft Creation (Application State)
// In Alice's app - create draft in application database (not PDS yet)
draft = {
  id: "01JBCDEF...",  // ULID
  giver: "did:plc:alice123",
  receiver: "did:plc:bob456",
  text: "Bob designed our entire microservices platform...",
  status: "drafting",
  proof_aturi: null,
  proof_cid: null,
  created_at: "2025-10-12T14:30:00Z"
}

// Store in application database, NOT published to PDS yet
await storage.createEndorsementDraft(draft)

Nothing is on ATProtocol yet. The draft exists only in the application's database.

Phase 2: Giver Finalizes with Cryptographic Commitment
// 1. Compute CID of endorsement content (without signatures)
endorsementContent = {
  giver: draft.giver,
  receiver: draft.receiver,
  text: draft.text,
  createdAt: draft.created_at
}
proof_cid = computeEndorsementCID(endorsementContent)

// 2. Create proof record in Alice's PDS
proofURI = await createRecord({
  repo: "did:plc:alice123",
  collection: "place.atwork.endorsementProof",
  record: {
    cid: proof_cid
  }
})

// This triggers:
// 1. Proof record serialized to DAG-CBOR
// 2. Added to Alice's repository Merkle tree
// 3. New commit signed by Alice
// 4. Broadcast to ATProtocol firehose

// 3. Update draft to mark as "drafted" with proof info
draft.status = "drafted"
draft.proof_aturi = proofURI
draft.proof_cid = proof_cid
await storage.updateEndorsementDraft(draft)

// 4. Notify Bob (via app-specific mechanism)
sendNotification(bob, {
  type: "endorsement_received",
  from: alice,
  draft_id: draft.id
})
Phase 3: Receiver Reviews and Accepts
// 1. Bob creates the endorsement record in his PDS (receiver's repo)
endorsementURI = await createRecord({
  repo: "did:plc:bob456",
  collection: "place.atwork.endorsement",
  record: {
    giver: "did:plc:alice123",
    receiver: "did:plc:bob456",
    text: "Bob designed our entire microservices platform...",
    createdAt: "2025-10-12T14:30:00Z",
    signatures: [{
      uri: draft.proof_aturi,  // Alice's proof AT-URI
      cid: draft.proof_cid     // Alice's proof CID
    }]
  }
})

// This triggers:
// 1. Endorsement serialized to DAG-CBOR
// 2. Added to Bob's repository Merkle tree
// 3. New commit signed by Bob
// 4. Broadcast to ATProtocol firehose

// 2. Delete the draft (no longer needed)
await storage.deleteEndorsementDraft(draft.id)
The Validation Algorithm

When someone claims "Here's Bob's endorsement with Alice's proof," here's the complete validation implementation:

function validateEndorsement(endorsementRecord, proofRecord):
  // 1. Extract the proof CID from Bob's endorsement signatures field
  proofStrongRef = endorsementRecord.signatures[0]
  expectedCID = proofStrongRef.cid

  // 2. Recompute CID from endorsement content (without signatures)
  endorsementContent = {
    giver: endorsementRecord.giver,
    receiver: endorsementRecord.receiver,
    text: endorsementRecord.text,
    createdAt: endorsementRecord.createdAt
  }
  canonicalBytes = encodeDagCBOR(endorsementContent)
  computedHash = sha256(canonicalBytes)
  computedCID = createCID(computedHash)

  // 3. Verify it matches Alice's proof CID
  if computedCID != proofRecord.cid:
    return INVALID("Proof CID doesn't match computed endorsement CID")

  // 4. Verify it matches what Bob referenced
  if computedCID != expectedCID:
    return INVALID("Bob's reference doesn't match Alice's proof")

  // 5. Verify signatures in ATProtocol repositories
  if !verifyRepoSignature(aliceRepo, proofRecord):
    return INVALID("Alice's proof signature invalid")

  if !verifyRepoSignature(bobRepo, endorsementRecord):
    return INVALID("Bob's endorsement signature invalid")

  return VALID
Asynchronous Validation via Firehose

Your app subscribes to the firehose and processes events in real-time:

// Firehose subscription handler
onFirehoseCommit(event):
  if event.collection == "place.atwork.endorsement":
    // New endorsement created by receiver
    endorsement = event.record

    // Extract the proof reference from signatures
    if endorsement.signatures and endorsement.signatures.length > 0:
      proofRef = endorsement.signatures[0]

      // Fetch proof record from giver's PDS
      proof = await fetchRecord(proofRef.uri)

      // Validate the cryptographic chain
      if validateEndorsement(endorsement, proof):
        // Update local database to mark as verified
        db.createOrUpdateEndorsement({
          aturi: event.uri,
          giver: endorsement.giver,
          receiver: endorsement.receiver,
          text: endorsement.text,
          validation_state: "valid",
          proof_uri: proofRef.uri,
          validated_at: now()
        })

  if event.collection == "place.atwork.endorsementProof":
    // New proof created by giver
    proof = event.record

    // Store proof for later validation when endorsement is created
    db.createEndorsementProof({
      aturi: event.uri,
      giver_did: event.repo,
      cid: proof.cid,
      created_at: now()
    })
Record Updates and Re-Validation

ATProtocol repositories are mutable, but endorsement records should be immutable by design. If Alice updates her proof to commit to different content:

// Original proof
await createRecord({
  repo: "did:plc:alice123",
  collection: "place.atwork.endorsementProof",
  record: {
    cid: "bafyreiabc..."  // CID of original endorsement content
  }
})

// Later, Alice tries to "update" it
await updateRecord({
  repo: "did:plc:alice123",
  collection: "place.atwork.endorsementProof",
  rkey: proofRecordKey,
  record: {
    cid: "bafyreixyz..."  // Different CID
  }
})

This breaks validation. Bob's endorsement still references the old proof CID. Apps will see the mismatch and mark the endorsement as invalid.

Re-Validation Implementation
function revalidateEndorsement(endorsementRecord):
  // 1. Get the proof reference from endorsement
  if !endorsementRecord.signatures or endorsementRecord.signatures.length == 0:
    return "INVALID"  // No proof signature

  proofRef = endorsementRecord.signatures[0]

  // 2. Fetch latest proof from giver's repo
  latestProof = await fetchLatestRecord(proofRef.uri)

  if !latestProof:
    return "INVALID"  // Proof was deleted

  // 3. Compute CID from endorsement content (without signatures)
  endorsementContent = {
    giver: endorsementRecord.giver,
    receiver: endorsementRecord.receiver,
    text: endorsementRecord.text,
    createdAt: endorsementRecord.createdAt
  }
  computedCID = computeEndorsementCID(endorsementContent)

  // 4. Verify proof CID matches computed CID
  if computedCID != latestProof.cid:
    return "INVALID"  // Proof doesn't commit to this content

  // 5. Verify proof CID matches what endorsement references
  if computedCID != proofRef.cid:
    return "INVALID"  // Reference mismatch

  return "VALID"
Validation State Machine
ValidationStates:
  DRAFT     - Created by giver, not published
  PENDING   - Published, awaiting receiver proof
  VALID     - Both records exist, CIDs match, signatures valid
  INVALID   - CID mismatch or missing records
  REJECTED  - Receiver explicitly rejected
  DELETED   - One or both parties deleted records

State Transitions:
  DRAFT → (publish) → PENDING
  PENDING → (proof created) → VALID
  PENDING → (explicit rejection) → REJECTED
  VALID → (proof deleted) → INVALID
  VALID → (endorsement deleted) → DELETED
  INVALID → (re-validation succeeds) → VALID
Security Analysis: Attack Scenarios

Let's examine each attack vector in detail:

Forgery Attack

Attempt: Eve creates a fake proof claiming Alice endorsed Bob.

Implementation defense:

// Attack fails at validation
if proofRecord.repo != endorsement.giver:
  return INVALID("Proof not from claimed giver")
Repudiation Attack

Attempt: Alice creates a proof, then claims she never did.

Implementation defense:

// Proof was broadcast and cached
cachedProof = db.getProof(proofURI)
if cachedProof && verifySignature(cachedProof, aliceDID):
  // Mathematical proof Alice created it
  return PROVEN_AUTHORSHIP
Modification Attack

Attempt: Eve modifies endorsement text after Alice's proof.

Implementation defense:

// Any modification changes the CID
originalCID = computeCID(originalContent)
modifiedCID = computeCID(modifiedContent)
// originalCID != modifiedCID
// Validation fails
Man-in-the-Middle

Attempt: Eve intercepts and modifies Alice's proof.

Implementation defense:

  • Proof is in Alice's signed repository

  • Repository commits are cryptographically signed

  • Modification invalidates signature chain

Replay Attack

Attempt: Eve reuses Alice's proof for different receiver.

Implementation defense:

// CID includes receiver DID
content = {
  giver: "did:plc:alice123",
  receiver: "did:plc:bob456",  // Bound to Bob
  text: "..."
}
// Can't change receiver without invalidating CID
Lexicon Definitions

Here are the complete lexicon definitions for implementation:

// place.atwork.endorsementProof
{
  "lexicon": 1,
  "id": "place.atwork.endorsementProof",
  "defs": {
    "main": {
      "type": "record",
      "key": "literal:self",
      "record": {
        "type": "object",
        "required": ["cid"],
        "properties": {
          "cid": {
            "type": "string",
            "description": "CID of the endorsement content"
          }
        }
      }
    }
  }
}

// place.atwork.endorsement
{
  "lexicon": 1,
  "id": "place.atwork.endorsement",
  "defs": {
    "main": {
      "type": "record",
      "key": "literal:self",
      "record": {
        "type": "object",
        "required": ["giver", "receiver", "text", "createdAt", "signatures"],
        "properties": {
          "giver": {
            "type": "string",
            "format": "did"
          },
          "receiver": {
            "type": "string",
            "format": "did"
          },
          "text": {
            "type": "string",
            "maxLength": 10000
          },
          "createdAt": {
            "type": "string",
            "format": "datetime"
          },
          "signatures": {
            "type": "array",
            "items": {
              "type": "ref",
              "ref": "com.atproto.repo.strongRef"
            }
          }
        }
      }
    }
  }
}
Conclusion

We've built a system where cryptographic math, not corporate databases, determines truth. The two-record architecture with CID-based mutual attestation creates unforgeable professional endorsements on fully decentralized infrastructure.

The key insights:

  • Content addressing (CIDs) creates unforgeable commitments

  • Two-record separation enables asynchronous consent

  • Firehose architecture allows autonomous validation

  • Repository signatures provide cryptographic proof of authorship

This pattern extends beyond endorsements to any scenario requiring mutual attestation: contracts, credentials, reviews, verifications. ATProtocol provides the primitives—now build something where nobody can forge or deny what happened.

The future of professional reputation is cryptographically-verified and radically decentralized. The code is the specification. The math is the truth.

https://ngerakines.leaflet.pub/3m33j64dxgs2q
Building Unforgeable Professional Endorsements with ATProtocol
Traditional professional endorsements on platforms like LinkedIn lack cryptographic proof—anyone could forge them, and the platform controls the truth. This article introduces a two-record architecture using ATProtocol's Content Identifiers (CIDs) and Decentralized Identifiers (DIDs) to create mathematically unforgeable mutual attestations. By separating proof creation from endorsement acceptance and leveraging the firehose for distributed validation, we build a system where both parties cryptographically consent and no central authority can manipulate the record.
Show full content

Picture this: You're scrolling through LinkedIn, and you see that your colleague has "endorsed" you for Python. Great! Except... did they really? Could they claim they never did it? Could you fake an endorsement from them? The answer is unsettling: maybe.

Traditional endorsement systems have a fundamental problem: they're one-sided attestations stored in centralized databases without cryptographic proof. When Alice endorses Bob for "distributed systems expertise," there's no mathematical proof that Alice actually created that endorsement and Bob actually accepted it. We're trusting LinkedIn to tell the truth, to preserve the data accurately, and to never alter the historical record. We're trusting the platform, not the math.

What if we could build an endorsement system where the cryptographic properties guarantee that both parties explicitly consented, nobody can forge or repudiate an endorsement, and there's no single company we have to trust? Enter ATProtocol and cryptographically-verified mutual attestations.

ATProtocol provides the perfect foundation for this through two killer features: Content Identifiers (CIDs) that create cryptographic fingerprints of data, and Decentralized Identifiers (DIDs) that enable self-sovereign identity without central account systems. The combination enables something powerful: unforgeable bidirectional consent where both the giver and receiver cryptographically commit to the endorsement, and anyone can verify its authenticity by checking the math, not by trusting a platform.

The Two-Record Architecture: Separating Commitment from Execution

Most developers' first instinct would be to create a single endorsement record that both parties sign. In the Atmosphere we can leverage ATProtocol's lexicon system to do something more elegant, a two-record architecture that separates the commitment phase from the execution phase.

The Proof Record

The first record type is the proof. This is created by the giver to cryptographically commit to the endorsement. It contains only a CID (Content Identifier) of the endorsement content. This is where minimalism becomes security. The CID is a cryptographic hash computed from the endorsement data. Alice creates this proof in her repository, cryptographically signing that she authored this specific endorsement.

The Endorsement Record

The second record type is the endorsement itself. This is created by the receiver when accepting the endorsement. It contains the actual content (giver, receiver, endorsement text) plus a signatures field that references Alice's proof record. This creates the cryptographic link: Bob's endorsement points to Alice's proof via a strongRef (AT-URI + CID). The CID in the signatures field must match the CID that Alice committed to in her proof record.

Why Two Records Instead of One?

This separation provides several critical properties:

  • Temporal Decoupling: Alice can create the proof while Bob is offline. Bob can review and accept days later. No coordination required.

  • Clear Consent: Bob must accept the endorsement explicitly. Passive acceptance isn't possible.

  • Independent Verification: Each record is signed by its creator using ATProtocol's repository signature mechanism.

  • Unforgeable Authorship: Alice's proof record cryptographically commits her to the endorsement content via the CID. She can't later claim she wrote different text.

  • Clean Rejection: If Bob doesn't create the endorsement record, it never becomes public.

The Workflow: Creating Mutual Attestation

Let's walk through the complete lifecycle:

Phase 1: Draft Creation

Alice wants to endorse Bob. She opens her endorsement app and fills out the form. At this point, nothing is on ATProtocol—the draft exists only in the application's database. Alice can iterate, edit, even delete the draft without any permanent records being created.

Phase 2: Giver Finalizes with Cryptographic Commitment

Alice reviews her draft and clicks "Finalize". The app computes the CID of the endorsement content and creates a proof record in Alice's repository. This proof is broadcast to the ATProtocol firehose. Alice has now published her commitment to endorse Bob with this specific content.

Phase 3: Receiver Reviews and Accepts

Bob receives a notification and reviews the draft. If he accepts, the app creates the endorsement record in Bob's repository, pointing to Alice's proof. The cryptographic chain is complete: Alice's proof commits to the endorsement content via CID, and Bob's endorsement references that proof. Both are independently signed in their respective repositories.

Phase 4: Asynchronous Validation

Here's where ATProtocol's architecture becomes powerful. Your app subscribes to the firehose and processes events. When it sees a new endorsement, it fetches the referenced proof and validates the cryptographic chain. No API calls to centralized servers. No trusting someone else's validation. Just math and distributed data.

Security Properties

This system provides robust security guarantees:

No Forgery: Eve can't create a fake proof from Alice—it must be in Alice's repository, and indirectly with her signature.

No Repudiation: Alice can't credibly deny creating a proof once it's broadcast. The cryptographic signature proves authorship.

No Modification: Any change to the endorsement content produces a different CID, breaking validation.

No Replay: Endorsements are bound to specific receiver DIDs through the CID. You can't reuse Alice's proof for a different receiver.

The security derives from cryptographic primitives: 256-bit elliptic curve signatures, SHA-256 collision-resistant hashing, and content addressing that binds proofs to specific content. The system is permissionless—you don't need anyone's permission to create, accept, or validate endorsements.

Why the Firehose Architecture Enables This

The firehose provides a fundamentally different model than traditional request-response APIs:

  • Offline operation: Alice can create endorsements without Bob being online

  • Eventual consistency: All apps eventually converge to the same validation state

  • Autonomous verification: Each app is a first-class validator using cryptographic proofs

  • Real-time updates: Events stream within seconds, not poll-based delays

This architecture eliminates single points of failure through distributed trust and mathematical guarantees.

Beyond Professional Endorsements

The same architecture applies to any scenario requiring bidirectional consent and unforgeable proof:

  • Academic peer review where reviewers and authors mutually attest

  • Educational credentials where institutions issue and graduates accept

  • Business contracts with independently verifiable signatures

  • Product reviews where both reviewer and seller acknowledge the transaction

  • Identity verification with mutual cryptographic commitment

The pattern naturally extends to multi-party collaborations—imagine a project record where Alice, Bob, and Carol all cryptographically commit to having worked together. Or consider privacy-preserving endorsements using zero-knowledge proofs: proving you have 5+ endorsements without revealing who gave them.

Conclusion

ATProtocol provides the cryptographic primitives to build systems where truth is mathematical, where users control their data, and where nobody—not you, not any platform—can forge or deny what happened. The future of professional reputation is cryptographically-verified and radically decentralized.

In Part 2, we'll dive deep into the implementation details: CID computation algorithms, validation code, firehose event processing, and attack resistance analysis. We'll explore the actual code that makes this system unforgeable.


Ready to dive deeper? Continue to Part 2: Deep Technical Implementation →

https://ngerakines.leaflet.pub/3m326qo4w522w
Introducing at://work: A Job Board That Puts You in Control
at://work is a modern job board built on ATProtocol where your profile and job listings are stored on your own Personal Data Server, giving you true ownership of your professional data. As a full AppView with XRPC APIs and remote MCP server capabilities, it makes job market data accessible to both users and developers while proving that professional networking can be decentralized and user-controlled.
Show full content

I'm excited to launch at://work, a modern job board built on ATProtocol that fundamentally changes how job listings and professional networking work on the decentralized web.

at://work - ATProto Job Board

Your Career. Your Data. Your Place. A decentralized job board powered by the ATProtocol.

Your Data, Your PDS

When you sign in to at://work using OAuth, you're logging in with your ATProtocol handle (like @alice.bsky.social). But here's what makes it different: your profile and job listings aren't stored on our servers. They live on your Personal Data Server (PDS), which means you actually own your content. We're just indexing what you've already published to the ATmosphere.

This isn't just a philosophical difference. It means your professional presence exists independently of any single platform. Your job listings and for-hire status are portable, verifiable, and permanently under your control.

Built for Hiring and Being Hired

at://work supports two types of identity profiles. If you're looking for work, you can mark yourself as "for-hire" and let opportunities find you. If you're hiring, you can publish job listings with full location support and rich content formatting using ATProtocol's content facets, including mentions, tags, and URLs. Setting up your profile is straightforward—check out the profile creation guide to get started.

The platform includes comprehensive search functionality for both job listings and identity profiles, along with location-based search and an interactive map that shows where opportunities are clustering. Whether you're looking for remote positions or trying to find local talent, the search tools make it straightforward.

A Full AppView with Developer-Friendly APIs

As a complete AppView, at://work indexes job listings and profiles as they're published across the ATmosphere. But we've also opened up the functionality through XRPC endpoints, allowing developers to integrate job listing features into their own applications—whether they're built on ATProtocol or not.

To demonstrate this capability, we've built a JavaScript widget that makes it easy to embed job listings on your blog or website. The comprehensive XRPC API documentation shows exactly how to tap into the at://work index programmatically.

MCP Integration for AI Workflows

For developers working with AI tools and agents, at://work functions as a fully remote Model Context Protocol (MCP) server. You can search, list, and retrieve job listings through the MCP endpoint at https://atwork.place/mcp, making it simple to incorporate job market data into your AI-powered workflows. For setup instructions and usage examples, see the MCP integration guide.

The Decentralized Future of Professional Networking

at://work demonstrates what becomes possible when professional networking moves to an open protocol. Instead of your career data being locked in proprietary platforms, it exists in a standardized format that any application can access and display. You're not just using a job board—you're participating in a professional graph that you control.

We're launching at://work to show that job boards can be decentralized, user-controlled, and developer-friendly all at once. Check it out at https://atwork.place/ and see what professional networking looks like when you own the data.

https://ngerakines.leaflet.pub/3m2iom72kqs2z
Ohio's New Porn Age Verification Law: Big Brother Dressed Up as Child Protection
Ohio's new age verification law requiring ID to access adult websites (starting September 29, 2025) fails to protect children while forcing adults to surrender personal data to access legal content. This "small government" Republican law creates a surveillance system that invades privacy without addressing the real online dangers kids face.
Show full content

Starting September 29th, Ohio joins 24 other states requiring age verification for adult websites. This law, buried in our 3,156-page state budget, is a perfect example of performative politics that fails to address real problems while creating new ones.

This law doesn't protect children. Any tech-savvy kid knows how to use a VPN to bypass geographic restrictions. Meanwhile, the real dangers kids face online - predators on social media, cyberbullying, exposure to violence - go unaddressed. We're creating a false sense of security while ignoring actual threats.

The hypocrisy is stunning. Major platforms like Facebook, TikTok, Twitter, and Reddit - where pornographic content is readily accessible - get a pass. Cable and streaming providers? Exempt. But smaller platforms like Bluesky that compete with Big Tech? They're in the crosshairs. This isn't about protecting kids; it's about picking winners and losers.

"Small government" Republicans just mandated that Ohioans hand over their driver's licenses or other sensitive personal data to access legal content. They're creating a permanent record of what adults view in private. In an era of constant data breaches - including multiple recent hacks of Ohio government systems - we're now required to trust our most personal browsing habits to verification systems that will inevitably be compromised.

The same party that rails against government overreach just created a surveillance infrastructure that would make authoritarians proud. They claim the data will be deleted immediately after verification, but we've heard that promise before.

If we actually cared about protecting children online, we'd focus on digital literacy education, better parental tools, and holding social media companies accountable for the harm they cause. Instead, we get this - security theater that invades privacy, creates new risks, and does nothing to address the real problems.

Ohio Republicans have shown us once again that "small government" is just a slogan they use when convenient. When it comes to controlling what you do in your own home, Big Brother is alive and well in Ohio.

https://ngerakines.leaflet.pub/3lzluo35pis2p
Launching QuickDID - Fast, Open Handle Resolution for the AT Protocol
QuickDID is a high-performance, open source handle resolution service for the ATmosphere that serves as both public infrastructure at https://quickdid.smokesignal.tools and deployable software under MIT license. It offers flexible caching strategies (memory/Redis/SQLite), scales from single-instance to distributed deployments, and includes production features like rate limiting and proactive cache refresh. Currently a release candidate, it provides a drop-in alternative to Bluesky's resolver while giving developers full control over their handle resolution infrastructure.
Show full content

Today I'm excited to share QuickDID, a high-performance handle resolution service for the ATmosphere that's now available as public infrastructure at https://quickdid.smokesignal.tools/. I built QuickDID as both a public service anyone can use right now and as open source infrastructure you can deploy yourself.

What QuickDID Does

At its core, QuickDID resolves ATProtocol handles to DIDs - the fundamental identity lookup that powers every interaction on Bluesky and other AT Protocol services. It implements the same XRPC endpoints as Bluesky Social PBC's resolver, making it a drop-in alternative that can handle everything from DNS TXT record lookups to HTTP well-known resolution.

QuickDID is just another resolver. It does one thing and one thing only. It's engineered for speed and flexibility, with intelligent caching strategies that can adapt to everything from a Raspberry Pi deployment to a distributed cluster handling millions of requests.

Built for Real Infrastructure Needs

QuickDID uses a multi-layer caching architecture that lets you choose the right storage for your deployment:

  • In-memory caching for blazing-fast responses (sub-millisecond when cached)

  • Redis-backed caching for distributed deployments and multi-instance scaling

  • SQLite-backed caching for single-instance deployments that need persistence without external dependencies

  • Binary serialization that reduces cache storage by ~40% compared to JSON

The proactive refresh feature keeps popular handles warm in the cache, preventing those annoying latency spikes when cache entries expire. And with configurable TTLs ranging from minutes to months, you can tune the freshness vs. performance trade-off for your specific needs.

Deployment Flexibility

One thing I'm particularly proud of is how QuickDID scales both up and down. You can run it with just HTTP_EXTERNAL=localhost:3007 cargo run for local development, or configure it with Redis queues, StatsD metrics, and rate limiting for production deployments handling the entire ATmosphere.

The queue system is equally flexible - use in-memory MPSC for lightweight single-instance setups, Redis for distributed processing, or SQLite with automatic work shedding to prevent unbounded growth. Every component is optional and configurable through environment variables, following 12-factor app principles.

Performance That Matters

QuickDID includes features that matter in production:

  • Semaphore-based rate limiting to protect upstream DNS and HTTP services

  • HTTP caching with ETag support and configurable Cache-Control headers

  • Connection pooling for Redis to minimize overhead

  • Work shedding in SQLite queues to maintain bounded resource usage

  • Comprehensive StatsD metrics for monitoring resolution timing and cache performance

Available Now

You can start using QuickDID today at https://quickdid.smokesignal.tools for your handle resolution needs. The endpoint is compatible with existing ATProtocol tooling - just point your resolver at it and go.

The source code is available at https://tangled.sh/@smokesignal.events/quickdid under the MIT license. Deploy it yourself, contribute improvements, or just peek under the hood to see how it works.

A Note on Stability

QuickDID is currently a release candidate. While it includes comprehensive error handling and has been designed with production features in mind, it hasn't been fully battle-tested at scale. Please evaluate it thoroughly for your use case before deploying in critical environments. I'm actively looking for feedback and bug reports as we move toward a stable 1.0 release.

Infrastructure for Everyone

QuickDID represents something I believe strongly in: open infrastructure that anyone can use and everyone can contribute to. The ATProtocol ecosystem thrives when we have multiple implementations of critical services, reducing single points of failure and encouraging innovation.

Try it, deploy it, break it, fix it - and let me know what you think. The future of decentralized social infrastructure is built together.

https://ngerakines.leaflet.pub/3lyea5xnhhc2w
AT Protocol and SMTP: When Old Tech Powers New Identity
Exploring how AT Protocol and SMTP can work together to make secure messaging possible by combining 50-year-old email infrastructure with modern cryptographic identity. Building on Chris Boscolo's AT-SMS proposal, this post introduces ideas for adding SMTP services directly to DID documents and leveraging PDS-level cryptographic operations through XRPC methods. The result: verifiable, encrypted communication where messages work like signed JWTs over email, handles prove identity without centralized authorities, and users maintain complete control over their messaging infrastructure. A technical deep-dive into how "boring" technology like SMTP and DNS, combined with AT Protocol's identity primitives, could finally deliver truly portable, private, and permanent messaging.
Show full content

Something clicked for me while reading Chris Boscolo's recent introduction to AT-SMS. Maybe we've been thinking about decentralized messaging wrong. Instead of trying to replace email, what if we just... used email? But not in the way you might think.

Introducing AT-SMS - Bosco Loco

Open End-to-End Encrypted Messaging on AT Protocol

Chris's proposal sparked some ideas that I think could expand on his foundation. While he's focused on using DNS MX records for routing, I see an opportunity to decouple SMTP services from domain ownership entirely by adding them directly to DID documents. Combined with an XRPC method for signing and encrypting content at the PDS level—something that aligns naturally with the attestation lexicon I've been working on—we could create something even more flexible and powerful.

Proposal: ATProtocol Attestation and Signature Lexicon

Hey everyone! I’m excited to present this proposal for a standardized attestation and signature specification for ATProtocol. This specification builds upon recent proof of concept work and our production implementation of badges in Smoke Signal, addressing fundamental challenges in establishing trust and authenticity for third-party attestations within the ATProtocol network. The core problem we’re solving here is straightforward but important: when records represent content or intentions invo...

The beauty of what emerges from this thinking isn't just about encrypted messaging. It's about something more fundamental: proving who you are in a way that's both cryptographically secure and refreshingly simple. When you send an email from one ATProtocol handle to another, you're essentially creating a JWT-like proof of identity, but one that travels over infrastructure that's been battle-tested for five decades.

The Identity-First Approach

Here's what makes this approach so compelling: when you add an SMTP service endpoint directly to your DID document, you're making a claim about where you can receive messages. This goes beyond Chris's DNS-based approach—you don't need to control a domain or manage MX records. Your DID document becomes the single source of truth for how to reach you, whether that's through a traditional email server, a specialized messaging service, or something else entirely.

This isn't just a technical detail—it's a consent mechanism. Only someone who controls the DID can update that document, which means if an SMTP endpoint is listed there, the owner has explicitly authorized it. No centralized authority needed, no complex key distribution problem to solve. The discovery process becomes straightforward: resolve the DID, find the SMTP service, send the message. No DNS lookups required unless the service itself chooses to use domain-based routing.

Think about it this way: in traditional email, anyone can claim to be sending from any address. SPF, DKIM, and DMARC are all band-aids trying to fix this fundamental flaw. But with ATProtocol handles backed by DIDs, the sender can cryptographically sign their message, and the recipient knows with mathematical certainty who sent it. It's like having a notarized letter delivered by regular postal service—the transport doesn't need to be secure because the content itself carries proof of authenticity.

The PDS as a Cryptographic Service

Where this gets really interesting is when we consider the PDS not just as a data store, but as a cryptographic service provider. Imagine an XRPC method that any client can call to sign or encrypt content on behalf of the user. This is where the attestation framework I've been developing fits naturally into the architecture—and from my conversations with Chris, there's definitely a place for it in this system.

The attestation lexicon provides a standardized way for a PDS to sign statements on behalf of an identity. Applied to messaging, this means your PDS could sign outbound messages, proving they came from you without the client needing direct access to private keys. For encryption, the PDS could handle the key management complexity while still ensuring that only authorized devices can decrypt the actual message content.

This separation of concerns is powerful. Clients become simpler—they just need to call XRPC methods. The PDS handles the cryptographic heavy lifting. And users maintain control through their DID document, which specifies which PDS they trust with these operations.

The elegance extends to how devices get authorized. Chris's approach of storing X509 certificates in the PDS solves a thorny problem: how do you prove a device is authorized to receive messages for an identity without creating yet another centralized registry? If the certificate exists in your PDS, you put it there. You control it. You can revoke it. The authorization model becomes dead simple: presence equals consent.

Group Messaging Without the Coordination Dance

One of the most interesting implications of this architecture is how it handles group messaging. Traditional encrypted group chat systems tie themselves in knots trying to coordinate key distribution and rotation. But when each identity has its own cryptographic material accessible through their PDS, group messaging becomes almost trivial from a protocol perspective. You're not managing a shared group key; you're just sending individual encrypted copies to each recipient.

The attestation framework could make this even more efficient. Instead of the client encrypting for each recipient, it could make a single XRPC call to the PDS with the message and recipient list. The PDS handles the fan-out, signing the message once and encrypting it for each recipient's published certificates. This moves the computational burden to the server side where it can be optimized and parallelized.

This might seem inefficient at first glance—why encrypt the same message multiple times? But it eliminates entire categories of complexity. No key agreement protocols. No complex state machines for handling members joining and leaving. No coordination servers that become single points of failure. Each message is just a collection of individually encrypted payloads, each one provably from the sender and readable only by its intended recipient.

SMTP as a Universal Fallback

What really strikes me about using SMTP as the transport layer is its universality. Every domain can receive email. Every hosting provider supports it. Every corporate firewall allows it. By piggybacking on SMTP, AT-SMS gets instant global reach without having to bootstrap a new network.

But by putting SMTP services in the DID document rather than relying solely on DNS MX records, we gain even more flexibility. Your handle could be @alice.wonderland.social, but your SMTP endpoint could be smtp.somewhereelse.com. You're not tied to your handle's domain for message routing. This decoupling means you could change handle providers without changing your messaging infrastructure, or vice versa.

The implementation flexibility this provides is remarkable. Your SMTP service listed in your DID document could be a full email server supporting IMAP or POP. It could be a simple forwarder that routes to your existing email. It could be a specialized service that only handles ATProtocol messages. It could even be a bridge to other messaging protocols. The protocol doesn't care—it just needs somewhere to deliver encrypted blobs.

This also means the system is inherently resilient. Even if every specialized ATProtocol messaging server disappeared tomorrow, the messages would still flow through regular email infrastructure. They'd be unreadable to anyone except the intended recipients, but they'd get delivered. That's the kind of architectural resilience that only comes from building on proven foundations.

Building on Boring Technology

There's a principle in software engineering: build on boring technology. DNS is boring. SMTP is boring. X509 certificates are boring. XRPC is becoming boring (in the best way). And that's precisely why this approach is so powerful. These aren't exciting new protocols that might have undiscovered flaws. They're technologies that have been hammered on for decades, with well-understood properties and failure modes.

The addition of SMTP services to DID documents and PDS-level cryptographic operations through XRPC doesn't add complexity—it removes it. Instead of managing DNS records and MX entries, you update your DID document. Instead of handling keys in every client, you make XRPC calls. The boring technology stack gets even more boring, which is exactly what you want.

Chris's choice to avoid backward compatibility with legacy email encryption is particularly smart. S/MIME and PGP carry decades of baggage, from key discovery problems to usability nightmares. By using SMTP purely as transport for already-encrypted ATProtocol messages, we get the infrastructure benefits without the complexity debt.

What This Means for the Future

What excites me most about this approach is how it could transform our relationship with messaging. Today, every app is an island. Your WhatsApp messages, your Discord conversations, your Slack threads—they're all trapped in their respective silos. But if messaging becomes a protocol-level capability tied to your portable identity, with cryptographic operations handled by your PDS and routing information in your DID document, suddenly every app can be a messaging app.

Imagine opening your favorite ATProtocol client and having all your messages there, regardless of which app the sender used. Imagine switching clients without losing a single conversation. Imagine knowing that your messages are truly yours, not held hostage by whatever company happens to run the server.

The attestation framework adds another layer to this vision. Your PDS becomes not just a data store but a cryptographic agent acting on your behalf. It can sign statements, encrypt messages, and verify claims—all through standardized XRPC methods that any client can use. This isn't just about messaging; it's about building a foundation for all kinds of cryptographic operations in a decentralized identity system.

The Path Forward

Chris's AT-SMS prototype, running on Cloudflare Workers, proves this isn't just theoretical. The foundations are solid, and the additions I'm proposing—DID document SMTP services and PDS-level cryptographic operations—build naturally on what he's created. Version 0 achieves feature parity with existing DM systems while adding end-to-end encryption. The path forward is clear.

For developers, the implications are profound. Adding secure messaging to your ATProtocol app becomes as simple as resolving a DID to find the SMTP endpoint and calling XRPC methods for cryptographic operations. You don't need to run messaging infrastructure. You don't need to solve key distribution. You just need to speak the protocol.

The convergence of Chris's practical SMTP transport approach with the attestation and cryptographic frameworks being developed for ATProtocol creates something more than the sum of its parts. It's not just secure messaging—it's a model for how decentralized identity systems can provide real, usable services without sacrificing user control.

The decision to open source everything from day one is crucial. This isn't a company trying to build a moat; it's a community trying to build infrastructure. The code is there to examine, critique, and improve. The specifications are open to implement. The architecture is decentralized by design.

We're watching the early stages of something that could fundamentally reshape how we think about digital communication. Not by replacing email, but by making it the secure, verifiable transport layer it always should have been. Not by building new walled gardens, but by tearing down the walls entirely.

The future of messaging isn't about picking the right app. It's about making every app speak the same secure, open protocol. And building it on top of SMTP—boring, reliable, universal SMTP—combined with the identity and cryptographic primitives of ATProtocol, might just be the key to making it work.

https://ngerakines.leaflet.pub/3lxxk3oahzc2f
If-This-Then-AT Blueprint: Payment-Proof Access Gates
Show full content

Another if-this-then-at blueprint I've been exploring:

IF an identity has a cryptographically-signed payment proof

THEN unlock access to confidential data services

The Blueprint Pattern

This blueprint combines two powerful ATProtocol patterns:

  • Signed attestations (proof of payment/ticket purchase)

  • Confidential record hydration (privacy-aware data serving)

The result: trustless payment gates for protected information.

Real Implementation

Here's how this works with smokesignal.events infrastructure for the OAuth Masterclass.

When someone purchases a ticket through Acudo, they get a signed RSVP that serves as cryptographic proof of payment:

{
  "$type": "community.lexicon.calendar.rsvp",
  "createdAt": "2025-08-20T20:50:14.152Z",
  "signatures": [
    {
      "$type": "community.lexicon.attestation.signature",
      "issuedAt": "2025-08-20T20:50:14.152Z",
      "issuer": "did:web:oauth-masterclass.atproto.camp",
      "signature": {
        "$bytes": "aGVsbG8gTW9uIFNlcCAgMSAxNDoxODozNiBFRFQgMjAyNQo"
      }
    }
  ],
  "status": "community.lexicon.calendar.rsvp#going",
  "subject": {
    "cid": "bafyreiad2w4nabfqf6hs2vjsju64qhjjtr7yyfqig6szkij2cyfnuzkoi4",
    "uri": "at://did:plc:cbkjy5n7bk3ax2wplmtjofq2/community.lexicon.calendar.event/3luzkrwivzm2a"
  }
}

This signed RSVP becomes the authorization token. When requesting confidential event details, the service can verify:

  • The signature is valid

  • The issuer is trusted

  • The payment proof matches the requested resource

Confidential Data Unlock

Event records can reference confidential data sources:

{
  "locations": [{
    "locality": "Dayton",
    "region": "OH", 
    "source": {
      "service": "did:web:locations.smokesignal.events#EventLocationProvider"
    }
  }]
}

IF you have a valid signed RSVP

THEN locations.smokesignal.events serves the full venue details

Blueprint Applications

This pattern enables:

  • Premium content access: Blog posts, videos, documents behind payment gates

  • Exclusive community features: Private chat rooms, special badges, member directories

  • Tiered information disclosure: Basic/premium/VIP content levels based on payment amount

  • Time-bound access: Content that unlocks based on payment date or event proximity

Why This Blueprint Works

The key here is composability. The signed payment proof works across any service that understands the attestation format. Your ticket for one event could potentially unlock related content, community access, or partner benefits—all without centralized auth systems.

if-this-then-at makes the decentralized social web more economically viable by creating verifiable, portable payment relationships between identities and services.

https://ngerakines.leaflet.pub/3lxsaj6wbjk2v