GeistHaus
log in · sign up

Kobi's blog

Part of leaflet.pub

stories
Privacy Pass + x402 = blinding for x402
Show full content
Brief Intro to Privacy Pass

The Privacy Pass protocol has been out there for a long time. It's simple to implement and deploy, and provides a straightforward way for services to provide safe private credits to their users.

In short, Privacy Pass is a protocol between a client and a server, where the server, after authenticating the client, issues credits that the client can redeem to use a service later. The special property it provides is that the credits are blinded, meaning that after the credits are issued to the client, the service can't know who's redeeming them.

This is suitable in cases where the check that is being made is homogenous among usages and state isn't required. Examples include:

  • CAPTCHA - instead of showing you're a human to multiple websites, making the user experience cumbersome, you get multiple "humanity credits" that you can use

  • Privacy-respecting web search

  • Memory-less ChatGPT and Sora

  • Age verification

Privacy Pass itself, along with a few extensions of it, has been standardized by the IETF, a community that is responsible for standardizing important protocols that are used throughout the world. It's been deployed in practice by Cloudflare, who did a lot of the early work on it, and by Apple, in the form of Private Access Tokens, which is now native in iOS devices. Both have mainly focused on the CAPTCHA use case, and they're not at all limited to it. 

It even has an HTTP authentication scheme that allows you to introduce it relatively seamlessly into existing systems by augmenting the HTTP server to support the PrivateToken variant.

It's important to note that Privacy Pass does not allow malicious actors to thrive, in contrast to privacy related protocols. The issuer has full discretion on who to issue credits to and the origin has all the information related to the action being performed, just without a direct link to the original identity of the client.

x402

x402 is an emerging standard for, in the authors' own words, internet-native payments. It allows services to gate access by requiring payments. The novelty is that it's done in a protocoled and automated way. It's implemented on top of HTTP and uses the "402 payment required" HTTP code. 

When a client tries to access such a service, they're presented with a payment request and supported payment methods. The client makes a payment and responds with their payment authorization. Once the payment settles, the client is allowed access.

With the advent of digital money, and especially fast-settling crypto payments, it's possible to imagine services can charge per micro-usage, rather than charging for large subscriptions. This is even more important when you have AI agents working with you, using MANY sources of data at a speed you can't keep up with. They're going to consume more content and it's not surprising services will want to charge for it, and the current method of the humans having to manually obtain API keys for every service isn't tenable.

Some examples from the x402 white paper:

  • Agents accessing premium news articles for research, paying per article

  • Agents paying for GPU costs, charging per minute used

  • Agents consuming video streaming, charging per second costs

  • Agents accessing legal court rulings, charging per court ruling

You got to admit, these all sounds pretty homogenous and uniform... how about we try to compose these two protocols together?

Privacy Pass, a bit deeper

Before we go further into it, let's look at some Privacy Pass terminology.

We have:

  • Client - wants to use a service

  • Origin - the service that requires a client to authenticate, presenting the client with a challenge

  • Attester - verifies that the client completed a challenge

  • Issuer - once a client completes a challenge, issues blinded credits to the client

Now let's think what happens in the following case - the client purchases from origin $10 worth of credits by sending a stablecoin payment, presenting the transaction hash and a proof of ownership of the address.

The math!

You can ignore this section if you don't want to see the math, it's not really necessary. If you're curious, feel free to stay!

Let's look at a simplified case where the Origin, Attester and Issuer are joint, and the client is looking to buy some AI image generation credits.

We will discuss the elliptic curve variant, although the one that is mostly used today is the RSA one. The protocol goes as follows:

1. The origin sends the client a challenge that identifies a specific context where the credits should be valid. E.g. which issuer can be used and during what time window. It's an opaque string that's hashed to a group element, together with a unique nonce chosen by the client.

T=H(challenge,nonce)

2. The client chooses a random field element b, blinds the credit and sends it to the issuer, together with the requirement the issuer wants satisfied - e.g. ID card to prove age > 18.

B=bT

3. The issuer signs it by using their secret key s and sends the signed blinded credit to the client. The issuer further sends a proof that the credit was signed using the same secret key that they use for everyone else. It's a discrete log equality proof, or DLEQ, showing that the discrete log is equal between the signed credit and unsigned credit, and the public key and the group's base point.

C=sBπ=DLEQ((B,C),(pk,G))

4. The client verifies the proofs, unblinds the credit and has it ready to use!

D=b−1C

5. When the client wants to redeem a credit towards an origin, they send them D and their nonce. The origin goes ahead and uses the same context to get the same T, and verifies that the token is indeed signed using s. This is possible in this case since the issuer and the origin are the same. In the RSA variant it's possible to do it even when they're separate. The origin does the following check:

D=?sH(challenge,nonce)

Following this check, the origin is now assured that the client has a valid signed token and can provide the service. 

6. The origin maintains a list of used credits, so they can't be redeemed twice.

Note that we discussed a single token issuance, and it's possible to issue many of these together efficiently, which is often what you'd want.

The HTTP scheme

Now let's see how this can be done as an HTTP authentication scheme.

The WWW-Authenticate header is used to request clients to authenticate to a server, specifying which authentication scheme to use.

In the Privacy Pass case, it looks like this:

WWW-Authenticate: 
    PrivateToken challenge="abc...", token-key="123..."

This specifies that the scheme is PrivateToken, meaning Privacy Pass, the challenge, as above, and the token-key, which is the issuer's public key.

The client goes ahead and runs the protocol described above, and then responds to the server with the usual Authorization header.

Authorization:
    PrivateToken token="abc..."

Essentially, all that we've done to make it usable as an HTTP scheme is specifying that the delivery mechanism of the challenge and credit is through HTTP headers.

Side note - extensions

Privacy pass and, more generally, Anonymous Tokens have been extended in a few different ways, some of them are being standardized as well. A few examples:

x402, in a nutshell

Let's now talk about how x402 works in a very high level. It uses the existing 402 Payment Required HTTP response code, and adds structure on top of it. Specifically, it allows services to specify allowed payment methods. It could be blockchain-based, like stablecoins, and could be anything else that is verifiable. This is an example from the whitepaper:

{
    "maxAmountRequired": "0.10",
    "resource": "/api/market-data",
    "description": "Access to real-time market data requires payment.",
    "payTo": "0xABCDEF1234567890ABCDEF1234567890ABCDEF12",
    "asset": "0xA0b86991C6218b36c1d19D4a2e9Eb0cE3606EB48",
    "network": "ethereum-mainnet"
}

The client goes ahead, prepares the payment and makes another request, this time with an additional X-PAYMENTheader, containing the payment details. E.g. the signed transaction ready for submission. 

The server uses the help of a Facilitator to settle the payment - e.g. submit the signed transaction and wait for it to be confirmed. Once it is confirmed, the server responds with a X-PAYMENT RESPONSE header, signaling to the client they can move forward and use the service.

Tying these two together - blind x402! 

One way to do it would be to run these two protocols concurrently.

This would mean, for example:

  • A client wants to use a service, so they try to figure out how to access it. They receive a 402 Payment Required status.

  • The client goes ahead and prepares the payment, and in addition it does the following: it estimates it use for the next month of this service and prepares to buy a bulk. It prepares a bunch of blinded credits representing single uses of the service.

  • The client submits this through the X-PAYMENT header.

  • The service uses a facilitator that supports Privacy Pass, which in addition to settling the transaction also issues Privacy Pass credits. 

  • The service responds to the client and sends the blinded credits through the X-PAYMENT-RESPONSE header.

  • The client can then unblind the credits and use them later. 

  • When the service receives a credit to be redeemed, they would run the protocol again, through a process where the facilitator checks whether the credit was used and maintains the list of used credits, perhaps on-chain.

Nice and simple, isn't it?

Who can use this?

This would be natural to use in many of the cases we mentioned in the beginning. It could be agents that need to access data from sources they frequent (e.g. a legal database), or can at least access through an aggregator. Agents can now use services in a way that doesn't expose their identity directly to services and other snoopers, preventing attacks on them such as unfair censorship or data manipulation, while still being fully compliant.

What did we get out of it? We retain the advantage of machine-to-machine interactions, fast settlement and a structured and extensible protocol. The main downside is that we went back a bit on the flexibility that x402 gave us - first and foremost, micro-uses and granularly-pricing of services that clients can consume immediately. 

One really interesting benefit is the possibility of offchain payments - the used credit list doesn’t have to be updated on chain, at least not immediately. The facilitator can contact a centralized service run by the issuer in the interim, while maintaining the single-use nature.

It's not obvious to solve it within this model, and I hope it will give someone some food for thought! 

If you have ideas to bounce, building around this topic or have questions, I'd love to hear from you :) Reach out to me on email.

Acknowledgements

Thanks to Sam who suggested to discuss offchain payments and Andrija for reviewing.

https://kobi.leaflet.pub/3m3pyyctda22i
ZODA FAQ for engineers
An engineer's set of questions when using ZODA
Show full content
Why is the Tensor variation more practical to implement?

In the tensor variant, as described in Accidental Computer, you do a tensor encoding, which practically means encoding the columns first with G and then encoding the encoded rows with G’, which can be done by using encoders as blackbox, and therefore as a drop-in replacement. Specifically, it looks like:

Z=GX~G′T

which, when encoding, is done by:

EncodeG′​(EncodeG​(X~)T)T

You then commit to the rows and the columns of that matrix, receive randomness, and then take random linear combinations of the rows and the columns of the original data, as part of what you send in the proof. In the paper it looks like this:

yr​=X~gˉ​r​wr′​=X~Tgˉ​r′′​

which translates to:

RLCr​(X~)RLCr’​(X~T)

where the RLC is done on the columns.

A big benefit of the tensor variation is that you sample rows and columns from an encoding of the original matrix, which means you only sample elements from the base field. In vanilla ZODA, you assume that you work over a big enough field to sample randomness from, which makes zero-overhead. Even if you were to only sample randomness from am extension field, and otherwise use a smaller base field, one of the matrices you sample from has randomness embedded within it which is sampled from an extension field, making the matrix have extension field elements. This implies the samples are much larger in the extension field variant of ZODA.

What does it mean to receive a row/column/entry?

It means that the when full node sends you the rows, columns and other elements, these refer to existing commitments. This, in turn, means that you also receive opening proofs of these commitments to verify that the sent data is valid. Commonly, you would use Merkle trees and proofs for this.

Why would wrapping the ZODA sampling algorithm in a succinct proof result in less sampling per-node?

The ZODA guarantees you get are unique encoding, correct encoding and adversarial reconstruction. Unlike the other guarantees, to ensure correct encoding, you need to do sampling over the rows and columns. This means that if you guarantee correct encoding through running the sampler in a succinct proof, you remove that need.

What does it mean that the Hadamard variant doesn’t support distributed reconstruction?

A really nice property to have is local decoding, meaning that you can decode specific elements in the matrix in sublinear space - importantly, without decoding the whole matrix. Fundamentally, it's because Hadamard ZODA is not a tensor encoding, and the right matrix is a small amount of random columns. Essentially, only the columns are encoded. For the ZODA proof, if you encode the columns, you have to sample over the rows, and only the rows are committed to. This suggests that a client that wants to access an element that they’re missing, when no one else would give them an entire row containing it, would have to essentially decode the column containing it to recover the element. And the only way to get a column, since columns aren’t committed to, would be to get enough rows to reconstruct the entire matrix. In other variants, you have commitments for both rows and columns and you can easily access specific elements. This is where the notion of distributed reconstruction comes in - clients can reconstruct missing data bit by bit, without having to collect the entire data matrix themselves.

Side note - even in the Hadamard variant, if you were given the entire row containing the element, you could verify it’s good using the ZODA proof, as you do for any sampled row.

https://kobi.leaflet.pub/3m2ch5qko2c2k
Signature schemes you've never heard about
An exploration of some interesting signature schemes!
Show full content

Most of you have heard about ECDSA and Schnorr signatures. A bunch of you have seen BLS. A few brave ones have also seen BBS. But we're not going to talk about these today...

I've stumbled upon a couple of very interesting signature schemes through some topic I'm looking into with Guillermo. Both of them are a bit unusual in the sense that they're useful in niche areas, but when you need them - they're extremely useful.

Let's look at them.

Locally verifiable signatures

First, let's recap aggregate signature schemes, through the lens of BLS signatures. Aggregate signature schemes allow you to compress signatures on distinct messages into a short signature. BLS signatures look roughly as follows:

σ=sk⋅H(m)pk=sk⋅G

where G is a group generator, sk is the private key, pk is the public key and H is hash to group, so H(m) is a group element. Using the properties of the group, you can create an aggregate BLS signature by adding the messages together:

σ=σ1​+…+σn​= sk⋅(H(m1​)+…H(mn​))

Nice and easy. When the verifier encounters this aggregate signature, they would hash all the messages themselves and do the following check:

e(σ,G)=e(H(m1​)+…+H(mn​),pk)

so the pairing essentially checks that sk is consistent between the signature and the public key combined with the set of messages.

The key thing to see here is that the verifier needs to hash all the messages, as there's no way to run the pairing equation without that. 

What if we only want to verify only one message and only possess the aggregate signature?

That's where locally verifiable signatures come in. First introduced in the paper by Goyal and Vaikuntanathan, locally verifiable signatures allow you to simultaneously achieve:

  • Aggregate short signature on multiple messages, especially when the signatures have been generated separately of each other

  • Verifying the signature for a single message is efficient

The authors present two constructions - one based on RSA and one based on pairings.

RSA-based locally verifiable signatures

In the RSA case, a signature looks like:

σ=gH(m)1​

where g is the RSA group generator and H is a specific hash function defined in the paper. Note that only the person that knows the RSA primes used to generate the group can compute the signature, since performing an inversion on H(m) requires knowing order of the group. The primes serve as the secret key. An aggregate signature is:

emi​​:=H(mi​)σ=∏σi​=∏gemi​​1​

It's not immediately obvious why this is locally verifiable. To see that, note that an actor called a hint generator can generate the following hints to help with local verification:

em\mi​​=j=i∏​emj​​fi​=j=i∑​k∈/{i,j}∏​emk​​

These, in turn, can be used to compute:

gemi​​1​

This seems a bit weird, since it's basically just the original signature! This works due to a cool observation the authors made, where given the aggregate signature and the hints above, the hint generator can extract the original signature without knowing the secret key.

Pairing-based locally verifiable signature

The construction is similar in spirit. A signature looks like:

α+m1​G

where ⍺ is the secret key. In this case, we know the order of the group, but we rely on the BDHI assumption, which roughly says that given:

g,αG,…, αnG

it's hard to compute the inverse of ⍺ in the exponent, even in the target group of the pairing:

α1​e(G,G)

During the setup, the public key includes powers of ⍺ in the exponent, up until the maximum amount of messages in an aggregate signature. Aggregating signatures is done by computing:

σ=∏i​α+mi​1​G

It's shown in the paper how to do this without knowing ⍺, using some Lagrange coefficient and partial fraction decomposition tricks.

In order to help the verifier, the following hints can be generated by the hint generator:

aux1​=i∑​βi​(αiG)aux2​=i∑​βi​(αi+1G)

where the β coefficients are such that:

j,j=i∏​(y+mj​)=j∑​βj​yj

The local verification equation is:

e(σ,mi​aux1​+aux2​)=e(G,G)e(αG, aux1​)=e(G,aux2​)

The first equation essentially checks that the signature contains the inverse of the product without the relevant expression of the message you want to locally verify. It looks like this:

mi​aux1​+aux2​= mi​(i∑​βi​(αiG))+i∑​βi​(αi+1G)= ((α+mi​)j∑​βj​αj)G=​(α+mi​)j,j=i∏​(α+mj​)​G= j∏​(α+mj​)G=σRelation to accumulators

Accumulators are a different primitive that allows you to efficiently get a compact representation of a set of elements and efficient membership proofs of individual elements in the set. Sometimes you can dynamically add, remove and even get non-membership proofs. That does feel similar, doesn't it?

Some readers may notice that the pairing-based construction is almost identical in mechanics to the pairing-based accumulator by Nguyen. The main difference is that in the accumulator ⍺ is secret to everyone, and everyone can add elements into the accumulator, while in the pairing-based signature ⍺ is known to the signer.

In the single-signer cases we've reviewed above, you can also create locally verifiable signatures by:

  • Accumulating messages into the accumulator

  • Signing the accumulator

  • Verifying membership proofs and the signature on the accumulator

The main difference is that you don't necessarily get aggregate signatures that you can aggregate non-interactively, so the set of messages has to be known ahead of time.

This is also true for concepts like vector commitments.

Signatures on linear subspaces

Another interesting signature scheme is the one by Boneh, Freeman, Katz and Waters, which allows you to sign a linear subspace. This means that given a set of basis vectors that are signed using this scheme, the signature would verify on any linear combination of them. This is wonderful, since this allows you to verify signatures on linear combinations you are sometimes required to do, without control of the coefficients of the vectors. But it comes at a cost. Let's see.

Signatures on individual vectors v look like:

σ=α(i∑​vi​H(id,i))

where id is some unique identifier and ⍺ is the secret key. You may notice this looks familiar. This is because it's an aggregate BLS signature, with the messages being (id, i) and aggregation coefficients being the elements of v, where previously we just used 1s. As such, verification of signatures on individual vectors looks like aggregate BLS verification.

You can guess where this is going - given multiple vectors for the same id, the messages on the matching positions would be the same and provide the separation between the elements of the vector, and so you can run the linear combinations in the exponent.

The really nice property about it is that given the signature on the subspace, you, or someone providing you with a hint, can run the linear combination on the signatures and get a short signature on the vector.

One other result they present in the paper is that there's a minimum signature size on the subspace, proportional to the dimension of the subspace. Note that this is the subspace signature, not the final vector, which can be short. The intuition is that if the signature is any less, you would have too many subspaces with the same signatures, and as a consequence the signature would eventually cover the entire space and be trivial.

Open questions

These left me with some other desired properties that I'd like to see:

  • Signing linear subspaces but different signers for each vector spanning the subspace

  • Aggregating existing aggregate locally verifiable signatures from different signers 

  • Accumulators of accumulators - short witnesses to show multiple accumulators have been merged together

  • Efficient verification for aggregate signatures that are both multi signer and multi message

https://kobi.leaflet.pub/3lxfc6khits27
Verifiable Verifications
Show full content

Bluesky recently introduced a new form of verification. It’s a social version of the familiar notion of a blue check, in a way that is architecturally aligned with Bluesky and its underlying protocol ATProto. This allows notable accounts and accounts that are related to trusted organizations to be verified as authentic.

In order to support that, the Bluesky app has a list of a few trusted verifiers - the Bluesky team itself, NY times, Wired and The Athletic. Each of these organizations can attest that another user is authentic. They do so by adding records on their own Personal Data Server (PDS), which stores all the records related to their activity in Bluesky and potentially a wider ATProto ecosystem, the Atmosphere. Steve Klabnik has a good post that describes how it fits within the protocol design considerations. This form of attestation is simple and decentralized in spirit - while the official Bluesky has this initial list of trusted verifiers, other views of the same social Bluesky records can ignore these or trust other verifiers.

In this short post, I’d liked to explore potential ways this mechanism can be extended to more forms of verification, and especially those that are automated and verifiable - verifications that contain non-interactive cryptographic evidence rather than just attestations.

Recap on current verification

As Steve describes in his post, verification is done by the attester adding a record of the following form to their PDS:

The record of type app.bsky.graph.verification describes the verified identity, including some other details like when they were verified, what’s their handle and their name.

Design goals

A couple of very simple ones:

  • The new form of verification should be easy to, at least partially, integrate into apps that support the existing form

  • Apps that would like to perform full verification have all the cryptographic evidence necessary to do so in a way that is compatible with ATProto

Non-interactive verification using ZK Email

As a concrete example, let’s look at ZK Email. It’s an open source project that uses zero knowledge proofs about the content of emails. This is possible since modern email infrastructure uses DKIM (DomainKeys Identified Mail), where sent emails contains signatures on the headers and body.

This allows us to create some fun examples of verified sets of users:

The above all contain links of examples from the ZK Email Registry of how you can use regex on the email contents to achieve these properties. The nice thing about this method is that you don’t have to expose the entire email. Because the proofs are zero-knowledge, you can just prove that the email satisfies the desired property without revealing the rest of it, containing potentially sensitive data. Note that the examples above require you to associate your email with your Bluesky identity somehow.

First method - new schema

A design that ignores the first requirement is just adding new records. We could think of something of the following form:

The record is of a new type that identifies it as a ZK Email verification, the blueprint of what kind of data is used and extracted from the email in the proof, similar details to the current form of verification and, finally, a proof.

This answers point 2, but requires app views to adopt both a new schema and a new, relatively complex, method to verify the attestations.

It would be cool if we can somehow combine the existing approach with a new one while moving in the trust tradeoff space.

Second method - optimistic verification

Another approach we could take is the following:

  • Use a new schema as in the first method. Let’s call this a ZK attestation.

  • Additionally, create another attestation as in the existing verification mechanism. Let’s call it a signature attestation.

This would mean that you have two kinds of attestations for the same identity. The advantage of going in this direction is that apps that want a lightweight version of the attestation can use the existing signature-based attestation for verification, and other apps can do full verification.

The challenge is that there can be disagreements between the two kinds if the attester is compromised. You could just trust the attester that they verified the ZK attestation and only created a signature attestation. Another way to solve it is to take a page from the dispute window playbook - when a signature attestation is added, you don’t accept it immediately. Instead, you wait a few days to deem it valid. If there are disagreements between the methods, on any attestation for any user, the attester will be considered compromised by the community and the app developers will take action to remove it. Otherwise, if the days have passed and it hasn’t happened, you’re good.

This creates a new challenge - you now have to trust the timestamps in the signature attestations, where previously they were arbitrary. This is a non-trivial problem, but it’s at least much more contained. This is what’s done for credit card disputes, for example. It’s also a common mechanism in the blockchain space, where states of some chains (specifically, optimistic rollups) are deemed valid only after some time has passed.

There are a few approaches for time stamping that are possible:

  • Have a generic timestamping attester that can service many different applications, not just signature attestation. This can be treated like a trusted verifier.

  • Do the same with a few entities, to increase the trust in the mechanism. When doing this, instead of agreeing on a very precise timestamp, you can agree on a day.

  • Use an existing blockchain that supports easy verification (also known as light clients) and submit data there. The timestamp the transaction was included in the chain will be the one you will start counting from.

This requires the app to integrate one of this three.

Third method - TEE-powered attestation

Another point in the space to explore is Trusted Execution Environments. A TEE consists of hardware and software that guarantee what code has been run in the machine. Given that, we could create the following TEE program:

  • When first booted, it generates a signature key pair. It never exposes the private key and doesn’t provide an API to retrieve it. This will be the key pair that is used for signature attestations.

  • The program has an API method to create a signature attestation given a ZK attestation. It only returns a signature attestation if the ZK attestation has been verified successfully.

  • The TEE security mechanism provides you with the evidence necessary to know that this is the structure of the program.

Since the private key is never exposed, this enhances our security, removes the latency introduced in the second method and doesn’t require any further interaction with other services, like a timestamping service. It does require you to verify the evidence at least once, so that you know the correct program and key pair are used. Since this is an infrequent operation, it can be done by several members of the community and then the signature key pair can be included in the app for verification.

The TEE method can be used in different places - a timestamping service or API-checking service for other kinds of verification, for example.

This is really nice but not all is roses, since TEEs are hard to build and there has been breaches in the past. That said, this is an area that is continuously being worked on and is deployed in practice. Also, if you trust the attester’s server, you don’t need the extra security guarantees.

Side note - there are also cryptographic methods that theoretically can achieve similar guarantees, but they’re not practical yet. One of those is called indistinguishability obfuscation (trying saying that three time quickly, or even write it tbh).

Epilogue

I’m really not sure what is the right method to use or even if it’s one of the above. I also don’t have a concrete proposal, there’s a lot more work to define the precise details here 🙂 Just wanted to share some ideas here of different areas to explore, and mention some technologies that are less commonly used in these contexts.

https://kobi.leaflet.pub/3lpruvjqlhs22