Private Relay Connections: Zero-Knowledge Solutions for Nostr

Nostr relays see everything - who connects, what they fetch, how often they post. Zero-knowledge cryptography can fix all three problems: Semaphore-based authentication hides which whitelisted user is connecting, private information retrieval hides which notes you're fetching, and Privacy Pass enables rate limiting without identity linkage.
Private Relay Connections: Zero-Knowledge Solutions for Nostr

Every Nostr relay operates as a surveillance point by architecture, regardless of intent. The relay sees your IP address when you connect, records which pubkeys you subscribe to, logs every note you request, and accumulates your posting frequency, online hours, and social graph.

NIP-42 authentication makes this worse. Paid relays and private communities need access control, so you sign a challenge proving your identity. Now the relay has cryptographic proof linking your IP to your npub.

The metadata problem is structural. Your content is signed and optionally encrypted, but your behavior is exposed: a relay operator, or anyone compelling them, can reconstruct your communication patterns from connection logs alone.

Zero-knowledge proofs offer a path out: prove membership in the authorized set while keeping your specific identity hidden from the relay.

The authentication problem

A relay maintains a whitelist: these 500 npubs are allowed to connect. Current NIP-42 requires you to prove you’re npub X, handing the relay your exact identity.

The proof you need is weaker: “I’m someone on your whitelist.” ZK makes that possible while keeping your specific position in the list hidden.

Semaphore (developed by PSE/Ethereum Foundation, battle-tested by Worldcoin) implements exactly this pattern:

  1. The relay publishes a Merkle tree root of authorized pubkeys
  2. You hold a Merkle proof showing your pubkey is a leaf in the tree
  3. You generate a ZK proof: “I know a secret key sk, and the corresponding pubkey is in this Merkle tree”
  4. The relay verifies the proof and grants access, blind to which member you are

The proof reveals zero information about your position in the tree or your actual pubkey. From the relay’s perspective, you could be any of the 500 authorized users.

Performance is practical today: Semaphore proofs generate in ~3 seconds on mobile hardware, verify in ~10 milliseconds. The relay’s Merkle tree scales to millions of members at constant verification time. Client-side libraries exist for browser and native mobile.

Rate limiting within anonymity requires one addition: nullifiers. When generating a proof, you also output a deterministic value derived from your secret key and the current epoch (hour/day/week). If you authenticate twice in the same epoch, you produce the same nullifier. The relay can detect and reject duplicates while remaining blind to your identity.

A malicious user’s nullifier repeats on spam attempts; the relay rejects the duplicate. An honest user’s nullifiers rotate each epoch, revealing only that a valid member acted.

Implementation sketch:

New message type: ["AUTH_ZK", <proof>, <nullifier>, <epoch>]

Relay logic:
1. Verify proof against current Merkle root
2. Check nullifier not seen this epoch
3. Store nullifier, grant session
4. Session has no identity attached - just "verified member"

This requires a new NIP, client library changes, and relay-side verification (~100 lines of code using existing snarkjs/Semaphore packages). The infrastructure exists; adoption is the barrier.

The retrieval problem

Authentication hides who you are. But once connected, you send REQ messages specifying exactly which notes you want: these pubkeys, these event kinds, these tags. The relay learns your interests, your contacts, your subscriptions.

Private Information Retrieval (PIR) lets you query a database while keeping the query itself hidden from the server. The relay returns your requested data, and the query vector reveals only that a valid lookup occurred.

The naive approach of downloading everything and filtering locally breaks at scale. A relay with millions of notes and thousands of concurrent clients faces impossible bandwidth requirements.

Modern PIR achieves sublinear communication:

SimplePIR (USENIX Security 2023) achieves 10 GB/s throughput with single-server security. The client encodes their query as a vector, the server performs matrix multiplication, the result decodes to the requested record. The server processes a query vector whose target index remains computationally hidden.

FrodoPIR (PoPETS 2023) optimizes for the messaging use case: <1 second queries, ~$1 per 100,000 queries at scale. Communication overhead is ~10-100x the actual data size - significant but potentially acceptable for high-value privacy.

The hybrid approach combines PIR with ZK:

  1. Relay maintains an encrypted message store indexed by recipient
  2. Client generates PIR query for their mailbox index
  3. ZK proof accompanies query proving:
    • Client knows the secret key for the queried mailbox
    • Client hasn’t exceeded rate limits (via nullifier)
  4. Relay executes PIR query, returns encrypted response
  5. Relay learns only: a valid user queried something at some time.

Practical constraints: PIR has real costs: server computation scales with database size and bandwidth overhead is substantial. PIR suits targeted, high-value queries; “firehose” subscriptions pulling every note from 1000 pubkeys.

But for specific high-value queries - fetching your DMs, checking notifications, retrieving specific threads - PIR provides metadata protection impossible through other means.

Express (MIT, 2021) demonstrated practical metadata-hiding communication: two-server deployment, 20ms client computation, 5KB communication per message, ~$1/month operating cost. The architecture required two non-colluding servers, which maps imperfectly to Nostr’s relay model but suggests the overhead is manageable.

The open question: can PIR be adapted to Nostr’s subscription model, or does metadata privacy require a different query pattern altogether?

The rate limiting problem

Relays need spam protection. The obvious solution - rate limit by IP or pubkey - destroys privacy. Your posting pattern becomes a fingerprint.

Privacy Pass (IETF standardization in progress) decouples rate limiting from identity:

  1. Client contacts an issuer, proves they’re a legitimate user (CAPTCHA, payment, reputation)
  2. Issuer blind-signs tokens - client gets valid tokens with the issuer kept blind to which tokens
  3. Client redeems tokens to relay - one token per action
  4. Relay verifies token validity, enforces one-use, learns nothing about client identity

The relay knows: this action was authorized by someone who passed the issuer’s checks. It doesn’t know which user, can’t link multiple redemptions to the same user, can’t build behavioral profiles.

Blind signatures are the key primitive: the issuer signs blinded tokens, the client unblinds the result, and the relay verifies validity with the issuance event cryptographically severed from the redemption. RSA blind signatures achieve this in ~0.5KB per token.

Integration with Nostr:

New message types:
["TOKEN_REQUEST", <blinded_token>] -> ["TOKEN_RESPONSE", <blind_signature>]
["EVENT", <event>, <token>, <signature>]

Flow:
1. Client obtains token batch (could be from relay itself, or third-party issuer)
2. Each EVENT submission includes token redemption
3. Relay verifies token, publishes event, discards token
4. No identity linkage between events

Anonymous Rate-Limited Credentials (ARC) extend Privacy Pass with per-origin limits. A user gets N unlinkable tokens per time period. They can spend them across multiple relays without any relay learning their total activity. Bandwidth scales sublinearly with token count.

The issuer could be the relay itself (simplest), a federation of relays (spreading trust), or an independent service (separating authentication from relay operation). Each model has different trust assumptions but all break the identity-to-behavior link.

Putting it together

A privacy-preserving relay connection looks like:

  1. Connect over Tor or VPN (IP privacy - out of scope but essential)
  2. Authenticate with Semaphore ZK proof (membership without identity)
  3. Query high-value data via PIR (retrieval without revelation)
  4. Subscribe to public feeds normally (some metadata leakage acceptable)
  5. Post with Privacy Pass tokens (rate limiting without tracking)

Not every connection needs full privacy. Public note browsing has lower stakes than DM retrieval. The architecture should support gradations.

What’s deployable today:

  • Semaphore authentication: production libraries exist, ~3s mobile proving
  • Privacy Pass rate limiting: IETF-standardized, existing implementations

What needs work:

  • PIR integration: research-stage for Nostr’s query model
  • Relay coordination: new NIPs, adoption incentives, client support

What’s missing:

  • Relay incentive alignment: privacy features cost compute, why would relays adopt?
  • User experience: additional latency, larger bandwidth, battery impact on mobile
  • Network coordination: value depends on critical mass of supporting relays

The metadata reality

Zero-knowledge authentication doesn’t make you invisible. The relay knows someone is connected. Traffic analysis can still correlate timing patterns. Global adversaries watching multiple relays can potentially deanonymize through intersection attacks.

But the threat model improves dramatically. The relay operator can’t identify you. A subpoena for “all activity from npub X” returns nothing - the relay genuinely doesn’t know. Your posting pattern isn’t stored because there’s no identity to attach it to.

Defense in depth assembles from these layers: Tor hides your IP, ZK auth hides your identity, PIR hides your interests, Privacy Pass hides your behavior. Each layer is imperfect; together they raise the cost of surveillance from trivial to substantial.

Nostr’s architecture makes this possible; the protocol is simple enough that privacy extensions attach without core changes. Clients, relay operators, and NIP authors need to decide to build them.

The cryptography works and implementations exist; deployment is the coordination problem that remains.



Loading comments…