Docs/Architecture/Nullifier Hash

Nullifier Hash

How nullifier hashes prevent double-spending while preserving privacy

Nullifier Hash

The nullifier hash mechanism is the critical component that prevents double-spending in ZKMix while preserving the unlinkability between deposits and withdrawals. Without nullifier hashes, a user could withdraw the same deposit repeatedly, draining the mixer pool. With a naive approach (e.g., marking commitments as spent), the link between deposit and withdrawal would be publicly visible, destroying privacy. The nullifier hash scheme solves both problems simultaneously.

The Double-Spend Problem in Mixers

A zero-knowledge mixer accepts deposits and allows withdrawals by proving membership in the set of depositors. The fundamental challenge is: how does the protocol know a deposit has already been withdrawn?

Why Marking Commitments Fails

The most obvious approach is to mark each commitment as "spent" after withdrawal. However, this completely breaks privacy:

User A deposits commitment C_A at leaf index 5.
User A withdraws and the protocol marks C_A as spent.
Anyone observing the chain sees that commitment at index 5 was just spent, directly linking the deposit transaction (which inserted C_A) to the withdrawal transaction (which marked C_A).

The deposit and withdrawal addresses are now publicly linked through the commitment, which is exactly what the mixer is supposed to prevent.

Why Simple Proofs of Membership Are Insufficient

If we only prove "I know a leaf in the tree" without any spending mechanism, nothing prevents the same user from generating multiple valid proofs for the same leaf and withdrawing repeatedly. Each proof would be valid because the leaf genuinely exists in the tree, and the ZK property means the verifier cannot tell that the same leaf is being reused.

The Requirement

We need a mechanism that:

Allows the protocol to detect and reject duplicate withdrawals for the same deposit.
Does not reveal which deposit is being withdrawn.
Is deterministic: the same deposit always produces the same "spent" marker, preventing replays.
Is unpredictable: given the "spent" marker, no one can determine which deposit it corresponds to.

The nullifier hash scheme satisfies all four requirements.

The Nullifier Scheme

Deposit Phase

At deposit time, the user generates two random values:

nullifier: A 248-bit random value from a cryptographically secure random number generator (CSPRNG).
secret: A 248-bit random value from a CSPRNG.

These two values are combined to form the commitment:

commitment = Poseidon(nullifier, secret)

The commitment is inserted into the Merkle tree as a leaf. The nullifier and secret are stored privately by the user in their deposit note. Neither value is ever sent to the chain or revealed to any other party.

The reason for using two separate values (nullifier and secret) rather than a single value is separation of concerns:

The nullifier is used to derive the nullifier hash, which will be publicly revealed at withdrawal time. It must be random and unpredictable, but its hash will become public.
The secret provides additional entropy that is never revealed in any form. Even if the nullifier hash is known, the secret prevents anyone from reconstructing the commitment (since commitment = Poseidon(nullifier, secret) and the secret remains unknown).

This two-value design ensures that knowledge of the nullifier hash (which becomes public) does not leak the commitment, which would allow linking to the deposit.

Withdrawal Phase

At withdrawal time, the user computes:

nullifierHash = Poseidon(nullifier)

This value is included as a public input to the ZK-SNARK proof. The circuit verifies two things about the nullifier:

Commitment reconstruction: The circuit computes Poseidon(nullifier, secret) using the private nullifier and secret inputs, and verifies that this equals the leaf used in the Merkle path verification. This proves that the prover knows the preimage of a real commitment.

Nullifier hash correctness: The circuit computes Poseidon(nullifier) using the private nullifier input and constrains the result to equal the public nullifierHash input. This proves that the publicly revealed nullifier hash correctly corresponds to the private nullifier used in the commitment.

The complete proving statement is: "I know (nullifier, secret) such that Poseidon(nullifier, secret) is a leaf in the Merkle tree with root root, and Poseidon(nullifier) equals nullifierHash."

On-Chain Verification

When the on-chain program processes a withdrawal:

Verify that nullifierHash has not been seen before (check the nullifier hash set).
Verify that root is in the root history buffer (valid Merkle root).
Verify the Groth16 proof against the public inputs (root, nullifierHash, recipient, relayer, fee, refund).
If all checks pass, add nullifierHash to the on-chain nullifier hash set (marking it as spent).
Transfer funds to the recipient (minus relayer fee).

Step 1 is the double-spend prevention: if the same deposit is used twice, it will produce the same nullifierHash (because the nullifier is deterministic for each deposit), and the second attempt will be rejected.

How the On-Chain Nullifier Hash Set Prevents Reuse

The nullifier hash set is a persistent, append-only data structure that records every nullifier hash that has been used in a successful withdrawal. Before processing any withdrawal, the program checks whether the submitted nullifier hash already exists in this set.

Properties of the Nullifier Hash

Deterministic: For a given deposit (with a fixed nullifier value), the nullifier hash is always Poseidon(nullifier). The user cannot produce a different nullifier hash for the same deposit. If they try to use a different nullifier, the commitment reconstruction inside the circuit will fail (because Poseidon(different_nullifier, secret) will not equal the committed leaf).

Unique per deposit: Since each deposit uses a randomly generated 248-bit nullifier, the probability of two deposits producing the same nullifier hash is negligible (approximately 2^-248, assuming Poseidon is collision-resistant). In practice, each deposit maps to exactly one unique nullifier hash.

One-to-one mapping: The combination of determinism and uniqueness means there is a bijective relationship between deposits and nullifier hashes. Each deposit can produce exactly one nullifier hash, and each nullifier hash corresponds to exactly one deposit. This guarantees that marking a nullifier hash as spent is equivalent to marking a deposit as spent.

Attempted Double-Spend Scenario

User deposits with nullifier = N, secret = S, creating commitment C = Poseidon(N, S) at leaf index 7.
User generates a proof with nullifierHash = Poseidon(N) and withdraws successfully. The program records Poseidon(N) in the nullifier hash set.
User attempts a second withdrawal using the same deposit note.
The user computes nullifierHash = Poseidon(N) (same value, because the nullifier is the same).
The on-chain program checks the nullifier hash set and finds that Poseidon(N) already exists.
The withdrawal is rejected with a "nullifier already spent" error.

The user cannot circumvent this by:

Using a different nullifier: The circuit would compute Poseidon(N', S), which would not equal C, and the Merkle path verification would fail (no leaf with that value exists).
Using a different secret: Same result: Poseidon(N, S') would not equal C.
Using a different leaf: The user does not know the preimage of any other leaf (each commitment is the hash of another user's secret values).

Why This Preserves Privacy

The critical privacy property is that the nullifier hash cannot be linked back to the original deposit. This relies on two cryptographic properties of the Poseidon hash function:

Preimage Resistance

Given nullifierHash = Poseidon(nullifier), it is computationally infeasible to recover nullifier. The Poseidon hash function provides 128 bits of preimage resistance (matching the security level of the BN128 curve). Without the nullifier, an observer cannot compute the commitment Poseidon(nullifier, secret) to determine which leaf was spent.

Even if an attacker knows both the nullifier hash and all commitments in the tree, they cannot determine which commitment corresponds to the nullifier hash without knowing either the nullifier or the secret.

Independence of Nullifier Hash and Commitment

The nullifier hash and the commitment are related through the nullifier value, but this relationship is hidden by the Poseidon hash function:

commitment   = Poseidon(nullifier, secret)
nullifierHash = Poseidon(nullifier)

An observer sees the commitment (from the deposit event) and the nullifier hash (from the withdrawal transaction). To link them, the observer would need to find a value nullifier such that:

Poseidon(nullifier) equals the known nullifier hash, AND
Poseidon(nullifier, secret) equals one of the known commitments (for some unknown secret).

The first condition requires breaking preimage resistance of Poseidon. Even if the attacker could somehow enumerate candidate nullifiers, they would also need to find the corresponding secret to check against commitments, requiring a brute-force search over a 248-bit space for each candidate.

Formal Privacy Guarantee

The unlinkability property can be stated formally: given a set of deposit commitments {C_1, C_2, ..., C_n} and a nullifier hash H, no polynomial-time adversary can determine which commitment C_i corresponds to H with probability significantly better than 1/n (random guessing), assuming Poseidon is a secure hash function.

This holds even if the adversary knows:

All commitments and their leaf indices (public from deposit events).
The nullifier hash (public from the withdrawal transaction).
The Merkle root and authentication path structure (public).
The recipient address and relayer information (public).

The only information that could link the two is the nullifier value itself, which is private to the depositor.

The Role of the Secret

One might ask: why not use commitment = Poseidon(nullifier) (a single value) and nullifierHash = Poseidon(nullifier) (the same value)? This would make the commitment equal to the nullifier hash, immediately linking deposits to withdrawals.

Alternatively, why not use commitment = Poseidon(nullifier) and nullifierHash = Poseidon(Poseidon(nullifier))? While this hides the direct equality, there is a structural relationship: the nullifier hash is the hash of the commitment. An attacker could hash every known commitment and check if the result equals the nullifier hash, breaking unlinkability in O(n) time.

The secret provides an independent random value that breaks this structural relationship. The commitment Poseidon(nullifier, secret) depends on both values, but the nullifier hash Poseidon(nullifier) depends only on the nullifier. Without knowing the secret, the attacker cannot verify whether a given commitment corresponds to a given nullifier hash, even if they could somehow invert Poseidon (which they cannot).

Nullifier Storage on Solana

PDA-Based Nullifier Set

ZKMix stores spent nullifier hashes as individual Program Derived Accounts (PDAs). Each PDA is derived from the nullifier hash value:

seeds = ["nullifier", pool_pubkey.as_ref(), nullifier_hash.as_ref()]

Checking whether a nullifier hash has been spent is equivalent to checking whether the corresponding PDA exists. This leverages Solana's native account lookup mechanism, which is O(1) and does not require iterating over a list or querying a bitmap.

Existence Check

When processing a withdrawal, the program attempts to initialize the nullifier PDA. If the account already exists (i.e., the nullifier hash has been spent), the initialization fails, and the withdrawal is rejected. This is an atomic check-and-set operation:

rust

#[derive(Accounts)]
#[instruction(proof: ProofData, public_inputs: PublicInputs)]
pub struct Withdraw<'info> {
    /// The nullifier PDA. If this account already exists, the withdrawal
    /// will fail because init requires the account to not exist.
    #[account(
        init,
        payer = relayer,
        space = 8 + 1,  // discriminator + is_spent flag
        seeds = [
            b"nullifier",
            pool.key().as_ref(),
            public_inputs.nullifier_hash.as_ref()
        ],
        bump
    )]
    pub nullifier_account: Account<'info, NullifierAccount>,

    // ... other accounts
}

Using Anchor's init constraint, the PDA creation and existence check are handled atomically by the Solana runtime. If a malicious user submits two withdrawal transactions with the same nullifier hash in the same slot, only one will succeed: the first to be processed will create the PDA, and the second will fail because the account already exists.

Storage Cost

Each nullifier PDA requires:

Field	Size
Account discriminator	8 bytes
Is-spent flag	1 byte
Total data	9 bytes

The minimum Solana account size (including metadata) results in a rent-exempt deposit of approximately 0.001 SOL per nullifier PDA. This cost is paid by the relayer (or the user, for direct withdrawals) as part of the withdrawal transaction, and it is a permanent, non-recoverable cost since the nullifier PDA must persist indefinitely to prevent double-spending.

For a pool with 1,000,000 withdrawals, the total storage cost for nullifier PDAs would be approximately 1,000 SOL. This cost is distributed across all withdrawers (each paying ~0.001 SOL) and is a natural economic constraint on the system.

Why Not a Bitmap or Hash Map

Alternative on-chain storage approaches for the nullifier set include:

Bitmap: A bit array where each bit corresponds to a possible nullifier hash. Since nullifier hashes are 254-bit values, a complete bitmap would require 2^254 bits, which is astronomically infeasible. A smaller bitmap would require a hash-to-index mapping that could introduce collisions (false positives), potentially blocking legitimate withdrawals.

On-chain hash map (HashMap in a single account): Storing all nullifier hashes in a single account's data (as a vector or hash table) is limited by Solana's maximum account size of 10 MB. Each nullifier hash is 32 bytes, so a single account could store at most ~312,000 entries. Additionally, loading and deserializing a large account in every withdrawal transaction consumes significant compute units and increases transaction costs.

Sparse Merkle tree: A sparse Merkle tree could represent the set of spent nullifiers compactly, but updating and proving non-membership requires complex on-chain logic and multiple hash computations, which would increase the compute cost of withdrawals.

The PDA-per-nullifier approach is the simplest and most efficient for Solana's account model. It has O(1) lookup cost, no size limitations (limited only by total ledger capacity), no collision risk, and leverages Solana's native parallelism (transactions touching different nullifier PDAs can execute in parallel across validators).

Nullifier Set and Anonymity Set Size

An important relationship exists between the nullifier set and the anonymity set:

The anonymity set at any point in time is the set of all unspent deposits: commitments in the Merkle tree whose corresponding nullifier hashes have NOT appeared on-chain.
Each withdrawal removes exactly one element from the anonymity set (the withdrawer's deposit) by revealing its nullifier hash.
The anonymity set size equals total_deposits - total_withdrawals.

For maximum privacy, users should wait until the anonymity set is large before withdrawing. If only 2 deposits exist and 1 is withdrawn, the remaining deposit is trivially linked to the remaining unspent commitment. If 10,000 deposits exist and 1 is withdrawn, the withdrawn deposit could be any of the 10,000, providing strong anonymity.

The nullifier hash set is public (anyone can enumerate all spent nullifier PDAs), but this does not reduce privacy. Knowing which nullifier hashes have been spent only reveals how many withdrawals have occurred, not which deposits they correspond to. The mapping from nullifier hashes to commitments remains computationally hidden by the Poseidon hash function.

Pruning and Long-Term Sustainability

Because nullifier PDAs must persist indefinitely to prevent double-spending, the storage cost grows linearly with the number of withdrawals. In the long term, this could become a significant cost if the protocol processes millions of withdrawals.

Potential mitigation strategies include:

Incorporating storage fees into the withdrawal: The ~0.001 SOL rent-exempt cost per nullifier PDA is deducted from the withdrawal amount, alongside the relayer fee. This makes the cost self-funding.

Epoch-based pool rotation: Periodically deploying new pool instances and deprecating old ones. Once all deposits in an old pool have been withdrawn (or after a long timeout), the nullifier PDAs for that pool can theoretically be closed. However, this requires careful design to avoid prematurely closing PDAs for unspent deposits.

Compressed state using state compression: Solana's state compression (used by compressed NFTs) could potentially store nullifier hashes in a compressed on-chain Merkle tree rather than individual PDAs, reducing the per-nullifier storage cost by orders of magnitude. This is an active area of research for ZKMix.

In practice, the ~0.001 SOL per nullifier cost is modest relative to typical mixer denomination sizes (1 SOL, 10 SOL, 100 SOL), representing less than 0.1% overhead even at the smallest denomination.

Back to Docs