Plinko PIR tutorial
2025 Nov 25
See all posts
Plinko PIR tutorial
Special thanks to Alex Hoover, Keewoo Lee and Ali for feedback
and review
One underrated form of privacy, that is not well satisfied by ZK-SNARKs,
traditional versions of FHE
or other commonly talked about methods, is private
reads. Users want to learn things from a large database,
without revealing what they're reading. Some use cases of this
include:
- Privacy-preserving Wikipedia (so you can learn things without even
the server knowing what you're reading)
- Privacy-preserving RAG
or LLM search (so your
local LLM can get data from the internet without revealing what you're
querying)
- Blockchain data reads (so you can use a dapp that queries an
external RPC node without the node learning what you're querying; see
"private reads" here)
There is a category of privacy protocols that are directly designed
to solve this problem, called private
information retrieval (PIR). A server has a database \(D\). A client has an index \(i\). The client sends the server a query,
the server replies back with an answer that allows the client to
reconstruct \(D[i]\), all without the
server learning anything about \(i\).
In this post I will describe some basics that underlies all PIR
protocols, and then describe Plinko, a protocol that
efficiently achieves this.
Basics of PIR
To understand more complicated PIR protocols, it's best to start with
"classic two-server PIR". Classic two-server PIR assumes that a client
is making queries to two servers, and it trusts
that at least one of the two is honest. Here's how the protocol
works:

- The client wants to know \(D[i]\),
in the above example \(D[8]\).
- The client generates a random subset of indices in \(D\), on average about half full (in the
above example, \([1, 5, 6, 10,
15]\)).
- The client sends that subset to one server, and then sends a
modified subset, where the membership of \(D[i]\) is flipped (ie. added if it was not
there, removed if it was there), to the other server.
- Each server computes the XOR of the values in \(D[i]\) that it was asked, and returns the
result
- The client XORs these two values, and it gets the value of \(D[i]\)
The key idea is that each server receives what looks like a
completely random subset of the indices. There is not even a slight bias
from mixing in \(i\), because that
modification is as likely to add the value as it is to remove it. The
two servers give XOR-sums of subsets that are almost identical, but
differ in one place: one includes \(D[i]\) and the other doesn't. The client
gets these two values, and takes the difference between the two, which
gives \(D[i]\).
This naturally generalizes to "cells" that are bigger than one bit;
you could have each \(D[i]\) have eg.
1000 bits, and the protocol is exactly the same.
This protocol has three big weaknesses:
- You have to trust that one of the two servers is honest.
- The client has communication overhead equal to the number of cells
(this can be much smaller than the total size of the data if the cells
are big, but it's still a lot).
- The server has to XOR half the total data to respond to a single
query.
There are two well-understood modifications to classic two-server PIR
that solve (1).
One strategy is PIR with preprocessing. Notice that
in the protocol above, you can compute one of the two random sample sets
before you even know which \(i\) you
want to query for. You generate thousands of these random queries. You
have an initial phase where you download (but do not need to store) the
whole file \(D\), and you compute the
XOR sums for these random sets of indices locally. Essentially, you are
acting as one of the two servers - and in this way, satisfy the 1-of-2
trust assumption all by yourself, so you do not need to trust the other
server. This approach does require a huge amount of upfront bandwidth,
but you can do it ahead of time, or do it in the background and use a
more inefficient form of PIR to answer any queries that you get before
it finishes.
The other strategy is single-server PIR.
Basically, you use some variant of homomorphic encryption (often simpler
than full-on FHE) to encrypt your query, in such a way that the server
can compute on your encrypted query to give you an encrypted answer.
This has high computational overhead on the server side, and so is not
yet feasible for very large datasets. This strategy can be combined with
PIR with preprocessing, so you can theoretically avoid the initial
download by doing that procedure homomorphically encrypted on the server
side.
Plinko solves (2) and (3) - or at least, it decreases both
communication overhead and server-side compute overhead from \(O(N)\) to \(O(\sqrt{N})\) where \(N\) is the number of cells. It uses the
preprocessing strategy to remove the trust assumption. Theoretically,
you can do the preprocessing in FHE; it is currently an open problem to
do this efficiently enough.
In Plinko's language, there is:
- A setup phase, where the client processes all the
data once (but only remembers a small number of
"hints")
- A query mechanism, where the client makes a query
to the server and combines the answer with the right hint to compute the
needed value.
There is also an easy way for the client to update hints if the data
set changes. Now, let's get into the protocol!
Setup phase
Treat the database as being a square, with \(\sqrt{N}\) rows and \(\sqrt{N}\) columns:

The client generates a random master seed \(S\). For each row \(i\), the client uses the master seed to
generate a row seed \(S_i\). The client
generates roughly \(128 * \sqrt{N}\)
hints. Each hint is generated as follows:
- Randomly (or rather, pseudorandomly by hashing the master seed) pick
\(\frac{\sqrt{N}}{2} + 1\) rows
- If we're computing the \(j\)'th
hint, for each row \(r_i\), use a hash
\(H(S_{r_i}, j)\) to generate a random
column \(c_i\). So we get \(\frac{\sqrt{N}}{2} + 1\) \((r_i, c_i)\) pairs
- XOR the pairs, and save the bit

The "hash" that is used to compute the \(c_i\) values could be a regular
hash, but that leads to (tolerable but significant) inefficiency during
the query phase. So instead, Plinko prefers special type of hash called
an "invertible PRF". The core property that it achieves is that given
\(S_i\) and \(c_i\), you can easily recover all \(j\) values that lead to that \(c_i\). In practical terms, this means that
for any cell \((r_i, c_i)\), you can
easily find all hints that "include" that cell.
Additionally, the setup phase generates some "backup hints". These
are constructed like regular hints, except for each index \(j\), you choose \(\frac{\sqrt{N}}{2}\) rows, and compute a
pair of hints: one over the chosen subset of rows, and the other for its
complement (ie. the \(\frac{\sqrt{N}}{2}\) rows that were
not chosen):

This whole process can be done "streaming", without requiring the
client to store any data at any point in time other than the hints
themselves. In total, the client needs to download the whole data set
and store data equal to roughly \(64 \*
\sqrt{N}\) cells, plus \(\sqrt{N}\) cells per backup hint (the
client will need one backup hint for each query that the client will
make).
If desired, the process can also be done in FHE, which means the
client would only need to download the hints, but the server would need
to do FHE work over the whole dataset; this is of course expensive, and
optimizing it is an ongoing problem.
Query phase
Suppose you are interested in a specific coordinate \((x, y)\). First, you discover a hint index
\(j\) that generates a hint that
contains \((x, y)\). This is where the
"invertible PRF" mechanism helps: it lets you immediately run the
function in reverse knowing \(S_x\) and
\(y\) to compute all \(j\)'s such that \(H(S_x, j) = y\). However, if you want to
avoid this complexity, you could just set \(H(S_x, j) = sha256(S_x, \lfloor \frac{j}{16}
\rfloor)[j \% 16: (j \% 16) + 2]\) and then compute on average
\(\frac{\sqrt{N}}{16}\) hashes every
time.
Then, you determine the \(\frac{\sqrt{N}}{2} + 1\) rows that the hint
generates. You take out row \(x\). You then send the server a message
that contains:
- \([(r_1, c_1), (r_2, c_2) ...
(r_{\frac{\sqrt{N}}{2}}, c_{\frac{\sqrt{N}}{2}})]\): the set of
points included in that hint, except for the point you want
- \([(x_1, y_1), (x_2, y_2) ...
(x_{\frac{\sqrt{N}}{2}}, y_{\frac{\sqrt{N}}{2}})]\): the set of
all other rows (including the row you want), with a randomly
chosen column for each one
You send these in a random order, so the server does not know which
is which. As far as the server can tell, you just sent it a random set
of points, one per row, with the rows split in half in an arbitrary way.
The server XORs both sets of cells, and replies with both outputs.
The client then locally takes its hint, XORs it with the (non-junk)
value returned by the server, and it has its answer.

The server does a computation based on \(\sqrt{N}\) cells, and returns that data to
the client. The server knows nothing about which cell the client wants
(provided the client does not reuse hints), because the client gave the
server every point in the set in its hint except the point it
actually wants. And without knowing the hash that generated these
subsets, to the server, all the row and column choices look completely
random.
Backup queries
To avoid using a hint twice (and thus leaking data), when the client
makes a query, it dumps the hint that it used, and "promotes" a backup
hint into being a regular hint. Remember that each "backup hint" is a
pair of hints, one that is based on a random subset of the rows, and the
other based on its complement. The algorithm is simple:
- Take the hint value generated from the subset of the rows that
does not contain the row you queried
- XOR in the value you queried
Now, you have a set of \(\frac{\sqrt{N}}{2}
+ 1\) rows with an XOR of one randomly-chosen column from each
row: exactly the same format as a regular hint.

Updating the dataset
Suppose a value in the data updates. Then, you can find all hints
that contain that value, and simply do a direct update: mix in the XOR
of the old value and the new value.

And that's the whole protocol!
Concrete efficiency
In asymptotic terms, everything in Plinko is \(O(\sqrt{N})\): the size of the hints that
the client needs to store, the data transmitted per query
(technically that's \(O(\sqrt{N} *
log(N))\)), and the server-side compute costs. But it helps to
put this all into the context of a real-world workload. So let's use the
real-world workload that I am the most familiar with: Ethereum.
Suppose that you are querying a dataset with 10 billion values, each
of which are 32 bytes. This is a pretty good description of the Ethereum
state tree, in a couple of years when it gets significantly bigger due
to scale. We use cuckoo hashing
to convert a key-value store into a "flat" lookup table of the type that
Plinko handles.
To include Merkle branches, one simple way (though not the optimal
way) is to treat them separately as queries into smaller databases; if
it's a binary
tree, then the first level is full-sized, the second level is
half-sized, the third level is quarter-sized, etc; because PIR costs are
all proportional to sqrt, this means that our total costs will all be
multiplied by \(1 + 1 + \frac{1}{\sqrt{2}} +
\frac{1}{2} + \frac{1}{2\sqrt{2}} + ... = 3 + \sqrt{2} \approx
4.414\). Ethereum's state tree is (for now) a less-efficient
hexary tree, but you can ZK-prove equivalence to a binary
tree to get the efficiencies of a binary tree today.

But first, let's provide raw costs without the Merkle branches. This
implies a 100000×100000 grid, where each cell is 32 bytes (256 bits).
The client would need to store 100000×128 hints, which is 100000 × 128 ×
32 bytes ~= 390 MB, plus some extra backup hints.
For each query, the client would need to send the server a
partitioning of the rows into two parts, and 100000 column indices. We
could be super-clever and do slightly better than this, but if
we do it naively, the row partition is 100000 bits (12.2 kB) and the
column indices are 100000 * ceil(log2(100000)) = 1.7m bits = 202.7 kB.
These two sum up to 215 kB per query.
If we introduce Merkle branches, the hint costs blow up to about
1.68 GB. With clever work, we can actually avoid the
per-query communication costs increasing by 4.414x, because we can
adjust the PIR in such a way that we use the same (or rather, analogous)
query indices for the datasets representing different levels of the
tree.
Server-side costs involve reading and XOR'ing 100000 cells, or 3.05MB
per query; the XOR is negligible, but the read load is significant.
Adding Merkle branches would blow this up to about 13.4 MB in theory,
though the realistic added cost may well be negligible because clever
engineering can ensure that matching values are always in the same page
(memory pages are usually 4096 bytes).
Our main tool to fiddle with these numbers is to make the
rectangle unbalanced (have different width and height): we can
increase hint storage by 2x to reduce query size by 2x. Realistically,
this is a good tradeoff: storing even 10 GB of hints is not that hard,
but especially if nodes are sending multiple queries every 200
milliseconds, 215 kB per query is higher than we would be comfortable
with.
In the case of a search engine, the size of the objects would be much
larger (eg. 10 kB per text webpage), but the number of queries may be
smaller. Hence, query complexity would be relatively low, but hint size
would be large. To compensate for this, it may make sense to make the
rectangle unbalanced in the other direction.
Beyond Plinko:
comparing to the alternatives
One already-existing alternative to Plinko is TreePIR. TreePIR works by
using a "puncturable PRF": a hash function where you can provide a key
that allows evaluating it at all points except for one chosen
point. This allows you to generate a representation of the query columns
("values generated from a seed \(S\)
except the \(j\)'th value")
that is logarithmic-sized - much more compact than providing the whole
list directly:

The full approach is somewhat more clever than the above diagram. The
idea in this diagram reveals the excluded index 01100, violating
privacy. TreePIR works by providing the yellow values without
revealing their (horizontal) position in the tree. There are \(2^{log_2(\sqrt{N})} = \sqrt{N}\) possible
sets of leaf values, and thus \(\sqrt{N}\) possible sets of columns, that
can be generated by taking the left and right choice of position for
each yellow value. The server has an \(O(\sqrt{N} * log(N))\) time algorithm to
compute the XOR sum of chosen cells for every possible arrangement
simultaneously. The client knows which of these sums corresponds to the
set that it actually needs. It uses single-server PIR to ask for it.
This reduces communication complexity to logarithmic, at the cost of
increased server work, and at the cost of removing the "invertible PRF"
mechanism, forcing the client to "grind" through many possible hints
until they find a matching hint.
There are various theoretical
lower bounds
for how much overhead a PIR protocol needs to have. In reality, however,
most of these bounds come with caveats, and so the "real" lower bounds
are far fewer. For example, if you have obfuscation, you can make any of
these protocols require only a constant real-time communication overhead
- but of course, today, such protocols are still quite far away from
practicality. The harder thing to reduce is server-side computation.
Because it's "just" XOR, the workload in Plinko is already not a big
deal. But if we want to remove the client-side \(O(N)\) download requirement, then we need
the server side to do much heavier computation.
With Plinko, PIR is much further along than it was a decade ago, but
there is still theoretically much further that these protocols could
go.
Plinko PIR tutorial
2025 Nov 25 See all postsSpecial thanks to Alex Hoover, Keewoo Lee and Ali for feedback and review
One underrated form of privacy, that is not well satisfied by ZK-SNARKs, traditional versions of FHE or other commonly talked about methods, is private reads. Users want to learn things from a large database, without revealing what they're reading. Some use cases of this include:
There is a category of privacy protocols that are directly designed to solve this problem, called private information retrieval (PIR). A server has a database \(D\). A client has an index \(i\). The client sends the server a query, the server replies back with an answer that allows the client to reconstruct \(D[i]\), all without the server learning anything about \(i\). In this post I will describe some basics that underlies all PIR protocols, and then describe Plinko, a protocol that efficiently achieves this.
Basics of PIR
To understand more complicated PIR protocols, it's best to start with "classic two-server PIR". Classic two-server PIR assumes that a client is making queries to two servers, and it trusts that at least one of the two is honest. Here's how the protocol works:
The key idea is that each server receives what looks like a completely random subset of the indices. There is not even a slight bias from mixing in \(i\), because that modification is as likely to add the value as it is to remove it. The two servers give XOR-sums of subsets that are almost identical, but differ in one place: one includes \(D[i]\) and the other doesn't. The client gets these two values, and takes the difference between the two, which gives \(D[i]\).
This naturally generalizes to "cells" that are bigger than one bit; you could have each \(D[i]\) have eg. 1000 bits, and the protocol is exactly the same.
This protocol has three big weaknesses:
There are two well-understood modifications to classic two-server PIR that solve (1).
One strategy is PIR with preprocessing. Notice that in the protocol above, you can compute one of the two random sample sets before you even know which \(i\) you want to query for. You generate thousands of these random queries. You have an initial phase where you download (but do not need to store) the whole file \(D\), and you compute the XOR sums for these random sets of indices locally. Essentially, you are acting as one of the two servers - and in this way, satisfy the 1-of-2 trust assumption all by yourself, so you do not need to trust the other server. This approach does require a huge amount of upfront bandwidth, but you can do it ahead of time, or do it in the background and use a more inefficient form of PIR to answer any queries that you get before it finishes.
The other strategy is single-server PIR. Basically, you use some variant of homomorphic encryption (often simpler than full-on FHE) to encrypt your query, in such a way that the server can compute on your encrypted query to give you an encrypted answer. This has high computational overhead on the server side, and so is not yet feasible for very large datasets. This strategy can be combined with PIR with preprocessing, so you can theoretically avoid the initial download by doing that procedure homomorphically encrypted on the server side.
Plinko solves (2) and (3) - or at least, it decreases both communication overhead and server-side compute overhead from \(O(N)\) to \(O(\sqrt{N})\) where \(N\) is the number of cells. It uses the preprocessing strategy to remove the trust assumption. Theoretically, you can do the preprocessing in FHE; it is currently an open problem to do this efficiently enough.
In Plinko's language, there is:
There is also an easy way for the client to update hints if the data set changes. Now, let's get into the protocol!
Setup phase
Treat the database as being a square, with \(\sqrt{N}\) rows and \(\sqrt{N}\) columns:
The client generates a random master seed \(S\). For each row \(i\), the client uses the master seed to generate a row seed \(S_i\). The client generates roughly \(128 * \sqrt{N}\) hints. Each hint is generated as follows:
The "hash" that is used to compute the \(c_i\) values could be a regular hash, but that leads to (tolerable but significant) inefficiency during the query phase. So instead, Plinko prefers special type of hash called an "invertible PRF". The core property that it achieves is that given \(S_i\) and \(c_i\), you can easily recover all \(j\) values that lead to that \(c_i\). In practical terms, this means that for any cell \((r_i, c_i)\), you can easily find all hints that "include" that cell.
Additionally, the setup phase generates some "backup hints". These are constructed like regular hints, except for each index \(j\), you choose \(\frac{\sqrt{N}}{2}\) rows, and compute a pair of hints: one over the chosen subset of rows, and the other for its complement (ie. the \(\frac{\sqrt{N}}{2}\) rows that were not chosen):
This whole process can be done "streaming", without requiring the client to store any data at any point in time other than the hints themselves. In total, the client needs to download the whole data set and store data equal to roughly \(64 \* \sqrt{N}\) cells, plus \(\sqrt{N}\) cells per backup hint (the client will need one backup hint for each query that the client will make).
If desired, the process can also be done in FHE, which means the client would only need to download the hints, but the server would need to do FHE work over the whole dataset; this is of course expensive, and optimizing it is an ongoing problem.
Query phase
Suppose you are interested in a specific coordinate \((x, y)\). First, you discover a hint index \(j\) that generates a hint that contains \((x, y)\). This is where the "invertible PRF" mechanism helps: it lets you immediately run the function in reverse knowing \(S_x\) and \(y\) to compute all \(j\)'s such that \(H(S_x, j) = y\). However, if you want to avoid this complexity, you could just set \(H(S_x, j) = sha256(S_x, \lfloor \frac{j}{16} \rfloor)[j \% 16: (j \% 16) + 2]\) and then compute on average \(\frac{\sqrt{N}}{16}\) hashes every time.
Then, you determine the \(\frac{\sqrt{N}}{2} + 1\) rows that the hint generates. You take out row \(x\). You then send the server a message that contains:
You send these in a random order, so the server does not know which is which. As far as the server can tell, you just sent it a random set of points, one per row, with the rows split in half in an arbitrary way. The server XORs both sets of cells, and replies with both outputs.
The client then locally takes its hint, XORs it with the (non-junk) value returned by the server, and it has its answer.
The server does a computation based on \(\sqrt{N}\) cells, and returns that data to the client. The server knows nothing about which cell the client wants (provided the client does not reuse hints), because the client gave the server every point in the set in its hint except the point it actually wants. And without knowing the hash that generated these subsets, to the server, all the row and column choices look completely random.
Backup queries
To avoid using a hint twice (and thus leaking data), when the client makes a query, it dumps the hint that it used, and "promotes" a backup hint into being a regular hint. Remember that each "backup hint" is a pair of hints, one that is based on a random subset of the rows, and the other based on its complement. The algorithm is simple:
Now, you have a set of \(\frac{\sqrt{N}}{2} + 1\) rows with an XOR of one randomly-chosen column from each row: exactly the same format as a regular hint.
Updating the dataset
Suppose a value in the data updates. Then, you can find all hints that contain that value, and simply do a direct update: mix in the XOR of the old value and the new value.
And that's the whole protocol!
Concrete efficiency
In asymptotic terms, everything in Plinko is \(O(\sqrt{N})\): the size of the hints that the client needs to store, the data transmitted per query (technically that's \(O(\sqrt{N} * log(N))\)), and the server-side compute costs. But it helps to put this all into the context of a real-world workload. So let's use the real-world workload that I am the most familiar with: Ethereum.
Suppose that you are querying a dataset with 10 billion values, each of which are 32 bytes. This is a pretty good description of the Ethereum state tree, in a couple of years when it gets significantly bigger due to scale. We use cuckoo hashing to convert a key-value store into a "flat" lookup table of the type that Plinko handles.
To include Merkle branches, one simple way (though not the optimal way) is to treat them separately as queries into smaller databases; if it's a binary tree, then the first level is full-sized, the second level is half-sized, the third level is quarter-sized, etc; because PIR costs are all proportional to sqrt, this means that our total costs will all be multiplied by \(1 + 1 + \frac{1}{\sqrt{2}} + \frac{1}{2} + \frac{1}{2\sqrt{2}} + ... = 3 + \sqrt{2} \approx 4.414\). Ethereum's state tree is (for now) a less-efficient hexary tree, but you can ZK-prove equivalence to a binary tree to get the efficiencies of a binary tree today.
But first, let's provide raw costs without the Merkle branches. This implies a 100000×100000 grid, where each cell is 32 bytes (256 bits). The client would need to store 100000×128 hints, which is 100000 × 128 × 32 bytes ~= 390 MB, plus some extra backup hints.
For each query, the client would need to send the server a partitioning of the rows into two parts, and 100000 column indices. We could be super-clever and do slightly better than this, but if we do it naively, the row partition is 100000 bits (12.2 kB) and the column indices are 100000 * ceil(log2(100000)) = 1.7m bits = 202.7 kB. These two sum up to 215 kB per query.
If we introduce Merkle branches, the hint costs blow up to about 1.68 GB. With clever work, we can actually avoid the per-query communication costs increasing by 4.414x, because we can adjust the PIR in such a way that we use the same (or rather, analogous) query indices for the datasets representing different levels of the tree.
Server-side costs involve reading and XOR'ing 100000 cells, or 3.05MB per query; the XOR is negligible, but the read load is significant. Adding Merkle branches would blow this up to about 13.4 MB in theory, though the realistic added cost may well be negligible because clever engineering can ensure that matching values are always in the same page (memory pages are usually 4096 bytes).
Our main tool to fiddle with these numbers is to make the rectangle unbalanced (have different width and height): we can increase hint storage by 2x to reduce query size by 2x. Realistically, this is a good tradeoff: storing even 10 GB of hints is not that hard, but especially if nodes are sending multiple queries every 200 milliseconds, 215 kB per query is higher than we would be comfortable with.
In the case of a search engine, the size of the objects would be much larger (eg. 10 kB per text webpage), but the number of queries may be smaller. Hence, query complexity would be relatively low, but hint size would be large. To compensate for this, it may make sense to make the rectangle unbalanced in the other direction.
Beyond Plinko: comparing to the alternatives
One already-existing alternative to Plinko is TreePIR. TreePIR works by using a "puncturable PRF": a hash function where you can provide a key that allows evaluating it at all points except for one chosen point. This allows you to generate a representation of the query columns ("values generated from a seed \(S\) except the \(j\)'th value") that is logarithmic-sized - much more compact than providing the whole list directly:
The full approach is somewhat more clever than the above diagram. The idea in this diagram reveals the excluded index 01100, violating privacy. TreePIR works by providing the yellow values without revealing their (horizontal) position in the tree. There are \(2^{log_2(\sqrt{N})} = \sqrt{N}\) possible sets of leaf values, and thus \(\sqrt{N}\) possible sets of columns, that can be generated by taking the left and right choice of position for each yellow value. The server has an \(O(\sqrt{N} * log(N))\) time algorithm to compute the XOR sum of chosen cells for every possible arrangement simultaneously. The client knows which of these sums corresponds to the set that it actually needs. It uses single-server PIR to ask for it.
This reduces communication complexity to logarithmic, at the cost of increased server work, and at the cost of removing the "invertible PRF" mechanism, forcing the client to "grind" through many possible hints until they find a matching hint.
There are various theoretical lower bounds for how much overhead a PIR protocol needs to have. In reality, however, most of these bounds come with caveats, and so the "real" lower bounds are far fewer. For example, if you have obfuscation, you can make any of these protocols require only a constant real-time communication overhead - but of course, today, such protocols are still quite far away from practicality. The harder thing to reduce is server-side computation. Because it's "just" XOR, the workload in Plinko is already not a big deal. But if we want to remove the client-side \(O(N)\) download requirement, then we need the server side to do much heavier computation.
With Plinko, PIR is much further along than it was a decade ago, but there is still theoretically much further that these protocols could go.