wiki:NewImmutableEncodingDesign

Context Navigation

The NewCapDesign page describes desired features of the next filecap design. This page is for designing the encoding format for these new immutable files.

Features

as described on NewCapDesign#filecaplength, we probably need 128bit confidentiality "C" bits, 256bit integrity "I" bits, and 128bit storage-collision resistance. There are encoding schemes that can combine the C and I bits (at the expense of convergence, or certain forms of offline attenutation).
we may define a "server-selection-index" (which is used to permute or otherwise narrow the list of servers to be used) to be separate from the "storage-index" (which is used to identify a specific share on whichever servers we actually talk to). This may involve a separate field in the filecap, or it may continue to be derived from the storage index.
some encoding schemes allow the readcap to be attenuated to a verifycap offline
in general, we don't care how long the verifycap is
the server should be able to validate the entire share by itself, without the readcap. In general, this means that the storage-index must also be the verifycap.
- note that this implies that the storage-index cannot be computed until the end of encoding, when all shares have been generated, the share hash tree has been built, and its root has been added to the UEB.
  - this implies that we can't use the storage-index to detect convergence with earlier uploads of the same file. To retain convergence may require a lookup table on the server (mapping hash-of-readkey to storage-index, or something)
  - it also implies that storage-index can't be used as a server-selection-index, which again points to using hash-of-readkey as SSI (to retain convergence of server-selection). Setting the storage-index at the end of upload requires a new uploader protocol, which uses an "upload handle" for the data transfer, and finishes with a "now commit this share to storage-index=X" message.
  - the original CHK design uses hash-of-readkey as storage-index, which has all these good properties except server-side full share validation. (servers can compare share contents against the UEB, and we could put a copy of the UEB hash into the share, but servers would continue to be unable to make sure the share was in the right place)
make the share layout more deterministic, to get rid of the RTT that's needed to fetch the offset table before we can fetch the UEB. Maybe reserve space for the UEB at the start of the share (but that makes it hard to have an extensible open-ended UEB).
since we probably do need to stick with the offset table, see if there's a way to guarantee having the same offsets across all shares. This would allow the table pulled (and validated) from one share to be used to compute read vectors for all other shares. We probably already have identical tables, but it'd be nice to make the guarantee explicit. The tables need to be covered by the UEB, else we can't share them: the first read might be from a corrupted share, and that would malign (hah!) the other shares.

Options

note: all cap-length computations assume the integrity-providing "I" field is 256bits long, and the confidentiality-providing "C" field is 128bits long. If we decide on different values, the sums below should be updated.

One: current CHK design

Readcaps consist of two main pieces: C bits and I bits, plus:

k (which improves the accuracy of the initial number of queries to send out)
N (which improves the guessed upper bound on number of queries to send out, and used to be required by the abandoned TahoeThree algorithm)
filesize (advisory only, used by deep-size measurements in lieu of fetching share data to measure filesize)

SI = H(C), SSI=SI. Verifycap is SI+I.

SSI and SI are known ahead of time, uploader protocol starts with SI
good convergence
long caps (128+256+len(k+N+filesize)) ~= 400bits
server cannot verify entire share

Two: Zooko's Scheme

Zooko wrote this scheme up in this message and this diagram, and was improved upon during the subsequent tahoe-dev discussion.

pick a random readkey K1 (perhaps via CHK)
use K1 to encrypt the data
encode the shares and compute the UEB hash, as usual. (zooko's message defines v=UEBhash)
K2 = H(K1+UEBhash)
K1enc = AES(key=K2, data=K1)
readcap = K2
storage-index = H(K2)
verifycap = H(K1enc+UEBhash)
store sharedata and K1enc on the server

Properties:

storage-index is unknown until the end of encoding.
- change the storage protocol to permit assignment of storage-index after upload
- To get Tahoe2-style load-balancing, define a separate "server-selection-index" value (which has no cryptographic requirements and can be much shorter, perhaps 20 bits), either randomly or with some convergence-secret salted CHK scheme. Put this value in the readcap.
readcap is kappa (e.g. 128bits) plus len(SSI), so perhaps 148bits total
verifycap cannot be offline-derived from readcap (it requires knowing K1enc, which cannot be included in the UEB since that'd be circular)
server does not have full validation (they can check that shares are internally consistent, but cannot confirm that they've been stored under the correct storage-index)

To download the file, the readcap holder does:

extract server-selection-index from readcap, permute peerlist with it
extract K2 from readcap
compute storage-index = H(K2)
query peers for shares
fetch K1enc, UEBhash from at least one share
decrypt K1=AES(key=K2, data=K1enc)
compute K2'=H(K1,UEBhash), assert K2'==K2
- any share which fails this test is discarded
- once the valid UEBhash is determined, any share with a different UEBhash is discarded
fetch UEB, validate against UEBhash
fetch share data, validate, decode, decrypt, emit plaintext

Three: Brian's previous flawed reconstruction of Zooko's scheme

Readcaps contain one crypto value that combines C and I fields. (I forget how this worked.. it was clever, but I think it had some fatal flaw, like not being able to get a storage-index from the readcap without first retrieving shares, or something. One of us will dig up the notes on it and describe it here).

short caps
convergence problems

this tahoe-dev message contains some clues:

For immutable files, you have to encrypt+encode the file before you can
get the secure hash that forms the root of the integrity check (because we
want to validate the individual shares too). Since you must commit to the
encryption keys before you get those bits, so you can't just derive the
encryption keys from that hash. But you could combine that hash with some
other secret, and hide the main encryption key by encrypting it with that
secret, and then the bits of that secret would each provide C+Ir. I think
this is closely related to Zooko's scheme.

I still can't reconstruct it, but here's one possibility:

pick a random readkey K1 (perhaps via CHK)
pick another random key K2
use K1 to encrypt the data
encode the shares and compute the UEB hash, as usual
store AES(key=H(K2+UEBhash), data=K1) in the share: the encrypted readkey
readcap = K2
storage-index = H(K2)
or storage-index = UEBhash

If SI=H(K2), then the reader starts from readcap=K2, computes the SI, fetches the share, computes UEBhash, computes H(K2+UEBhash), decrypts K1, then compares the share hash trees against the UEB, decodes the shares, and uses K1 to decrypt them. The servers can't validate anything against the storage-index (because it is derived from the readcap, which is forbidden to them). And I'm pretty sure that Rose the readcap-holder can collude with a quorum of storage servers to change the contents of the file, since there's nothing in K2 that's dependent upon the actual shares.

If SI=H(K2) and K1=CHK(data), then you get protection against Rose+Server, but you only get to discover problems after you've extracted all of the plaintext, so alacrity is maximally bad (you can't deliver the first byte of validated plaintext until you've retrieved the last byte of the ciphertext).

Not having a validated UEB is troublesome, because then you don't know which shares are good, so if whatever subsequent validation you have (like a CHK) fails, you don't know which shares to throw out. But since all shares are supposed to have a copy of the same UEB, there's a workaround: validate each share up to its local UEB, sort the varieties of UEBs that you see, start with the most popular one, see if a decode matches your CHK. If not, iterate to the next-less-popular variant, etc.

If SI=UEBhash, then you've got the integrity checking that you want, but there's no way for the readcap holder to find out what the storage-index is, so they can neither locate a share (to compute everything else) nor can they use the storage-index to validate anything. You could imagine a separate table, though, that mapped H(K2) to storage-index values, but of course this table would have to be stored somewhere, etc.

Mutable readcap plus commitment

Another idea is to upload regular mutable files, create a readcap, and then augment the readcap with some number of I bits (basically the "R=roothash" in our current mutable-file layout, which corresponds to the UEB hash in our immutable layout). Having the the readcap would reduce the set of people who could modify your file down to the writecap holders; adding the I bits would reduce/remove their ability to modify it either.

But I think this converges on the current CHK design once you make those non-modification guarantees equally strong.

Others?

Ideas

It might be possible to have the uploader give two values to the server, at different stages of the upload process, which (together) would allow full validation of the resulting share. Using a single value (the verifycap), as a storage index, would be cleaner, but might not be strictly necessary.

The servers could maintain a table, mapping from one sort of index to another, if that made it easier for the upload process to proceed (or to achieve convergence). For example, H(readkey) is known at the beginning of upload, but the I bits aren't known until the end. If the client could use SSI=H(readkey) and then ask each server to tell them the storage-index of any shares which used H(readkey), it could achieve convergence and still use the I bits as the storage-index. The servers would be obligated to maintain a table with one entry per bucket (so probably ~20M entries), and errors/malicious behavior in this table would cause convergence failures (which are hardly fatal).

The SSI can be much shorter than the SI. It only needs to be long enough to provide good load-balancing properties. It could be included explicitly in the filecap. Alternate (non-TahoeTwo) peer-selection strategies could encode whatever per-file information they needed into the SSI, assuming some sort of tradeoff between cap length (i.e. SSI length) and work done by the downloader to find the right servers.

Last modified at 2010-01-07T01:48:32Z Last modified on 2010-01-07T01:48:32Z

Download in other formats:

Plain Text