#757 closed enhancement (fixed)

there isn't a doc that says "which operations are efficient"

Reported by: zooko Owned by: nobody
Priority: major Milestone: undecided
Component: documentation Version: 1.4.1
Keywords: performance docs large Cc:
Launchpad Bug:

Description

Jonathan Ellis says that he was telling a co-worker: "There are a lot of docs on Tahoe, but there's nothing that says which operations are efficient."

Attachments (1)

performance.darcspatch.txt (52.5 KB) - added by kevan at 2010-02-02T01:11:45Z.

Download all attachments as: .zip

Change History (25)

comment:1 Changed at 2009-07-14T14:10:35Z by zooko

To close this ticket, add notes to the Performance page about the general efficiency of all of the major operations, and link to the Performance page from the webapi.txt doc.

comment:2 Changed at 2009-07-15T01:30:22Z by warner

Also, include a copy of these notes in the source tree somewhere. Adding docs to the wiki is great and all, but I'd like people to be able to grab a source tree and know that they don't need to get anything else.

The biggest benefits (in my mind) of having docs on the wiki is that 1) everybody can edit them, add questions, etc, and 2) you can reference them from other wiki pages with nice short WikiNames. The benefit of having docs in the source tree are 1) long-term persistence, 2) no version skew between the tahoe version you're looking at and the docs, and 3) local availability despite allmydata.org and/or trac being down.

Maybe each file in docs/ could have a corresponding wiki page, and the file could have a note at the bottom saying "for the latest version of this document, please see URL". And we could write a script that would fetch all the docs and copy them into the source tree.

comment:3 Changed at 2009-12-04T05:31:59Z by davidsarah

  • Keywords performance docs added

comment:4 Changed at 2010-01-14T05:02:57Z by zooko

#878 (warn users about the performance issues of mutable files) was a subset of this ticket. By the way, Jonathan Ellis is an engineer at Rackspace and recently posted this: http://www.rackspacecloud.com/blog/2009/09/23/the-cassandra-project/

When we mentioned that comment 6 months ago I guess they were internally evaluating Tahoe-LAFS as a candidate technology for Rackspace's cloud offerings. Since I haven't heard that much from him since, I guess they are not using it. :-(

Anyway, documenting the approximate efficiency of the various Tahoe-LAFS operations should help people like that evaluate Tahoe-LAFS in the future.

comment:5 Changed at 2010-01-14T18:53:06Z by zooko

  • Owner changed from somebody to kevan

Kevan: since you did #878 you might also be interested in this ticket. If so, accept it.

comment:6 Changed at 2010-01-15T04:44:20Z by kevan

  • Status changed from new to assigned

comment:7 Changed at 2010-01-15T04:59:38Z by kevan

(summarizing a discussion on IRC with zooko)

This file will live at docs/performance.txt. It will contain a list of operations performed by Tahoe-LAFS, and an approximation of the cost of each operation in whatever units are appropriate.

comment:8 Changed at 2010-01-15T20:30:14Z by davidsarah

  • Keywords large added

comment:9 Changed at 2010-01-29T04:41:36Z by kevan

Okay, so what operations would be good to have here?

My list so far:

  • Publishing a mutable file.
  • Publishing an immutable file.
  • Downloading a mutable file.
  • Downloading an immutable file.
  • Replacing a mutable file
  • Modifying a mutable file (in the sense of, say, adding an entry to a directory)
  • Using tahoe backup
  • Verification
  • Checking/Repair?
  • Listing a directory

The only code in that list that I'm intimately familiar with is the upload code, so if I'm separating operations that aren't very different into different categories, that's why (and please let me know if you think I am!).

What other operations might I think of including?

comment:10 Changed at 2010-01-29T09:10:57Z by warner

I'd break out some operations that people might (or might not) expect to cost proportional to the size of the data they touch, especially when those operations actually cost proportional to the size of the overall file. So things like:

  • publishing A-byte immutable file
  • downloading B bytes (possibly all) of an A-byte immutable file
  • initial publish of an A-byte mutable file
  • modify B bytes of an existing A-byte mutable file
  • insert/delete B bytes in the middle of an existing A-byte mutable file
  • downloading B bytes (possibly all) of an A-byte mutable file
  • adding one directory entry to an existing A-entry directory
  • listing an A-entry directory

The only surprises for our current formats is that any mutable-file operation costs O(A) (i.e. proportional to filesize, even if you're only modifying or downloading a single byte), likewise for directories. Each directory entry tends to use about 300-330 bytes. And of course mutable-file creation requires a relatively slow (1 or 2 second) RSA keypair generation.

comment:11 Changed at 2010-01-30T22:54:55Z by kevan

I'm attaching a first stab at this.

If I understand correctly, mutable files can be modified in two ways:

  • Overwriting their contents with those of a new file. This is what tools like the WUI and CLI do when updating a mutable file.
  • download-modify-upload, which is done internally to directories when entries are added or removed to/from them.

When I read "insert/delete B bytes in the middle of an existing A-byte mutable file", I thought of download-modify-upload (otherwise, this just seems like a special case of "modify B bytes of an existing A-byte mutable file"). I then wondered if it was appropriate to have an explanation of that in there. If performance.txt is intended for end-users and people who maintain grids, and if they wouldn't ever directly modify mutable files with download-modify-upload (since none of the end-user interfaces do that directly), then it seems like we'd do fine to just move the explanation for "insert/delete B bytes in the middle of an existing A-byte mutable file" into the explanation about modifying directories, since that is a special case of the same operation, and something that end users can do directly.

I also removed the existing blurb about mutable files, since it seemed redundant now that there are blurbs about mutable files elsewhere. If anyone misses it, let me know and I'll put it back.

I still want to add blurbs for the various verification operations, but I probably won't get around to that until tomorrow or Monday.

comment:12 Changed at 2010-02-02T01:10:16Z by kevan

I'm attaching an updated patch with the costs of checking, verifying, and repairing files.

I collapsed computational cost and network cost into one measurement where I thought it was appropriate (which was in most places), though I wonder if that was the right thing to do. It is nicer to read if we have just one rough cost for each operation, and doing so achieves the goal of telling users about the surprises associated with some operations, but it is imprecise. Granted, this file is imprecise -- people who really wanted precise information would probably go read the code -- but I wonder if we're losing too much.

Changed at 2010-02-02T01:11:45Z by kevan

comment:13 Changed at 2010-02-02T01:13:42Z by kevan

If you're looking at this to review it:

  • Am I doing a good job of compressing notes and justifications into digestible chunks without losing a lot of information? (alt: are there any glaring inaccuracies? I looked over the code for the operations that are in there to try to make sure there wouldn't be, but I could have missed something)
  • Do you think there are any operations that should be in there and aren't?

comment:14 Changed at 2010-02-02T01:17:01Z by kevan

  • Keywords review-needed added
  • Owner changed from kevan to nobody
  • Status changed from assigned to new

comment:15 Changed at 2010-02-02T02:03:32Z by warner

When I read "insert/delete B bytes in the middle of an existing A-byte mutable file", I thought of download-modify-upload (otherwise, this just seems like a special case of "modify B bytes of an existing A-byte mutable file").

I mentioned that just because we have plans to make it fast: O(B) instead of O(A). As far as I know, no existing filesystem can do this right now, even local disk filesystems. This is the "LDMF" proposal. It's entirely reasonable to leave it out until we actually get around to building LDMF.

comment:16 follow-up: Changed at 2010-02-02T02:19:27Z by zooko

  • Yes, I think we should break out RAM, CPU, bandwidth, and round-trips. Without the units, the asymptotic measurements aren't useful enough. O(A) CPU usage is typically not a problem. O(A) RAM usage is either no problem at all or a huge problem, depending on whether A is greater than your physical RAM. Bandwidth and round-trips are always a problem. :-)

comment:17 Changed at 2010-02-02T04:14:17Z by davidsarah

"cost: O(A) + a large constant for RSA + memory usage."

Since there's no way of using an algorithm other than RSA at the moment, this could just say "a large constant".

"Modifying B bytes of an A-byte mutable file" and "Adding/Removing? B bytes in an A-byte mutable file" could probably be merged. The point about possibly being able to implement the latter efficiently even though local filesystems can't, could be made in the notes without requiring two sections.

comment:18 Changed at 2010-02-02T04:15:28Z by zooko

There is no way currently in the cli, wui or the wapi to upload a file without using convergent encryption. (The confidentiality risk of convergent encryption is solved by adding in a separate "added convergence secret", not by skipping the step of hashing the cleartext to generate a symmetric key.) Therefore, all uploads of immutable files take two passes over the file. If you're uploading through the wui/wapi, this means your client (i.e. web browser, or a wapi-using client) first reads the entire file from disk while streaming it to the gateway, then the gateway writes it out to a temporary directory on disk while hashing it to generate the symmetric encryption key, then the gateway reads it again from the beginning from its temporary location on disk while encrypting it, erasure-coding it, and uploading the shares to the storage servers.

If you're uploading from within the tahoe node process itself (i.e. you've extended your tahoe node with your own code instead of using the wapi) then it will make two consecutive passes of reading the entire file from its original location on disk and then encrypt, erasure-code, and upload during the second pass.

#329 (add streaming (on-line) upload to HTTP interface) is about allowing one-pass "streaming" upload, so for example the web gateway would no longer write a temporary copy of the file to disk at all but would instead process it incrementally in (a small amount of) RAM. Shawn Willden contributed the first step of #320, which is code to use a random encryption key instead of to hash the file and generate a convergent encryption key. See: src/allmydata/immutable/upload.py@4164#L1099. However, as far as I can tell from the wapi code and the docs, there is no way to access this feature through the wapi. (I thought that Shawn contributed a patch to make this feature available. Maybe that patch never got accepted into trunk?)

comment:19 in reply to: ↑ 16 ; follow-up: Changed at 2010-02-02T04:22:11Z by davidsarah

Replying to zooko:

  • Yes, I think we should break out RAM, CPU, bandwidth, and round-trips. Without the units, the asymptotic measurements aren't useful enough. O(A) CPU usage is typically not a problem. O(A) RAM usage is either no problem at all or a huge problem, depending on whether A is greater than your physical RAM. Bandwidth and round-trips are always a problem. :-)

If we broke them out, they'd still all end up as O(A) at the moment, no? (even though it would be possible to use only O(1) memory). I think it would be sufficient just to say "O(A) memory, time, and total bandwidth". Round-trips are interesting, but I'm not sure we have time to work that out if we want to commit this for 1.6.

comment:20 Changed at 2010-02-02T04:25:07Z by zooko

I meant "#320" in all places in comment:18, not "#329".

comment:21 in reply to: ↑ 19 Changed at 2010-02-02T04:27:44Z by zooko

Replying to davidsarah:

If we broke them out, they'd still all end up as O(A) at the moment, no? (even though it would be possible to use only O(1) memory). I think it would be sufficient just to say "O(A) memory, time, and total bandwidth". Round-trips are interesting, but I'm not sure we have time to work that out if we want to commit this for 1.6.

Well, all operations on immutable files take O(1) RAM (although as described in comment:18 upload takes O(N) temporary disk space). Yeah, you may be right. I'm not sure!

comment:22 Changed at 2010-02-02T05:34:48Z by davidsarah

  • Resolution set to fixed
  • Status changed from new to closed

comment:23 Changed at 2010-02-15T03:57:48Z by zooko

  • Keywords review-needed removed
Note: See TracTickets for help on using tickets.