Context Navigation

Changes between Version 1 and Version 2 of ServerSelection

Timestamp:: 2009-04-21T20:47:23Z (17 years ago)
Author:: warner
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

ServerSelection

-                      v1
+                      v2
 As I, Zooko, have emphasized a few times, we really should not try to write a super-clever algorithm into Tahoe which satisfies all of these people, plus all the other crazy people that will be using Tahoe for crazy things in the future.  Instead, we need some sort of configuration language or plugin system so that each crazy person can customize their own crazy server selection policy.  I don't know the best way to implement this yet -- a domain specific language?  Implement the above-mentioned list of seven policies into Tahoe and have an option to choose which of the seven you want for this upload?  My current favorite approach is: you give me a Python function.  When the time comes to upload a file, I'll call that function and then use whichever servers it said to use.
+==== Brian says: ====
+Having a function or class to control server-selection is a great idea. The
+current code already separates out responsibility for server-selection into a
+distinct class, at least for immutable files
+(source:src/allmydata/immutable/upload.py#L131 {{{Tahoe2PeerSelector}}}). It
+would be pretty easy to make the uploader use different classes according to
+a {{{tahoe.cfg}}} option.
+However, there are some additional properties that need to be satified by the
+sever-selection algorithm for it to work at all. The basic Tahoe model is
+that the filecap is both necessary and sufficient (given some sort of grid
+membership) to recover the file. This means that the eventual
+'''downloader''' needs to be able to find the same servers, or at least have
+a sufficiently-high probability of finding "enough" servers within a
+reasonable amount of time, using only information which is found in the
+filecap.
+If the downloader is allowed to ask every server in the grid for shares, then
+anything will work. If you want to keep the download setup time low, and/or
+if you expect to have more than a few dozen servers, then the algorithm needs
+to be able to do something better. Note that this is even more of an issue
+for mutable shares, where it is important that publish-new-version is able to
+track down and update all of the old shares: the chance of accidental
+rollback increases when it cannot reliably/cheaply find them all.
+Another potential goal is for the download process to be tolerant of new
+servers, removed servers, and shares which have been moved (possibly as the
+result of repair or "rebalancing"). Some use cases will care about this,
+while others may never change the set of active servers and won't care.
+It's worth pointing out the properties we were trying to get when we came up
+with the current "tahoe2" algorithm:
+ * for mostly static grids, download uses minimal do-you-have-share queries
+ * adding one server should only increase download search time by 1/numservers
+ * repair/rebalancing/migration may move shares to new places, including
+   servers which weren't present at upload time, and download should be able
+   to find and use these shares, even though the filecap doesn't change
+ * traffic load-balancing: all non-full servers get new shares at the same
+   bytes-per-second, even if serverids are not uniformly distributed
+We picked the pseudo-random permuted serverlist to get these properties. I'd
+love to be able to get stronger diversity among hosts, racks, or data
+centers, but I don't yet know how to get that '''and''' get the properties
+listed above, while keeping the filecaps small.