#1092 new defect

shares.happy is the wrong name of the measure

Reported by: zooko Owned by: warner
Priority: minor Milestone: soon
Component: code-nodeadmin Version: 1.7.0
Keywords: usability upload servers-of-happiness unfinished-business Cc: kevan
Launchpad Bug:

Description (last modified by zooko)

There is a configuration option named shares.happy which is how you control the servers-of-happiness value. It is mis-named! It should be named servers.happy. Of course, it belongs right next to shares.needed and shares.total, but hopefully placement and docs can make their intimate relationship clear. Also, shares.needed serves double-duty. It means both:

  1. Number of shares necessary to reconstruct the file, and
  2. Number of servers necessary to serve the file in a servers-of-happiness upload-quality metric.

Maybe that name should also be changed or at least documented even more carefully. Assigning to Brian. The next step on this ticket is for Brian to study the new servers-of-happiness feature (#778) and let us know what he thinks about it, both in general and in regard to this specific issue.

Attachments (1)

1092.dpatch (8.3 KB) - added by kevan at 2010-12-23T06:45:52Z.

Download all attachments as: .zip

Change History (13)

comment:1 follow-up: Changed at 2010-12-23T06:45:26Z by kevan

I'm attaching a patch that changes shares.happy to servers.happy. The client now ignores shares.happy, since it doesn't make a lot of sense to use shares.happy for servers.happy, given the differences between the two robustness metrics. Should we make the startup code print a warning if it doesn't find a servers.happy but does find a shares.happy?

I've defined servers.happy with the default value of 1; this means that servers of happiness checks will be disabled for nodes without a servers.happy directive in their tahoe.cfg (including the result of tahoe create-node).

I don't think there's a particularly convincing argument for leaving the default at 7; probably the only good it is doing is forcing people to reason about their grid when they have to go in and edit tahoe.cfg when their uploads fail because their "Hello, world!" grid isn't big enough to satisfy servers.happy=7. There are probably friendlier ways to do that :-). I'm open to being convinced for a value that isn't 1, but I think that there's something to be said for giving the user the information that they need to set the value sensibly and staying out of their way until they do that.

(I don't have a clear opinion yet on shares.needed, since I hadn't thought about that until I read the ticket this morning)

Changed at 2010-12-23T06:45:52Z by kevan

comment:2 Changed at 2010-12-23T14:49:28Z by gdt

-1 on the servers.happy.

If we're going to change, I think it would be good to also pick a different word than happy. There's an important concept lurking under a seemingly flippant word.

bWhat's really going on is that this single variable is a rough first cut at ensuring that there is adequate redundancy based on some policy and some knowledge of physical and administrative correlation among servers. I see the 3/7/10 values as very closely linked, and changing shares to servers makes that less clear.

I do agree that shares.happy gives the wrong impression. So I'll suggest "shares.independent", with the meaning being "the minimum number of shares that must be on independent servers". I think that's what is meant, and this keeps the parallelism of shares.* and clarifies this variable. One could have shares.independent and shares.independent-target, but I'm not sure independent-target needs to be different from total.

The current ordering gives the impression that shares.needed are shares.total are more independent than they are. So perhaps "shares.coding = (3, 10)" would be better than two variables. (I am under the impression that I can't just set shares.total to 12 and reconstruct those missing sh10, sh11 without having to recode the entire file; if I'm confused on that point this paragraph is invalid.)

3/7/10 seems reasonable, and I've been using 2/5/7. I don't think it makes sense to talk about the right value of shares.independent/shares.happy without considering the whole 3-tuple.

comment:3 Changed at 2010-12-23T15:26:03Z by gdt

Thinking about kevan's comments on the default, I think there are two use cases: setting up a single node with storage to play with tahoe for the very first time, and actually wanting to store bits. 1 is definitely not a good value for actual use. So perhaps there should be "tahoe create-test-node" that has encoding parameters set up for demo use, where the node is client, server, and introducer. Then create-node can be tuned for real use.

comment:4 in reply to: ↑ 1 Changed at 2010-12-23T19:19:34Z by davidsarah

Replying to kevan:

... I've defined servers.happy with the default value of 1; this means that servers of happiness checks will be disabled for nodes without a servers.happy directive in their tahoe.cfg (including the result of tahoe create-node).

I don't think there's a particularly convincing argument for leaving the default at 7; probably the only good it is doing is forcing people to reason about their grid when they have to go in and edit tahoe.cfg when their uploads fail because their "Hello, world!" grid isn't big enough to satisfy servers.happy=7. There are probably friendlier ways to do that :-). I'm open to being convinced for a value that isn't 1, but I think that there's something to be said for giving the user the information that they need to set the value sensibly and staying out of their way until they do that.

A value of 1 means that at least one share has been placed (it is vacuously true that it is on an independent server). This isn't sufficient for the file to be retrievable.

We should probably require that at least k shares are placed in order for an upload or repair to succeed, regardless of the happiness threshold. In that case happiness thresholds less than k would make more sense.

Independently of that, I don't think that 1 is a sensible default. Even for a toy grid that is only being created for someone to see that Tahoe works, it's not unreasonable to require at least two servers. If the happiness threshold is 1, then even if there are no other servers, uploads will succeed by putting shares on the gateway, provided it has sufficient space. I don't think they should succeed (by default) in that case.

comment:5 Changed at 2010-12-24T23:34:14Z by warner

You know, I actually kinda like servers.happy=1, probably because I still haven't internalized the whole bijective-mapping-of-servers concept yet. (I mean, I know what's going on, yet each time that error appears, I walk away in confusion because the text of the error message is so hard to follow, so it leaves a general taste in my mouth that the whole idea is bad, even though I know it's not really that bad)

Kevan's arguments in the first comment are spot on. "forcing people to reason about their grid" needs to happen in a friendlier place than the error message.

gdt's comment about the flippant use of "happy" is accurate too. I originally picked that for shares-of-happiness because it was a somewhat arbitrary threshold appliedin a very narrow and probably-rare error case (you've connected to enough servers at the start of the upload, but then some were lost by the time you finished.. do you still declare success? are you still happy?)

The current ordering gives the impression that shares.needed are shares.total are more independent than they are. So perhaps "shares.coding = (3, 10)" would be better than two variables. (I am under the impression that I can't just set shares.total to 12 and reconstruct those missing sh10, sh11 without having to recode the entire file; if I'm confused on that point this paragraph is invalid.)

(you're correct: you can't go from 3-of-10 to 3-of-12 without reencoding the whole file. raw zfec would treat them the same, but the share-hash-trees that tahoe adds for integrity checking would be different, so we fold both k and N into the CHK hash, so you'll get an entirely different encryption key and share data anyways)

Yeah, combining two tahoe.cfg directives into one might be a good idea. In fact, it should be phrased in the same way we talk about it in english:

[client] shares.encoding = 3-of-10

So I'll suggest "shares.independent", with the meaning being "the minimum number of shares that must be on independent servers"

I get the impression that this issue is more about "servers" than about "shares", so I wonder if maybe it ought to be "servers.independent". I know the math touches both, but I'd like to give users the ability to learn how this works in chunks, where the first chunk is only about shares ("3-of-10, I need 3 distinct shares, doesn't matter where they come from, ok, got it"), and then a later chunk is about where those shares are placed ("oh, right, what happens if there aren't enough servers?"). Maybe, if all the "shares.*" configuration fit into the first chunk, then all the controls that involve servers (even though they also involve shares) could be put into a different namespace and support the user's concept of a second chunk of things to learn. "servers.*" might support that.

I'm still undecided about what the default "use-case" ought to be. I think it's vital that folks be able to bring up a small grid and test it out. I also think it's important to protect "tahoe backup" users against the trivial case where you're only putting shares on yourself. Maybe what I'm really wishing for were better #467 explicit-server-selection code and UI. Maybe I'm coming around to the idea that diversity trumps write-availability: if you have some way of configuring (or at least acknowledging) who you're *supposed* to connect to, then you could fail writes unless all those servers were present. Maybe a set of checkboxes on the known-servers web page, meaning "don't allow uploads to succeed unless this server is present". Maybe I'm balking at simple integer success criteria because I don't see it as being easy for a user (or me) to understand what it means, whereas a list of required serverids is pretty straightforward.

But I'm hesitant on the explicit serverlist too, because of how it'd not work so well in very dynamic grids, and how it kind of needs constant attention and decision making by the user.

Hm. I'll think about the checkboxes idea more, I kinda like it.

comment:6 Changed at 2010-12-29T09:15:12Z by zooko

  • Keywords servers-of-happiness unfinished-business added

comment:7 Changed at 2015-05-12T16:50:36Z by zooko

  • Description modified (diff)
  • Milestone changed from eventually to 1.11.0

comment:8 in reply to: ↑ description Changed at 2015-05-14T15:54:07Z by daira

Replying to zooko:

Also, shares.needed serves double-duty. It means both:

  1. Number of shares necessary to reconstruct the file, and
  2. Number of servers necessary to serve the file in a servers-of-happiness upload-quality metric.

This is wrong. shares.needed only ever refers to a number of shares. Those shares can be served from any number of servers (which necessarily is between 1 and shares.needed inclusive, but that's a logical requirement rather than an additional criterion imposed by the upload/download/repair algorithms).

comment:9 Changed at 2016-03-22T05:02:52Z by warner

  • Milestone changed from 1.11.0 to 1.12.0

Milestone renamed

comment:10 Changed at 2016-06-28T18:20:37Z by warner

  • Milestone changed from 1.12.0 to 1.13.0

moving most tickets from 1.12 to 1.13 so we can release 1.12 with magic-folders

comment:11 Changed at 2020-06-30T14:45:13Z by exarkun

  • Milestone changed from 1.13.0 to 1.15.0

Moving open issues out of closed milestones.

comment:12 Changed at 2021-03-30T18:40:19Z by meejah

  • Milestone changed from 1.15.0 to soon

Ticket retargeted after milestone closed

Note: See TracTickets for help on using tickets.