#648 closed enhancement

collect server capacities and put them on the welcome page — at Version 20

Reported by: zooko Owned by:
Priority: major Milestone: 1.10.1
Component: code-frontend-web Version: 1.3.0
Keywords: introducer usability transparency ostrom statistics test-needed Cc: kpreid, ussjoin@…
Launchpad Bug:

Description (last modified by davidsarah)

As we're setting up the Volunteer Grid, this makes me want to see a summary of total storage capacity and free storage capacity on each server on the introducer's [and gateway's] welcome page.

Change History (20)

comment:1 follow-up: Changed at 2009-03-01T04:26:56Z by warner

Yeah! I've been thinking of two approaches:

  • add methods to the existing storage server remote API to query for total-space, space-available, etc (basically all the storage-related things you can get from the current stats gatherer). Have the introducer (or anyone else who's interested) query this interface and aggregate the results.
  • add a new service class (to the one "storage" one that we have now), with a separate remote API, that just does space-available information. Publish this through the introducer. Have the introducer (or anyone else who's interested) query this interface and aggregate the results.

The first approach feels a bit weird because it would conflate server access (upload/download shares) with a purely informational interface, and getting access to one should not necessarily provide access to the other. The second approach feels cleaner, but I've been holding off on implementing it until #466 is done (signed/extensible introducer messages, which is blocked on ECDSA). It doesn't strictly require #466, though.. maybe we could build it first.

Another approach would be to use the extensible-message part of #466 and publish space-available information in each announcement, but this would never be updated/updateable as quickly as having a remotely-callable query interface.

In any case, the information could be used by either the introducer, or by a separate disk-watcher process, not unlike the one we have right now. The existing disk-watcher queries the HTTP-based stats interface on each node to construct total-available, total-left, and rate-of-space-usage averages. One annoying aspect of this HTTP-based approach is that it must be configured manually: each time you add a server, you have to add its /statistics URL to the list. A process which used the introducer announcements to locate storage servers to query would be a lot easier to use.

comment:2 Changed at 2009-03-08T22:08:49Z by warner

  • Component changed from code to code-frontend-web
  • Owner somebody deleted

comment:3 Changed at 2009-12-12T20:03:56Z by imhavoc

As a user/grid administrator, I would be happy enough with an aggregation of node-reported statistics. Even though it would not be immediately up to date, it would be able to report in "round gigabyte" (TB, PB?) the approximate status and available capacity of the grid. This "out of date" information would be "cheap" and much better than a) no information or b) "expensive" and immediate information. Each node updated hourly, which is going to be fine-grain enough for most applications.

comment:4 Changed at 2009-12-13T04:08:05Z by davidsarah

  • Keywords introducer usability statistics added

comment:5 Changed at 2009-12-19T20:39:53Z by kpreid

  • Cc kpreid added

I would like to see this too, per-server — I think it should show up automatically in the table of storage servers on every node's welcome page.

comment:6 Changed at 2009-12-26T15:19:06Z by zooko

Kevin: I agree it should show up automatically on the welcome page.

comment:7 Changed at 2010-02-23T00:04:42Z by USSJoin

  • Cc ussjoin@… added

comment:8 Changed at 2010-08-15T22:50:02Z by bj0

This would be nice for making sure you have enough storage space on your tahoe network. It would also be good to add it to the sshfs interface so that it shows up in the 'df' report.

comment:9 follow-up: Changed at 2010-09-01T18:49:57Z by davidsarah

  • Keywords sftp added
  • Summary changed from show server capacities on introducer welcome page to collect server capacities and put them on introducer welcome page, output of 'df' for SFTP, etc.

The code that determines what SFTP outputs for 'df' is at lines 1757 and 1879 of sftpd.py. It currently has to fake some values to keep sshfs happy.

comment:10 Changed at 2010-09-18T16:41:33Z by zooko

  • Summary changed from collect server capacities and put them on introducer welcome page, output of 'df' for SFTP, etc. to collect server capacities and put them on the welcome page, output of 'df' for SFTP, etc.

comment:11 in reply to: ↑ 1 Changed at 2010-10-14T17:15:01Z by zooko

Replying to warner:

Yeah! I've been thinking of two approaches:

  • add methods to the existing storage server remote API to query for total-space, space-available, etc (basically all the storage-related things you can get from the current stats gatherer). Have the introducer (or anyone else who's interested) query this interface and aggregate the results.
  • add a new service class (to the one "storage" one that we have now), with a separate remote API, that just does space-available information. Publish this through the introducer. Have the introducer (or anyone else who's interested) query this interface and aggregate the results.

Don't storage servers already announce their space available to the introducer and doesn't the introducer already send that information to each client?

Let's see...

Yeah, there in remote_get_version():

                    { "maximum-immutable-share-size": remaining_space,

So the introducers and the clients could just display that information on their web pages.

In addition to that, we could get a lot more information if each storage server would be default automatically send its stats to a stats-gatherer and each storage client (or else each introducer) would automatically run a stats-gatherer and give the stats-gatherer's furl to each storage server: stats.txt (And then the storage client or introducer would publish a web page with aggregated information in JSON, and then someone would write a nice JavaScript tool using protovis to visualize that information...)

Last edited at 2013-10-11T15:06:42Z by zooko (previous) (diff)

comment:12 in reply to: ↑ 9 ; follow-up: Changed at 2010-10-14T20:56:21Z by warner

Replying to davidsarah:

The code that determines what SFTP outputs for 'df' is at lines 1757 and 1879 of sftpd.py. It currently has to fake some values to keep sshfs happy.

Wait, what? What's the relationship between server-space available and the number that SFTP reports as available to any given client? Not trivial, I think.

If we do this, let's make it clear that we're providing only a very rough approximation of the client-side space. Adding together all of the raw server space and dividing by the expansion factor is pretty rough, especially with the servers-of-happiness change (e.g. one server has 14TB free, but you can't upload anything because everyone else is full: SFTP should announce 0).

Also let's make room for Accounting APIs to generate this data (since really it's a function of accounting: how much space an individual "user" is allowed to consume, which may be far less than the sum of all server capacities). At least let's be thinking in that direction when we name the functions.

comment:13 Changed at 2010-12-29T08:01:08Z by zooko

  • Keywords ostrom added

comment:14 in reply to: ↑ 12 Changed at 2010-12-29T21:31:39Z by davidsarah

  • Keywords sftp removed

Replying to warner:

Replying to davidsarah:

The code that determines what SFTP outputs for 'df' is at lines 1757 and 1879 of sftpd.py. It currently has to fake some values to keep sshfs happy.

Wait, what? What's the relationship between server-space available and the number that SFTP reports as available to any given client? Not trivial, I think.

Agreed that estimating the total available space is nontrivial. I've split it out into ticket #1285 (SFTP: put an approximation of grid capacity and available space in the 'df' output).

Last edited at 2010-12-29T21:32:12Z by davidsarah (previous) (diff)

comment:15 Changed at 2011-04-27T16:38:24Z by zooko

#1206 (node status page does not indicate per server if it is taking shares) was a duplicate of this. In that ticket, gdt wrote:

A very important indicator of the health of a server in a grid is whether it will take new shares. A client node has enough information (or could record it) to know this. It should show somehow if a node is not taking shares (either if it says it won't or if it actually doesn't). The lack of this feature makes it almost impossible to assess if files can be uploaded without trying it.

Whether a server is accepting shares is determined like this: if the server is configured to be in read-only mode then it sets its "available space" to 0: StorageServer.get_available_space(). If "reserved space" is set then it subtracts that much space from its available space: fileutil.get_disk_stats(). It includes the resulting "available space" in the metadata about itself that it sends back in response to get_version requests: StorageServer.remote_get_version().

The client invokes get_version on each server as soon as it connects to that server, but it doesn't do so ever again as long as it stays connected: storage_client.NativeStorageServer.

So, this ticket is basically a superset of #1206. The client is already learning (once, at connection establishment time) how much space the server is offering, which is equal to 0 if and only if the server is either in read-only mode or is full. If the client would display this information to the user in a nice comprehensible way then both #1206 and this ticket would be fixed.

patch-needed! :-)

Last edited at 2013-10-11T15:28:15Z by zooko (previous) (diff)

comment:16 Changed at 2011-04-27T16:47:50Z by zooko

Hm, once we've fixed this ticket, then we should add to ticket #816 (Add ping-all-servers button to welcome page). That ticket is to make a button titled "ping all servers". When you click that button it will issue get_version requests to all servers and update the display of how much space they are offering.

comment:17 Changed at 2011-04-27T16:49:37Z by zooko

  • Keywords transparency added

comment:18 Changed at 2011-05-31T17:37:33Z by zooko

  • Summary changed from collect server capacities and put them on the welcome page, output of 'df' for SFTP, etc. to collect server capacities and put them on the welcome page

Moving the part about df in the SFTP server over to its own ticket: #1285.

comment:19 Changed at 2011-10-11T02:33:42Z by davidsarah

addos asked about this on #tahoe-lafs (http://fred.submusic.ch/irc/tahoe-lafs/2011-10-09#i_296689 username irclogs, password irclogs):

in the status of the storage grid display in the web interface, why does it not show the storage of each node?

I guess I have to go to each node and visit /storage?

It would be nice if each node in the status of the storage grid table, had a link to that node's /storage

The suggestion of a link to the node's /storage page is a nice one; maybe one of the columns could be linked to that, so as not to take up any extra space.

comment:20 Changed at 2011-10-11T02:35:28Z by davidsarah

  • Description modified (diff)
Note: See TracTickets for help on using tickets.