#1663 assigned enhancement

Add a concise table of the URL tree to webapi.rst.

Reported by: nejucomo Owned by: marlowe
Priority: normal Milestone: undecided
Component: documentation Version: 1.9.0
Keywords: webapi docs Cc: thedod
Launchpad Bug:

Description

I would like to see a section of webapi.rst with a concise table of all handled URLs (or prefixes). For each url, a complete list of possible operations initiated by that URL would be extremely useful.

Exhaustive query parameters per operation might make the table too cluttered.

The use case is an operations engineer under time pressure without specific knowledge of Tahoe-LAFS needs to understand the URL structure in order to implement access control, caching, load balancing, or other opsy policies.

Change History (18)

comment:1 Changed at 2012-01-22T01:28:50Z by nejucomo

As a first attempt to do this automatically, I learned that Nevow Page's will delegate requests to sub-url-paths by looking for child_<path segment> attributes or methods. A quick grep gives a basic partial picture of the URL namespace (see below).

The allmydata.web.root.Root class is a good starting point. There are at least these top-level handlers:

  • /operations
  • /storage
  • /uri
  • /cap
  • /file
  • /named
  • /status
  • /statistics

This is based on this simplistic grep:

$ find src/allmydata/web -type f -name '*.py' -print0 | xargs -0 grep -E '^class |child_' | grep -B 1 'child_'
src/allmydata/web/directory.py:class DirectoryNodeHandler(RenderMixin, rend.Page, ReplaceMeMixin):
src/allmydata/web/directory.py:        d = self.node.move_child_to(from_name, self.node, to_name, replace)
--
src/allmydata/web/introweb.py:class IntroducerRoot(rend.Page):
src/allmydata/web/introweb.py:    child_operations = None
--
src/allmydata/web/root.py:class Root(rend.Page):
src/allmydata/web/root.py:        self.child_operations = operations.OphandleTable(clock)
src/allmydata/web/root.py:        self.child_storage = storage.StorageStatus(s)
src/allmydata/web/root.py:        self.child_uri = URIHandler(client)
src/allmydata/web/root.py:        self.child_cap = URIHandler(client)
src/allmydata/web/root.py:        self.child_file = FileHandler(client)
src/allmydata/web/root.py:        self.child_named = FileHandler(client)
src/allmydata/web/root.py:        self.child_status = status.Status(client.get_history())
src/allmydata/web/root.py:        self.child_statistics = status.Statistics(client.stats_provider)
src/allmydata/web/root.py:    def child_helper_status(self, ctx):
src/allmydata/web/root.py:    child_provisioning = provisioning.ProvisioningTool()
src/allmydata/web/root.py:        child_reliability = reliability.ReliabilityTool()
src/allmydata/web/root.py:        child_reliability = NoReliability()
src/allmydata/web/root.py:    child_report_incident = IncidentReporter()
src/allmydata/web/root.py:    #child_server # let's reserve this for storage-server-over-HTTP
--
src/allmydata/web/status.py:class DownloadStatusPage(DownloadResultsRendererMixin, rend.Page):
src/allmydata/web/status.py:    def child_timeline(self, ctx):
src/allmydata/web/status.py:    def child_event_json(self, ctx):

comment:2 Changed at 2012-01-22T01:29:20Z by nejucomo

See also: #1662

comment:3 Changed at 2012-01-22T01:41:14Z by nejucomo

The URL handling path can also be modified by the putChild method. A grep shows two cases of static file handling:

  • In allmydata.web.root.Root.__init__ at the end it adds all files in the static resource directory to the root URL namespace. For my repository that gives:
    • /d3-2.4.6.min.js
    • /d3-2.4.6.time.min.js
    • /download_status_timeline.js
    • /icon.png
    • /jquery-1.6.1.min.js
    • /tahoe.css
  • In allmydata.webish.WebishServer.buildServer if staticdir is truthy (and its default resolves to ~/.tahoe/public_html by default), it gets added under this subpath:
    • /static/

comment:4 Changed at 2012-01-23T05:57:57Z by zooko

Good start! Thanks! Whoever wants to write a patch for the docs based on this should go ahead and mark it as review-needed.

comment:5 Changed at 2012-01-25T04:09:04Z by nejucomo

I feel only moderate confidence that the URL structure above is complete enough to make sound URL-path-based access control decisions.

For example, when I request a directory node, my browser makes a request for /webform_css but I haven't found the source of this request processing. A quick search suggests it's in a dependency called formless:

$ find tahoe-lafs/ -iname '*webform*'
$ find tahoe-lafs/ -type f -print0 | xargs -0 grep -i webform
tahoe-lafs/src/allmydata/windows/tahoesvc.py:            from formless import webform, processors, annotate, iformless
tahoe-lafs/src/allmydata/windows/tahoesvc.py:                context, flatmdom, flatstan, twist, webform, processors, annotate, iformless, Decimal,
tahoe-lafs/static/tahoe.py:from formless import webform, processors, annotate, iformless
tahoe-lafs/static/tahoe.py:    context, flatmdom, flatstan, twist, webform, processors, annotate, iformless, Decimal,
$ find ~/virtualenvs/default/ -iname '*webform*'
/home/n/virtualenvs/default/lib/python2.7/site-packages/formless/webform.py
/home/n/virtualenvs/default/lib/python2.7/site-packages/formless/webform.pyc
$ grep webform_css /home/n/virtualenvs/default/lib/python2.7/site-packages/formless/webform.py
$ grep -i webform_css /home/n/virtualenvs/default/lib/python2.7/site-packages/formless/webform.py
$ grep -i webform /home/n/virtualenvs/default/lib/python2.7/site-packages/formless/webform.py
            return webform.renderForms(

comment:6 Changed at 2012-01-25T04:55:21Z by nejucomo

The results of this ticket would inform tickets #1665, #860, and #587.

comment:7 Changed at 2012-03-12T19:26:09Z by davidsarah

  • Component changed from unknown to documentation
  • Keywords webapi docs added
  • Owner changed from nobody to somebody

comment:8 Changed at 2012-03-29T19:11:53Z by davidsarah

  • Priority changed from major to normal

comment:9 Changed at 2012-04-11T23:24:50Z by marlowe

  • Owner changed from somebody to marlowe

comment:10 Changed at 2012-05-06T23:29:43Z by marlowe

  • Status changed from new to assigned

comment:11 Changed at 2012-06-06T03:57:39Z by marlowe

22:56 < nejucomo> marlowe: The first pass could contain path-pattern,

http-methods, interesting-query-parameters, and a short note about what operations are available there.

comment:12 Changed at 2012-06-06T03:58:03Z by marlowe

22:56 < nejucomo> marlowe: The first pass could contain path-pattern,

http-methods, interesting-query-parameters, and a short note about what operations are available there.

comment:13 Changed at 2012-06-06T03:59:31Z by marlowe

22:57 < nejucomo> Actually, that might be even too specific for the first pass.

Maybe just path-pattern and a short description of what purpose that path servers (especially which information it leaks and what state it can change).

22:58 < marlowe> change noted 22:58 < nejucomo> Something like: "/file/<CAP" - "This url path is used to read

and update files in the grid." "/status/<…>" - "This path gives status for current upload, download, verification, and repair operations."

comment:14 Changed at 2012-11-19T00:42:40Z by davidsarah

thedod wrote at #1866:

At the WAPI doc there should be a list of of all prefix options, followed by a list of sections describing each.

/cap appears as an orphan line somewhere on the page, and it's not clear from the text whether we should use it and when.

/file appears inside the /named section, and theres's also no mention of the variant that contains a /@@named=/ component (the WAPI [actually WUI] links to such urls). What does it mean (removing the component doesn't seem to matter)? Which of the 3 options (/named/ or /file/ with or without /@@named=/) is preferred when?

Another thing I can't find (maybe it's on some other doc, but then webapi.rst should link to it): list (+ explanation) of all cap types: DIR2-RO, DIR2-CHK, CHK, etc. (and in general - explanation of cap uri syntax). This (or a link to it) should appear before explaining urls that *contain* a cap :)

comment:15 Changed at 2012-11-19T00:45:38Z by davidsarah

zooko wrote at #1866:

Okay, to close this ticket, update webapi.rst to answer all these questions. Here are some answers you could use to that end...

/cap was a plan that I had to rename "uri" to "cap" everywhere. I thought it was more helpful to users to call those things caps instead of uris.

Part of why Brian had agreed to go along with this was that Kevin Reid emphasized to us that we're not supposed to call a thing a "uri" unless it has some sort of official recognition from some namespace allocator like IANA or something.

We wound up changing most but not all of the things that were easy to change -- the docs and some of the source code -- but not changing how it is spelled in the WAPI.

I guess we should consider resuming that process of renaming, if only because a half-renamed thing is almost as bad as a consistently bad badly-named thing. :-/

+1 (a half-renamed thing is worse, IMHO)

Anyway, /cap *ought* to be a synonym of /uri, but I'm not sure what happens if you actually use it.

The /@@named=/ feature is kind of complicated. The goal is: tell the web server (tahoe-lafs gateway) that the resource you want to download is a certain cap, e.g. "URI:CHK:egrocatgmbuoqra3e3jptkzvwe:543sre2wsjmqwbk73in76oqaemi35iqeyzggavc4vp6kkvc43nkq:1:1:948821", but at the same time tell the web *browser* that the resource you are fetching is named something like "Murphy-2012-Deaths__Preliminary_Data_For_2010.pdf". The way we do this is by appending a string after the cap which will be ignored by the server (LAFS gateway), but which will make the browser think that the file has that name. So, for example /uri/URI:CHK:egrocatgmbuoqra3e3jptkzvwe:543sre2wsjmqwbk73in76oqaemi35iqeyzggavc4vp6kkvc43nkq:1:1:948821/@@named=/Murphy-2012-Deaths__Preliminary_Data_For_2010.pdf.

Now, the further complication is that if the cap is a dir cap as opposed to a file cap, then /uri/URI:DIR2-MDMF-RO:ppnrefnrnovjpoiv3jirjnpoim:obhqprvm6hafvarzzssrawgazx6p6tgopi4fslirhelg7xqyfr6a/@@named=/foo could be interpreted by the web server (LAFS gateway) as meaning "Get the child out of the dir whose name is @@named= and then treat that child as a directory and look in that for a child of it named foo. In order to avoid that misinterpretation, we added the /file/ instead of /uri/ to specify that this is not a dir.

Here was a thread about this on tahoe-dev long ago:

https://tahoe-lafs.org/pipermail/tahoe-dev/2008-May/000573.html

Frankly, the resulting API is kind of weird and I wonder if we couldn't come up with a simpler and better one!

Now as to the list of cap types and cap syntax, there are at least the following two docs, and they should be cross-linked, and linked to from webapi.rst, and probably unified:

Last edited at 2012-11-19T00:46:01Z by davidsarah (previous) (diff)

comment:16 Changed at 2012-11-19T00:48:02Z by davidsarah

  • Cc thedod added

comment:17 follow-up: Changed at 2012-11-19T21:43:46Z by zooko

Created #1868 (rename "uri" to "cap" everywhere).

comment:18 in reply to: ↑ 17 Changed at 2012-11-20T02:59:31Z by davidsarah

Replying to zooko:

Created #1868 (rename "uri" to "cap" everywhere).

Duplicate of #1715 (change all docs and generated URLs to point to "/cap" instead of "/uri"). Note my disagreement in ticket:1715#comment:6.

Note: See TracTickets for help on using tickets.