Ticket #118: webapi-new.txt

File webapi-new.txt, 17.4 KB (added by zooko, at 2007-08-20T19:51:18Z)

webapi-new.txt

Line 
1This document has six sections:
2
31.  the basic API for how to programmatically control your tahoe node
42.  convenience methods
53.  safety and security issues
64.  features for controlling your tahoe node from a standard web browser
75.  debugging and testing features
86.  XML-RPC (coming soon)
9
10
111. the basic API for how to programmatically control your tahoe node
12
13a. connecting to the tahoe node
14
15Writing "8011" into $NODEDIR/webport causes the node to run a webserver on
16port 8011. Writing "tcp:8011:interface=127.0.0.1" into $NODEDIR/webport does
17the same but binds to the loopback interface, ensuring that only the programs
18on the local host can connect. Using
19"ssl:8011:privateKey=mykey.pem:certKey=cert.pem" would run an SSL server. See
20twisted.application.strports for more details.
21
22If $NODEDIR/webpassword exists, it will be used (somehow) to require HTTP
23Digest Authentication for all webserver connections.  XXX specify how
24
25b. file names
26
27The node provides some small number of "virtual drives". In the 0.5
28release, this number is two: the first is the global shared vdrive, the
29second is the private non-shared vdrive. We will call these "global" and
30"private".
31
32For the purpose of this document, let us assume that the vdrives currently
33contain the following directories and files:
34
35global/
36global/Documents/
37global/Documents/notes.txt
38
39private/
40private/Pictures/
41private/Pictures/tractors.jpg
42private/Pictures/family/
43private/Pictures/family/bobby.jpg
44
45Within the webserver, there is a tree of resources. The top-level "vdrive"
46resource gives access to files and directories in all of the user's virtual
47drives. For example, the URL that corresponds to notes.txt would be:
48
49http://localhost:8011/vdrive/global/Documents/notes.txt
50
51and the URL for tractors.jpg would be:
52
53http://localhost:8011/vdrive/private/Pictures/tractors.jpg
54
55In addition, each directory has a corresponding URL. The Pictures URL is:
56
57http://localhost:8011/vdrive/private/Pictures
58
59c. URIs
60
61A separate top-level namespace ("uri/" instead of "vdrive/") is used to
62access to files and directories directly by URI, rather than by going through
63the vdrive.
64
65For example, this identifies a file or directory:
66
67http://localhost:8011/uri/$URI
68
69And this identifies a file or directory named "tractors.jpg" in a
70subdirectory "Pictures" of the identified directory:
71
72http://localhost:8011/uri/$URI/Pictures/tractors.jpg
73
74In the following examples, "$URL" is a shorthand for a URL like the ones
75above, either with "vdrive/" as the top level and a sequence of
76slash-separated pathnames following, or with "uri/" as the top level,
77followed by a URI, optionally followed by a sequence of slash-separated
78pathnames.
79
80Now, what can we do with these URLs? By varying the HTTP method
81(GET/PUT/POST/DELETE) and by appending a type-indicating query argument, we
82control what we want to do with the data and how it should be presented.
83
84d. examining files or directories
85
86  GET $URL?t=json
87
88  This returns machine-parseable information about the indicated file or
89  directory in the HTTP response body. This information contains a flag that
90  indicates whether the thing is a file or a directory.
91
92  If it is a file, then the information includes file size and URI, like
93  this:
94
95   [ 'filenode', { 'ro_uri': file_uri,
96                   'size': bytes } ]
97
98  If it is a directory, then it includes information about the children of
99  this directory, as a mapping from child name to a set of metadata about the
100  child (the same data that would appear in a corresponding GET?t=json of the
101  child itself). Like this:
102
103   [ 'dirnode', { 'rw_uri': read_write_uri,
104                  'ro_uri': read_only_uri,
105                  'children': children } ]
106
107  In the above example, 'children' is a dictionary in which the keys are
108  child names and the values depend upon whether the child is a file or a
109  directory:
110
111   'foo.txt': [ 'filenode', { 'ro_uri': uri, 'size': bytes } ]
112   'subdir':  [ 'dirnode', { 'rw_uri': rwuri, 'ro_uri': rouri } ]
113
114  note that the value is the same as the JSON representation of the child
115  object (except that directories do not recurse -- the "children" entry of
116  the child is omitted).
117
118  Then the rw_uri field will be present in the information about a directory
119  if and only if you have read-write access to that directory,
120
121e. downloading a file
122
123  GET $URL
124
125  If the indicated object is a file, then this simply retrieves the contents
126  of the file. The file's contents are provided in the body of the HTTP
127  response.
128
129  If the indicated object a directory, then this returns an HTML page,
130  intended to be used by humans, which contains HREF links to all files and
131  directories reachable from this directory. These HREF links do not have a
132  t= argument, meaning that a human who follows them will get pages also
133  meant for a human. It also contains forms to upload new files, and to
134  delete files and directories. These forms use POST methods to do their job.
135
136  You can add the "save=true" argument, which adds a 'Content-Disposition:
137  attachment' header to prompt most web browsers to save the file to disk
138  rather than attempting to display it.
139
140  A filename (from which a MIME type can be derived) can be specified using a
141  'filename=' query argument. This is especially useful if the $URL does not
142  end with the name of the file (because it instead ends with the identifier
143  of the file). This filename is also the one used if the 'save=true'
144  argument is set. For example:
145
146   GET http://localhost:8011/uri/$TRACTORS_URI?filename=tractors.jpg
147
148f. uploading a file
149
150  PUT http://localhost:8011/uri
151
152  Upload a file, returning its URI as the HTTP response body. This does not
153  make the file visible from the virtual drive -- to do that, see section
154  1.h. below, or the convenience method in section 2.a..
155
156g. creating a new directory
157
158  PUT http://localhost:8011/uri?t=mkdir
159
160  Create a new empty directory and return its URI as the HTTP response body.
161  This does not make the newly created directory visible from the virtual
162  drive, but you can use section 1.h. to attach it, or the convenience method
163  in section 2.XXX.
164
165h. attaching a file or directory as the child of an extant directory
166
167  PUT $URL?t=uri
168
169  This attaches a child (either a file or a directory) to the given directory
170  $URL is required to indicate a directory as the second-to-last element and
171  the desired filename as the last element, for example:
172
173   PUT http://localhost:8011/uri/$URI_OF_SOME_DIR/Pictures/tractors.jpg
174   PUT http://localhost:8011/uri/$URI_OF_SOME_DIR/tractors.jpg
175   PUT http://localhost:8011/vdrive/private/Pictures/tractors.jpg
176
177  The URI of the child is provided in the body of the HTTP request.
178
179  There is an optional "?overwrite=" param whose value can be "true", "t",
180  "1", "false", "f", or "0" (case-insensitive), and which defaults to "true".
181  If the indicated directory already contains the given child name, then if
182  overwrite is true then the value of that name is changed to be the new URI.
183  If overwrite is false then an error is returned. XXX specify the error
184
185  This can be used to attach a shared directory (a directory that other
186  people can read or write) to the vdrive. Intermediate directories, if any,
187  are created on-demand.
188
189i. removing a name from a directory
190
191  DELETE $URL
192
193  This removes the given name from the given directory. $URL is required to
194  indicate a directory as the second-to-last element and the name to remove
195  from that directory as the last element, just as in section 1.g..
196
197  Note that this does not actually delete the resource that the name points
198  to from the tahoe grid -- it only removes this name in this directory. If
199  there are other names in this directory or in other directories that point
200  to the resource, then it will remain accessible through those paths. Even
201  if all names pointing to this resource are removed from their parent
202  directories, then if someone is in possession of the URI of this resource
203  they can continue to access the resource through the URI. Only if a person
204  is not in possession of the URI, and they do not have access to any
205  directories which contain names pointing to this resource, are they
206  prevented from accessing the resource.
207
2082. convenience methods
209
210a. uploading a file and attaching it to the vdrive
211
212  PUT $URI
213
214  Upload a file and link it into the the vdrive at the location specified by
215  $URI. The last item in the $URI must be a filename, and the second-to-last
216  item must identify a directory.
217
218  It will create intermediate directories as necessary. The file's contents
219  are taken from the body of the HTTP request. For convenience, the HTTP
220  response contains the URI that results from uploading the file, although
221  the client is not obligated to do anything with the URI. According to the
222  HTTP/1.1 specification (rfc2616), this should return a 200 (OK) code when
223  modifying an existing file, and a 201 (Created) code when creating a new
224  file.
225
226  To use this, run 'curl -T localfile http://localhost:8011/vdrive/global/newfile'
227
2283. safety and security issues -- names vs. URIs
229
230The vdrive provides a mutable filesystem, but the ways that the filesystem
231can change are limited. The only thing that can change is that the mapping
232from child names to child objects that each directory contains can be changed
233by adding a new child name pointing to an object, removing an existing child
234name, or changing an existing child name to point to a different object.
235
236Obviously if you query tahoe for information about the filesystem and then
237act upon the filesystem (such as by getting a listing of the contents of a
238directory and then adding a file to the directory), then the filesystem might
239have been changed after you queried it and before you acted upon it.
240However, if you use the URI instead of the pathname of an object when you act
241upon the object, then the only change that can happen is when the object is a
242directory then the set of child names it has might be different. If, on the
243other hand, you act upon the object using its pathname, then a different
244object might be in that place, which can result in more kinds of surprises.
245
246For example, suppose you are writing code which recursively downloads the
247contents of a directory. The first thing your code does is fetch the listing
248of the contents of the directory. For each child that it fetched, if that
249child is a file then it downloads the file, and if that child is a directory
250then it recurses into that directory. Now, if the download and the recurse
251actions are performed using the child's name, then the results might be
252wrong, because for example a child name that pointed to a sub-directory when
253you listed the directory might have been changed to point to a file (in which
254case your attempt to recurse into it would result in an error and the file
255would be skipped), or a child name that pointed to a file when you listed the
256directory might now point to a sub-directory (in which case your attempt to
257download the child would result in a file containing HTML text describing the
258sub-directory!).
259
260If your recursive algorithm uses the uri of the child instead of the name of
261the child, then those kinds of mistakes just can't happen. Note that both the
262child's name and the child's URI are included in the results of listing the
263parent directory, so it isn't harder to use the URI for this purpose.
264
265In general, use names if you want "whatever object (whether file or
266directory) is found by following this name (or sequence of names) when my
267request reaches the server". Use URIs if you want "this particular object".
268
2694. features for controlling your tahoe node from a standard web browser
270
271a. uri redirect
272
273  GET http://localhost:8011/uri?uri=$URI
274
275  This causes a redirect to /uri/$URI, and retains any additional query
276  arguments (like filename= or save=). This is for the convenience of web
277  forms which allow the user to paste in a URI (obtained through some
278  out-of-band channel, like IM or email).
279
280  Note that this form merely redirects to the specific file or directory
281  indicated by the URI: unlike the GET /uri/$URI form, you cannot traverse to
282  children by appending additional path segments to the URL.
283
284b. web page offering rename
285
286  GET $URL?t=rename-form&name=$CHILDNAME
287
288  This provides a useful facility to browser-based user interfaces. It
289  returns a page containing a form targetting the "POST $URL t=rename"
290  functionality described below, with the provided $CHILDNAME present in the
291  'from_name' field of that form. I.e. this presents a form offering to
292  rename $CHILDNAME, requesting the new name, and submitting POST rename.
293
294c. POST forms
295
296  POST $URL
297  t=upload
298  name=childname  (optional)
299  file=newfile
300  This instructs the node to upload a file into the given directory. We need
301  this because forms are the only way for a web browser to upload a file
302  (browsers do not know how to do PUT or DELETE). The file's contents and the
303  new child name will be included in the form's arguments. This can only be
304  used to upload a single file at a time. To avoid confusion, name= is not
305  allowed to contain a slash (a 400 Bad Request error will result).
306
307  POST $URL
308  t=mkdir
309  name=childname
310
311  This instructs the node to create a new empty directory. The name of the
312  new child directory will be included in the form's arguments.
313
314  POST $URL
315  t=uri
316  name=childname
317  uri=newuri
318
319  This instructs the node to attach a child that is referenced by URI (just
320  like the PUT $URL?t=uri method). The name and URI of the new child
321  will be included in the form's arguments.
322
323  POST $URL
324  t=delete
325  name=childname
326
327  This instructs the node to delete a file from the given directory. The name
328  of the child to be deleted will be included in the form's arguments.
329
330  POST $URL
331  t=rename
332  from_name=oldchildname
333  to_name=newchildname
334
335  This instructs the node to rename a child within the given directory. The
336  child specified by 'from_name' is removed, and reattached as a child named
337  for 'to_name'. This is unconditional and will replace any child already
338  present under 'to_name', akin to 'mv -f' in unix parlance.
339
3405. debugging and testing features
341
342GET $URL?t=download&localfile=$LOCALPATH
343GET $URL?t=download&localdir=$LOCALPATH
344
345  The localfile= form instructs the node to download the given file and write
346  it into the local filesystem at $LOCALPATH. The localdir= form instructs
347  the node to recursively download everything from the given directory and
348  below into the local filesystem. To avoid surprises, the localfile= form
349  will signal an error if $URL actually refers to a directory, likewise if
350  localdir= is used with a $URL that refers to a file.
351
352  This request will only be accepted from an HTTP client connection
353  originating at 127.0.0.1 . This request is most useful when the client node
354  and the HTTP client are operated by the same user. $LOCALPATH should be an
355  absolute pathname.
356
357  This form is only implemented for testing purposes, because of a trivially
358  easy attack: any web server that the local browser visits could serve an
359  IMG tag that causes the local node to modify the local filesystem.
360  Therefore this form is only enabled if you create a file named
361  'webport_allow_localfile' in the node's base directory.
362
363PUT $NEWURL?t=upload&localfile=$LOCALPATH
364PUT $NEWURL?t=upload&localdir=$LOCALPATH
365
366  This uploads a file or directory from the node's local filesystem to the
367  vdrive. As with "GET $URL?t=download&localfile=$LOCALPATH", this request
368  will only be accepted from an HTTP connection originating from 127.0.0.1 .
369
370  The localfile= form expects that $LOCALPATH will point to a file on the
371  node's local filesystem, and causes the node to upload that one file into
372  the vdrive at the given location. Any parent directories will be created in
373  the vdrive as necessary.
374
375  The localdir= form expects that $LOCALPATH will point to a directory on the
376  node's local filesystem, and it causes the node to perform a recursive
377  upload of the directory into the vdrive at the given location, creating
378  parent directories as necessary. When the operation is complete, the
379  directory referenced by $NEWURL will contain all of the files and
380  directories that were present in $LOCALPATH, so this is equivalent to the
381  unix commands:
382
383   mkdir -p $NEWURL; cp -r $LOCALPATH/* $NEWURL/
384
385  Note that the "curl" utility can be used to provoke this sort of recursive
386  upload, since the -T option will make it use an HTTP 'PUT':
387
388   curl -T /dev/null 'http://localhost:8011/vdrive/global/newdir?t=upload&localdir=/home/user/directory-to-upload'
389
390  This form is only implemented for testing purposes, because any attacker's
391  web server that a local browser visits could serve an IMG tag that causes
392  the local node to modify the local filesystem. Therefore this form is only
393  enabled if you create a file named 'webport_allow_localfile' in the node's
394  base directory.
395
396GET $URL?t=manifest
397
398  Return an HTML-formatted manifest of the given directory, for debugging.
399
4006. XMLRPC (coming soon)
401
402  http://localhost:8011/xmlrpc
403
404  This resource provides an XMLRPC server on which all of the previous
405  operations can be expressed as function calls taking a "pathname" argument.
406  This is provided for applications that want to think of everything in terms
407  of XMLRPC.
408
409   listdir(vdrivename, path) -> dict of (childname -> (stuff))
410   put(vdrivename, path, contents) -> URI
411   get(vdrivename, path) -> contents
412   mkdir(vdrivename, path) -> URI
413   put_localfile(vdrivename, path, localfilename) -> URI
414   get_localfile(vdrivename, path, localfilename)
415   put_localdir(vdrivename, path, localdirname)   # recursive
416   get_localdir(vdrivename, path, localdirname)   # recursive
417   put_uri(vdrivename, path, URI)
418
419   etc..
420
421