#736 closed defect (fixed)

UnrecoverableFileError on directory which has 6 shares (3 needed)

Reported by: zooko Owned by: zooko
Priority: major Milestone: 1.5.0
Component: code Version: 1.4.1
Keywords: foolscap Cc:
Launchpad Bug:

Description

When I view this private directory of mine, I get:

Error reading directory:

UnrecoverableFileError: the directory (or mutable file) could not be retrieved, because there were insufficient good shares. This might indicate that no servers were connected, insufficient servers were connected, the URI was corrupt, or that shares have been lost due to server departure, hard drive failure, or disk corruption. You should perform a filecheck on this object to learn more.
No upload forms: directory is unreadable

But when I run check (with or without verify) on it, I get:

{
 "results": {
  "needs-rebalancing": true, 
  "count-shares-expected": 10, 
  "healthy": false, 
  "count-unrecoverable-versions": 0, 
  "count-shares-needed": 3, 
  "sharemap": {
   "seq29-5gk3-sh9": [
    "ivjakubrruewknqg7wgb5hbinasqupj6"
   ], 
   "seq29-5gk3-sh8": [
    "xiktf6ok5f5ao5znxxttriv233hmvi4v"
   ], 
   "seq29-5gk3-sh7": [
    "wfninubkrvhlyscum7rlschbhx5iarg3"
   ], 
   "seq29-5gk3-sh4": [
    "xiktf6ok5f5ao5znxxttriv233hmvi4v"
   ], 
   "seq29-5gk3-sh3": [
    "trjdor3okozw4eld3l6zl4ap4z6h5tk6"
   ], 
   "seq29-5gk3-sh1": [
    "jfdpabh34vsrhll3lbdn3v23vem4hr2z"
   ]
  }, 
  "count-recoverable-versions": 1, 
  "servers-responding": [
   "xiktf6ok5f5ao5znxxttriv233hmvi4v", 
   "wfninubkrvhlyscum7rlschbhx5iarg3", 
   "trjdor3okozw4eld3l6zl4ap4z6h5tk6", 
   "jfdpabh34vsrhll3lbdn3v23vem4hr2z", 
   "ivjakubrruewknqg7wgb5hbinasqupj6"
  ], 
  "count-good-share-hosts": 5, 
  "count-wrong-shares": 0, 
  "count-shares-good": 6, 
  "count-corrupt-shares": 0, 
  "list-corrupt-shares": [], 
  "recoverable": true
 }, 
 "storage-index": "iy5ilio6pjeehciu6idw4jzp6e", 
 "summary": "Unhealthy: 6 shares (enc 3-of-10)"
}

This may be related to #732 (Not Enough Shares when repairing a file which has 7 shares on 2 servers).

Hm, my twistd.log ends with:

2009-06-14 14:06:52-0600 [Negotiation,client] Unhandled error in Deferred:
2009-06-14 14:06:52-0600 [Negotiation,client] Unhandled Error
        Traceback (most recent call last):
          File "/Users/wonwinmcbrootles/playground/allmydata/tahoe/trunk/trunk/support/lib/python2.5/site-packages/foolscap-0.4.1-py2.5.egg/foolscap/call.py", line 669, in _don
e
            self.request.complete(res)
          File "/Users/wonwinmcbrootles/playground/allmydata/tahoe/trunk/trunk/support/lib/python2.5/site-packages/foolscap-0.4.1-py2.5.egg/foolscap/call.py", line 55, in compl
ete
            self.deferred.callback(res)
          File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/Twisted-8.2.0-py2.5-macosx-10.3-i386.egg/twisted/internet/defer.py", line 243, in 
callback
            self._startRunCallbacks(result)
          File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/Twisted-8.2.0-py2.5-macosx-10.3-i386.egg/twisted/internet/defer.py", line 312, in 
_startRunCallbacks
            self._runCallbacks()
        --- <exception caught here> ---
          File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/Twisted-8.2.0-py2.5-macosx-10.3-i386.egg/twisted/internet/defer.py", line 328, in 
_runCallbacks
            self.result = callback(self.result, *args, **kw)
          File "/Users/wonwinmcbrootles/playground/allmydata/tahoe/trunk/trunk/src/allmydata/introducer/client.py", line 92, in _got_versioned_service
            self._ic.add_connection(self._nodeid, self.service_name, rref)
          File "/Users/wonwinmcbrootles/playground/allmydata/tahoe/trunk/trunk/src/allmydata/introducer/client.py", line 270, in add_connection
            rsc.reset()
          File "/Users/wonwinmcbrootles/playground/allmydata/tahoe/trunk/trunk/src/allmydata/introducer/client.py", line 105, in reset
            self._reconnector.reset()
        exceptions.AttributeError: RemoteServiceConnector instance has no attribute '_reconnector'

That looks like it is related to http://foolscap.lothar.com/trac/ticket/129 (zero-length location hints cause getReference() to fail synchronously).

Change History (4)

comment:1 Changed at 2009-06-24T04:12:46Z by warner

that foolscap issue was closed in foolscap-0.4.2 .. could you try viewing that directory again with an upgraded foolscap?

comment:2 Changed at 2009-06-29T21:31:12Z by warner

  • Owner changed from somebody to zooko

comment:3 Changed at 2009-07-02T20:48:49Z by zooko

  • Status changed from new to assigned

I'll test this this weekend.

comment:4 Changed at 2009-07-03T17:24:58Z by zooko

  • Resolution set to fixed
  • Status changed from assigned to closed

Confirmed that it works now. Thanks, Brian!

Note: See TracTickets for help on using tickets.