#1655 closed defect (duplicate)

Reproducible UncoordinatedWriteError on repair

Reported by: ianchov Owned by: somebody
Priority: critical Milestone: 1.9.2
Component: code Version: 1.9.0
Keywords: ucwe repair regression Cc:
Launchpad Bug:

Description (last modified by zooko)

Hi

Tahoe 1.9.1 (same with 1.9.0)

[ianchov@localhost]$ ./bin/tahoe deep-check --repair --add-lease -v XYZ:XYZ
'<root>': not healthy
 repair successful
ERROR: UncoordinatedWriteError()
"[Failure instance: Traceback (failure with no frames): <class 'allmydata.mutable.common.UncoordinatedWriteError'>: "

Attachments (2)

incident-2012-03-09--21-14-38Z-e4c2fzi.txt (394.8 KB) - added by ianchov at 2012-03-10T07:30:14Z.
Last incidents for Uncoordinated Write Error
incident-2012-03-09--05-23-19Z-vsxjgbi.txt (66.7 KB) - added by ianchov at 2012-03-10T07:30:28Z.
Last incidents for Uncoordinated Write Error

Download all attachments as: .zip

Change History (18)

comment:1 Changed at 2012-02-16T17:05:36Z by davidsarah

  • Keywords ucwe repair leases added
  • Priority changed from critical to major

Is this problem reproducible, and does it happen without --add-lease?

comment:2 Changed at 2012-02-16T17:42:35Z by ianchov

C:\Users\ianchov>C:\Python26\python.exe X:\allmydata-tahoe-1.9.1\bin\tahoe deep-
check --repair -v -d X:\tahoe cveti:
ERROR: UncoordinatedWriteError()
"[Failure instance: Traceback (failure with no frames): <class 'allmydata.mutabl
e.common.UncoordinatedWriteError'>: "


if is without repair and add-lease
C:\Users\ianchov>C:\Python26\python.exe X:\allmydata-tahoe-1.9.1\bin\tahoe deep-
check -v -d X:\tahoe XXXX:
'<root>': Unhealthy: some versions are unrecoverable
'Archives': Unhealthy: some versions are unrecoverable 10 shares (enc 5-of-12)
'Archives/2012-02-01_08:25:59Z': Not Healthy: 10 shares (enc 5-of-12)
'Archives/2012-02-01_08:25:59Z/Local Disk (C) - Shortcut.lnk': Not Healthy: 10 s
hares (enc 5-of-12).....
Last edited at 2012-03-05T22:15:50Z by zooko (previous) (diff)

comment:3 Changed at 2012-02-17T00:09:30Z by davidsarah

  • Keywords leases removed

comment:4 Changed at 2012-02-17T00:10:45Z by davidsarah

  • Summary changed from Cannot deep-check url to Reproducible UncoordinatedWriteError on repair

comment:5 Changed at 2012-02-17T16:31:16Z by zooko

  • Priority changed from major to critical

comment:6 Changed at 2012-02-18T17:47:35Z by kevan

I don't think 1.9.1 has the fix for #1628. Can you try a deep check + repair with a Tahoe-LAFS that has that fix applied (preferably the current git master) and let us know if you can still reproduce the error?

comment:7 Changed at 2012-02-18T23:04:03Z by gyver

Same problem here : "tahoe deep-check --add-lease" works on an alias but "tahoe deep-check --repair" throws UncoordinatedWriteError. See ticket #1656 for a probably related bug.

I separated the two to at least maintain my backups on the storage network waiting for a solution.

Unfortunately, I have to deploy the solution to 8 servers and for this I use gentoo ebuilds so testing git master is a bit tricky (although possible if time allows).

One note : one of my storage node had network connection problems which most probably happened during "tahoe cp" : I put very large tar.xz files that can take more than one hour to store. The problem started about the same time these connection issues happened.

Last edited at 2012-03-05T22:16:32Z by zooko (previous) (diff)

comment:8 Changed at 2012-03-03T19:23:44Z by ianchov

.....'Archives/2012-02-01_08:25:59Z/DESKTOP/Dokumentatsia_en.efektivnost_Kostinbrod/P
rilojenie_3_deklaracia 47,1.doc': not healthy
 repair successful
"ERROR: AttributeError('NoneType' object has no attribute 'callRemote')"
"[Failure instance: Traceback: <type 'exceptions.AttributeError'>: 'NoneType' ob
ject has no attribute 'callRemote'"
X:\allmydata-tahoe-1.9.1\support\Lib\site-packages\foolscap-0.6.3-py2.6.egg\fool
scap\call.py:677:_done
X:\allmydata-tahoe-1.9.1\support\Lib\site-packages\foolscap-0.6.3-py2.6.egg\fool
scap\call.py:60:complete
X:\allmydata-tahoe-1.9.1\support\Lib\site-packages\twisted-10.1.0-py2.6-win-amd6
4.egg\twisted\internet\defer.py:318:callback
X:\allmydata-tahoe-1.9.1\support\Lib\site-packages\twisted-10.1.0-py2.6-win-amd6
4.egg\twisted\internet\defer.py:424:_startRunCallbacks
--- <exception caught here> ---
X:\allmydata-tahoe-1.9.1\support\Lib\site-packages\twisted-10.1.0-py2.6-win-amd6
4.egg\twisted\internet\defer.py:441:_runCallbacks
x:\allmydata-tahoe-1.9.1\src\allmydata\immutable\upload.py:553:_got_response
x:\allmydata-tahoe-1.9.1\src\allmydata\immutable\upload.py:420:_loop
x:\allmydata-tahoe-1.9.1\src\allmydata\immutable\upload.py:105:query

C:\Users\ianchov>C:\Python26\python.exe X:\allmydata-tahoe-1.9.1\bin\tahoe deep-
check --repair --add-lease -v -d X:\tahoe cveti:
'<root>': not healthy
 repair successful
ERROR: UncoordinatedWriteError()
"[Failure instance: Traceback (failure with no frames): <class 'allmydata.mutabl
e.common.UncoordinatedWriteError'>: "

C:\Users\ianchov>C:\Python26\python.exe X:\allmydata-tahoe-1.9.1\bin\tahoe deep-
check --repair --add-lease -v -d X:\tahoe cveti:
'<root>': not healthy
 repair successful
ERROR: UncoordinatedWriteError()
"[Failure instance: Traceback (failure with no frames): <class 'allmydata.mutabl
e.common.UncoordinatedWriteError'>: "
Last edited at 2012-03-05T22:15:15Z by zooko (previous) (diff)

comment:9 Changed at 2012-03-05T22:16:04Z by zooko

  • Description modified (diff)

comment:10 Changed at 2012-03-05T22:17:28Z by zooko

  • Keywords regression added
  • Milestone changed from undecided to 1.9.2

I'm adding the keyword regression on the assumption that this is related to the regressions in 1.9.x, and I'm adding it to the 1.9.2 Milestone. Please let me know if you know that assumption is incorrect.

comment:11 Changed at 2012-03-06T06:35:56Z by ianchov

I am not sure what should i do..

Please give me hint howto debug.

Changed at 2012-03-10T07:30:14Z by ianchov

Last incidents for Uncoordinated Write Error

Changed at 2012-03-10T07:30:28Z by ianchov

Last incidents for Uncoordinated Write Error

comment:12 Changed at 2012-03-10T18:05:50Z by kevan

ianchov: Are those incidents from a Tahoe-LAFS with the fix for #1628 applied? At first inspection, the UncoordinatedWriteErrors? in those files look like issue #1628.

comment:13 Changed at 2012-03-12T17:14:17Z by ianchov

Confirmed!

I thinkg it is the same as #1628 The last git e27423e4a9 does not have this problem.

comment:14 Changed at 2012-03-12T17:14:32Z by ianchov

  • Resolution set to fixed
  • Status changed from new to closed

comment:15 Changed at 2012-03-12T20:02:41Z by davidsarah

  • Resolution fixed deleted
  • Status changed from closed to reopened

comment:16 Changed at 2012-03-12T20:02:49Z by davidsarah

  • Resolution set to duplicate
  • Status changed from reopened to closed
Note: See TracTickets for help on using tickets.