id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc	launchpad_bug
932	benchmark Tahoe-LAFS compared to nosql dbs	zooko	bibilthaysose	"I'm curious how Tahoe-LAFS performs compared to nosql databases on the nosqlish loads that those users care about. Aaron Cordova did some benchmarks of Tahoe-LAFS vs. HDFS as the storage backend for Hadoop and reported in his !HadoopWorld presentation that they performed about the same for the map-reduce computation (which is a read-intensive workload): http://www.slideshare.net/cloudera/hw09-map-reduce-over-tahoe-a-least-authority-encrypted-distributed-filesystem

Recently a scientist from Yahoo posted about his benchmarks of various nosql systems:

http://mail-archives.apache.org/mod_mbox/incubator-cassandra-user/201001.mbox/%3cC2D6929236FAC846B7A4FE1EC39910C64F27B52F25@SP1-EX07VS01.ds.corp.yahoo.com%3e

He says that his benchmarking code will be open-sourced soon pending approval from Yahoo's legal department. Maybe we could contribute patches that make Tahoe-LAFS one of the systems that his benchmark system can measure.

N.B. not to get anyone's hopes up, I would expect Tahoe-LAFS to perform very badly on those workloads! They typically want to assign values to user-specified keys, which we don't have a native implementation of and which we would have to simulate somehow, such as by letting the user-chosen keys be the childnames in a mutable directory. So I would expect Tahoe-LAFS to be pretty much off the charts for bad performance on those workloads. But, I might be pleasantly surprised. And also: ""What gets measured gets improved!"" :-)"	enhancement	assigned	major	undecided	dev-infrastructure	1.5.0		scalability performance large	zooko