Andy Doan andy.doan@linaro.org writes:
On 01/28/2013 03:31 PM, Michael Hudson-Doyle wrote:
Thoughts? Am I solving the wrong problem?
Well. The thing that occurs to me is that what we are doing here is building a system that aims to be available for writes in the face of network partitions, and other people have already built systems that have this property -- it is basically the whole principle behind Amazon's famous dynamo db [1] and the systems it inspired like Riak and Cassandra. It seems unlikely that we'd do a better job than them.
time to read your article below.
To be fair, we're pretty unlikely to build anything on this sort of technology before the outager we're being notified of :-)
One thing that I don't completely understand how to replicate if we have a simple job-accepting scheduler in the cloud is the sanity check about the submitting user being able to submit results to the stream specified in the job -- or even if token provided while submitting the job is valid, come to think of it!
I thought about this before my original email and decided to not mention this issue.
:)
However, my thinking was that we might need to add a new state to a job called something like "REJECTED". Now normally, we just reject a request before it ever becomes a job. However, in this offline mode, we could wait to reject the job until we came back online and were able to do the proper checks.
Yeah, that could work. Still not sure if we'll get this done before the outage though...
Cheers, mwh