Saturday 28 December 2013

"Database as a Service" and the CAP Theorem


DBaaS is difficult on most platforms for several reasons:

1.       Cloud providers charge by the amount of use. Databases need a lot of disk space so incur charges. If the database engine caches a lot of things in memory, then the server instance needs to run the whole time, and this also costs a lot f money. Finally, if there is a privileged single instance (such as a transaction master for a dataset) then this is also a restriction.

2.       Cloud “database” systems typically do not support ACID principles. They add timestamps etc so that eventual consistency is guaranteed at the expense of durability (there will be lost updates). Also, eventual consistency means that users in telephone contact with each other may see differences in data values in New York and Glasgow that take time to be resolved.

For these reasons Pyrrho does not claim to work with cloud providers. It requires a transaction master and uses memory at least for indexes.
The title refers to Eric Brewer's famous theorem that in a distributed system you cannot have all three of Consistency, Availability and Partition Tolerance. As with many inconvenient truths, many people have tried to pretend they have a workaround.
Assuming we want C for consistency, the P of CAP is “partition tolerance”, which means tolerating when the network is broken (partitioned into two or more fragments). This sort of partition is not the same as horizontal or vertical partitioning of databases. If a client cannot contact the transaction master, no transactions can be committed, so on part of the network the database will not be available (the A).

What Pyrrho does offer in the direction of DBaaS is distributed and (horizontally) partitioned databases. Each horizontal partition is its own transaction master (a replica of a horizontal partition will not be). The most (network) partition tolerant design is where the only distributed transactions are either read-only or for schema changes. In that case you have a good deal of (network) partition tolerance:

1.       When any (horizontal) partition comes up, it checks with its parent in the (horizontal) partitioning scheme.

2.       To view data from anywhere connect to any partition: if your query attempts to access a partition that is offline, there will be an error.

3.       For updates to a given partition, you just connect to that partition (this is not a distributed transaction).

For example, if the horizontal partitioning was by country, few users would notice disruption to network traffic between countries. Pyrrho does not prevent more complex and fragile database designs. In this way we still have global consistency and ACID, and have partial A and P .