DBaaS is difficult on
most platforms for several reasons:
1.
Cloud providers charge by the amount of use. Databases
need a lot of disk space so incur charges. If the database engine caches a lot
of things in memory, then the server instance needs to run the whole time, and
this also costs a lot f money. Finally, if there is a privileged single
instance (such as a transaction master for a dataset) then this is also a
restriction.
2.
Cloud “database” systems typically do not
support ACID principles. They add timestamps etc so that eventual consistency
is guaranteed at the expense of durability (there will be lost updates). Also,
eventual consistency means that users in telephone contact with each other may
see differences in data values in New York and Glasgow that take time to be
resolved.
For these reasons
Pyrrho does not claim to work with cloud providers. It requires a transaction
master and uses memory at least for indexes.
The title refers to Eric Brewer's famous theorem that in a distributed system you cannot have all three of Consistency, Availability and Partition Tolerance. As with many inconvenient truths, many people have tried to pretend they have a workaround.
Assuming we want C
for consistency, the P of CAP is “partition tolerance”, which means tolerating
when the network is broken (partitioned into two or more fragments). This sort
of partition is not the same as horizontal or vertical partitioning of
databases. If a client cannot contact the transaction master, no transactions
can be committed, so on part of the network the database will not be available
(the A).
What Pyrrho does offer in the direction of DBaaS is distributed and (horizontally) partitioned databases. Each horizontal partition is its own transaction master (a replica of a horizontal partition will not be). The most (network) partition tolerant design is where the only distributed transactions are either read-only or for schema changes. In that case you have a good deal of (network) partition tolerance:
1.
When any (horizontal) partition comes up, it
checks with its parent in the (horizontal) partitioning scheme.
2.
To view data from anywhere connect to any
partition: if your query attempts to access a partition that is offline, there
will be an error.
3.
For updates to a given partition, you just
connect to that partition (this is not a distributed transaction).
For example, if the
horizontal partitioning was by country, few users would notice disruption to
network traffic between countries. Pyrrho does not prevent more complex and
fragile database designs. In this way we still have global consistency and
ACID, and have partial A and P .