Monday 25 November 2013

Distributed and Partitioned DB Tutorials

Today's release of PyrrhoDB v5.0 comes with a full account of the distributed and partitioned tutorials described in August 2013 postings on this blog.

Those postings have been updated slightly to match the new version. The main difference is that reconfiguration and repartition of databases are now transactioned. You can say begin transaction before you start either process and then no changes are made to disks on any of the servers involved until the transaction is committed. Recall that Pyrrho is very fussy about transaction isolation, so that while a transaction is ongoing the connection that started the transaction is the only participant that can examine the progress being made.

The advantage of doing this is more than theoretical. Defining a partition includes specifying a set of conditions for including records in the new partition, and autocommits during this process would not be helpful.

The new version comes with a much more robust 3-phase distributed transaction protocol than was provided in previous versions, and there are slight differences.

For the purpose of explaining the internal operation of Pyrrho for distributed and partitioned data, a tutorial mode -T has been implemented that exposes all of the server-server protocols and commit steps. Sections have been added to the tutorials to explain some of what is going on. The Pyrrho manual and the SourceIntro document in the open source distribution have been updated with full details. I am happy to explain the internals to anyone who is interested and plan to add more comments to the source code.

Future developments for Pyrrho will develop these facilities a little further, offering behaviour closer to scatter-gather (Hadoop). In connection with a related project at UWS, I am also planning to provide internal support for a BSON data type. As usual, what distinguished Pyrrho from other database initiatives is that all databases are consistent and relational, support full SQL and optimistic concurrency.