Thursday 15 August 2013

Pyrrho DBMS architecture

With v5.0 Pyrrho has enhanced internal support for distributed, partitioned and remote databases. The architectural model outlined in 2009 Pyrrho and the Cloud is still correct, and the diagram from that article still applies:

The bottom two blocks are referred to in the source code as Level 1 (datafiles). Level 2 (physical records) is where transaction serialization occurs, Level 3 is for database objects, and Level 4 is where the SQL mechanisms, rowsets, OLAP, multi-database connections etc occur. Since approximately v4.5, the strong data type mechanism operates in level 2, and this now facilitates binary server-server communications.

With v5.0 the asynchronous client-server implementation (AsyncStream) is adapted for server-server communications, and applies for communication at the different levels in the following way:
  • At Level 1 server-server communications support the notion of remote storage
  • At Level 2 we use server-server comms for remote transaction master
  • At Level 3 we support database partitioning, where the asynchronous comms enable traversal of partitioned table segments
  • At Level 4 server-server comms will support mobile clients accessing remote databases
The configuration database uses the partitioning mechanism at level 3 to implement its peer-to-peer nature, with subclasses of the LocalTransaction class for handling side effects such as data transfer when partitions are moved between servers.

The v5.0 implementation of partitioning has brought a subtle and important change to Pyrrho's internal operation. Up to now objects in data and schema records have uniformly been identified using their defining or actual addresses in the transaction log files. However, from v5.0, schema objects are always identified by places in the base database's transaction log, and a special physical record (Partition) is used to wrap and import these into the partition's log. These Partition records are created and pushed to the partition as needed whenever a communication channel is opened between them. (They should probably be transacted in the partition under the identity of whoever made the schema change to the base database: but currently they sit outside of the transaction mechanism, as they are not part of the current transaction. I would like such changes to be asynchronous as I don't require all partitions to be online for a schema change.)

I plan to add more comments to the source code to explain things and make the structure clearer. There are some attempts to explain the internal workings of Pyrrho in the SourceIntro.doc and Classes spreadsheet in the distribution.

1 comment:

  1. Hi, nice description about Pyrrho DBMS architecture.Thanks for your help.....

    -Aparna
    Theosoft

    ReplyDelete