At DBKDA2021 in May, I presented a tutorial on the inner workings of Pyrrho V7. The PDF version, with the demonstrations, tiotalled 10MB, which was probably much too much. Material based on it will nevertheless be included in the introductory sections and appendices of the documentation "Introduction to the Source code of the PyrrhoDBMS" a version of which is available on my github site. The following extracts are from the very start of the tutorial, which went on to explain the use of shareable data structures and ended with a deep dive into remote updates using view-mediated web services. (All of this material will soon appear on the github site: email me at malcolm.crowe@uws.ac.uk if you are interested in getting it sooner.)
ACID and Serializable Transactions
Our starting point is that full isolation requires truly serializable transactions.
All database textbooks begin by saying how important serializability and isolation are, but very quickly settle for something much less.
If we agree that ACID transactions are good, then:
· First, for atomicity and durability we should not write anything durable until (a) we are sure we wish to commit and (b) we are ready to write the whole transaction.
· Second: before we write anything durable, we should validate our commit against the current database.
· Third, for isolation, we should not allow any user to see transactions that have not yet been committed.
· Fourth, for durability, we should use durable media – preferably write-once append storage.
From these observations, it seems clear that a database should record its durable transactions in a non-overlapping manner.
· If transactions in progress overlap in time, they cannot both commit if they conflict: and if they don’t conflict, it does not matter which one is recorded first.
· The simplest order for writing is that of the transaction commit.
· If we are writing some changes that we prepared earlier, the validation step must ensure that it depends on nothing that has changed in the meantime, so that our change can seem to date from the time it was committed rather than first considered.
· Effectively we need to reorder overlapping transactions as we commit them.
These few rules guarantee actual serialization of transactions for a single transaction log (sometimes called a single transaction master). It obviously does not matter where the transactions are coming from.
But if a transaction is trying to commit changes to more than one transaction log, things are very difficult (the notorious two-army problem). If messages between autonomous actors can get lost, then inconsistencies are inevitable. The best solution is to have a DBMS act as transaction master for every piece of data For safety any transaction should have at most one remote participant, and more complex transactions can be implemented as a sequence of such one-way dependency transactions.
The transaction log
The first was about the transaction log. In
Pyrrho the transaction log defines the content of the database (it is the only
thing stored on disk).
In this demonstration, we will see how
every Transaction T consists of a set of elementary operations e1, e2,..
, en .
Each operation corresponds to a Physical
object written to the transaction log
Committing this transaction applies the
sequence to the Database D. In reverse
mathematical notation
(D)T = (..((D)e1)e2)..)en
If we think of a transaction commit as
comprising a set of elementary operations e, then the transaction log is best
implemented as a serialization of these events to the append storage. We can
think of these serialized packets as objects in the physical database. In an
object-oriented programming language, we naturally have a class of such
objects, and we call this class Physical.
So, at the commit point of a transaction,
we have a list of these objects, and the commit has two effects (a) 7 appending
them to the storage, (b) modifying the database so that other users can see. We
can think of each of these elementary operations as a transformation on the
database, to be applied in the order they are written to the database (so the
ordering within each transaction is preserved).
And the transaction itself is the resulting
transformation of the database.
Every Database D is the
result of applying a sequence of transactions to the empty _system Database D0.
Any state of the database is thus the
result of the entire sequence of committed transactions, starting from a known
initial database state corresponding to an empty database.
The sequence of transactions and their
operations is recorded in the log.
Shareable data
Many programming languages (including Java,
Python and C#) currently have shareable implementations of strings.
Specifically, strings in Java, Python and
C# are immutable: if you make a change, you get a new string, and anyone who
had a copy of the previous version sees no change.
In these systems, it is illegal to modify a
string by assigning to a character position instead you need to use a library
function.
The addition operator can be used in these
languages to create a sum of strings. This is basically the model for shareable
data structures.
For a class to be shareable, all
fields must be read-only and shareable. Constructors therefore need to perform
deep initialisation, and any change to an existing structure needs another
constructor. Inherited fields need to be initialised in the base (or super)
constructor, maybe with the help of a static method.
This is useful for
databases because databases share so many things: predefined types, properties,
system tables. For example, all databases can share the same starting state, by
simply copying it from the _system database.
Even more importantly all transactions can start
with the current state of the database, without cloning or separately copying
any internal structures. When a transaction starts, it starts with the shared
database state: as it adds physicals, it transforms. Different transactions
will in general start from different states of the shared database.
In the above picture, we know
what the database DB’s state is. Each of concurrent transaction steps T1, T2,
and T3 are, if committed, will create a new version of DB (for example (DB)T2.)
Because of isolation, from the viewpoint of any of these transactions, they
cannot know whether DB has already been updates by another transaction (in
which case, they may no longer fit on the resulting database). In particular,
after T2 commits, T1 and/or T3 will possibly no longer be able to commit.
However, if T1 was able to
commit before, then it will still be able to commit provided it has no conflict
with T2’s changes.
Transaction conflict
The details of what
constitutes a conflicting transaction are debatable. Most experts would agree
with some version of the following rules:
·
T1 and T2 will
not conflict if they affect different tables or other database objects
o
And only read
from different tables
·
But we can allow
them to change different rows in the same table
o
Provided they
read different specified rows
·
Or even different
columns in the same row
o
Provided they
read different columns
The first rule is sound
enough, although the condition on reading is very important: we would need to
include things like aggregation in our definition of reading. The first rule is
also very easy to implement, especially if tables are shareable structures, as
a simple 64-bit comparison is sufficient!
For the other rules, we would
need to be very clear on what is meant by a “specified row”, and the
non-existence of a row might only be determined by reading the whole table.
Transaction validation
At the start of Transaction Commit, there is a validation check, to ensure that the transaction still fits on the current shared state of the database, that is, that we have no conflict with transaction that committed since our transaction started.
We will see that during commit of a Transaction T, we do a validation check. It ensures that the elementary operations of T can be validly relocated to follow those of any transaction T´ that has committed since the start of T . T planned
But now we have
Relocation amounts to swapping the order of
the elementary operations ei that comprise transaction T and T’. Two
such cannot be swapped if they conflict. E.g. They change the same object
(write/write conflict). The tests for write-write conflicts involve comparing our list of
physicals with those of the other transactions.
There are also tests for read/write
conflicts between T and T´. For checking read-write conflicts, we collect “read constraints” when we
are making Cursors.