Wednesday, 25 September 2019

Why Pyrrho performs so well in the TPC-C benchmark tests

I have been asked how it can be that commercial DBMS, and also PostgreSQL, show up so badly in the TPC-C benchmark tests that I have published on GitHub.

To begin with, the TPC-C benchmark normally has 1 clerk per warehouse, so that the conflict rate is around 4%. In my tests I deliberatiely increase the concurrency challenge by using multiple clerks for a single warehouse. When the number of clerks goes above 10, most New Order tasks will fail with a write-write conflict on NEXT_O_ID as this is set per district and there are only 10 districts. Worse, the single row in the WAREHOUSE table contains an amount W_YTD which is updated by the payment task, and fields from this row are read by all the NewOrder tasks and others so that a great many more tasks are aborted because of read/write conflicts. In all of the products tested, apart from Pyrrho and StrongDBMS, read/write conflicts are detected at the row level or wider.

Both Pyrrho and StrongDBMS see no conflict between the payment and NewOrder task because Payment is the only task that accesses W_YTD, and one of the available tests in the ReadConstraint for detecting read/write conflicts is a set of fields in a specific single row of a table.

There are actually three levels of read/write conflict detection in these DBMS. The following comment in the source code at ReadConstraint.cs dates from about 2005:

    /// ReadConstraints record all of the objects that have been accessed in the current transaction
    /// so that this transaction will conflict with a transaction that changes any of them.
    /// However, for records in a table, we allow specific non-conflicting updates, as follows:
    /// (a) (CheckUpdate) If unique selection of specific records cannot be guaranteed, then
    /// we should report conflict if any column read is updated by another transaction.
    /// (b) (CheckSpecific) If we are sure the transaction has seen a small number of records of tb,
    /// selected by specific values of the primary or other unique key, then
    /// we can limit the conflict check to updates of the selected records (if any),
    /// or to updates of the key TableColumns.
    /// (c) (BlockUpdate) as (a) but it is known that case (b) cannot apply.


If the isolation level is reduced to repeatable-read or read-committed, most of the competing products achieve performance comparable with Pyrrho and StrongDBMS.

I remain very satisfied with the results of these tests since they show that Pyrrho and StrongDBMS achieve such high scores on concurrency tests despite, or even because of, using immutable data structures and optimistic concurrency.

Monday, 16 September 2019

TPCC benchmark with Pyrrho v7

At present, successive updates to PyrrhoDB v7 alpha are on GitHub . As of today, this location contains the 14 September 2019 version, and a version of TpccPyrrho. The TPC-C benchmark test is for OLTP for a warehouse, where the clerk works through a task sequence including new orders, with realistic time delays. In 10 minutes the clerk handles 16 new orders along with other tasks.

In order to demonstrate exceptional handling of concurrency, this version of the benchmark uses multiple clerks per warehouse. This introduces high levels of concurrency and many transactions should fail. With StrongDBMS I demonstrated performance superior to commercial databases, and now can do the same with the alpha version of PyrrhoDB. The GitHub repository includes versions of the benchmark for several popular DBMS so this claim can be verified by anyone interested.

The results for Pyrrho v7 alpha are as follows:


                Recreate DB: 1:02

                Fill stock: 2:02

                Fill districts: 6:15

                Cold start with initial warehouse: 1:30



F:\PyrrhoDB7\Pyrrho>tpccpyrrho

fid 1 loaded at 15/09/2019 12:04:32

Started at 15/09/2019 12:04:40 with 1 clerks

fid 2 loaded at 15/09/2019 12:04:40

At 15/09/2019 12:14:40 Commits 16, Conflicts 0 0

Last fid=2



F:\PyrrhoDB7\Pyrrho>tpccpyrrho

fid 1 loaded at 15/09/2019 12:17:56

Started at 15/09/2019 12:18:03 with 10 clerks

fid 11 loaded at 15/09/2019 12:18:03

At 15/09/2019 12:28:03 Commits 145, Conflicts 0 95

Last fid=11



F:\PyrrhoDB7\Pyrrho>tpccpyrrho

fid 1 loaded at 15/09/2019 12:32:41

Started at 15/09/2019 12:33:33 with 100 clerks

fid 101 loaded at 15/09/2019 12:35:01

At 15/09/2019 12:43:33 Commits 313, Conflicts 0 2920

Last fid=101



F:\PyrrhoDB7\Pyrrho>



During the benchmark test for 100 clerks my desktop machine reported the CPU utilisation was around 40% and the memory utilisation 50%.


PyrrhoDB v7 should reach beta version by December and include all of the usual database features as in previous versions of the DBMS.

Monday, 2 September 2019

Pyrrho v7 alpha available

The 2 September alpha code of PyrrhoDB v7 is now available
So far it can manage creation and CRUD operations on simple tables, but has a full set of data types and system tables. There is an updated introduction to the source code in the doc folder.
Work continues, comments welcome.