Pyrrho DBMS: June 2009

Thursday, 11 June 2009

Bug fixed: SelectedColumnsRowSet

In previous releases of v3.2, the rowType was incorrectly reported as the original rowType, not the rowType with just the selected columns. This would have caused an error 22202 when the wrong rowType was used.
Similar fixes to other RowSet classes. Fixed as of 12 June.

Monday, 8 June 2009

On importing data

I've started to provide better support for data import into Pyrrho. Version 3.x has been introducing ideas of provenance. Now I am enhancing the PyrrhoMgr application a bit. The changes will get published in the next few days. So far they amount to:
1. Supporting direct import from Access 2007 in addition to Access 2003 and SQL Server.
2. Supporting the "Percent" numeric format in Access. (Intriguingly in Access 2007 the default Percent format is for a long integer, and so validation changes every value to either 0.0% or 100.0% !)
3. Allowing the importer to specify the "From Culture" so that culture specific formats for numbers and dates can get converted into the culture used by the importing thread (the culture of the machine that PyrrhoMgr is running on).
As mentioned in the last posting, the server always uses an invariant culture.
As a result of some ongoing research, more changes in this area can be expected soon.

Saturday, 6 June 2009

Why the version hasn’t changed

Changes like this are possible in Pyrrho because of the distinctive way that Pyrrho is designed. Nearly all DBMSs store data structures like indexes in permanent storage within the files making up the database. Pyrrho by contrast places in permanent storage only the data required for durability of each committed transaction. This approach minimises disk activity during normal operations, to about one-seventieth of the disk activity of rival systems.
The server’s data structures are private to the server, and are initialised when the server starts up, and the server brings its state up to date by re-reading the database file, which is just the transaction record. This is a time-consuming process for a large database, equivalent to a cold-start with resynchronisation step for other products.
The format of this data is as far as possible version and platform-independent – there is not even any assumption for what “double precision” means, or how many bits make up a long integer. This means that every aspect of the organisation of data structures can evolve without breaking backward compatibility. Database files created by version 0.1 can still be used in every version of Pyrrho, so all Pyrrho data bases can always be managed by the latest version of the server. New features, such as the recently introduced concept of provenance, or URI-based data types, introduce new or modified data formats which would not be understood by earlier versions, but they really should not be in use. Upgrading to the latest version is always recommended, and is free of charge.

Wednesday, 3 June 2009

Speeding up joins

The next version (still numbered 3.2) will have a new implementation of the matching code for joins. There will be situations where a significant performance improvement can be expected, for example where the join condition refers to a component of a primary or foreign key. In the source code this is called a TrivialJoin, but it is not necessarily quite so trivial where the join condition does not constrain all the components of the key...
I should be able to release this implementation this weekend following testing.
Update 4 June: so FDJoin is a better term than TrivialJoin, with consequential renaming of lots of things. I've introduced a new internal datatype called Wild for dealing with partial matches. And better handling of recursive traversal of multilevel indexes. It's all loking rather nice.

Pyrrho DBMS