Over on TAWQT.com, there is some discussion on when a change in data should result in a new version. Several DBMS provide optimistic concurrency based on row versioning, including DB2 and SQL Server. But the versioning offered by these systems is confined to the base table and does not consider related data in other tables. That is, it is tied to the physical database, and not the logical database. If the database schema is changed, the version semantics may well be quite different, and the application logic may need to be redesigned.
One of the examples on the TAWQT site is about students’ marks in a set of subjects. Under what circumstances is there a new version of the data for a student? Is it (A) only when the student’s name or address has changed, or (B) also when the subject information changes? If the data is held in 3NF relational tables, the normal answer would be (A), but if the data for each student is held in a document, or the subject information is in a column of type XML or ARRAY, the answer would be (B).
For a great many business applications, answer (B) would be preferable, and many businesses prefer non-relational designs for this sort of reason. If the DBMS only offers (A), then application programmers may need to develop some customised versioning logic. This is not very difficult, and is done in any case for DBMS’s that don’t offer versioning.
So it is natural to ask whether a change to row versioning semantics would be helpful. In this note I would like to explore the consequences of updating the version of a row if values are changed in that row or any row that references it. As an experiment this change has been introduced in Pyrrho 5.3’s built-in CHECK and VERSIONING properties.
The references being considered are of course confined to the database: we won’t know about changes that take place elsewhere. But in the example cited, if there are two relational tables STUDENT and SUBJECT then a change to a mark in a subject will give a new version to one row in each table.
In making this change I have of course considered that a database may be intended to work with many different applications, and this particular sort of versioning may suit some applications but not others. I am convinced however that the use of foreign keys in relational databases is so widespread that this is the correct default behaviour, correct in the sense that it would be useful in a great many more cases. Programmers can still do something different if they want.
Some will no doubt see this as a departure from common practice. But this is an area where practice is far from common, and the SQL standard is silent on the issue. It would be possible to have an intermediate implementation where the relevance of related tables is determined by the role assigned to the application, with some sort of metadata associated with the foreign key. But it seems unattractive.