Thursday, 8 September 2016

Improvements to the Object-Oriented API

For years now, Pyrrho has offered an object-oriented API for C#, and will generate a C# class definition for any base table for any given database and role. Tutorials offered demos of how these classes could be extended by adding in relational data, and over the years versioning was added so that all such C# classes are subclasses of Versioned.
The mechanisms have now been enhanced in the following important ways:
  1. Full class definitions are now available for Views and structured types as well as base tables,
  2. Deep structure is supported: Pyrrho will define nested classes for subjects in tables or views, and for user-defined types. (All such classes are Versioned too)
  3. The Versioned base class now includes a readCheck in addition to a row Check.
As discussed in this blog, for base tables, the row Check is a row version validator in the sense of Laiho and Laux's paper, and for other data may be an ETag in the sense of RFC 7232. In complex cases (e.g. results that do not correspond directly to base table rows) either or both may be empty.
The readCheck is for checking later whether the query used to obtain the data would get the same results.
Class definitions can be found in the "Role$Class" table.Such a class C can then be used with the database connection with the following methods added to the usual IDbConnection interface:
  • C FindOne()
  • C[] FindWith(string where)
  • C[] FindAll()
  • Post(C s)
  • Put(C s)
  • Delete(C s)
  • Check(string check)
  • Check(string check,string readCheck)
Triggers and generated columns apply, and Post and Put update affected fields of s along with the Versioned values.

Monday, 5 September 2016

ETags and Row version validation

Web resources often come with semantically useful information (starting with the URI), for example RDF [1] or OWL types [2]. HTTP now offers ETags (Fielding and Reschke, 2014, RFC7232) in responses to allow caching of results, and ETags can be used for validating a step in a transaction. ETags are very similar to the RVV concept[3] or MVCC [4], successfully used in PostGRES [5], SQL server [6] Oracle [7] etc to provide optimistic execution.  
The purposes of the two concepts, however, remain different, and become more clearly visible when we try to generalise their use to more complex situations such as composite databases, updatable views, joins, and general queries.
For Pyrrho in simple cases the ETag can coincide with the strong RVV validator (CHECK) for requests that return a row of a base table: so implementing this feature amounts to an extension of the Pyrrho API's Versioned mechanism (See also this discussion on REST in Spring).
In this blog I have previously suggested that RVV should be updated if related data changes, but I have been persuaded that the validator should just be for the contents of the row itself.
In that case, I want to extend the idea of RVV to updatable views and joins. In such objects, each row has a direct relationship with specific base table rows, so an RVV for the view will reflect changes to these rows.
It is a moot point whether the result of a query is a Web resource. It is possible that it contains nothing directly copied from any base table, but if an ETag was returned with the result it would contain the information that the result has not changed since the last time the it was computed.
Also, we should bget the same versioning information from a View that we would get with the associated SELECT statement.
So, consider the following:

(a) RVV

The RVV if present includes row version validators for any base table row whose values appear in the result row.
The RVV will be absent for a row that does not identify any single row in a contributing base table (e.g. some of the rows in select distinct or  group by). RVVs are computed for base tables using the Laiho/Laux algorithms, and indicate the transaction point that last modified the values of the row in the base table. For query results, an RVV is copied into a result row by the first contributing row and removed if more than one row contributes.
RVV can be used to identify a set of rows to be deleted or updated, and so have an immediate use in the API in an application-side object to identify corresponding individual base table row(s) in the database.

(b) ETag

The ETag if present indicates the latest transaction points of the values used to compute the values in the row.
The ETag will be absent if the row does not come from any database contents (e.g. the row consists of constants, current_time, current_user etc). There is an obvious algorithm for computing ETags from RVVs. For rows coming from base tables the ETag is the same as the RVV. When another row contributes, the ETag becomes the later transaction point.
ETags can be used for HTTP caching as described in RFC7232. However, we note that for complex queries the server would have to recompute the result to see if the ETag is valid.
If results come from more than one database, RVVs and ETags will be lists of values, at most one from each database.
From today, The CHECK pseudofunction in Pyrrho gives the RVV if defined for a row, and the ETag returned with a REST result is for the first result in the result set. If an ETag is supplied with a REST request, it is checked for validity.
The RVV is also used in the Pyrrho API, so that Find<C> and similar functions return Versioned objects that can be used with the API functions Put, Post, and Delete, as described in the Pyrrho manual.

[1] W3's RDF documentation
[2]W3's OWL documentation
[3 Laiho,M, Laux, F: (2010) Implementing Optimistic Concurrency Control for Persistence Middleware Using Row Version Verification
[4]MVCC
[5]Amit Kapila, (2015) Well-known Databases Use Different Approaches for MVCC
[6]Steve Howard (2012) SQL Server MVCC with read_committed_snapshot
[7]Oracle Database Concepts: Chapter 13 Data Concurrency and Consistency
[8]The concept of strong validator dates from sec 13 of the HTTP/1.1 definition (Fielding et al 1999)