Thursday, 23 September 2021

Implementation: instancing and framing

(Updated: 14th January 2022: Implementation continues)

Tables and table references

During traversal of a rowset, the value associated with a column name changes from one row to the next. This is managed in database implementations using the notion of cursor. A cursor is a row value with the additional capabilities of being able to advance to contain the values for the next row or step back to the previous row. 

Column names are local to a table, or more precisely to a table reference, since a table and its columns may be referenced more than once in an SQL statement, and the differenet references are then distinguished using aliases. Different references to the same table thus have different properties, and may be subject to different where-conditions etc.

To manage this, it is useful to have a process of instancing, so that the shared table remains immutable, while its instance are placed in transient local storage (a heap). Thus the instance and its columns will be referenced using heap uids instead of names or defining positions.

Each table reference thus has its own instance on the heap, and instances for its columns. At the end of execuition of the current SQL statement, the heap can be forgotten.

Compiled database objects

In standard SQL, several database object types (e.g. Procedure, Check, Trigger, View) define executable code. It is an ambition of Pyrrho V7 to compile such code once only: on database load, or on definition of a new compiled object. The compilation process creates many subsidiary objects (variable declarations, expressions, executable statements) in memory. These objects are then immutable and only replaced if the database object is altered (ALTER VIEW etc). In the next iteration (V7) of Pyrrho, these subsidiary objects have temporary uids and are stored in a field of the compiled object called framing.

When the compiled object is referenced during execution of an SQL statement, the objects in the framing need to be  copied to the top of the heap. However, for efficiency reasons, even such a simple step whould be carried out as few times as possible. Procedure code can be brought into the context just once when required. Trigger code, and constraints for columns and domains should be brought in to the context  before rowset traversal begins.

When views are used in a query, view references need to be instanced in the same way as table references, so that an instance must be created for each reference, so that it can be equipped with where-conditions, grouping specifications etc.

Prepared Statements

Prepared statements are local to a connection, and have query parameters, and can be handled in much the same way as compiled statements. An instance of a prepared statement has the formal paramneters replaced with actual values before execution begins,

The storage used for prepared statements in shared with successive transactions on the connection, and does not need to be saved in the database.



No comments:

Post a Comment