Saturday, 9 July 2011

Reflection and Relationships

Relational databases need to provide support two-way relationships especially now that such relationships are so easily navigated using REST-style URLs. This article proposes a simple approach to defining path navigation in the relational database at physical or conceptual level. The resulting generated columns (or properties) are of type ROW ARRAY and are quite difficult to use in SQL, but very easy from REST.
Interestingly the Java Persistence Query Language provides query language support for such two-way relationships, but required a lot of annotation infrastructure.
For example (inspired by the Java Tutorial), suppose we have tables
player: (pid int primary key, pname char)
team: (tid int primary key, tname char, capt int references player)
playerteam: (pid references player, tid references team)
This club has a number of teams for different purposes, and players can belong to (or even captain) more than one team. Then the syntax proposed here would allow us to write definitions as simple as
alter table player add captains reflects team
alter table player add playsin reflects playerteam
Then in the REST interface we can write
http://myclub.org/db/db/player(pname=’Fred Bloggs’)/captains/tname
to obtain the names of all the teams Fred captains.
The syntax is one of the alternatives for ColumnDefinition, and defines a generated-always column:
| id REFLECTS [Cols IN] Table_id [Cols]
Recall that ColumnDefinition is also used to define a property in a role-based data model, and of course in that context entity and property identifiers can be used in place of table and column identifiers.
The REFLECTS definition sets up an array-valued generated column as a reverse reference: e.g. if t.f references g(i) then “alter g add h reflects t” is a shorthand for “alter g add h row t array generated always as (select * from t where f=i)”. That is, each cell in the generated column contains a list of rows from the referenced table.
The optional [Cols] following the referenced Table_id allows selection of a particular foreign key if there is more than one.
For many-many relationships the referenced table will be a secondary table, so that the type of the generated column is defined only implicitly. Still the basic idea is simple enough. Suppose we have a many-many relationship R between tables A and B. In a relational database this will be implemented by means of a secondary table AB consisting of pairs of primary keys for A and B. In both A and B we can easily define additional generated columns for our convenience (e.g. for REST navigation):
alter table A add X reflects AB
alter table B add Y reflects AB
Then in each row of A, X will be a list of rows from table B; and in each row of B, Y will be a list of rows from table A. In practice we would probably name these properties B and A rather than X and Y to make the resulting REST-based navigation more intuitive.
Again, the optional column specification [Cols IN] allows selection of a particular foreign key in the secondary table. In the above player and team example we could define
alter table player add captains reflects team(capt)
alter table team add players reflects (pid) in playerteam(tid)
The full details of the [Cols IN] option are as follows. [Cols IN] is only permitted in relationships that navigate a secondary table. In a definition such as “alter S add C reflects (TF) in MM (SF)”, MM will be a secondary table implementing a many-many relationship between table S and some other table T, containing (at least) foreign keys SF and F where TF references the target table T and SF references the source table S. Then the values of C will be lists of rows of T.
The final [Cols] syntax allows selection of a foreign key that references the source table, and can be used in any REFLECTS construct.

No comments:

Post a Comment