Monday 12 January 2015

The MongoDB service

It is noticeable that many companies are now using MongoDB documents alongside traditional SQL databases. But it is a very bad idea to require connections to different DBMS in a single application since transactional consistency is made even more difficult if not impossible. Pyrrho already allows a single connection to involve several databases, and MongoDB has a similar facility. But if an application is going to work with both database types it is really essential to have a single connection to both the SQL and no-SQL databases. Pyrrho has added support for MongoDB documents (and accessibility to these documents via SQL) for at least academic interest, by making DOCUMENT (and DOCARRAY) into primitive types in the same way that SQL added support for XML. Pyrrho’s SQL syntax in chapter 7 of this manual allows {} to delimit Json documents (and, where the context allows, [ ] to delimit Json arrays). Within such documents the Mongo operators have their usual meanings.

By default Pyrrho starts up a MongoDB-like wire protocol service on port 27017, so that the mongo shell can work with Pyrrho although the database file formats are completely different.
In application programming where SQL tables and documents are both needed, it is better to use the above SQL extensions on a single TCP connection rather than opening a second connection on port 27017. The mongo shell’s document collections are implemented as database tables with a single column “doc” of type DOCUMENT, although SQL can access more general layouts. Comparison of documents has been hacked so that WHERE “doc”= condition in SQL behaves similarly to a Mongo find() (the resulting concept of equality of documents is admittedly strange but works well).

When working with such mixed systems the biggest surprise is around case-sensitivity. Json collection and field names are case sensitive, while SQL by default converts unquoted identifiers to upper case.

The MongoDB documentation for update contains the following example {
  $set: { status: "D" },
  $inc: { quantity: 2 }
}
Here are complete examples for updates using Pyrrho DB. The first uses the mongo shell to access Pyrrho, the second, the Pyrrhocmd client:
C:\Program Files\MongoDB 2.6 Standard\bin>mongo
MongoDB shell version: 2.6.6
connecting to: test
:0> db.runCommand({insert:"orders",inserts: [{product:"coffee",status:"E",quantity:1}]
})
{ "ok" : true, "n" : 1 }
:0> db.orders.find()
{ "product" : "coffee", "status" : "E", "quantity" : 1, "_id" : ObjectId("0000000039150006003d1909") }
:0> db.runCommand({update:"orders",updates:
[{query: {product:"coffee"},update:{$set:{status:"D"},$inc:{quantity:2}}}]
})
{ "ok" : true, "n" : 1 }
:0> db.orders.find()
{ "product" : "coffee", "status" : "D", "quantity" : 3, "_id" : ObjectId("0000000039150006003d1909") }
:0>

Microsoft Windows [Version 6.3.9600]
(c) 2013 Microsoft Corporation. All rights reserved.

C:\PyrrhoDB\OSP\OSP>pyrrhocmd test
SQL> update "orders" set "doc"={$set: {status:"G"},$inc: {quantity: 4}} where "doc"={product:"coffee"}
1 records affected

SQL> table "orders"
|--------------------------------------------------------------------------------------|
|doc
       |
|--------------------------------------------------------------------------------------|
|{"product": "coffee", "status": "G", "quantity": 7, "_id": "0000000039150004007
52379"}|
|--------------------------------------------------------------------------------------|
SQL>



Update May 2015

development of this service continues disappointingly slowly. It does not work well with MongoDB 3+ tools, and work is effectively suspended until MongoDB releases more documentation of their meta protocol.

Wednesday 7 January 2015

Version 5.2 Beta

Since early April 2014 I have been working on this new version of Pyrrho DBMS. The open source edition of this version is available today in beta form for download, together with a set of unit tests.
The new version retains the same SQL language and file format as previous versions, so should be fully backwards compatible with databases created by them.
But the query processing system has been completely rewritten to be more robust for supporting developments such as MongoDB, and enhancements planned for the HTTP/REST service. I will start working on these soon.
The comments in the code also need to be completed and revised.
The new version is better able to process some standard SQL syntax (I give some examples below), and also some innovations such as the following:
  • FROM STATIC: similar to the use of "DUAL" in other DBMS, this helps meet the requirement for the FROM keyword in the SQL standard while allowing the evaluation of expressions, e.g. SELECT SESSION_ROLE FROM STATIC.
  • The ability to index on fields within structured types, e.g.
create type mrw as (c int)
create table e(f mrw, primary key(f.c)
  • The ability to rename database objects with automatic consequential modification to stored procedures, constraints, views etc that reference the modified objects.
One of the main motivations for the new version was to get triggers working fully according to the SQL 2011 standard. Example:

create trigger sdai instead of delete on a referencing old table as ot for each statement begin atomic insert into c (select b,c from ot) end
create trigger riab before insert on a referencing new as nr for each row begin atomic set nr.c=nr.b+3; update b set tot=tot+nr.b end
create trigger ruab before update on a referencing old as mr new as nr for each row begin atomic update b set tot=tot-mr.b+nr.b; set nr.d='changed' end

Other examples from the test suite:
select bb,dd from aa order by bb+cc
select a[b] from (select array(bb,cc,dd) as a,ee as b from aa)
select array(select bb from aa order by cc)[1] from static


Some examples withdrawn from the test suite include

select bb as a,(select max(bb) from aa t where t.cc>a) from aa
insert into d (select 17,(select a from b where c=1) from static)
These look like SQL, but I don't think they are valid. In the first, left-to-right processing of SQL as required by the standard means the type of bb is unknown until we reach "from aa". In the second I believe SQL requires the column names to be correct for INSERT SELECT: the test could be improved by inserting the VALUES keyword. I'll be interested in comments about these tests.