Thursday, 19 July 2018

RESTViews: the "Sierra Leone" example

A previous post on this blog (More details about RESTViews) showed the range of HTTP queries that could be automatically generated by Pyrrho from a simple View definition. This post takes extends the examples, using a set of sample MySQL databases constructed for the purpose by Fritz Laux.
A number of steps are required to set up the MYSQL databases and allow remote SELECT connections to the view, so these will not detailed here. The important point is that once the remote views have been defined to a Pyrrho database, a large set of SQL queries (with filters, aggregations and joins to other local tables) are supported using automatic creation of the appropriate SQL requests to the remote MySQL databases. The local Pyrrho database is not modified during any of these tests as none of the views are materialised and no schema changes are made. The data requested (volume of data transferred from the remote databases) is minimised as shown in these blog posts. Full technical details of the mechanism are given in the Pyrrho distribution at http://pyrrhodb.com, which contains a document "The Sierra Leone example" giving more details than are given here.
In this example, the remote view is defined in Pyrrho as follows (the syntax uses an SQL extension that Pyrrho calls RESTView):
 [create view patientdataSL of ("hospID" char,"hospDesc" char,"patientID" int,"rCode" int,"address" char, "birthdate" date,"admission" date,"ICD10" char,"diagnosis" char,"treatments" char) as get using hospitalssl]
where hospitalssl is the following small table:
hospIDhospDescurl
'CMH''Choithram Memorial Hospital''http://root:admin@localhost:8078/sleone/patientrecordcmh'
'HS''Holy Spirit Hospital''http://root:admin@localhost:8078/sleone/patientrecordholyspirit'
'MABESSS''St John of God Catholic (Mabesseneh) Hospital''http://root:admin@localhost:8078/sleone/patientrecordmabesss'
The first column is a key to identify the remote contributor database, at the url shown.
The remote contributor databases supply views for the given credentials containing the remaining columns in the above view definition. All three databases are in a MySQL database on localhost, and of course in a real example they would be on different hosts. The HTTP listener on port 8078 is the HTTP interface restif available at GitHub - this interface needs to run on the same server as the MySQL database but has no privileges or configuration of its own. It simply accepts one or more SQL statements from and returns SQL data in Json format.
With this setup, no further programming is required to access data from the remote views, as Pyrrho generates any MySQL queries that may be required. Further local views can be defined as indicated below and the code below includesjoins aith a local table called deomgraphicSL.
Consider the following SQL statements and the corresponding SQL queries sent to the remote contributors (Pyrrho request is in black, MySQL generated query for one remote hospital in blue):

[select * from patientdataSL where "patientID" = 10015]
select 'CMH' as hospID,'Choithram Memorial Hospital' as hospDesc,patientID,rCode,address,birthdate,admission,ICD10,diagnosis,treatments from patientrecordcmh where patientID=10015

select count("patientID") from patientdataSL
select COUNT(patientID) as C_298 from patientrecordcmh
This request of course returns a single row from each hospital with its number of patients and Pyrrho gives the total of these counts as 30003.

select count(*) from patientdatasl where icd10='N73.4'
select COUNT(*) as C_323 from patientrecordcmh where ICD10='N73.4'
This time the count is 3.

select "patientID" from patientdataSL where icd10='N73.4'
select patientID from patientrecordcmh where ICD10='N73.4'
Three patientIDs are returned 0 from the first hospital, 2 from the second, and 1 from the third.

[select count( *) as "numPatients", "rCode", count( *)*1.0 / "totalinhabitants" as "percentageOfPopulation" from patientdataSL natural join demographicSL where "rCode" < 1010 group by "rCode","totalinhabitants" order by "percentageOfPopulation" desc]
select COUNT(*) as C_275,rCode from patientrecordcmh group by rCode having rCode < 1010
This fetches 9 rows from each hospital, and results in a table of 9 rows sorted by proportionOfPopulation.

[create view patientSummary as select "rCode", extract(year from "admission")-extract(year from "birthdate") as "age", extract(year from "admission") as "adyear", icd10, count(*) as patients from patientDataSL group by "rCode", "age", "adyear", icd10]
select * from patientsummary where icd10='N73.4'
select rCode as rCode,(YEAR(admission)-YEAR(birthdate)) as age,(YEAR(admission) as adyear,ICD10 as ICD10,COUNT(*) as C_856 from patientrecordcmh group by rCode,age,adyear,ICD10 having ICD10='N73.4'
This gives the rCode, age and admission year for the three patients mentioned above.

create view patientAnalysis as select * from patientSummary natural join demographicSL
select * from patientanalysis where icd10='N73.4'
select rCode as rCode,(YEAR(admission)-YEAR(birthdate)) as age,(YEAR(admission) as adyear,ICD10 as ICD10,COUNT(*) as C_903 from patientrecordcmh group by rCode,age,adyear,ICD10 having ICD10='N73.4'

[select "rCode", icd10, "age", patients*100.0/"under10" as percentage from patientAnalysis where ICD10='N17.1' and "age"<10]
select rCode as rCode,(YEAR(admission)-YEAR(birthdate)) as age,(YEAR(admission) as adyear,ICD10 as ICD10,COUNT(*) as C_972 from patientrecordcmh group by rCode,age,adyear,ICD10 having ICD10='N73.4'
These adds to the previous details some demographic information about the regions involved. The generated SQL is similar (Pyrrho currently does not pass on the condition on the calculated field age).

[select coalesce(young.icd10,older.icd10) as "icd10", youngPat, oldPat from (select icd10, sum(patients) as youngPat from patientsummary where "age"<10 and="" icd10="">'Z87' group by icd10) young full join (select icd10, sum(patients) as oldPat from patientsummary where "age">50 and icd10>'Z87' group by icd10) older on young.icd10=older.icd10 order by youngPat desc]
select rCode as rCode,(YEAR(admission)-YEAR(birthdate)) as age,YEAR(admission) as adyear,ICD10 as ICD10,COUNT(*) as C_253 from patientrecordcmh group by rCode,age,adyear,ICD10 having ICD10>'Z87'
Each half of this join retrieves 100, 97, and 70 records from the three hospitals, and groups the results into a table with 56 rows. Also the descending order is honoured. The conversions and processing are very fast.