Types and data for the examples Let us consider the following types representing a Bibliography
type Biblio = <bibliography>[Heading Paper*]
type Heading = <heading>[ PCDATA ]
type Paper = <paper>[ Author+ Title Conference File ]
type Author = <author>[ PCDATA ]
type Title = <title>[ PCDATA ]
type Conference = <conference>[ PCDATA ]
type File = <file>[ PCDATA ]
and some values
let bib : Biblio =
<bibliography>[
<heading>"Alain Frisch's bibliography"
<paper>[
<author>"Alain Frisch"
<author>"Giuseppe Castagna"
<author>"Veronique Benzaken"
<title>"Semantic subtyping"
<conference>"LICS 02"
<file>"semsub.ps.gz"
]
<paper>[
<author>"Mariangiola Dezani-Ciancaglini"
<author>"Alain Frisch"
<author>"Elio Giovannetti"
<author>"Yoko Motohama"
<title>"The Relevance of Semantic Subtyping"
<conference>"ITRS'02"
<file>"itrs02.ps.gz"
]
<paper>[
<author>"Veronique Benzaken"
<author>"Giuseppe Castagna"
<author>"Alain Frisch"
<title>"CDuce: a white-paper"
<conference>"PLANX-02"
<file>"planx.ps.gz"
]
]
Projections All titles in the bibliography bib
let titles = [bib]/<paper>_/<title>_
Which yields to:
val titles : [ <title>[ Char* ]* ] = [ <title>[ 'Semantic subtyping' ]
<title>[ 'The Relevance of Semantic Subtyping' ]
<title>[ 'CDuce: a white-paper' ]
]
Ok.
All authors in the bibliography bib
let authors = [bib]/<paper>_/<author>_
Yielding the result:
val authors : [ <author>[ Char* ]* ] = [ <author>[ 'Alain Frisch' ]
<author>[ 'Giuseppe Castagna' ]
<author>[ 'Veronique Benzaken' ]
<author>[ 'Mariangiola Dezani-Ciancaglini' ]
<author>[ 'Alain Frisch' ]
<author>[ 'Elio Giovannetti' ]
<author>[ 'Yoko Motohama' ]
<author>[ 'Veronique Benzaken' ]
<author>[ 'Giuseppe Castagna' ]
<author>[ 'Alain Frisch' ]
]
Ok.
All papers in the bibliography bib
let papers = [bib]/<paper>_
Yielding:
val papers : [ <paper>[ Author+ Title Conference File ]* ] = [ <paper>[
<author>[ 'Alain Frisch' ]
<author>[ 'Giuseppe Castagna' ]
<author>[ 'Veronique Benzaken' ]
<title>[ 'Semantic subtyping' ]
<conference>[ 'LICS 02' ]
<file>[ 'semsub.ps.gz' ]
]
<paper>[
<author>[ 'Mariangiola Dezani-Ciancaglini' ]
<author>[ 'Alain Frisch' ]
<author>[ 'Elio Giovannetti' ]
<author>[ 'Yoko Motohama' ]
<title>[ 'The Relevance of Semantic Subtyping' ]
<conference>[ 'ITRS\'02' ]
<file>[ 'itrs02.ps.gz' ]
]
<paper>[
<author>[ 'Veronique Benzaken' ]
<author>[ 'Giuseppe Castagna' ]
<author>[ 'Alain Frisch' ]
<title>[ 'CDuce: a white-paper' ]
<conference>[ 'PLANX-02' ]
<file>[ 'planx.ps.gz' ]
]
]
Ok.
Select_from_where The same queries we wrote above can of course be programmed with the select_from_where construction All the titles
let tquery = select y
from x in [bib]/<paper>_ ,
y in [x]/<title>_
This query is programmed in a XQuery-like style largely relying on the projections. Note that x and y are CDuce's patterns. The result is:
val tquery : [ <title>[ Char* ]* ] = [ <title>[ 'Semantic subtyping' ]
<title>[ 'The Relevance of Semantic Subtyping' ]
<title>[ 'CDuce: a white-paper' ]
]
Now let's program the same query with the translation given previously thus eliminating the y variable
let withouty = flatten(select [x] from x in [bib]/<paper>_/<title>_)
Yielding:
val withouty : [ <title>[ Char* ]* ] = [ <title>[ 'Semantic subtyping' ]
<title>[ 'The Relevance of Semantic Subtyping' ]
<title>[ 'CDuce: a white-paper' ]
]
- : [ <title>[ Char* ]* ] = [ <title>[ 'The Relevance of Semantic Subtyping' ] ]
- : [ <title>[ Char* ]* ] = [ <title>[ 'The Relevance of Semantic Subtyping' ] ]
Ok.
But the select_from_where expressions are likely to be used for
more complex queries such as the one that selects all titles whose at least one
author is "Alain Frisch" or "Veronique Benzaken"
let sel = select y
from x in [bib]/<paper>_ ,
y in [x]/<title>_,
z in [x]/<author>_
where z = <author>"Alain Frisch" or z = <author>"Veronique Benzaken"
Which yields:
val sel : [ <title>[ Char* ]* ] = [ <title>[ 'Semantic subtyping' ]
<title>[ 'Semantic subtyping' ]
<title>[ 'The Relevance of Semantic Subtyping' ]
<title>[ 'CDuce: a white-paper' ]
<title>[ 'CDuce: a white-paper' ]
]
Ok.
Note that the corresponding semantics, as in SQL, is a multiset one.
Thus duplicates are not eliminated. To discard them, one has to use the distinct_values operator.
A pure pattern example
This example computes the same result as the previous query except that
duplicates are eliminated. It is written in a pure pattern form (i.e., without
any XPath-like projections)
let sel = select t
from <_>[(x::<paper>_ | _ )*] in [bib],
<_>[ _* (<author>"Alain Frisch" | <author>"Veronique Benzaken") _* (t&<title>_ ); _] in x
Note the pattern on the second line in the from clause. As the type of an element in x is <paper>[ Author+ Title Conference File], we skip the tag : <_>, then we skip authors _* until we find either Alain Frisch or Veronique Benzaken (<author>"Alain Frisch" | <author>"Veronique Benzaken"), then we skip the remaining authors _*, we then capture the corresponding title (t &<title>_) and then ignore the tail of the sequence by writing ; _
Result:
val sel : [ <title>[ Char* ]* ] = [ <title>[ 'Semantic subtyping' ]
<title>[ 'The Relevance of Semantic Subtyping' ]
<title>[ 'CDuce: a white-paper' ]
]
Ok.
This pure pattern form of the query yields (in general) better performance than
the same one written in an XQuery-like programming style. However, the query
optimiser automatically translates the latter into a pure pattern one
Joins
This example is the exact transcription of query Q5 of XQuery use cases.
We first give the corresponding CDuce types. We leave the user in charge of creating the corresponding relevant values.
type Bib = <bib>[Book*]
type Book = <book year=String>[Title (Author+ | Editor+ ) Publisher Price]
type Author = <author>[Last First]
type Editor = <editor>[Last First Affiliation]
type Title = <title>[PCDATA]
type Last = <last>[PCDATA]
type First = <first>[PCDATA]
type Affiliation = <affiliation>[PCDATA]
type Publisher = <publisher>[PCDATA]
type Price = <price>[PCDATA]
The queries are expressed first in an XQuery-like style, then in a pure pattern style: the first pattern-based query is the one produced by the automatic translation from the first one. The last query correponds to a pattern aware programmer's version.
XQuery style
<books-with-prices>
select <book-with-price>[t1
<price-amazon>([p2]/_) <price-bn>([p1]/_)]
from b in [biblio]/Book ,
t1 in [b]/Title,
e in [amazon]/Entry,
t2 in [e]/Title,
p2 in [e]/Price,
p1 in [b]/Price
where t1=t2
Automatic translation of the previous query into a pure pattern (thus more efficient) one
<books-with-prices>
select <book-with-price>[t1 <price-amazon>x11 <price-bn>x10 ]
from <_>[(x3::Book|_)*] in [biblio],
<_>[(x9::Price|x5::Title|_)*] in x3,
t1 in x5,
<_>[(x6::Entry|_)*] in [amazon],
<_>[(x7::Title|x8::Price|_)*] in x6,
t2 in x7,
<_>[(x10::_)*] in x9,
<_>[(x11::_)*] in x8
where t1=t2
Pattern aware programmer's version of the same query (hence hand optimised).
This version of the query is very efficient. Be aware of patterns.
<books-with-prices>
select <book-with-price>[t2 <price-amazon>p2 <price-bn>p1]
from <bib>[b::Book*] in [biblio],
<book>[t1&Title _* <price>p1] in b,
<reviews>[e::Entry*] in [amazon],
<entry>[t2&Title <price>p2 ;_] in e
where t1=t2
More complex Queries: on the power of patterns
<bib>
select <book (a)> x
from <book (a)>[ (x::(Any\Editor)|_ )* ] in bib
This expression returns all book in bib but removoing the editor element.
If one wants to write more explicitly:
select <book (a)> x
from <book (a)>[ (x::(Any\<editor>_)|_ )* ] in bib
Or even:
select <book (a)> x
from <book (a)>[ (x::(<(_\`editor)>_)|_ )* ] in bib
Back to the first one:
<bib>
select <book (a)> x
from <(book) (a)>[ (x::(Any\Editor)|_ )* ] in bib
This query takes any element in bib, tranforms it in a book element and
removes sub-elements editor (but you will get a warning as capture variable book in the from is never used.
select <(book) (a)> x
from <(book) (a)>[ (x::(Any\Editor)|_ )* ] in bib
]]
</sample>
<p> Same thing but without tranforming tag to "book".
More interestingly:</p>
<sample><![CDATA[
select <(b) (a\id)> x
from <(b) (a)>[ (x::(Any\Editor)|_ )* ] in bib
removes all "id" attribute (if any) from the attributes of the element in bib.
select <(b) (a\id+{bing=a.id})> x
from <(b) (a)>[ (x::(Any\Editor)|_ )* ] in bib
Changes attribute id=x into bing=x
However, one must be shure that each element in bib has an "id" attribute
if such is not the case the expression is ill-typed. If one wants to perform this only for those elements which certainly have an "id" attribute then:
select <(b) (a\id+{bing=a.id})> x
from <(b) (a&{id=_}))>[ (x::(Any\Editor)|_ )* ] in bib
An unorthodox query: Formatted table generation
The following program generates a 10x10 multiplication table:
let bg ((Int , Int) -> String)
(y, x) -> if (x mod 2 + y mod 2 <= 0) then "lightgreen"
else if (y mod 2 <= 0) then "yellow"
else if (x mod 2 <= 0) then "lightblue"
else "white";;
<table border="1">
select <tr> select <td align="right" style=("background:"@bg(x,y)) >[ (x*y) ]
from y in [1 2 3 4 5 6 7 8 9 10] : [1--10*]
from x in [1 2 3 4 5 6 7 8 9 10] : [1--10*];;
The result is the xhtml code that generates the following table:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 2 | 4 | 6 | 8 | 10 | 12 | 14 | 16 | 18 | 20 | 3 | 6 | 9 | 12 | 15 | 18 | 21 | 24 | 27 | 30 | 4 | 8 | 12 | 16 | 20 | 24 | 28 | 32 | 36 | 40 | 5 | 10 | 15 | 20 | 25 | 30 | 35 | 40 | 45 | 50 | 6 | 12 | 18 | 24 | 30 | 36 | 42 | 48 | 54 | 60 | 7 | 14 | 21 | 28 | 35 | 42 | 49 | 56 | 63 | 70 | 8 | 16 | 24 | 32 | 40 | 48 | 56 | 64 | 72 | 80 | 9 | 18 | 27 | 36 | 45 | 54 | 63 | 72 | 81 | 90 | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 |
|
|