Emdros - the database engine for analyzed or annotated text

Introduction to MQL

Emdros' query-language is called MQL. It is a powerful query-language with create, update, delete, and query-capabilities. Especially the query-facilities are powerful.

MQL is documented in the MQL User's Guide.

MQL is a descendant of another query-language, QL, which was the fruit of Crist-Jan Doedens' labors in his PhD thesis. QL was an extremely powerful query-language to go with the MdF model.

For various reasons, however, QL was extremely difficult to implement in practice. Therefore, MQL was born. MQL originally stood for "Mini QL", since it was a scaled-down version of QL.

Since then, however, MQL has grown to include much of the power of QL, plus additional ideas. It is now a full-access language which allows you to create, update, delete, and query all of the data domains of the EMdF model.

Let us take some examples.

Object blocks

MQL is centered around object blocks. Object blocks look like this:

  [word]

An object block finds an object (word, phrase, clause, page, line, speaker-turn, etc.) in the database.

MQL basic principle

The basic principle of MQL is:

The structure of the query
mirrors
the structure of the objects found
(with respect to sequence and embedding)

Let's see how that plays out

Sequence

If two object blocks are adjacent in the query, the objects which they find must be adjacent in the database. E.g.:

  [word]
  [word]

Finds two adjacent words (this is with respect to monads).

Embedding

If an object block is nested inside another object block, the inner object must be inside the outer object:

  [clause
    [phrase]
  ]

The phrase-object must be inside the clause-object (with respect to monad-set inclusion).

Arbitrary space

You can signal that arbitrary space be allowed between object blocks with the ".." operator:

  [clause
    [phrase]
    ..
    [phrase]
 ]

Here, the phrases need not be adjacent (though they can be). They must still both be within the clause, however.

Feature-restrictions

A feature is an attribute of an object. You can have arbitrary Boolean restrictions on an object's features:

  [phrase function = Subject and phrase_type = NP]

This presupposes that the feature-data is already in the database. Emdros is not for making analysis, only for storing and retrieving already-made analyses.

Any logically valid combination of and, or, not, and parentheses (grouping) is supported.

MQL Example 1

Consider the following MQL query:

  [Phrase phrase_type = NP]
  [Phrase phrase_type = VP]

It finds all pairs of adjacent phrases in the current database where the first is an NP and the second is a VP.

MQL Example 2

Consider the following MQL query:

[Clause
  [Phrase
    grammatical_relation = subject
  ]
  ..
  [Phrase
    phrase_type = VP

    [Phrase
      phrase_type = NP 
      and grammatical_relation = direct_object

      [Word
        lexeme = "bird" and number = plural
      ]
    ]
  ]
]

Supposing we have a database with the relevant data, this query means the following:

Find all clauses for which the following is true.

The clause contains two phrases. The second phrase occurs at an unbounded distance from the first phrase, but within the boundaries of the clause.

The first phrase must have grammatical relation "subject"

The second phrase must have phrase-type "VP". In addition, it must have an emdedded phrase.

This embedded phrase must have phrase-type "NP" and grammatical relation "direct object". Within this phrase, there must occur a word whose lexeme is "bird" and whose number is "plural", i.e., "birds".

This query would find such clauses as "I saw the birds," and "He shot all the birds."

Named objects

You can give an object block a name and then refer to the object's features in another object block.

[phrase phrase_type = NP
  [word AS w1 part_of_speech = article]
  [word part_of_speech = noun
        number = w1.number AND gender = w1.gender
  ]
]

This finds an NP inside of which are two words, an article and a noun. We give the name "w1" to the article, and then refer to the article inside the feature-restrictions of the noun. This makes the noun agree in number and gender with the article.

MQL Example 3

Consider the following example of an MQL query:

[Sentence
  [Clause as c
    clause_type = main

    [Clause focus
      juncture_type = subordination  and
      main_clause_id = c.self and
      clause_type = relative

      [Phrase first
        grammatical_relation = subject

        [Word
          part_of_speech = relative_pronoun
        ]
      ]
      ..
      [Phrase
        grammatical_relation = object

        [Word
          lexeme = "Peter"
        ]
      ]
    ]
  ]
]

Supposing we have a database with the relevant data, this query means the following:

Find all sentences containg a clause for which the following is true.

The clause must be a main clause. It must have an embedded clause which must have juncture type "subordination", whose main clause must be the first clause, and whose clause type must be relative.

Within this embedded clause, there must be two phrases. The first phrase must have the grammatical relation "subject" and must contain a word which has part of speech "relative pronoun". This phrase must also be first within the second clause.

The second phrase can follow at an unbounded distance from the first phrase, provided it is within the boundaries of the embedded clause. This second phrase must have grammatical relation "object" and must contain a word whose lexeme is "Peter".

The embedded clause must be in focus when viewed by the client.

This would find such sentences as "Susan, who gave Peter the book, is defending her thesis today." and "The man who served Peter's tea was from Spain."

There's more...

There's more to MQL than is presented above. Feel free to explore the documentation, or the tutorial and paper in the next sections.

MQL Tutorial (PDF)

You can also download a short MQL Tutorial in PDF-format.

Emdros overview-paper

There is a short paper on Emdros which also gives an overview of MQL, along with some examples drawn from linguistics.