Emdros - the database engine for analyzed or annotated text

Introduction to EMdF

EMdF is a text database model. That means, it is a mathematical model of text which explains the concepts involved in how Emdros deals with text.

There are only four central concepts in the EMdF model. Once they are mastered, the rest follows fairly easily. These four concepts are:

  1. Monads
  2. Objects
  3. Object types
  4. Features

These will be explained in turn below.

At the end, we have an example.

Monads

A monad is simply an integer, i.e., a natural number: 1,2,3,4, etc.

The sequence of the natural numbers (1,2,3,4,5, etc.) dictates the text-flow in a database.

Objects

An object is a set of monads. This set can be any set of monads. So {1,2,3} is just as valid as {1,2,5,6,7}.

It is objects that carry the text plus the information about that text. This is explained by object types and their features.

Object types

Objects are grouped in types. A type could be, e.g., "Phrase", "Clause", "Word", "Chapter", "Book", etc. An object is always of one specific type.

Features

Finally, object types have features. A feature is an attribute or value which we associate with an object.

It is the values of an object's features which store the analytic data in the database. For example, the "Phrase" object type might have a "phrase_type" feature which tells us whether the phrase is a VP, NP, PP, etc.

It is the values of an object's features which stores the textual data in the database. For example, the "Word" object type might have a feature called "surface" which tells us what the textual word for that particular word object is.

An example

Consider the Emdros logo:

Emdros - the database for analyzed or
annotated text

This is a small example of an EMdF database. At the top we see the sequence of monads (1-6). In this database, there are only six monads.

To the left, we see two object types, "Letter" and "Name".

The "Letter" object type has the feature "surface", while the "Name" object type doesn't have any features.

There are six "Letter" objects. Each object has a number (1-6). These numbers are not their monads, but a kind of id (an id_d, to be specific). So the object "Letter-2" consists of the set of monads "{2}".

There is only one "Name" object, and it consists of the set of all the monads, "{1,2,3,4,5,6}". Its id_d is 7.

More information

The following three documents explain the EMdF model in more detail:

  • MQL User's Guide
  • The Standard MdF model
  • The EMdF model