Emdros - the database engine for analyzed or annotated text

FAQ - overview

The following questions are answered here:

1. What is Emdros?

Emdros is a text database engine for analyzed or annotated text. It supports storage and retrieval of any kind of text plus annotations/analyses of that text. Linguistic analyses are its primary target, and here syntactic analyses are in focus (although other linguistic levels are supported, too).

Another important use of Emdros is as a backend for a digital library. The harvest library (explained in the Programmer's Reference Guide) provides support for extracting documents from Emdros databases using "stylesheets" encoded in JSON.

Emdros also provides an easy-to-use Full Text Search engine.

Emdros excels in storing and querying structured data, supporting multiple hierarchies of embedding over the same text. Its powerful query language is built around sequence and embedding as the primary structuring operations. It implements the EMdF database model and the MQL query language.

2. Is Emdros free?

Yes, Emdros is free as in price. It is also Free as in "Freedom", meaning it is licensed under the GNU General Public License (GPL). However, if you need commercial licensing, please contact the author. More information below.

3. What does 'Emdros' stand for?

Emdros stands for Engine for MdF Database Retrieval, Organization, and Storage.

4. Can I contact the author?

By all means, yes, please do.

5. Which version of Emdros should I choose?

The most recent version.

6. Which database backends are supported?

The current branch (3.X.Y) supports the following backends:

In addition, there is a proprietary database backend, called the BPT (Bit Packed Table) engine, written by the author. It is faster than any of the other backends supported. You can contact the author for more information about the BPT engine, including licensing options. The BPT engine is especially well suited to shipping databases with an application, where the databases can be encrypted encrypted for maximum protection of the content.

It is easy to add support for new databases. You can either do it yourself or ask the author if he has the time to implement support for your choice of database.

7. On what platforms does it run?

Emdros has been tested by the author on the following platforms:

  • Linux (Intel x86, AMD64, ARM)
  • Solaris
  • Mac OS X (10.6 and above)
  • Windows 8
  • Windows 7
  • Windows Vista
  • WinXP
  • Win2000
  • iOS (iPad, iPhone, iPod)
  • Android

However, it will probably run on any *nix. For example, OpenBSD should work, as should any other Unix-like system.

There is no support any more for running Emdros on Windows 95, 98, 98SE or ME.

8. Which back-end should I choose?

PostgreSQL is 'more free' than MySQL in terms of licensing (BSD versus GPL/LGPL). SQLite is public domain, and thus "freeest" of the five back-ends supported.

MySQL is about the same speed as PostgreSQL. SQLite 2 is faster than MySQL, but SQLite 3 is fastest among the Open Source backends. (It used to be that the SQLite 2 backend was faster than the SQLite 3 backend, but this is no longer the case.)

SQLite 2 and 3 provide an embedded, zero-install, zero-configuration database engine with no bootstrapping to do. MySQL and PostgreSQL both need a modicum of administration and bootstrapping (including username and password generation). See the file doc/bootstrapping.txt in the sources for information on the bootstrapping process for MySQL and PostgreSQL.

A fifth backend, the proprietary BPT backend, is available from the author upon negotiation of a license fee. The BPT backend produces single-file databases which are smaller than the SQLite3 equivalents, and is faster than the SQLite3 backend. Feel free to contact the author to ask about licensing or trying out the proprietary BPT engine.

9. Do I need to program in order to use Emdros?

Short answer: Maybe.

Slightly longer answer:

Emdros is a general purpose text database engine. As such, it is really a software library that other programs can take advantage of.

However, I have written a number of programs which may mean that you probably don't have to program yourself in order to use Emdros. These programs include:

  • The Emdros Query Tool: A graphical tool to query databases which are already in Emdros.

  • Some importers: I have written some import-programs which enable import of data. Please contact the author if you have a format that you'd like to import.

    The currently supported importers include:

    • Penn Treebank format
    • "Plain text"
    • "Slashed/V/slash Text/N/text"
    • TIGER XML
    • Linguistic Tree Constructor *.ltcx format (via the TIGERXML importer).
    • NeGRA format (version 3)
    • SIL SFM (Standard Format Markers)
    • Unbound Bible

If you find that you need to program in order to use Emdros, there is a file, HOW-TO-USE in the doc/ directory of the sources which gives you some pointers for how to get going. The Programmer's Reference Guide also has a lot of hints.

If you have any questions, you are welcome to contact the author, and he will most likely try to be helpful.

10. What programming languages can I use?

You can use almost any programming language to program your application on top of Emdros. You just need a way of communicating with the MQL subsystem. You have a number of options for this:

  1. Writing your application in C++, and linking in the Emdros libraries.
  2. Writing in one of the scripting languages that Emdros supports through SWIG (currently Python, Ruby, Java, C#/.Net, PHP and Perl).
  3. Writing in any programming language that can call external programs, invoking the mql(1) program, and parsing its output.
  4. Writing in any programming language that can use sockets, making the mql(1) program run as a daemon (see this question).

11. Can Emdros be run as a daemon?

Emdros can be run as a daemon provided you have a program, such as inetd or xinetd, to invoke Emdros when incoming requests arrive on a particular port.

If you would like to get full daemon-capabilities, feel free to drop the author an e-mail telling him of your need.

12. How is Emdros licensed?

Emdros is licensed under the GNU General Public License (GPL) version 2.

Commercial licensing under a dual-licensing scheme may be available upon negotiating with the author, and has already been done by at least three companies.

All licensing questions should be addressed through Scripture Systems ApS, the company behind Emdros.

13. What happened to version 2.0?

Q: You released Emdros version 1.2.0.pre262, and then the next public release was Version 3.0.0. What happened to version 2?

A: The 1.2.0.preXX series started as a branch off of version 1.1. This was in 2004. It was a series of "previews" that ran to version 1.2.0.pre269 -- that is two hundred and sixty nine (269!) previews. Almost four (4!) years later, in 2008, I finally decided enough was enough: I really ought to get my act together and make a real, non-preview release, since the code had been stable for a very long time, and the "preview" label wasn't winning any users.

By January 2008, the code had evolved so much since version 1.1 from which the "1.2.0.preXX" series was branched, that a bump in -- not one but -- two major version numbers was merited. Version 2.0.0 did happen, though -- it was just an internal, non-public release.