<sect> Introduction
<p>

In the early nineties, the Institute of Phonetics,
Faculty of Arts, Charles University, Prague, and the Institute of
Radio-Engineering and Electronics, Academy of Sciences, Prague,
had managed to assemble a complete TTS implementation for the
Czech and Slovak languages, using a linear prediction based
diphone synthesis.  This TTS engine then served as a base
for further research in speech synthesis, until the source became
too large and complicated to be easily modified, ported to new
hardware or operating system, or to be well understood by anybody
except the authors.  By the end of 1995, when a need for testing
some new prosody modelling hypotheses had arisen, these limitations
were slowly becoming a major burden. A new implementation of part of
the system was eventually written from scratch (starting in 1996) and it is
still expanding, integrating the original results with numerous
recent improvements. It has been baptized Epos in 1998.

Our primary design goal is to allow the user the ultimate control
over the TTS processing. We avoid hard-wired constants; we use
configuration options instead, with sensible default values.
Most of the language-dependent processing is driven by a <em/rules file/,
a text file using an intuitive and well-documented syntax. A rules file
lists the rules to be applied on a written text structure representation
to yield a corresponding spoken text structure representation (in fact
it could be the other way round in principle, but somehow no one seems to need that).
Some aspects of user-definable
behavior don't fit into the concept of a rules file, and are therefore settable
with various options in conventional configuration files.  Finally, many other
external files can be referenced either by the rules file, or a configuration
file, such as segment inventories or dictionaries.

Most of these files have to be processed before any actual TTS processing
has finished. That's why Epos is implemented as a background process, i.e.
as a daemon under UNIX-like OSes and as a service on Windows NT and similar OSes.
Epos reserves a TCP/IP port for any communication with client applications
using a custom, quite generic protocol for TTS data flow control, 
called TTSCP.  A simple TTSCP client utility named <tt/say/ has been provided
with Epos, but there are many more specialized TTSCP clients in existence.

Epos currently supports three main speech generation algorithms.  Two ones
employ segment concatenation in the time domain; these have been contributed by
Zdenek Kadlec, Masaryk University, Brno, and Martin Petriska, Slovak Technical
University, Bratislava, respectively.  These algorithms, including low quality
segment inventories, are available as part of Epos subject to GPL.  The third one
is a linear prediction coding speech synthesis written by Ellen V&iacute;chov&aacute,
which
actually gives good results, but is not freely available (existing commercial
applications prohibit this). You can however use a virtual speech synthesis to
synthesize your own texts using the linear prediction algorithm, if you are
connected to the Internet. (Your text is partly processed and sent to our server.
Then the generated speech signal is sent back to you.) We are constantly working
on improving especially the freely distributable syntheses.

The name Epos is not an acronym.  It is Greek for...um, go have a look yourself.

This section still has to be (re)written by somebody. At present, try to look at
<tt> <url url="http://epos.ure.cas.cz"> </tt>
for additional introductory information or ask the authors by
<htmlurl url="mailto:geo@cuni.cz" name="email">.

Meanwhile, tell us what kind of introductory information you would like
to see here.  The documentation is provided for you and we need your feedback.

