Source Code Documentation

This section is of little use for anyone except for programmers willing to contribute to the development of Epos or going to modify its source code. It is also not trying to become a beginner's guide to Epos. Anyway, if you are personally missing something here or elsewhere, tell me and I may add it; that will become almost the only source of progress in this section of documentation. The section may also slowly become outdated due to lack of interest. Design goals

Overall coding priorities, approximately in order of decreasing precedence: language independence and generality no undocumented or implicit "features" (except for error handling) portability maintainability, clean decomposition clean (intuitive) protocols and programming interfaces scalability intuitive configuration fault tolerance simple algorithms code readability speed space possible parallelizability Isolated classes

Class simpleparser

This class takes some input (such as a plain ASCII or STML text) and then can be used in conjunction with the . Its purpose is to identify the Latin text tokens (usually ASCII characters, but some traditional tokens like "..." would be difficult to identify later, as well as numerous other context dependent uses of "."). The parser also identifies the level of description which corresponds to the token and this is the information needed by the . In this process, the parser skips over any empty units, that is, units that contain no phones (simple letters) at all. Note that it is unnecessary and counterproductive to distinguish between homographic tokens used at the same level of description here; such intelligence can be handled more flexibly by the language dependent rules. In fact, they tend to be usually language dependent. The parser only avoids losing information (through empty unit deletion) by the minimum necessary tokenization. The STML parser is still unimplemented. Class unit

This class is the fundamental element of the . Its methods are listed in Class text

This class represents a logical line-oriented text file. It handles things like the , , initial whitespace and comment stripping. It is used for the and , but . Class file

This class represents a physical data file. Its main purpose is to cache and share files repeatedly needed by Epos. The Class hash

. Class rules

Note the difference between Class hierarchies Class rule

Each r_regress r_progress r_raise r_syll r_contour r_smooth r_regex r_debug hashing_rule r_subst r_prep r_postp r_diph r_prosody cond_rule r_inside r_if r_with block_rule r_block r_choice r_switch Classes not beginning in Class agent

Epos can be configured to support multiple simultaneous TTSCP connections and except for bugs, no single unauthorized connection should be able to create a Denial of Service situation, such as long delays in processing other connections. To achieve this, Epos uses a simple cooperative multitasking facility called for the semantics of that. Other agents are responsible for individual TTSCP connections, for accepting new connections and other tasks. A special agent is used for deleting other agents when they need to delete themselves. The stream a_accept a_protocol a_ttscp a_disconnector a_ascii a_stml a_rules a_print a_diphs a_synth a_io a_input a_output oa_ascii oa_stml oa_diph oa_wavefm Testing

The Epos package contains three TTSCP clients. One of them is the standalone Portability

Epos is written to be as portable as possible. It is however also written with UNIX developer's tastes, and it is also partly true of this documentation. The following should give you an approximate look at the degree of support for some most common operating systems. Linux

The primary development OS. The most of the testing is done under Debian and Red Hat distributions. Please report to us any compilation issues which may be distribution related, these will be the most easy ones to solve. Other UNIX clones Epos is ported to other unices from time to time as well, but there may be minor incompatibilities in recent code. In this documentation, references to UNIX should be read as "tested on Linux, implemented using POSIX compliant interfaces and expected to be easy to get working on any other UNIX clone".

Epos uses the autoconf package to avoid portability pitfalls within the UNIX world. Features like syslog are welcomed and used, but only if the corresponding system header file is detected by autoconf. QNX

On the QNX operating system, Epos can be controlled not only over a TCP-based TTSCP implementation, but also using a QNX specific interprocess communication interface. See src/qnxipc.cc for details; be however aware that this code has never been completely debugged becasuse of a drop in our motivation. You could help debug this easily if you really need this and provide us with a QNX machine. Windows NT, Windows 2000

See the arch/win directory for architectural differences from UNIX. Be aware of the following three differences of Epos's behavior on these operating systems: the /dev/dsp for speech output; Epos compiles and runs as an NT service named In order to make service installation and registry access available, it is necessary to build and run the instserv utility before running Epos. That utility, if run with the letter You should use the Visual C++ compiler for compiling Epos, but you don't need it for running Epos. The Borland C++ Builder and Watcom C++ used to work a long time ago, too. Ask us for help with these compilers if necessary. Please refer to the WELCOME file on how to proceed step by step with Visual C++. Advanced notes for Windows NT

File input and output modules are not going to work with Windows sockets (whose incompatible implementation of the Windows CE

The port is planned soon, but not available at the moment. Ask us if you need it. Files specific for this port can be currently found at arch/win-ce.

An experienced Windows user can get a good estimate of this port's behavior from reading the sections on other versions of Windows. Windows 95, 98 etc.

We don't support these DOS successors very strongly now, but these ports used to work. If you want to try out, you should probably comment out the HAVE_WINSVC_H line in src/config.h after running arch/win/configure.bat. This will force Epos to compile not as a Windows NT service, but as an ordinary UNIX-style daemon. In fact, the way Epos is written, it will decide to run as a daemon if it can't connect to the service controller anyway. The same holds for MS DOS, but as MS DOS offers no sound playback interface, you'll have to comment out portions of source code here and there to make Epos e.g. produce wave files. Good luck and don't even try to use 16-bit compilers, please. Other OSes

Please contact the authors for advice with any OS significantly different from the UNIX and Windows families. However, the approximate requirements are: Architecture: 32bit little endian Reliable and complete C++ compiler Standard C library, preferably including regular expression handling 8-bit ASCII based character set TCP/IP networking Note that the architectural requirements are only a guideline and are enforced rather for lack of energy for debugging Epos on every perverse 36-bit machine with PDP byte ordering. Epos supports big endian architectures, but the corresponding code still needs to be tested. The integers and pointers can be any size not less than 32 bits as long as the integers are not longer than pointers. If they were, a single code change would do the port. TCP/IP networking is not strictly necessary, but if you don't have it, you can either try to adapt the QNX IPC proxy for your favourite IPC interface, or you can build the monolithic binary of Epos. A bourne-compatible shell is helpful, as it allows to run a configure script. Otherwise you have to write a src/config.h file by hand as we have done with the Windows port. A plain old make utility helps the compilation process if your OS can emulate a UNIX development environment a little bit. More information

The header files mostly define basic interfaces for individual Epos components. Reading the ones related to a specific piece of code may often clarify things. Lots of global data declarations live in . You are also encouraged to subscribe to the list first by sending a mail containing only the text