GAIA: Common Framework for the Development of Speech Translation Technologies


GAIA is an open-source software platform for the integration of speech translation components. The tool can be used to facilitate and record person-to-person and machine-to-person communication.
This tool is useful to integrate into a common framework different automatic speech recognition, spoken language translation and text-to-speech synthesis solutions. Gaia operates with great flexibility, and it has been used to obtain the text and speech corpora needed when performing speech translation. The platform follows a modular distributed approach, with a specifically designed extensible network protocol handling the communication with the different modules. A well defined and publicly available API facilitates the integration of existing solutions into the architecture. Completely functional audio and text interfaces together with remote monitoring tools are provided. The speech and audio engines are not included but the interface is defined and skeleton engines are provided to help the integration of engines.

The configuration file allows to use GAIA in several scenarios, for facilitating the communication between two people or to provide service to one person.
- Speech-to-speech translation using telephone lines. Two users participate in a dialog and GAIA translate the turns. Speech recognition, spoken translation and speech synthesis is required. This was the scenario of the demonstration protoype of the LC-STAR EU Project.
- Speech/Text translation system. In this scenario one of the dialog participants uses speech and the other uses text.
- Speech collection: GAIA can manage data collection either from the desktop or from the telephone. The user can be prompted either using text or using pre-recorded or synthetic speech.

