SEGRE: An automatic tool for grapheme-to-allophone transcription in Catalan – META-SHARE

Last view: 2024-07-21

428 Last view: 2024-07-21

Last update: 2013-01-30

3 Last update: 2013-01-30

SEGRE: An automatic tool for grapheme-to-allophone transcription in Catalan

SEGRE

ID:

SEGRE

Segre is a rule-based automatic phonetic transcription system for Catalan, jointly developed by the Universitat Politècnica de Catalunya, the Universitat Autònoma de Barcelona and the Universitat de Barcelona in the framework of the Catalan Reference Centre for Language Engineering (CREL, Centre de Referència en Enginyeria Lingüística).

The syntax of the rules has been designed to obtain phonetic transcriptions for four major dialects of Catalan: the Central or Eastern dialect, spoken in the East of Catalonia, the North-Western or Western dialect, spoken in the West of Catalonia (including the South), the Balearic, spoken in the Balearic Islands, and finally the Valencian, spoken in the Valencian Community.

The accuracy of transcriptions of new texts, when compared with human expert generated transcriptions, is of 99.1% for isolated words and 99,39% for running text.
Segre can be considered a useful tool to model how coarticulation modifies the isolated transcription of words in real sentences. So, it is helpful not only to build speech syntesis systems but also to train subword-based speech recognition systems.

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Restricted Use

Licence

MS - NC - No ReD - ND

Restrictions: Academic - Non Commercial Use, No Derivatives, No Redistribution, OnlyM Smembers

Licensors:

IPR Holder

Universitat Autònoma de Barcelona

Universitat Politècnica de Catalunya

Universitat de Barcelona

Contact Person

toolService

Tool

Language Dependent

Input

Media type: Text

Language: Catalan; Valencian

Character encoding: ISO - 8859 - 1

Annotation type: Speech Annotation - Orthographic Transcription

Output

Media type: Text

Language: Catalan; Valencian

Character encoding: ISO - 8859 - 1

Annotation type: Speech Annotation - Phonetic Transcription

Segmentation level: Phoneme, Syllable

Operation

Operating system: Linux

Metadata

Created: 22/01/2013

Last Updated: 30/01/2013

Documentation

Document Type: Other

SEGRE Narrative Description, http://metashare.tal...

People who looked at this resource also viewed the following: