xpfcorpus Documentation
A Python package for grapheme-to-phoneme transcription based on the XPF Corpus.
XPF Resources:
Contents:
Installation
pip install xpfcorpus
Quick Example
from xpfcorpus import Transcriber
# Basic usage
es = Transcriber("es")
result = es.transcribe("ejemplo")
print(result) # ['e', 'x', 'e', 'm', 'p', 'l', 'o']
Features
201 languages with 203 language/script combinations
Pure Python with no required dependencies
BCP-47 language codes support (e.g., “es-ES”, “yi-Latn”)
Multiple data sources: bundled JSON, external YAML, or legacy formats
Command-line interface for batch processing
100% verification - all 203 language/script combinations pass verification tests