Skip to main content
Version: 4.0.0-rc1

Arabic Levantine (ar-XL)

In this article you will find graphemes and phonemes for Levantine Arabic Speech to Text and Keyword Spotting.

info

Each language has a specific set of graphemes and phonemes that were used to train the Speech Recognition systems. Only these sets can be used for spelling keywords or preferred phrases and defining their pronunciations.

Graphemes

These are the valid graphemes to be used for spelling.
ء آ أ ؤ إ ئ ا ب ة ت ث ج ح خ د ذ ر ز س ش ص ض ط ظ ع غ ف ق ك ل م ن ه و ى ي

Phonemes

These are the valid phonemes to be used for defining pronunciations.

Phoneme1GraphemeExample wordsPhonemic representation
aااكتب
التانيه
a k t b
a l t a n j h
Aىأبدي?a b d A
?aأأمريكي
بأبطئوا
?a m r j k j
b ?a b t` ?i w a
??إإجاني
لإحنا
?? Z a n j l
?? X\ n a
a:آآلي
القرآن
a: l j a
l q r a: n
bببعثلون
عباتين
b ?\ T t l w n
?\ b a t j n
tتبتأذاني
قيمتلي
b t ?a D a n j
q j m t l j
Tثثقلي
شغثة
T q l j
S G T at
Zججوزها
دجاج
Z w h a
d Z a Z
X\ححلتها
عالحيط
X\ l t h a
?\ a l X\ j t`
xخخربانه
متخربت
x r b a n h
m t x r b t
dددارسه
صادق
d a r s h
s` a d q
Dذذكرياتي
عالهذا
D k r j a t j
?\ a l h D a
rرراسكم
زيرو
r a s k m
z j r w
zززيرو
كالكنوز
z j r w
k a l k n w z
sسسنتان
يدرسو
s n t a n
j d r s w
Sششركة
فيهاش
S r k at
f j h a S
s`صصاحبي
فاحص
s` a X\ b t j
f a X\ s`
d`ضضرورية
احمضت
d` r w r j at
a X\ m d` t
t`ططافيه
بأبطئوا
t` a f j h
b ?a b t` ?i w a
D`ظظافر
احظه
D` a f r a
X\ D` h
?\ععارفين
مبعطي
?\ a r f j n
m b ?\ t` j
Gغغيرها
شغثة
G j r h a
S G T at
fففاليوم
طافيه
f a l j w m
t` a f j h
qققلبون
صادق
q l b w n
s` a d q
kككالسيوم
كالكنوز
k a l s j w m
k a l k n w z
lللأنتصار
كالسيوم
l ?a n t s` a r
k a l s j w m
mممبعطي
قيمتلي
m b ?\ t` j
q j m t l j
nننوصي
سنتان
n w s` j
s n t a n
hههوني
بالهواء
h w n j
b a l h w a ?
atةاخوة
ضرورية
a x w at
d` r w r j at
wووحليب
بالهواء
w X\ l j b
b a l h w a ?
?uؤمسؤول
أتؤلم
m s ?u w l
?a t ?u l m
jييأخذنا
زيرو
j ?a x D n a
z j r w
?iئكئيب
لئيمة
k ?i j b
l ?i j m at
?ءبالهواء
سمراء
b a l h w a ?
s m r a ?

Footnotes

  1. Each phoneme corresponds to the grapheme (letter). Arabic texts and dictionary as well as phonemes used in training of STT and KWS do not use diacritics (fathah, kasrah, dammah, waslah, sukun, tanwin), while all representations of hamza ئ ,ؤ ,إ ,أ ,ء, and alif with maddah آ are present. Given the one-to-one correspondence between phonemes and graphemes, some phonemes had been modified and, therefore, phonemic system of AR_XL_6 present a few differences from the typical SAMPA representation of MSA phonemes.