مقالات​

An Introduction to Noor Diacritized Corpus

نویسندگان
Akbar Dastani, Behrouz Minaei-Bidgoli, Mohammad Reza Vafaei, Hossein Juzi
چکیده
This article is aimed to introduce Noor Diacritized Corpus which includes 28 million words extracted from about 360 hadith books. Despite lots of attempts to diacritize the holy Quran, little diacritizing efforts have been done about hadith texts. This corpus is therefore from a great significance. Different statistical aspects of the corpus are explained in this article. This paper states challenges of diacritizing activities in Arabic language in addition to general specifications of the corpus.
کلیدواژه‌ها
Noor Diacritized Corpus, diacritization, Arabic corpora
2 3 رای ها
رأی دهی
اشتراک در
اطلاع از
guest
0 نظر
بازخورد (Feedback) های اینلاین
نمایش همه نظرات