A Framework for Spelling Correction in Persian Language Using Noisy Channel Model

مقالات​ نویسندگان Mohammad Hoseyn Sheykholeslam, Behrouz Minaei-Bidgoli, Hossein Juzi چکیده There are several methods offered for spelling correction in Farsi (Persian) Language. Unfortunately no powerful framework has been implemented because of lack of a large training set in Farsi as an accurate model. A training set consisting of erroneous and related correction string pairs have […]

Improving K-Nearest Neighbor Efficacy for FarsiText Classification

مقالات​ نویسندگان Mohammad Hossein Elahimanesh, BehrouzMinaei-Bidgoli, Hossein Malekinezhad چکیده One of the common processes in the field of text mining is text classification.Because of the complex nature of Farsi language, words with separate parts and combined verbs, the most of text classification systems are not applicable to Farsi texts.K-Nearest Neighbors (KNN) is one of the […]

Extracting person names from ancient Islamic Arabic texts

مقالات​ نویسندگان Extracting person names from ancient Islamic Arabic texts چکیده Recognizing and extracting name entities like person names, location names, date and time from an electronic text is very useful for text mining tasks. Named entity recognition is a vital requirement in resolving problems in modern fields like question answering, abstracting systems, information retrieval, […]

An Introduction to Noor Diacritized Corpus

مقالات​ نویسندگان Akbar Dastani, Behrouz Minaei-Bidgoli, Mohammad Reza Vafaei, Hossein Juzi چکیده This article is aimed to introduce Noor Diacritized Corpus which includes 28 million words extracted from about 360 hadith books. Despite lots of attempts to diacritize the holy Quran, little diacritizing efforts have been done about hadith texts. This corpus is therefore from […]

Automatic classification of Islamic Jurisprudence Categories

مقالات​ نویسندگان Mohammad Hossein Elahimanesh, Behrouz Minaei-Bidgoli, Hossein Malekinezhad چکیده This paper evaluates some of text classification methods to classify Islamic jurisprudence classes. One of prominent Islamic sciences is jurisprudence, which explores the religious rules from religious texts. For this study the Islamic Jurisprudence corpus is used. This corpus consists of more than 17000 text […]

A new framework for detecting similar texts in Islamic Hadith Corpora

مقالات​ نویسندگان Hossein Juzi, Ahmed Rabiei Zadeh, Ehsan Barati, Behrouz Minaei-Bidgoli چکیده Nowadays similarity detection is one of the most applicable aspects of text mining techniques. There are different methods for similarity detection. This paper presents a new system for text similarity detection in Islamic Large Hadith Corpus of Computer Research Center of Islamic Science […]

A framework for detecting Holy Quran inside Arabic and Persian texts

مقالات​ نویسندگان Mohsen Shahmohammadi, Toktam Alizadeh, Mohammad Habibzadeh Bijani, Behrouz Minaei چکیده This paper presents how to design and implement the Quranic intelligent engine to detect Quranic verses in the texts automatically. Process area of this system is in the scope of text mining processes and its operations are beyond the usual multiple patterns matching […]

An Introduction to Noor Corpus and its Language Model

مقالات​ نویسندگان Mohammad Hossein Elahimanesh, Behrouz Minaei-Bidgoli, Mohammad Javad Gholami, Hossein Juzi چکیده In Linguistics, a text corpus is defined as a large group of text documents. Text corpora are used in order to extract the hidden laws of languages. As one application for statistical researches and hidden laws extraction, language models are made to […]

Semantically Clustering of Persian Words

مقالات​ نویسندگان Alireza Arasteh, Mohammad Hossein Elahimanesh, Ahmad Sharif, Behrouz Minaei-Bidgoli چکیده Clustering is one of data mining task which aims to divides a set of objects into groups so that similar objects fall into the same group and objects with different features are put into different and separate groups. This paper presents a technique […]

Automatic Hypertext Construction in Persian Texts Using Self- Organizing Map Neural Network

مقالات​ نویسندگان Mahdieh HajiMohammadHosseini, Behrouz Minaei-Bidgoli چکیده With the availability of electronic texts, users are encouraged to study them. Therefore users may encounter during their study with different information needs and want more information or related information about a particular word or phrase within that document. If so, it is necessary to search the entire […]