Volume 02 Number 2 June 2011

    
Towards Multi-Level Hybrid Features To Resolve Mixed Entities

Ingyu Lee, Byung-Won On

https://doi.org/

Abstract With the popularity of Internet, tremendous amount of unstructured information becomes available. Consequently, extracting related information from large corpus becomes popular and has been studied by many researchers. However, synonym and polysemy, miss spelling, and using abbreviation make the task difficult. Resolving those confusions is known as an Entity Resolution problem. In this paper, we are proposing a multi-level weighted... Read More


Improvement in Automatic Classification of Persian Documents by Means of Support Vector Machine and Representative Vector

Jafari Ashkan, Izadi Hamed, Hossennejad Mihan

https://doi.org/

Abstract Representative Vector is a kind of Vector which includes related words and the degree of their relationships. In this paper the effect of using this kind of Vector on automatic classification of Persian documents is examined. In this method, preprocessed documents, extra words as well as word stems are at first found. Next, through one of the known ways, some... Read More


Inflectional Morphology, Reverse Similarity and Data Mining – Finding and Applying Compact and Transparent Descriptions of Verb Systems of Natural Languages

Alfred Holl

https://doi.org/

Abstract Under the term “data mining”, the field of computer science includes many different techniques for data analysis, among them methods of cluster analysis. In the approach presented, a special method is designed for the analysis of inflectional systems. The algorithm is independent of individual natural languages and parts of speech. It finds two types of clusters: morphologically homogeneous ones, which... Read More


Arabic Language in the Context of Information Extraction Task

Meshrif Alruily, Aladdin Ayesh, Hussien Zedan

https://doi.org/

Abstract In the past few years, researchers have started paying attention to the Arabic language. In this paper we review information extraction systems that were developed for the Arabic to extract predefined entities. A comparisons are conducted between these systems in terms of their performance in extracting the common entities, the approach used whether rule-based or machine learning and type of... Read More