Volume 06 Number 1 March 2015

    
N-Grams Based Linguistic Search Engine

Peng Zhu

https://doi.org/

Abstract With the rapid development of the Internet, the advantages of Web text in scalability and diversity play an important role in English research and teaching. This paper presents a N-Grams based English search engine for words in context, which incorporate information retrieval, part-of-speech, named entity recognition, word similarity and other natural language processing technology with web based text data. The engine... Read More


Improving the Performance of Text Categorization using N-gram Kernels

Varsha K. V., Santhosh Kumar C., Reghu Raj P. C.

https://doi.org/

Abstract Kernel Methods are known for their robustness in handling large feature space and are widely used as an alternative to external feature extraction based methods in tasks such as classification and regression. This work follows the approach of using different string kernels such as n-gram kernels and gappy-n-gram kernels on text classification. It studies how kernel concatenation and feature combination... Read More


Machine Translation of Different Systemic Languages Using Apertium Platform (with an Example of English and Kazakh Languages)

Shormakova Assem, Sundetova Aida

https://doi.org/

Abstract This paper describes the initial steps in the project of building a prototype of a free/open-source rule-based machine translation system that translates from English to Kazakh. The goal of this article is to examine the grammatical and lexical problems, which we often face while translating English texts, and during the translation process there is no detailed statement of grammatical or lexical phenomenon... Read More


Improving Information Acquisition via Text Mining for Efficient E-Governance

Adesesan B. Adeyemo, Adebola K. Ojo

https://doi.org/

Abstract In this paper we proposed a framework for integrating text mining with E-Governance. We suggested that the users of electronic governance can use the text terms to describe their interest which can be processed for clustering and term extraction. The words thus expressed by users are tracked and subjected to processing wherein it is possible to generate content. We have provided the... Read More