

<?xml version="1.0" encoding="UTF-8"?>
<record>
  <title>Author Profiling in Arabic Tweets: An Approach based on Multi-Classification with Word and Character Features</title>
  <journal>Journal of E - Technology</journal>
  <author>Yutong Sun, Hui Ning, Kaisheng Chen, Leilei Kong, Yunpeng Yang, Jiexi Wang, Haoliang Qi</author>
  <volume>11</volume>
  <issue>2</issue>
  <year>2020</year>
  <doi>https://doi.org/10.6025/jet/2020/11/2/60-63</doi>
  <url>http://www.dline.info/jet/fulltext/v11n2/jetv11n2_3.pdf</url>
  <abstract>This paper focuses on the author profiling task published in the FIRE 2019 (Forum for Information Retrieval
Evaluation), which includes automatic identification of the age, gender, and language variety of Arabic tweets. We think the
author profiling task as a multi-Classification problem. We have used word and character based on TFIDF features, learned
the logistic regression classifier to predict the labels. In the final results, our proposed method shows a good performance in
terms of age prediction, the accuracy rate is 0.6250. Additionally, we have obtained 0.5111 and 0.9604 accuracy for gender
and language variety classifications respectively. In the experiment, We have used the different feature combination and
adjusted the feature parameters to test the system. The combination of word and character features can improve the prediction
accuracy and enhance the system performance significantly.</abstract>
</record>
