

<?xml version="1.0" encoding="UTF-8"?>
<record>
  <title>DVM-based Topic Detection for Microblog</title>
  <journal>Journal of Digital Information Management</journal>
  <author>LV Jia-Guo, JIANG Xiu-Ying, CHI Qing-Yun, Zhang Wei, JOCSHI Allen</author>
  <volume>14</volume>
  <issue>6</issue>
  <year>2016</year>
  <doi></doi>
  <url>http://dline.info/fpaper/jdim/v14i6/jdimv14i6_4.pdf</url>
  <abstract>With the rise of microblog, topic detection
in microblog posts has been a hotspot in natural language
processing and text mining. Different from regular text,
microblog post is a kind of short and idiomatic text.
Microblog post contains little information, which brings
great challenge for its topic detection. To address the
issue of topic detection in microblog, a new single pass
algorithm based on a double-vector model (DVM; Single
Pass_DM) is proposed. First, a support vector machine
(SVM) based algorithm is employed to filter irrelevant
posts, thereby improving the accuracy of the algorithm.
As for the representation model, on the basis of the
traditional vector space model, a DVM that includes event
and keyword vector is put forward. Subsequently,, a
combination of Jacoby ,cosine and semantic similarity
is used for similarity computation. Finally, some structural
characteristics of microblog posts are used to support
the topic detection problem. To validate the performance
of the proposed algorithm, experiments are conducted on
a real-world dataset. Experimental results show that,
comparing with three benchmark algorithms SinglePass,
Agglomerative Hierarchical Clustering (AHC) and Densitybased
Spatial Clustering ( DBSCAN), the performance of
SinglePass_DM has been improved greatly.</abstract>
</record>
