Application of K-means Algorithm to Web Text Mining Based on Average Density Optimization

Application of K-means Algorithm to Web Text Mining Based on Average Density Optimization Journal of Digital Information Management FAN Guang-Ling, LIU Yu-Wei, TONG Jan-Qiang, ZHAO Sheng-Hai, NIE Zhi-Quan 14 1 2016 http://dline.info/fpaper/jdim/v14i1/v14i1_6.pdf Text information is increasing at an explosive speed with the advent of the Internet. However, this situation has given rise to the problem of abundant information with relative deficiency of knowledge. Therefore, finding a way to seek target information rapidly and accurately has become a research hotspot. This study presented a method to improve web text clustering accuracy and integrity. First, the dk-means algorithm was modified, and the k-means algorithm based on average density optimization was proposed. Second, a web text clustering model was designed, and indepth research on the key technology of web text clustering was conducted. Finally, the k-means algorithm based on average density optimization (Adk-means algorithm) was applied to the web text clustering model, and clustering and classification of web text were completed. Experiment showed that the purity and mutual trust values of the Adk-means algorithm are higher than those of the dk-means algorithm, and the modified algorithm is greatly improved in terms of accuracy, integrity, and performance of partitioned clusters. When clustering text, the Adk-means algorithm has high polymerization and similarity within classification. Research results were applicable to text clustering. When used in Internet text searching, the Adk-means algorithm is a highly efficient information retrieval technology that can improve searching speed and accuracy.