Journal of Electronic Systems

DLINE Journals portal

Home

New Journals

Browse Journals

Journal Prices

For Authors

Print ISSN:
Online ISSN:

About JISM
	DLINE Portal Home Home Aims & Scope Editorial Board Current Issue Next Issue Previous Issue Sample Issue Upcoming Conferences Self-archiving policy Alert Services Be a Reviewer Publisher Paper Submission Subscription Contact us

How To Order
	Order Online Price Information Request for Complimentary Print Copy

For Authors
	Guidelines for Contributors Online Submission Call for Papers Author Rights

RELATED JOURNALS

Journal of Digital Information Management (JDIM)

Journal of Multimedia Processing and Technologies (JMPT)

International Journal of Web Application (IJWA)

Journal of Electronic Systems

Research on Person Entity Extraction from Ancient Sources

Yihong Ma, Qingkai Zeng, Tianwen Jiang, Liang Cai, Meng Jiang
School of Finance Shanghai University of Finance and Economics Shanghai China., University of Notre Dame Notre Dame, IN 46556 USA

Abstract: We in this work have worked for data retrieval from Chinese historiography. The main issue is the low resource of the language: deep learning requires large amounts of annotated data and becomes impracticable when such data is not available. We used the subject experts to curate a set of person entities and their profile attributes and relations from two documents. We introduce a pattern-based bootstrapping approach to extract the information with a very small number (i.e., 1 or 2) of seed patterns. The testing results show the effectiveness as well as the limitations of the iterative method.

Keywords: Information Extraction, Entity Profiling, Classical Chinese, Textual Pattern, Bootstrapping Research on Person Entity Extraction from Ancient Sources

DOI:https://doi.org/10.6025/jes/2020/10/3/102-113

Full_Text PDF 3.37 MB Download: 7 times

References:

[1] Agichtein, Eugene., Gravano, Luis. (2000). Snowball: Extracting relations from large plain-text collections. In: Proceedings of the fifth ACM conference on Digital libraries. ACM, 85–94.
[2] Cai, Liang. (2014). Witchcraft and the Rise of the First Confucian Empire. Albany, NY: State University of New York Press (2014).
[3] Cai, Liang. (2019). Confucians, Social Networks, and Bureaucracy: Donghai Men and Models for Success in the Western Han China (206 BCE–9 CE). Early China.
[4] Carlson, Andrew., Betteridge, Justin., Kisiel, Bryan., Settles, Burr., Hruschka, Estevam R., Mitchell, Tom M. (2010). Toward an architecture for never-ending language learning. In AAAI.
[5] Chang, Pi-Chuan., Galley, Michel., Manning, Christopher D. (2008). Optimizing Chinese word segmentation for machine translation performance. In Proceedings of the third workshop on statistical machine translation. Association for Computational Linguistics, 224–232.
[6] Che, Wanxiang., Li, Zhenghua., Liu, Ting. (2010). Ltp: A chinese language technology platform. In Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations. Association for Computational Linguistics, 13–16.
[7] Chen, Chen., Ng, Vincent. (2016). Chinese zero pronoun resolution with deep neural networks. In ACL. 778–788.
[8] Chen, Xinchi., Qiu, Xipeng., Zhu, Chenxi., Liu, Pengfei., Huang, Xuanjing. (2015). Long short-term memory neural networks for chinese word segmentation. In Proceedings of Empirical Methods on Natural Language Processing. 1197–1206.
[9] Chen, Yufei., Huang, Sheng., Wang, Fang., Cao, Junjie., Sun, Weiwei., Wan, Xiaojun. (2018). Neural Maximum Subgraph Parsing for Cross-Domain Semantic Dependency Analysis. In Proceedings of the 22nd Conference on Computational Natural Language Learning. 562–572.
[10] Converse, Susan P., Palmer, Martha Stone. (2006). Pronominal anaphora resolution in Chinese. Citeseer.
[11] De, Crespigny R. (2007). A Biographical Dictionary of Later Han to the Three Kingdoms (23-220 Ad). Leiden: Brill.
[12] Gupta, Rahul., Halevy, Alon., Wang, Xuezhi., Whang, Steven Euijong., Wu, Fei. (2014). Biperpedia: An ontology for search applications. VLDB 7, 7 (2014), 505– 516.
[13] Halevy, Alon., Noy, Natalya., Sarawagi, Sunita., Whang, Steven Euijong., Yu, Xiao. (2016). Discovering structure in the universe of attribute names. In WWW. International World Wide Web Conferences Steering Committee, 939–949.
[14] Hearst, Marti A. (1992). Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th conference on Computational linguistics-Volume 2. Association for Computational Linguistics, 539–545.
[15] Huang., Xu, Wei., Yu, Kai. (2015). Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991.
[16] Jiang, Meng., Shang, Jingbo., Cassidy, Taylor., Ren, Xiang., Kaplan, Lance M., Hanratty, Timothy P., Han, Jiawei. (2017). Metapad: Meta pattern discovery from massive text corpora. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 877–886.
[17] Jiang, Tianwen., Zhao, Tong., Qin, Bing., Liu, Ting., Chawla, Nitesh V., Jiang, Meng. (2019). The Role of “Condition”: A Novel Scientific Knowledge Graph Representation and Construction Model. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1634– 1642.
[18] Kern, Martin. (2003). The Biography of Sima Xiangru and the question of the Fu in Sima Qian’s Shiji. Journal of the American Oriental Society 123, 2 (2003), 303–316.
[19] Lafferty, John., McCallum, Andrew., Pereira, Fernando CN. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data.
[20] Lample, Guillaume., Ballesteros, Miguel., Subramanian, Sandeep., Kawakami, Kazuya., Dyer, Chris. (2016). Neural architectures for named entity recognition. In NAACL.
[21] Li, Haonan., Zhang, Zhisong., Ju, Yuqi., Zhao, Hai. (2018). Neural character-level dependency parsing for Chinese. In AAAI.
[22] Kaiyuan, Li. (2000). The Establishment of Han Dynasty and the Liu Bang Group: A Study of the Meritorious Military Class. Beijing: San lian shu dian (2000).
[23] Li, Qi., Jiang, Meng , Zhang, Xikun., Qu, Meng., Hanratty, Timothy P., Gao, Jing., Han, Jiawei. (2018). Truepie: Discovering reliable patterns in pattern-based information extraction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1675–1684.
[24] Li, Zhenghua., Zhang, Min., Che, Wanxiang., Liu, Ting., Chen, Wenliang., Li, Haizhou. (2011). Joint models for Chinese POS tagging and dependency parsing. In: Proceedings of Empirical Methods on Natural Language Processing. Association for Computational Linguistics, 1180–1191.
[25] Liu, Liyuan., Shang, Jingbo., Ren, Xiang., Fangzheng, Frank Xu., Gui, Huan., Peng, Jian., Han, Jiawei. (2018). Empower sequence labeling with task-aware neural language model. In AAAI.
[26] Michael Loewe. (2000). A Biographical Dictionary of the Qin, Former Han and Xin Periods: 221 Bc - Ad 24. Leiden: Brill (2000).
[27] Ma, Xuezhe., Hovy, Eduard. (2016). End-to-end sequence labeling via bidirectional lstm-cnns-crf. arXiv preprint arXiv:1603.01354.
[28] Mikolov, Tomas., Sutskever, Ilya., Chen, Kai., Corrado, Greg S., Dean, Jeff. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111–3119.
[29] Nakashole, Ndapandula., Weikum, Gerhard., Suchanek, Fabian. (2012). PATTY: a taxonomy of relational patterns with semantic types. In: Proceedings of Empirical Methods on Natural Language Processing. Association for Computational Linguistics, 1135–1145.
[30] Peng, Fuchun., Feng, Fangfang., McCallum, Andrew. (2004). Chinese segmentation and new word detection using conditional random fields. In: Proceedings of the 20th international conference on Computational Linguistics. Association for Computational Linguistics, 562.
[31] Toutanova, Kristina., Chen, Danqi., Pantel, Patrick., Poon, Hoifung., Choudhury, Pallavi., Gamon, Michael. (2015). Representing text for joint embedding of text and knowledge bases. In: Proceedings of Empirical Methods on Natural Language Processing. 1499–1509.
[32] Hans Van Ess. (1993). The Meaning of Huang-Lao in Shiji and Hanshu. Études chinoises 12, 2 (1993), 161–177.
[33] Wang, Wenhui., Chang, Baobao. (2016). Graph-based dependency parsing with bidirectional LSTM. In ACL, Vol. 1. 2306–2315.
[34] Wang Xueying., Zhang, Haiqiao., Li, Qi., Shi, Yiyu., Jiang, Meng. (2019). A Novel Unsupervised Approach for Precise Temporal Slot Filling from Incomplete and Noisy Temporal Contexts. In: The World Wide Web Conference. ACM, 3328– 3334.
[35] Yahya, Mohamed., Whang, Steven., Gupta, Rahul., Halevy, Alon. (2014). Renoun: Fact extraction for nominal attributes. In: Proceedings of Empirical Methods on Natural Language Processing. 325–335.
[36] Yeh, Ching-Long., Chen, Yi-Chun. (2007). Zero Anaphora Resolution in Chinese with Shallow Parsing. Journal of Chinese Language and Computing 17, 1 (2007), 41–56.
[37] Yin, Qingyu., Zhang, Yu., Zhang, Weinan., Liu, Ting. (2017). Chinese Zero Pronoun Resolution with Deep Memory Network. In EMNLP. Association for Computational Linguistics, Copenhagen, Denmark, 1309–1318. https://doi.org/10.18653/v1/D17-1135
[38] Yin, Qingyu., Zhang, Yu., Zhang, Weinan., Liu, Ting., Wang, William Yang. (2018). Zero Pronoun Resolution with Attention- based Neural Network. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, USA, 13–23. https://www. aclweb.org/anthology/C18-1002
[39] Yin, Qingyu., Zhang, Yu., Zhang, Wei-Nan., Liu, Ting., Wang, William Yang. (2018). Deep Reinforcement Learning for Chinese Zero Pronoun Resolution. In ACL. Association for Computational Linguistics, Melbourne, Australia, 569–578. https://doi.org/10.18653/v1/P18-1053
[40] Yu, Wenhao., Li, Zongze., Zeng, Qingkai., Jiang, Meng. (2019). Tablepedia: Automating PDF Table Reading in an Experimental Evidence Exploration and Analytic System. In The World Wide Web Conference. ACM, 3615–3619.
[41] Zhang, Qi., Liu, Xiaoyu., Fu, Jinlan. (2018). Neural networks incorporating dictionaries for chinese word segmentation. In AAAI.
[42] Zhou, Wei., Wang, Aiping., Shu, Hua., Kliegl, Reinhold., Yan, Ming. (2018). Word segmentation by alternating colors facilitates eye guidance in Chinese reading. Memory & cognition 46, 5 (2018), 729–740.
[43] Zhu, Jun, Nie, Zaiqing, Liu, Xiaojiang., Zhang, Bo., Wen, Ji-Rong. (2009). Stat Snowball: a statistical approach to extracting entity relationships. In WWW. ACM, 101–110.

DLINE Journals portal