咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >CDCAT: A Multi-Language Cross-... 收藏

CDCAT: A Multi-Language Cross-Document Entity and Event Coreference Annotation Tool

CDCAT: A Multi-Language Cross-Document Entity and Event Coreference Annotation Tool

作     者:Yang Xu Boming Xia Yueliang Wan Fan Zhang Jiabo Xu Huansheng Ning Yang Xu;Boming Xia;Yueliang Wan;Fan Zhang;Jiabo Xu;Huansheng Ning

作者机构:School of Computer and Communication EngineeringUniversity of Science and Technology BeijingBeijing 100083China Beijing Engineering Research Center for Cyberspace Data Analysis and ApplicationsBeijing 100083China Research Institute with Run Technologies CompanyLtd.Beijing 100192China School of Information EngineeringXinjiang Institute of EngineeringUrumqi 830091China 

出 版 物:《Tsinghua Science and Technology》 (清华大学学报(自然科学版(英文版))

年 卷 期:2022年第27卷第3期

页      面:589-598页

核心收录:

学科分类:0808[工学-电气工程] 08[工学] 081203[工学-计算机应用技术] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:supported by the National Natural Science Foundation of China (No. 61872038) the Fundamental Research Funds for the Central Universities (No. FRF-GF-19-020B)。 

主  题:event coreference entity coreference manual annotation tool natural language processing 

摘      要:A tool for the manual annotation of cross-document entity and event coreferences that helps annotators to label mention coreference relations in text is essential for the annotation of coreference corpora. To the best of our knowledge, CROss-document Main Events and entities Recognition(CROMER) is the only open-source manual annotation tool available for cross-document entity and event coreferences. However, CROMER lacks multi-language support and extensibility. Moreover, to label cross-document mention coreference relations, CROMER requires the support of another intra-document coreference annotation tool known as Content Annotation Tool, which is now unavailable. To address these problems, we introduce Cross-Document Coreference Annotation Tool(CDCAT), a new multi-language open-source manual annotation tool for cross-document entity and event coreference, which can handle different input/output formats, preprocessing functions, languages, and annotation systems. Using this new tool, annotators can label a reference relation with only two mouse clicks. Best practice analyses reveal that annotators can reach an annotation speed of 0.025 coreference relations per second on a corpus with a coreference density of 0.076 coreference relations per word. As the first multi-language open-source cross-document entity and event coreference annotation tool, CDCAT can theoretically achieve higher annotation efficiency than CROMER.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分