Cross-Context News Corpus for Protest Event-Related Knowledge Base Construction
作者机构:KoçUniversityRumelifeneri yoluSariyerIstanbul 34450Turkey
出 版 物:《Data Intelligence》 (数据智能(英文))
年 卷 期:2021年第3卷第2期
页 面:308-335页
核心收录:
学科分类:1205[管理学-图书情报与档案管理] 0502[文学-外国语言文学] 050201[文学-英语语言文学] 05[文学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
主 题:Event extraction Text classification Political science Social science News Contentious politics Protests Event coreference resolution
摘 要:We describe a gold standard corpus of protest events that comprise various local and international English language sources from various *** corpus contains document-,sentence-,and token-level *** corpus facilitates creating machine learning models that automatically classify news articles and extract protest event-related information,constructing knowledge bases that enable comparative social and political science *** each news source,the annotation starts with random samples of news articles and continues with samples drawn using active *** batch of samples is annotated by two social and political scientists,adjudicated by an annotation supervisor,and improved by identifying annotation errors *** found that the corpus possesses the variety and quality that are necessary to develop and benchmark text classification and event extraction systems in a cross-context setting,contributing to the generalizability and robustness of automated text processing *** corpus and the reported results will establish a common foundation in automated protest event collection studies,which is currently lacking in the literature.