Semantic Recognition of a Data Structure in Big-Data
作者机构:Laboratory LIPN-UMR 7030-CNRSUniversity Paris 13Sorbonne Paris CitéVilletaneuseFrance Company TalendSuresnesFrance
出 版 物:《Journal of Computer and Communications》 (电脑和通信(英文))
年 卷 期:2014年第2卷第9期
页 面:93-102页
学科分类:1002[医学-临床医学] 100214[医学-肿瘤学] 10[医学]
主 题:Data Quality Big-Data Semantic Data Profiling Data Dictionary Regular Expressions Ontology
摘 要:Data governance is a subject that is becoming increasingly important in business and government. In fact, good governance data allows improved interactions between employees of one or more organizations. Data quality represents a great challenge because the cost of non-quality can be very high. Therefore the use of data quality becomes an absolute necessity within an organization. To improve the data quality in a Big-Data source, our purpose, in this paper, is to add semantics to data and help user to recognize the Big-Data schema. The originality of this approach lies in the semantic aspect it offers. It detects issues in data and proposes a data schema by applying a semantic data profiling.