Path-partitioned encoding supports wildcard-awareness twig queries
Path-partitioned encoding supports wildcard-awareness twig queries作者机构:School of ComputerHuazhong University of Science and Technology School of ComputerHuanggang Normal University Technic Pre-research DepartmentDameng Database Company Limited
出 版 物:《Journal of Shanghai University(English Edition)》 (上海大学学报(英文版))
年 卷 期:2009年第13卷第5期
页 面:363-374页
学科分类:0810[工学-信息与通信工程] 12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 0805[工学-材料科学与工程(可授工学、理学学位)] 0701[理学-数学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:supported by the National High-Tech Research and Development Plan of China (Grant No.2005AA4Z3030)
主 题:XML tree pattern structural join encoding scheme twig query
摘 要:Finding all occurrences of a twig query in an XML database is a core operation for efficient evaluation of XML queries. It is important to effiectively handle twig queries with wildcards. In this paper, a novel path-partitioned encoding scheme is proposed for XML documents to capture paths of all elements, and a twig query is modeled as an XPattern extended from tree pattern. After definition, simplification, normalization, verification and initialization of the XPattern, both work sets and a join plan are generated. According to these measures, an effiective algorithm to answer for a twig query, called DMTwig, is designed without unnecessary elements and invalid structural joins. The algorithm can adaptively deal with twig queries with branch ([ ]), child edge (/), descendant edge (//), and wildcard (*) synthetically. We show that path-partitioned encoding scheme and XPattern guarantee the I/O and CPU optimality for twig queries. Experiments on representative data set indicate that the proposed solution performs significantly.