3DT-CM: A Low-complexity Cross-matching Algorithm for Large Astronomical Catalogues Using 3d-tree Approach
作者机构:College of Intelligence and ComputingTianjin UniversityTianjin 300350China Technical R&D Innovation CenterNational Astronomical Data CenterTianjin 300350China Xinjiang Astronomical ObservatoryChinese Academy of SciencesUrumqi 830011China
出 版 物:《Research in Astronomy and Astrophysics》 (天文和天体物理学研究(英文版))
年 卷 期:2023年第23卷第10期
页 面:324-334页
核心收录:
学科分类:07[理学] 070401[理学-天体物理] 0704[理学-天文学]
基 金:supported by the National Key Research and Development Program of China (2022YFF0711502) the National Natural Science Foundation of China (NSFC) (12273025 and 12133010) supported by China National Astronomical Data Center (NADC), CAS Astronomical Data Center and Chinese Virtual Observatory (China-VO)
主 题:methods:data analysis catalogs techniques:miscellaneous
摘 要:Location-based cross-matching is a preprocessing step in astronomy that aims to identify records belonging to the same celestial body based on the angular distance formula. The traditional approach involves comparing each record in one catalog with every record in the other catalog, resulting in a one-to-one comparison with high computational complexity. To reduce the computational time, index partitioning methods are used to divide the sky into regions and perform local cross-matching. In addition, cross-matching algorithms have been adopted on highperformance architectures to improve their efficiency. But the index partitioning methods and computation architectures only increase the degree of parallelism, and cannot decrease the complexity of pairwise-based crossmatching algorithm itself. A better algorithm is needed to further improve the performance of cross-matching algorithm. In this paper, we propose a 3d-tree-based cross-matching algorithm that converts the angular distance formula into an equivalent 3dEuclidean distance and uses 3d-tree method to reduce the overall computational complexity and to avoid boundary issues. Furthermore, we demonstrate the superiority of the 3d-tree approach over the 2d-tree method and implement it using a multi-threading technique during both the construction and querying phases. We have experimentally evaluated the proposed 3d-tree cross-matching algorithm using publicly available catalog data. The results show that our algorithm applied on two 32-core CPUs achieves equivalent performance than previous experiments conducted on a six-node CPU-GPU cluster.