The paper describes an efficient direct method to solve an equation Ax = b, where A is a sparse matrix, on the Intel®Xeon PhiTM coprocessor. The main challenge for such a system is how to engage all available threads ...
详细信息
The paper describes an efficient direct method to solve an equation Ax = b, where A is a sparse matrix, on the Intel®Xeon PhiTM coprocessor. The main challenge for such a system is how to engage all available threads (about 240) and how to reduce OpenMP* synchronization overhead, which is very expensive for hundreds of threads. The method consists of decomposing A into a product of lower-triangular, diagonal, and upper triangular matrices followed by solves of the resulting three subsystems. The main idea is based on the hybrid parallel algorithm used in the Intel®Math Kernel Library Parallel Direct Sparse Solver for Clusters [1]. Our implementation exploits a static scheduling algorithm during the factorization step to reduce OpenMP synchronization overhead. To effectively engage all available threads, a three-level approach of parallelization is used. Furthermore, we demonstrate that our implementation can perform up to 100 times better on factorization step and up to 65 times better in terms of overall performance on the 240 threads of the Intel®Xeon PhiTM coprocessor.
This paper describes a method of calculating the Schur complement of a sparse positive definite matrix A. The main idea of this approach is to represent matrix A in the form of an elimination tree using a reordering a...
详细信息
This paper describes a method of calculating the Schur complement of a sparse positive definite matrix A. The main idea of this approach is to represent matrix A in the form of an elimination tree using a reordering algorithm like METIS and putting columns/rows for which the Schur complement is needed into the top node of the elimination tree. Any problem with a degenerate part of the initial matrix can be resolved with the help of iterative refinement. The proposed approach is close to the “multifrontal” one which was implemented by Ian Duff and others in 1980s. Schur complement computations described in this paper are available in Intel®Math Kernel Library (Intel®MKL). In this paper we present the algorithm for Schur complement computations, experiments that demonstrate a negligible increase in the number of elements in the factored matrix, and comparison with existing alternatives.
引言血管性认知损害(vascular cognitive impairment,VCI)诊断共识的缺乏(体现为多种不同评估方案的使用),妨碍了对其理解和治疗的推进.多个国家的大量临床医生和研究人员参与了2个阶段血管性认知损害分类共识研究(Vascular Impairment of Cognition Classification Consensus Study,VICCCS),旨在就VCI的诊断原则(VICCCS-1)和诊断方案(VICCCS-2)达成一致意见.本文提供了VICCCS-2的相关内容.方法使用VICCCS-1达成的原则和已发表的诊断指南作为在线德尔菲(Delphi)调查的参考基点,以期对VCI的临床诊断达成共识.结果共进行了6轮调查,每轮有65~79名专家参与,他们就VICCCS修订的轻度和重度VCI的诊断指南达成共识,并肯定了美国国立神经疾病与卒中研究所-加拿大卒中网(National Institute of Neurological Disorders and Stroke–Canadian Stroke Network,NINDS-CSN)发布的神经心理学评估方案和对影像学检查的推荐意见.讨论VICCCS-2建议规范化应用NINDS-CSN推荐的神经心理学和影像学评估方案诊断VCI,以促进研究协作.
暂无评论