Privacy-preserving integration of multiple institutional data for single-cell type identification with scPrivacy
作者机构:Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration(Tongji University)Ministry of EducationOrthopaedic Department of Tongji HospitalBioinformatics DepartmentSchool of Life Sciences and TechnologyTongji UniversityShanghai 200092China Department of AIWeBankShenzhen 518055China Translational Medical Center for Stem Cell Therapy and Institution for Regenerative MedicineShanghai East HospitalBioinformatics DepartmentSchool of Life Sciences and TechnologyTongji UniversityShanghai 200092China Shanghai Research Institute for Intelligent Autonomous SystemsShanghai 201210China
出 版 物:《Science China(Life Sciences)》 (中国科学(生命科学英文版))
年 卷 期:2023年第66卷第5期
页 面:1183-1195页
核心收录:
学科分类:0710[理学-生物学] 0711[理学-系统科学] 07[理学]
基 金:supported by the National Key Research and Development Program of China(2021YFF1200900,2021YFF1201200) the National Natural Science Foundation of China(31970638,61572361) the Shanghai Artificial Intelligence Technology Standard Project(19DZ2200900) the Shanghai Shuguang Scholars Project WeBank Scholars Project the Fundamental Research Funds for the Central Universities。
主 题:preserving integration utilize
摘 要:The rapid accumulation of large-scale single-cell RNA-seq datasets from multiple institutions presents remarkable opportunities for automatically cell annotations through integrative analyses.However,the privacy issue has existed but being ignored,since we are limited to access and utilize all the reference datasets distributed in different institutions globally due to the prohibited data transmission across institutions by data regulation laws.To this end,we present scPrivacy,which is the first and generalized automatically single-cell type identification prototype to facilitate single cell annotations in a data privacy-preserving collaboration manner.We evaluated scPrivacy on a comprehensive set of publicly available benchmark datasets for single-cell type identification to stimulate the scenario that the reference datasets are rapidly generated and distributed in multiple institutions,while they are prohibited to be integrated directly or exposed to each other due to the data privacy regulations,demonstrating its effectiveness,time efficiency and robustness for privacy-preserving integration of multiple institutional datasets in single cell annotations.