GenBase: A Nucleotide Sequence Database
作者机构:National Genomics Data CenterBeijing Institute of GenomicsChinese Academy of Sciences and China National Center for BioinformationBeijing 100101China CAS Key Laboratory of Genome Sciences and InformationBeijing Institute of GenomicsChinese Academy of Sciences and China National Center for BioinformationBeijing 100101China Human Genome Sequencing CenterBaylor College of MedicineHoustonTX 77030USA University of Chinese Academy of SciencesBeijing 100049China
出 版 物:《Genomics, Proteomics & Bioinformatics》 (基因组蛋白质组与生物信息学报(英文版))
年 卷 期:2024年第22卷第3期
页 面:107-112页
核心收录:
学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:supported by the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant No.XDB38030200) the National Key R&D Program of China(Grant No.2021YFF0703701) the Professional Association of the Alliance of International Science Organizations(Grant No.ANSO-PA-2023-07) the International Partnership Program of the Chinese Academy of Sciences(Grant No.161GJHZ2022002MI) the Open Biodiversity and Health Big Data Initiative of International Union of Biological Sciences(IUBS)
主 题:Nucleotide sequence Database GenBase GenBank INSDC
摘 要:The rapid advancement of sequencing technologies poses challenges in managing the large volume and exponential growth of sequence data efficiently and on *** address this issue,we present GenBase(https://***/genbase),an open-access data repository that follows the International Nucleotide Sequence Database Collaboration(INSDC)data standards and structures,for efficient nucleotide sequence archiving,searching,and *** a core resource within the National Genomics Data Center(NGDC)of the China National Center for Bioinformation(CNCB;https://***),GenBase offers bilingual submission pipeline and services,as well as local submission assistance in *** also provides a unique Excel format for metadata description and feature annotation of nucleotide sequences,along with a real-time data validation system to streamline sequence *** of April 23,2024,GenBase received 68,251 nucleotide sequences and 689,574 annotated protein sequences across 414 species from 2319 *** of these,63,614(93%)nucleotide sequences and 620,640(90%)annotated protein sequences have been released and are publicly accessible through GenBase’s web search system,File Transfer Protocol(FTP),and Application Programming Interface(API).Additionally,in collaboration with INSDC,GenBase has constructed an effective data exchange mechanism with GenBank and started sharing released nucleotide ***,GenBase integrates all sequences from GenBank with daily updates,demonstrating its commitment to actively contributing to global sequence data management and sharing.