Improving cis-regulatory elements modeling by consensus scaffolded mixture models
Improving cis-regulatory elements modeling by consensus scaffolded mixture models作者机构:Department of Computer Science and TechnologyTsinghua University MOE Key Laboratory of Bioinformatics and the Bioinformatics DivisionDepartment of AutomationTsinghua University
出 版 物:《Science China(Information Sciences)》 (中国科学:信息科学(英文版))
年 卷 期:2013年第56卷第1期
页 面:219-229页
核心收录:
学科分类:0710[理学-生物学] 07[理学] 08[工学] 09[农学] 071007[理学-遗传学] 0901[农学-作物学] 0836[工学-生物工程] 090102[农学-作物遗传育种]
基 金:supported by National Natural Science Foundation of China (Grant Nos. 60703058 30625012 60721003)
主 题:position weight matrix cis-regulatory elements mixture models frequent pattern mining EM algorithm
摘 要:A position weight matrix(PWM) is widely accepted as a probabilistic representation for modeling protein-DNA binding *** studies showed that for factors which bind to divergent binding sites,mixtures of multiple PWMs improve *** propose a consensus scaffolded mixutre PWM(CSM) model to improve cis-regulatory elements modeling by allowing overlapping components represented by a set of PWMs,each of which corresponds to a binding pattern and is scaffolded by a degenerate *** addition,we propose a learning algorithm that involves an initial structure learning stage based on the frequent pattern mining and a refining stage based on the expectation maximization(EM) *** assess the merits of CSM using three independent *** a case-study of transcription factor Leu3,the derived CSM models agree with conventional mixtures but show better fitness according to Fermi-Dirac *** of the human-mouse conservation of predicted binding sites of 83 JASPAR transcription factors(TFs) shows that the CSM is as good as or better than the simple mixture,the context-specific independent(CSI) mixture,and the single PWM model,for 83%,84%,and 75% of the cases,***-fold cross validation on 46 TRANSFAC datasets shows that CSM model has better generality than other mixture models.