Accurate Plant MicroRNA Prediction Can Be Achieved Using Sequence Motif Features
Accurate Plant MicroRNA Prediction Can Be Achieved Using Sequence Motif Features作者机构:Computer Science The College of Sakhnin Sakhnin Israel The Institute of Applied Research The Galilee Society Shefa-’Amr Israel Molecular Biology and Genetics Izmir Institute of Technology Urla Turkey Bionia Incorporated IZTEKGEB A8 Urla Turkey
出 版 物:《Journal of Intelligent Learning Systems and Applications》 (智能学习系统与应用(英文))
年 卷 期:2016年第8卷第1期
页 面:9-22页
学科分类:1002[医学-临床医学] 100214[医学-肿瘤学] 10[医学]
主 题:MicroRNA Prediction Plant Bioinformatics Machine Learning Sequence Motifs
摘 要:MicroRNAs (miRNAs) are short (~21 nt) nucleotide sequences that are either co-transcribed during the production of mRNA or are organized in intergenic regions transcribed by RNA polymerase II. In animals, Drosha, and in plants DCL1 recognize pre-miRNAs which set themselves apart by their characteristic stem loop (hairpin) structure. This structure appears important for their recognition during the process of maturation leading to functioning mature miRNAs. A large body of research is available for computational pre-miRNA detection in animals, but less within the plant kingdom. For the prediction of pre-miRNAs, usually machine learning approaches are employed. Therefore, it is necessary to convert the pre-miRNAs into a set of features that can be calculated and many such features have been described. We here select a subset of the previously described features and add sequence motifs as new features. The resulting model which we called MotifmiRNAPred was tested on known pre-miRNAs listed in miRBase and its accuracy was compared to existing approaches in the field. With an accuracy of 99.95% for the generalized plant model, it distinguishes itself from previously published results which reach an average accuracy between 74% and 98%. We believe that our approach is useful for prediction of pre-miRNAs in plants without per species adjustment.