Use of Mutual Information Arrays to Predict Coevolving Sites in the Full Length HIV gp120 Protein for Subtypes B and C
Use of Mutual Information Arrays to Predict Coevolving Sites in the Full Length HIV gp120 Protein for Subtypes B and C作者机构:State Key Laboratory of VirologyWuhan Institute of VirologyChinese Academy of Sciences
出 版 物:《Virologica Sinica》 (中国病毒学(英文版))
年 卷 期:2011年第26卷第2期
页 面:95-104页
核心收录:
学科分类:1004[医学-公共卫生与预防医学(可授医学、理学学位)] 100401[医学-流行病与卫生统计学] 10[医学]
主 题:Mutual information arrays Predict coevolving sites Protein evolve HIV gpl20 protein B and C subtypes
摘 要:It is well established that different sites within a protein evolve at different rates according to their role within the protein; identification of these correlated mutations can aid in tasks such as ab initio protein structure, structure function analysis or sequence alignment. Mutual Information is a standard measure for coevolution between two sites but its application is limited by signal to noise ratio. In this work we report a preliminary study to investigate whether larger sequence sets could circumvent this problem by calculating mutual information arrays for two sets of drug naive sequences from the HIV gpl20 protein for the B and C subtypes. Our results suggest that while the larger sequences sets can improve the signal to noise ratio, the gain is offset by the high mutation rate of the HIV virus which makes it more difficult to achieve consistent alignments. Nevertheless, we were able to predict a number of coevolving sites that were supported by previous experimental studies as well as a region close to the C terminal of the protein that was highly variable in the C subtype but highly conserved in the B subtype.