学海网 文档下载 文档下载导航
设为首页 | 加入收藏
搜索 请输入内容:  
 导航当前位置: 文档下载 > 所有分类 > Comprehensive statistical method for protein fold recognition

Comprehensive statistical method for protein fold recognition

We present a protein fold recognition method that uses a comprehensive statistical interpretation of structural Hid-den Markov Models (HMMs). The structure/fi)ld recogni-tion is done by summing the probabilities of all sequence-to-structure alignments Conv

Comprehensive statistical method for protein fold recognition Jadwiga R. Biefikowska Lihua Yu Sophia Zarakhovich Robert G. Rogers Jr. Temple F. Smith BioMolecular Engineering Research Center, College of Engineering, Boston University, 36 Cummington Street, Boston, MA 02215, USA. E-mail: j adwiga@darwin, bu. edu Abstract We present a protein fold recognition method that uses a comprehensive statistical interpretation of structural Hidden Markov Models (HMMs). The structure/fi)ld recognition is done by summing the probabilities of all sequenceto-structure alignments Conventionally, Boltzmann statistics dictate t h a t the optimal alignment can give an estimate of the lowest free energy of the sequence conforimation imposed by the structural model. The alignment is optimized for a scoring function t h a t is interpreted as a free energy of an amino acid in a structural environment. Near-optimal alignments are ignored, regardless of how likely they might be compared to the optimal alignment. Here we investigate an alternative view. A structure model can be seen as a statistical representation of an ensemble of simila~ structures. The optimal alignment is always the most pt'obable, but sub-optimal alignments may have comparable probabilities. These sub-optimal alignments can be interpreted as optimal alignments to the "other" structures from the ensemble or optimal alignments under minor fluctuations in tile scoring function. Summing probabilities for all alignmetits gives an (~timate of sequence-model compatibility. We liave built a set of structural HMMs for 188 protein structures, and have compared two methods for identifying the structure compatible with a sequence: by the optimal alignment probability and by the total probability. Fold recognition by t~tal probability was 40% more accurate than fold recognition by the optimal alignment probability. 1 Introduction Protein fold recognition methods quickly ewflve into viable tools that help to deduce the protein structure and function [13]. The ultimate goal of a fold recognition inethod is to predict the protein structure by identifying the correct fold (structural template) among already-solved protein structures or models and aligning the protein seqlaence correctly onto the structural model. Most fold recognition methods use Boltzmann statistics to interpret probabilistic scoring functions [16, 3, 4, 18, 5, 11, 22, 19, 23, 21]. A sequence-to-structure alignment is evaluPermlssmn to make digital or hind copras of all or part of this x~o~kfor personal or classroom use is granted w,lthout fee prowded that copras are not made or distributed tbr profit or commercml advamagc and that copaes bear th~s notice and the full e~talJonon tile fi~stpage To copy otherv,'Jse,to repubhsh, to post on servers or to redistribute to hsts, reqmres prior specific permu-,slonand/or a fee RECOMB 2000 Tokyo Japan USA Copyn ght ACM 2000 1-58l 13-186-0/00/04 .$5.00 ated by a scoring function, and the score of the alignm

第1页

TOP相关主题

  • comprehensive
  • comprehensive income
  • comprehensive school
  • comprehensive input
  • comprehensive 翻译
  • four comprehensives
  • incomprehensive
  • comprehensive 缩写

我要评论

相关文档

站点地图 | 文档上传 | 侵权投诉 | 手机版
新浪认证  诚信网站  绿色网站  可信网站   非经营性网站备案
本站所有资源均来自互联网,本站只负责收集和整理,均不承担任何法律责任,如有侵权等其它行为请联系我们.
文档下载 Copyright 2013 doc.xuehai.net All Rights Reserved.  email
返回顶部