It has become popular to say that ab-initio prediction of protein structure is now unnecessary since the “Fold space” is nearly covered and thus there is a representative, homologous solved structure, for every protein. Therefore, limiting the structure prediction problem to homology modeling. However, this is certainly not the case for membrane proteins.
Experimental determination of high-resolution membrane protein structures remains very difficult. The fact that membrane proteins are typically longer than 200 aa does not make the problem easier. Membrane proteins can be classified into 2 groups: transmembrane helical (TMH) bundles and beta-barrels. For TMH proteins, the physical constraints imposed by the anisotropic environment of the lipid bilayer lead to characteristic distributions of amino acids that depend on their depth in the membrane. These observations have enabled the development of topology prediction schemes that have become quite sophisticated and powerful.
The authors expend the technique developed for sampling nonlocal beta-sheet topologies to fold membrane proteins from sequence. In this scheme, the relative orientation of TMH pairs is fixed at two particular positions during folding by long-range pair wise constraints. For each constraint between two helices, a ‘‘fold tree’’ is constructed for the polypeptide chain in which two C-alpha positions from the two helices are connected and fixed in space during folding. To allow for this non-local connection in the tree, the peptide chain is cut elsewhere between the two connected positions. The cut is randomly selected within predicted loop regions of the proteins with a bias toward long loops. This avoids disrupting subdomains composed of few TMHs connected by short loops, which can be folded properly.
To predict those structural constraints from sequence information, a database of TMHs configurations from TMH pairs of known structures was assembled. This database of interacting TMH pairs is searched for local sequence matches with all possible pairs of predicted TMHs in the query sequence using a sliding window. In each folding trajectory, a single randomly selected predicted interaction in the library is used to constrain a particular helix pair to the helix–helix arrangement. Ten predicted interactions are included for each helix pair, which allows correct models to be generated despite the low overall accuracy of the interaction library since only one of the 10 is requiered to be correct.
The method was tested on 12 membrane proteins of diverse topologies and functions with lengths ranging between 190 and 300 residues. Enforcing a single constraint during the folding simulations enriched the population of near-native models for 9 proteins over the predictions made with the older generation of RosettaMembrane. In 4 of the cases in which the constraint was predicted from the sequence, 1 of the 5 lowest energy models was superimposable within 4 Å on the native structure. Near-native structures could also be selected for heme-binding and pore-forming domains from simulations in which pairs of conserved histidine-chelating hemes and one experimentally determined salt bridge were constrained, respectively. In 8 out of the 12 cases a model was sampled in which more than 85% of the sequence was superimposable onto the native structure and in 5 cases this was true for one of the 5 lowest energy models.
P. Barth, B. Wallner, D. Baker (2009). Prediction of membrane protein structures with complex topologies using limited constraints Proceedings of the National Academy of Sciences, 106 (5), 1409-1414 DOI: 10.1073/pnas.0808323106
Enjoyed this Post ?