Sep
07
2008

Insights into Protein–DNA Interactions through Structure Network Analysis

In a research published at Plos Computational Biology, Sathyapriya et. al. employs a network analysis on a data set of protein-DNA complexes. The analysis results demonstrate characteristic clustering patterns at the interfaces, which might be used in the future for structure/binding-site prediction. The authors also suggest a novel classification scheme based on this representation. By Nir London

Sathyapriya R, Vijayabaskar MS, Vishveshwara S.

Plos Computational Biology. 2008 Sep 5;4(9)

Insights into the mechanism of protein–DNA binding and recognition have come from extensive analysis of protein–DNA interfaces. Some of these investigations have been carried out at the level of pairwise interactions between the atoms/residues of the interacting partners. However, the information communicated along the interfaces is rarely a pairwise phenomenon. The concept of representing protein structures as graphs is not new, the amino acids in proteins are considered as nodes and the interaction between these nodes have been considered as edges for constructing different types of graphs. These protein structure graphs (PSG) have been successfully used in the analysis of protein structure, stability and function (Brinda et. al. I, Brinda et. al. II). PSGs have also been analyzed in protein–DNA complexes to identify significant interactions as clusters of interacting amino acids at the protein–DNA interfaces. However in such studies, the interacting nucleotides of the DNA were not considered as part of the graphs.

Clusters at the Protein–DNA Interface

A protein–DNA graph (PDG) is a bipartite graph constructed to represent the interaction between the amino acids of the protein and the nucleotides of the DNA in a protein–DNA complex. A contact in the bipartite PDG is defined when a side chain of an amino acid interacts with the nucleotide. The interactions of the amino acid with the nucleotide can be considered at different levels: with the phosphate (p), deoxyribose sugar (S) or base (B) components individually, or with the nucleotide as a complete entity. The edges are defined upon quantification of the interaction between the amino acids and the nucleotides with the “Interaction Strength,” Iij (It is to be noted that the interaction strength mentioned here is based on the number of atom-atom contacts and in a way reflects only the local packing density). The nodes in a PDG are connected if the Iij evaluated between the nodes is greater than or equal to a user-defined Iij. This parameter is selected to balance a trade-off between the strength of interaction and the cluster size. 

Amino Acid Propensities in PDGs

The propensities of amino acids to form P-p, P-S, and P-B graphs were calculated from a dataset of ~120 protein-DNA solved complexes. In general, a higher propensity of basic residues (Arg and Lys) is observed to occur in PDGs. Arg is more preferred in P-B graphs whereas Lys is more preferred in P-p graphs. All polar amino acids occur significantly in all the three component clusters, however the preferences vary. For instance, Ser, the smallest polar amino acid has a higher propensity to occur in P-p graphs, and Asn/Gln, which contain the planar conjugated amide group has higher propensity of occurrence in P-B graphs. The interactions of Val, Ile, Leu, Phe, Trp are higher in the P-S graphs indicating that the deoxyribose is involved in hydrophobic and van der Waals interactions. Other expected trends of interaction between amino acids and DNA are confirmed as well.

Cluster Profiles of Different Groups of Protein–DNA Complexes

The protein–DNA complexes have been previously classified into different groups based on the structural similarity of the proteins bound to the DNA. Luscombe et. al. have provided a comprehensive classification of the protein–DNA complexes based on the secondary structural motifs of proteins interacting with the DNA. The classification results in eight groups of complexes: beta-sheet group, beta-hairpin group, helix turn helix (HTH), zipper type (ZT), zinc coordinating group, other alpha-helices, enzymes and others.  PDGs were generated for all groups of protein–DNA complexes, and were analyzed to investigate the properties such as the preference of proteins to interact with the DNA, the components of the DNA to which the protein binds, the dominance of a particular type of cluster (P-p, P-S, or P-B), and also searched for a generic pattern (if any) of clusters that could be identified amongst the groups of protein–DNA complexes.

The authors presents the major characteristics for each of these groups. For example, for the  beta-Sheet group, given their tendency to twist, they provide a saddle like scaffold on which the minor groove of the DNA is well seated. There is an overwhelming dominance of P-S clusters compared to P-p or P-B clusters in the members of this group. Also, the P-S clusters, which appear in the minor groove, are located in similar positions and their amino acid compositions are very similar among the members of this group. 

Hubs have been defined as amino acid residues which are connected to four or more nucleotides or vice versa. The members of this group contain a few hubs. Most characteristic feature of this group is the presence of Phe (P-B) hub in all the members of the group. It is interesting to note that this structurally important residue identified as a hub is observed at the DNA bending region and could be correlated to the deformation of the DNA. Thus, the protein-induced DNA deformation, which was observed earlier, is elegantly captured here by this network property.

Other groups such as beta-Hairpin, Helix turn helix and Zipper-type, also show distinct cluster profiles.

Classification of Protein–DNA Complexes

The authors attempt to classify Protein-DNA complexes according to the different clusters that appear at the interface, their classes are composed of complexes containing only one type of cluster (P-p/P-S/P-B) two types of clusters, or all types.

This classification scheme based on the interaction patterns of amino acids with nucleotide components in PDG does not directly deal with the type of interaction involved (like electrostatics, van der Waals, H-bonding, etc). However, indirectly, the P-p cluster is dominated by electrostatic interaction and the P-S clusters are composed of van der Waals interactions along with stacking of aromatic residues with the deoxyribose ring. The P-B graphs are dominated by stacking of amino acids (mostly the planar side chain of Arg) with the bases, H-bonding and also charge mediated interactions.

A comparison of the present classification with the structural motif based classification by Luscombe et. al. shows distinct differences. A major difference is that the proteins from the same group (motif based classification) fall under different classes of interface clusters. In other words, even though the proteins have the same secondary structure motif (e.g., HTH motif), their mode of interaction may vary significantly depending on factors like the sequence of DNA (cognate/non-cognate DNA binding) and the component (p, S, or B) of the nucleotide to which it binds. However, a few salient features are common to both the classification schemes. 

The authors mention other attempts at protein-DNA classification schemes Prabakaran et. al.,Siggers et. al. and note that the fact that there is only a marginal overlap between different classification schemas underscores the versatilities in protein–DNA recognition mechanism. It may be valuable to use different approaches to obtain complementary information to understand the protein–DNA recognition mechanisms in detail.

In conclusion the authors propose that such analysis and the group specific features of protein–DNA recognition could be used as a starting point in predicting the DNA binding sites on these proteins. Their classification scheme along with different classification schemes could provide complementary information on the nature of protein–DNA interactions. This study has also highlighted the nature of the clusters and hubs present at the recognition site. These clusters and hubs may not only prove to be valuable in understanding the residues contributing to the stability of the protein–DNA interfaces, but also could be identified as features characteristic for a given group of proteins.

By Nir London


Random Posts

    Enjoyed this Post ?

    Subscribe by E-mail:

    Subscribe in a reader. Follow us on twitter.

    Powered by WordPress | Aeros Theme | TheBuckmaker.com WordPress Themes
    © 2009 Rosetta Design Group LLC