1. Query microbe protein against all templates.
Users can submit a task to HMI-PRED 2.0. Simply put the microbe protein as PDB ID (chain ID is recommended, but optional)
, or provide a .pdb
/ .cif (recommended)
file containing the microbe protein chain. Potential interface mimicry is searched
across all templates in our database. Note that during the prediction stage, human chains are skipped and not processed as a microbe protein.
Thus, if an input contains all human protein chains, no HMIs will be predicted.
It is recommended
that users provide a chain ID
for the input protein. If not, HMI-PRED 2.0 will process all unique chains in the structure, and therefore taking longer time to scan.
In case it is of interest, human chains can be separately submitted by providing .cif file as input, instead of a PDB ID.
However, HMI-PRED 2.0 is not intended to predict human protein-protein interactions.
2. Query microbe protein against a set of templates.
Users can choose a set of templates to search for potential host-microbe interactions. The template list file is a text file listing one template per line.
Template identifier is a form of {PDB ID}_{Chain ID}_{optional Interface ID}.
Users can search templates and obtain the list as a text file using template search menu.
3. Advanced options.
Users can control the structural alignment (TM-align) threshold (default 0.5).
By default, TM-align score of 0.5 or greater is required for microbe-template interactions to be processed further.
All (microbe-mimic chain) pairs with a passing TM-score are screened for hot spot conservation.
Certain residues contribute more to the binding energy than the rest. These residues are called hot spots
.
Interface mimicry is more likely when those hot spots are conserved by the microbial protein.
By default, all hotspot conservation levels are kept, meaning the predictions are not filtered based on the conservation level.
Since the conservation is rare and likely indicate more plausible HMIs, users can search for HMIs (Find HMI below)
having certain degree of hot spot conservation.
Passing the alignment, the microbe protein is transformed to replace the mimic chain in the template, followed by docking and refinement. HMIs with docking score of -5.0 or better (more negative) are presented as results.
1. Find predicted interactions
By default, predictions made with PDB ID inputs (microbe protein) are made public and available for search. We also continuously predict and cumulate HMIs for AlphaFold2-modeled microbial proteins.
Users can choose to exclude (default is to include) AlphaFold2-modeled microbial proteins to obtain HMIs only for structurally solved microbial proteins.
HMIs can also be searched by the identities of proteins (e.g. microbial, host, and mimic proteins).
In the example shown above, a microbial organism name containing keyword "epstein-barr" is chosen as a filter (intended to find Epstein-Barr virus proteins interacting with any host proteins). The search filters are case insensitive.
Note that this will include other sub-species whose organism name contains the keyword "epstein-barr". For instance, a keyword "pseudomonas" will return HMIs for multiple species, including "P. aeruginosa", "P. citronellolis", "P. nitroreducens", etc.
Also, we include interactions where evolutionary hot spots are relatively well-conserved. We choose "med" and "max" so that the interactions with "low" or "none" hot spot conservation are filtered out.
Evolutionary hot spots
are amino acid residues that are likely to contribute more to the binding that other residues. The hot spot residues are extracted from our templates. When microbial protein interacts with a target protein in host, the hot spots may or may not be conserved.
Conservation of hot spots suggests the predicted interactions are more likely to occur.
2. Network analysis
A microbe may target multiple host protein to survive and cause infections. Click ENRICHMENT ANALYSIS
to perform enrichment analysis via String.
It will redirect to String, where more detailed analyses can be done for the human proteins. Note that this is applicable only when multiple human proteins are found from search.
In the example above, predicted human protein targets for Epstein-Barr virus proteins are queried through String, where the interaction network (among human proteins only) are visualized.
The enrichment analysis suggests the associations between Epstein-Barr virus and several important diseases and pathways, including non-Hodgkin lymphoma and Ras signaling pathway.
CYTOSCAPE VIEW
is another option that visualizes the predicted interactions as a network. It shows which microbial proteins are interacting with which host proteins, and what are the known interactors of the host proteins.
1. Searching for templates
HMI-PRED 2.0 scans potential interactions for the input microbe proteins across all templates in our database. The templates contain two protein chains that are known to interact with each other.
The contacting surfaces of the two chains are interfaces, where some residues within interfaces contribute to the binding more than the others. These residues are called evolutionary hot spots
that HMI-PRED 2.0 considers during prediction.
Users can find and visualize templates in our database by the protein identities and / or the number of predicted interactions linked to the template.
In the example above, we search for templates containing a protein chain with keyword "kinase" and at least one predicted interaction.
A total of 475 templates are shown in the table, where the identity of protein chains and number of linked HMIs are displayed.
Click "Show" button to visualize the individual template.
The details about the template are displayed, including a button to show the linked HMIs.
In the 3D structure view, the meshes indicate interaction interface for each chain, and the yellow balls-and-sticks indicate interaction hot spot residues.
The chains, interfaces, and hot spots are toggleable in the control panel to the right.
The chain information, including gene names, tissue expressions, and hot spot residues are on the box above.
1. Structure views for predicted host-microbe interactions
When viewing a predicted host-microbe interaction, there are three protein chains: microbial, host, and mimic chains.
Microbial protein and mimic chains are superimposed based on the structural alignment. Thus, toggling on/off the microbial/mimic chains,
users can see the original interactions in templates (host-mimic) or the predicted interactions (host-microbe).
They are colored differently, and the colors match to the boxes and buttons.
By default, host chains are blue and mimic chains are red. "swap chain colors" button will change host chain to red and mimic chan to blue (and microbe chain to teal).
This is useful to compare with the original templates -- it is unknown until the prediction which template chain will be the host / mimic.
2. Structure views for interface templates
A template contain two protein chains, and upon prediction, one becomes host and the other is mimicked by a microbial protein.
When both chains are human proteins, one can be host in some predictions and mimicked in the others.
Predicted interactions linked to the template can be accessed by the blue "Show" button on the upper-right corner.
Each chain, interface, and hot spot residues can be shown/hidden by clicking the toggle buttons in the visual control panel.