Documentation
HMM Logo
Domain logo

A HMM Logo displays several columns containing a letters stack, representing the different amino acids observed at this position. The height stack shows the position conservation while the letter height represents the amino acid frequency at this specific position.
The three lines underlying the columns letter stack displays -from the top to the bottom- respectivly: the probability to observe an amino acid, the insertion probability and the insertion length at this logo position.

In simple terms, the last two lines represent the probability to observe a letter insertion right after this column and the expected length of this insertion.
In our case the first line represents how many eukaryotes sequences in the pfam domain alignment have an amino acid at this location.
The two last lines have low insertion probability and low expected length because we chose to represent all the alignment position to make easier the human proteins mapping.
Actually, if you need a consensus protein domain HMM logo you can go to the Pfam website. Only more conservated position are represented on this logo. Nevertheless, it is harder to map human protein positions on this type of display. That's why Pfam logo is more helpfull determining most important amino acid patterns in the protein domain.
Column information

By clicking on a logo column you can display a table including more details on amino acids frequencies. Amino acids are ordered by decreasing frequency. The explainations of the table two first lines are in the previous paragraph.
Mutation table

The mutation table shows the Dolphin prediction for the given protein missense mutation. Dolphin "WT" and "∆" scores are displayed and the prediction probability (see Dolphin paper for more information). The red button downloads the mutation information.
GnomAD allele frequency and Dolphin frequency (AF in the domain) are also available.
To understand where the Dolphin frequency comes from, you can click on the "Show Details" button.

The details table displays the same missenses variants at the same logo position. Mutations are ordered by decreased frequencies. The first one gives the Dolphin frequency. For each mutant you can also see the reported gnomAD frequency.
Create your HMM logo
All the protein domains HMM Logos were created by the Skylign tool. To create your logo you can go to their website and read the relative documentation. Differents possibilities are offered to generate a logo that corresponds to your alignment data.
Skylign web site:
Citation in publications:
Wheeler, T.J., Clements, J. & Finn, R.D. Skylign: a tool for creating informative, interactive logos representing sequence alignments and profile hidden Markov models. BMC Bioinformatics 15, 7 (2014). https://doi.org/10.1186/1471-2105-15-7
Alignment Viewer
Logo position number

The alignment viewer displays all the human proteins belonging to this pfam domain alignment.
The first line represents the HMM logo positions (columns). In the example above, the provided variant is at the center of the highlighted blue cross. The position in the logo corresponds to the vertical highlight while the horizontal one corresponds to the specific domain of the human protein.
By hovering over the entry name the gene name will appear.
Colors

This select form proposes several colors panels:
- Skylign
- Rasmol
- Rasmol Shapely
- ClustalX
Get protein domains alignments
Protein domains sequences alignments were obtained from the Pfam FTP. Only Pfam A (an HMM based hand curated Pfam entry, which passed a manually set threshold value for each HMM) is used, keeping only the eukaryotes sequences. In the Dolphin alignment viewer, you can see sequences from human proteins.
Pfam web site:
Citation in publications:
Jaina Mistry, Sara Chuguransky, Lowri Williams, Matloob Qureshi, Gustavo A Salazar, Erik L L Sonnhammer, Silvio C E Tosatto, Lisanna Paladin, Shriya Raj, Lorna J Richardson, Robert D Finn, Alex Bateman, Pfam: The protein families database in 2021, Nucleic Acids Research, Volume 49, Issue D1, 8 January 2021, Pages D412–D419, https://doi.org/10.1093/nar/gkaa913
Frequency viewer
Domain feature

The first line of the frequency viewer element, displays the protein amino acid sequence associated with the Pfam entry name related to the provided mutation. Letter color depends on the alignement viewer color selector (see paragraph above).
The second line represents differents occurancies of the same protein domain along the protein. At the bottom, the dark line shows the protein scale (protein position).
Frequencies features

With your mouse, you can scroll or click and drag to zoom in the frequency viewer. To unzoom, you can scroll or right click in the viewer.
Both stair lines display the number of amino acid substitutions with associated with a GnomAD (yellow, top) or Dolphin (blue, bottom) frequency.
Variants frequency
All substitution frequencies were extracted from GnomAD version 2 containing data from 125,748 exomes. For each missense mutation, we selected the most frequent mutational event leading to the amino acid substitution from any human population.
GnomAD web site:
Citation in publications:
Karczewski, K.J., Francioli, L.C., Tiao, G. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020). https://doi.org/10.1038/s41586-020-2308-7