Pfam includes annotations and additional family information from a range of different sources. These sources can be accessed via the tabs below.
This is the Wikipedia entry entitled "Metallothionein". More...
The Wikipedia text that you see displayed here is a download from Wikipedia. This means that the information we display is a copy of the information from the Wikipedia database. The button next to the article title ("Edit Wikipedia article") takes you to the edit page for the article directly within Wikipedia. You should be aware you are not editing our local copy of this information. Any changes that you make to the Wikipedia article will not be displayed here until we next download the article from Wikipedia. We currently download new content on a nightly basis.
Does Pfam agree with the content of the Wikipedia entry ?
Pfam has chosen to link families to Wikipedia articles. In some case we have created or edited these articles but in many other cases we have not made any direct contribution to the content of the article. The Wikipedia community does monitor edits to try to ensure that (a) the quality of article annotation increases, and (b) vandalism is very quickly dealt with. However, we would like to emphasise that Pfam does not curate the Wikipedia entries and we cannot guarantee the accuracy of the information on the Wikipedia page.
Editing Wikipedia articles
Before you edit for the first time
Wikipedia is a free, online encyclopedia. Although anyone can edit or contribute to an article, Wikipedia has some strong editing guidelines and policies, which promote the Wikipedia standard of style and etiquette. Your edits and contributions are more likely to be accepted (and remain) if they are in accordance with this policy.
You should take a few minutes to view the following pages:
How your contribution will be recorded
Anyone can edit a Wikipedia entry. You can do this either as a new user or you can register with Wikipedia and log on. When you click on the "Edit Wikipedia article" button, your browser will direct you to the edit page for this entry in Wikipedia. If you are a registered user and currently logged in, your changes will be recorded under your Wikipedia user name. However, if you are not a registered user or are not logged on, your changes will be logged under your computer's IP address. This has two main implications. Firstly, as a registered Wikipedia user your edits are more likely seen as valuable contribution (although all edits are open to community scrutiny regardless). Secondly, if you edit under an IP address you may be sharing this IP address with other users. If your IP address has previously been blocked (due to being flagged as a source of 'vandalism') your edits will also be blocked. You can find more information on this and creating a user account at Wikipedia.
If you have problems editing a particular page, contact us at firstname.lastname@example.org and we will try to help.
The community annotation is a new facility of the Pfam web site. If you have problems editing or experience problems with these pages please contact us.
Metallothionein Edit Wikipedia article
|Metallothionein superfamily, eukaryotic|
Solution structure of the beta-E-domain of wheat Ec-1 metallothionein.
ag-substituted metallothionein from saccharomyces cerevisiae, nmr, minimized average structure
Metallothionein (MT) is a family of cysteine-rich, low molecular weight (MW ranging from 500 to 14000 Da) proteins. They are localized to the membrane of the Golgi apparatus. MTs have the capacity to bind both physiological (such as zinc, copper, selenium) and xenobiotic (such as cadmium, mercury, silver, arsenic) heavy metals through the thiol group of its cysteine residues, which represents nearly the 30% of its amino acidic residues.
MT was discovered in 1957 by Vallee and Margoshe from purification of a Cd-binding protein from horse (equine) renal cortex. MTs function is not clear, but experimental data suggest MTs may provide protection against metal toxicity, be involved in regulation of physiological metals (Zn and Cu) and provide protection against oxidative stress. There are four main isoforms expressed in humans (family 1, see chart below): MT1 (subtypes A, B, E, F, G, H, L, M, X), MT2, MT3, MT4. In the human body, large quantities are synthesised primarily in the liver and kidneys. Their production is dependent on availability of the dietary minerals, as zinc, copper and selenium, and the amino acids histidine and cysteine.
Structure and classification
MTs are present in a vast range of taxonomic groups, ranging from prokaryotes (such as the cyanobacteria Syneccococus spp....), protozoa (p. ex. the ciliate Tetrahymena genera...), plants (such as Pisum sativum, Triticum durum, Zea mays, Quercus suber...), yeast (such as Saccharomyces cerevisiae, Candida albicans,...), invertebrates (such as the nematode Caenorhabditis elegans, the insect Drosophila melanogaster, the mollusc Mytilus edulis, or the echinoderm Strongylocentrotus purpuratus) and vertebrates (such as the chicken, Gallus gallus, or the mammalian Homo sapiens or Mus musculus).
The MTs from this diverse taxonomic range represent a high-heterogeneity sequence (regarding molecular weight and number and distribution of Cys residues) and do not show general homology; in spite of this, homology is found inside some taxonomic groups (such as vertebrate MTs).
From their primary structure, MTs have been classified by different methods. The first one dates from 1987, when Fowler et al., established three classes of MTs: Class I, including the MTs which show homology with horse MT, Class II, including the rest of the MTs with no homology with horse MT, and Class III, which includes phytochelatins, Cys-rich enzymatically synthesised peptides. The second classification was performed by Binz and Kagi in 2001, and takes into account taxonomic parameters and the patterns of distribution of Cys residues along the MT sequence. It results in a classification of 15 families for proteinaceous MTs. Family 15 contains the plant MTs, which in 2002 have been further classified by Cobbet and Goldsbrough into 4 Types (1, 2, 3 and 4) depending on the distribution of their Cys residues and a Cys-devoid regions (called spacers) characteristic of plant MTs.
A table including the principal aspects of the two latter classifications is included.
|7||Ciliate||x-C-C-C-x ?||T.termophila MTT1
|8||Fungal 1||C-G-C-S-x(4)-C-x-C-x(3,4)-C-x-C-S-x-C||N.crassa MT
|9||Fungal 2||---||C.glabrata MT2
|10||Fungal 3||---||C.glabrata MT2
|11||Fungal 4||C-X-K-C-x-C-x(2)-C-K-C||Y.lipolitica MT3
|12||Fungal 5||---||S.cerevisiae CUP1
|13||Fungal 6||---||S.cerevisiae CRS5
|14||Procaryota||K-C-A-C-x(2)-C-L-C||Synechococcus sp SmtA
|15.1||Plant MTs Type 1||C-X-C-X(3)- C-X-C-X(3)- C-X-C-X(3)-spacer-C-X-C-X(3)- C-X-C-X(3)- C-X-C-X(3)||Pisum sativum MT
|15.2||Plant MTs Type 2||C-C-X(3)-C-X-C-X(3)- C-X-C-X(3)- C-X-C-X(3)-spacer- C-X-C-X(3)- C-X-C-X(3)- C-X-C-X(3)||L.esculetum MT
|15.3||Plant MTs Type 3||---||A.thaliana MT3
|15.4||Plant MTs Type 4 or Ec||C-x(4)-C-X-C-X(3)-C-X(5)-C-X-C-X(9,11)-HTTCGCGEHC-
|99||Phytochelatins and other non-proteinaceous MT-like polypeptides||---||S.pombe
More data on this classification are discoverable at the Expasy metallothionein page.
Secondary structure elements have been observed in several MTs SmtA from Syneccochoccus, mammalian MT3, Echinoderma SpMTA, fish Notothenia Coriiceps MT, Crustacean MTH, but until this moment, the content of such structures is considered to be poor in MTs, and its functional influence is not considered.
Tertiary structure of MTs is also highly heterogeneous. While vertebrate, echinoderm and crustacean MTs show a bidominial structure with divalent metals as Zn(II) or Cd(II) (the protein is folded so as to bind metals in two functionally independent domains, with a metallic cluster each), yeast and procariotyc MTs show a monodominial structure (one domain with a single metallic cluster). Although no structural data is available for molluscan, nematoda and Drosophila MTs, it is commonly assumed that the former are bidominial and the latter monodominial. No conclusive data are available for Plant MTs, but two possible structures have been proposed: 1) a bidominial structure similar to that of vertebrate MTs; 2) a codominial structure, in which two Cys-rich domains interact to form a single metallic cluster.
Quaternary structure has not been broadly considered for MTs. Dimerization and oligomerization processes have been observed and attributed to several molecular mechanisms, including intermolecular disulfide formation, bridging through metals bound by either Cys or His residues on different MTs, or inorganic phosphate-mediated interactions. Dimeric and polymeric MTs have been shown to acquire novel properties upon metal detoxification, but the physiological significance of these processes has been demonstrated only in the case of prokaryotic Synechococcus SmtA. The MT dimer produced by this organism forms structures similar to zinc fingers and has Zn-regulatory activity.
Metallothioneins have diverse metal-binding preferences, which have been associated with functional specificity. As an example, the mammalian Mus musculus MT1 preferentially binds divalent metal ions (Zn(II), Cd(II),...), while yeast CUP1 is selective for monovalent metal ions (Cu(I), Ag(I),...). A novel functional classification of MTs as Zn- or Cu-thioneins is currently being developed based on these functional preferences.
Metallothioneins are characterised by an abundance of cysteine residues and a lack of generic secondary structure motifs. Yeast Metallothionein (MT) are also alternatively named, Copper metallothionein (CUP).
This protein functions in primary metal storage, transport and detoxification. More specifically, Yeast MT stores copper so therefore protects the cell against copper toxicity by tightly chelating copper ions.
Yeast MT can be found in the following:
- Saccharomyces cerevisiae
- Neurospora crassa
Metallothionein has been documented to bind a wide range of metals including cadmium, zinc, mercury, copper, arsenic, silver, etc. Metallation of MT was previously reported to occur cooperatively but recent reports have provided strong evidence that metal-binding occurs via a sequential, noncooperative mechanism. The observation of partially metallated MT (that is, having some free metal binding capacity) suggest that these species are biologically important.
Metallothioneins likely participate in the uptake, transport, and regulation of zinc in biological systems. Mammalian MT binds three Zn(II) ions in its beta domain and four in the alpha domain. Cysteine is a sulfur-containing amino acid, hence the name "-thionein". However, the participation of inorganic sulfide and chloride ions has been proposed for some MT forms. In some MTs, mostly bacterial, histidine participates in zinc binding. By binding and releasing zinc, metallothioneins (MTs) may regulate zinc levels within the body. Zinc, in turn, is a key element for the activation and binding of certain transcription factors through its participation in the zinc finger region of the protein. Metallothionein also carries zinc ions (signals) from one part of the cell to another. When zinc enters a cell, it can be picked up by thionein (which thus becomes "metallothionein") and carried to another part of the cell where it is released to another organelle or protein. In this way the thionein-metallothionein becomes a key component of the zinc signaling system in cells. This system is particularly important in the brain, where zinc signaling is prominent both between and within nerve cells. It also seems to be important for the regulation of the tumor suppressor protein p53.
Control of oxidative stress
Cysteine residues from MTs can capture harmful oxidant radicals like the superoxide and hydroxyl radicals. In this reaction, cysteine is oxidized to cystine, and the metal ions which were bound to cysteine are liberated to the media. As explained in the Expression and regulation section, this Zn can activate the synthesis of more MTs. This mechanism has been proposed to be an important mechanism in the control of the oxidative stress by MTs. The role of MTs in oxidative stress has been confirmed by MT Knockout mutants, but some experiments propose also a prooxidant role for MTs.
Expression and regulation
Metallothionein gene expression is induced by a high variety of stimuli, as metal exposure, oxidative stress, glucocorticoids, hydric stress, etc. The level of the response to these inducers depends on the MT gene. MT genes present in their promotors specific sequences for the regulation of the expression, elements as metal response elements (MRE), glucocorticoid response elements (GRE), GC-rich boxes, basal level elements (BLE), and thyroid response elements (TRE).
Metallothionein and disease
Because MTs play an important role in transcription factor regulation, problems with MT function or expression may lead to malignant transformation of cells and ultimately cancer. Studies have found increased expression of MTs in some cancers of the breast, colon, kidney, liver, skin (melanoma), lung, nasopharynx, ovary, prostate, mouth, salivary gland, testes, thyroid and urinary bladder; they have also found lower levels of MT expression in hepatocellular carcinoma and liver adenocarcinoma.
Heavy metal toxicity has been proposed as a hypothetical etiology of autism, and dysfunction of MT synthesis and activity may play a role in this. Many heavy metals, including mercury, lead, and arsenic have been linked to symptoms that resemble the neurological symptoms of autism. However, MT dysfunction has not specifically been linked to autistic spectrum disorders. A 2006 study, investigating children exposed to the vaccine preservative thiomersal, found that levels of MT and antibodies to MT in autistic children did not differ significantly from non-autistic children.
- PDB 2KAK; Peroza EA, Schmucki R, Güntert P, Freisinger E, Zerbe O (March 2009). "The beta(E)-domain of wheat E(c)-1 metallothionein: a metal-binding domain with a distinctive structure". J. Mol. Biol. 387 (1): 207–18. doi:10.1016/j.jmb.2009.01.035. PMID 19361445.
- Sigel H, Sigel A, ed. (2009). Metallothioneins and Related Chelators (Metal Ions in Life Sciences). Metal Ions in Life Sciences 5. Cambridge, England: Royal Society of Chemistry. ISBN 1-84755-899-2.
- Margoshes M, Vallee BL (1957). "A cadmium protein from equine kidney cortex". Journal of American Chemical Society 79 (17): 4813. doi:10.1021/ja01574a064.
- "Metallothioneins: classification and list of entries". www.uniprot.org.
- Peterson CW, Narula SS, Armitage IM (January 1996). "3D solution structure of copper and silver-substituted yeast metallothioneins". FEBS Lett. 379 (1): 85–93. doi:10.1016/0014-5793(95)01492-6. PMID 8566237.
- Butt TR, Ecker DJ (September 1987). "Yeast metallothionein and applications in biotechnology". Microbiol. Rev. 51 (3): 351–64. PMC 373116. PMID 3312986.
- Freisinger E, Vašák M (2013). "Cadmium in metallothioneins". Met Ions Life Sci 11: 339–71. doi:10.1007/978-94-007-5179-8_11. PMID 23430778.
- Krezel A, Maret W (September 2007). "Dual nanomolar and picomolar Zn(II) binding properties of metallothionein". J. Am. Chem. Soc. 129 (35): 10911–21. doi:10.1021/ja071979s. PMID 17696343.
- Huang M, Krepkiy D, Hu W, and Petering D (2004). "Zn-, Cd-, and Pb-transcription factor IIIA: properties, DNA binding, and comparison with TFIIIA-finger 3 metal complexes". Journal of Inorganic Biochemistry 98 (5): 775–785. doi:10.1016/j.jinorgbio.2004.01.014. PMID 15134923.
- Huang M, Shaw CF, and Petering D (2004). "Interprotein metal exchange between transcription factor IIIa and apo-metallothionein". Journal of Inorganic Biochemistry 98 (4): 639–648. doi:10.1016/j.jinorgbio.2004.02.004. PMID 15041244.
- Kumari MV, Hiramatsu M, Ebadi M (August 1998). "Free radical scavenging actions of metallothionein isoforms I and II". Free Radic. Res. 29 (2): 93–101. doi:10.1080/10715769800300111. PMID 9790511.
- Krizkova S, Fabrik I, Adam V, Hrabeta J, Eckschlager T, Kizek R (2009). "Metallothionein--a promising tool for cancer diagnostics". Bratisl Lek Listy 110 (2): 93–7. PMID 19408840.
- Cherian, M. (2003). "Metallothioneins in human tumors and potential roles in carcinogenesis". Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis 533: 201–209. doi:10.1016/j.mrfmmm.2003.07.013. PMID 14643421.
- Drum DA (October 2009). "Are toxic biometals destroying your children's future?". Biometals 22 (5): 697–700. doi:10.1007/s10534-009-9212-9. PMID 19205900.
- Singh VK, Hanson J (June 2006). "Assessment of metallothionein and antibodies to metallothionein in normal and autistic children having exposure to vaccine-derived thimerosal". Pediatr Allergy Immunol 17 (4): 291–6. doi:10.1111/j.1399-3038.2005.00348.x. PMID 16771783.
This tab holds the annotation information that is stored in the Pfam database. As we move to using Wikipedia as our main source of annotation, the contents of this tab will be gradually replaced by the Wikipedia tab.
Metallothionein Provide feedback
This family consists of metallothioneins from several worm and sea urchin species. Metallothioneins are low molecular weight, cysteine rich proteins known to be involved in heavy metal detoxification and homeostasis .
External database links
This tab holds annotation information from the InterPro database.
InterPro entry IPR017980
Metallothioneins (MT) are small proteins that bind heavy metals, such as zinc, copper, cadmium, nickel, etc. They have a high content of cysteine residues that bind the metal ions through clusters of thiolate bonds [PUBMED:1779825, PUBMED:2959513]. An empirical classification into three classes has been proposed by Fowler and coworkers [PUBMED:2959504] and Kojima [PUBMED:1779826]. Members of class I are defined to include polypeptides related in the positions of their cysteines to equine MT-1B, and include mammalian MTs as well as from crustaceans and molluscs. Class II groups MTs from a variety of species, including sea urchins, fungi, insects and cyanobacteria. Class III MTs are atypical polypeptides composed of gamma-glutamylcysteinyl units [PUBMED:2959504].
This original classification system has been found to be limited, in the sense that it does not allow clear differentiation of patterns of structural similarities, either between or within classes. Consequently, all class I and class II MTs (the proteinaceous sequences) have now been grouped into families of phylogenetically-related and thus alignable sequences. This system subdivides the MT superfamily into families, subfamilies, subgroups, and isolated isoforms and alleles.
The metallothionein superfamily comprises all polypeptides that resemble equine renal metallothionein in several respects [PUBMED:2959504]: e.g., low molecular weight; high metal content; amino acid composition with high Cys and low aromatic residue content; unique sequence with characteristic distribution of cysteines, and spectroscopic manifestations indicative of metal thiolate clusters. A MT family subsumes MTs that share particular sequence-specific features and are thought to be evolutionarily related. The inclusion of a MT within a family presupposes that its amino acid sequence is alignable with that of all members. Fifteen MT families have been characterised, each family being identified by its number and its taxonomic range: e.g., Family 1: vertebrate MTs [see http://www.bioc.unizh.ch/mtpage/protali.html].
Echinoidea (sea urchin, family 4) MTs are 64-67 residue proteins. Members of this family are recognised by the sequence pattern P-D-x-K-C-[V,F]-C-C-x(5)-C-x-C-x(4)-C-C-x(4)-C-C-x(4,6)-C-C located near the N terminus. The taxonomic range of the members extends to sea urchins (echinodea). The protein sequence is divided into two structural domains, each containing 9 and 11 Cys residues binding 3 and 4 bivalent metal ions, respectively. Family 4 includes subfamilies: e1, e2, they are separate phylogenetic groups.
This entry includes the sea urchin proteins, and related sequences from worms.
Below is a listing of the unique domain organisations or architectures in which this domain is found. More...
The graphic that is shown by default represents the longest sequence with a given architecture. Each row contains the following information:
- the number of sequences which exhibit this architecture
a textual description of the architecture, e.g. Gla, EGF x 2, Trypsin.
This example describes an architecture with one
Gladomain, followed by two consecutive
EGFdomains, and finally a single
- a link to the page in the Pfam site showing information about the sequence that the graphic describes
- the UniProt description of the protein sequence
- the number of residues in the sequence
- the Pfam graphic itself.
Note that you can see the family page for a particular domain by clicking on the graphic. You can also choose to see all sequences which have a given architecture by clicking on the Show link in each row.
Finally, because some families can be found in a very large number of architectures, we load only the first fifty architectures by default. If you want to see more architectures, click the button at the bottom of the page to load the next set.
Loading domain graphics...
This superfamily contains related families of metallothioneins, prokaryotes and eukaryotes.
The clan contains the following 5 members:Metallothi_Euk2 Metallothio Metallothio_6 Metallothio_Euk Metallothio_Pro
We store a range of different sequence alignments for families. As well as the seed alignment from which the family is built, we provide the full alignment, generated by searching the sequence database using the family HMM. We also generate alignments using four representative proteomes (RP) sets, the NCBI sequence database, and our metagenomics sequence database. More...
There are various ways to view or download the sequence alignments that we store. We provide several sequence viewers and a plain-text Stockholm-format file for download.
We make a range of alignments for each Pfam-A family:
- the curated alignment from which the HMM for the family is built
- the alignment generated by searching the sequence database using the HMM
- Representative Proteomes (RPs) at 15%, 35%, 55% and 75% co-membership thresholds
- alignment generated by searching the NCBI sequence database using the family HMM
- alignment generated by searching the metagenomics sequence database using the family HMM
You can see the alignments as HTML or in three different sequence viewers:
- a Java applet developed at the University of Dundee. You will need Java installed before running jalview
- an HTML page showing the whole alignment.Please note: full Pfam alignments can be very large. These HTML views are extremely large and often cause problems for browsers. Please use either jalview or the Pfam viewer if you have trouble viewing the HTML version
- an HTML-based representation of the alignment, coloured according to the posterior-probability (PP) values from the HMM. As for the standard HTML view, heatmap alignments can also be very large and slow to render.
- Pfam viewer
- an HTML-based viewer that uses DAS to retrieve alignment fragments on request
You can download (or view in your browser) a text representation of a Pfam alignment in various formats:
You can also change the order in which sequences are listed in the alignment, change how insertions are represented, alter the characters that are used to represent gaps in sequences and, finally, choose whether to download the alignment or to view it in your browser directly.
You may find that large alignments cause problems for the viewers and the reformatting tool, so we also provide all alignments in Stockholm format. You can download either the plain text alignment, or a gzipped version of it.
We make a range of alignments for each Pfam-A family. You can see a description of each above. You can view these alignments in various ways but please note that some types of alignment are never generated while others may not be available for all families, most commonly because the alignments are too large to handle.
1Cannot generate PP/Heatmap alignments for seeds; no PP data available
Key: available, not generated, — not available.
Format an alignment
We make all of our alignments available in Stockholm format. You can download them here as raw, plain text files or as gzip-compressed files.
You can also download a FASTA format file containing the full-length sequences for all sequences in the full alignment.
MyHits provides a collection of tools to handle multiple sequence alignments. For example, one can refine a seed alignment (sequence addition or removal, re-alignment or manual edition) and then search databases for remote homologs using HMMER3.
HMM logos is one way of visualising profile HMMs. Logos provide a quick overview of the properties of an HMM in a graphical form. You can see a more detailed description of HMM logos and find out how you can interpret them here. More...
If you find these logos useful in your own work, please consider citing the following article:
This page displays the phylogenetic tree for this family's seed alignment. We use FastTree to calculate neighbour join trees with a local bootstrap based on 100 resamples (shown next to the tree nodes). FastTree calculates approximately-maximum-likelihood phylogenetic trees from our seed alignment.
Note: You can also download the data file for the tree.
Curation and family details
This section shows the detailed information about the Pfam family. You can see the definitions of many of the terms in this section in the glossary and a fuller explanation of the scoring system that we use in the scores section of the help pages.
|Seed source:||Pfam-B_1360 (release 8.0)|
|Number in seed:||7|
|Number in full:||30|
|Average length of the domain:||66.20 aa|
|Average identity of full alignment:||52 %|
|Average coverage of the sequence by the domain:||80.30 %|
|HMM build commands:||
build method: hmmbuild -o /dev/null HMM SEED
search method: hmmsearch -Z 23193494 -E 1000 --cpu 4 HMM pfamseq
|Family (HMM) version:||6|
|Download:||download the raw HMM for this family|
Weight segments by...
Change the size of the sunburst
selected sequences to HMM
a FASTA-format file
- 0 sequences
- 0 species
This visualisation provides a simple graphical representation of the distribution of this family across species. You can find the original interactive tree in the More....
This chart is a modified "sunburst" visualisation of the species tree for this family. It shows each node in the tree as a separate arc, arranged radially with the superkingdoms at the centre and the species arrayed around the outermost ring.
How the sunburst is generated
The tree is built by considering the taxonomic lineage of each sequence that has a match to this family. For each node in the resulting tree, we draw an arc in the sunburst. The radius of the arc, its distance from the root node at the centre of the sunburst, shows the taxonomic level ("superkingdom", "kingdom", etc). The length of the arc represents either the number of sequences represented at a given level, or the number of species that are found beneath the node in the tree. The weighting scheme can be changed using the sunburst controls.
In order to reduce the complexity of the representation, we reduce the number of taxonomic levels that we show. We consider only the following eight major taxonomic levels:
Colouring and labels
Segments of the tree are coloured approximately according to their superkingdom. For example, archeal branches are coloured with shades of orange, eukaryotes in shades of purple, etc. The colour assignments are shown under the sunburst controls. Where space allows, the name of the taxonomic level will be written on the arc itself.
As you move your mouse across the sunburst, the current node will be highlighted. In the top section of the controls panel we show a summary of the lineage of the currently highlighed node. If you pause over an arc, a tooltip will be shown, giving the name of the taxonomic level in the title and a summary of the number of sequences and species below that node in the tree.
Anomalies in the taxonomy tree
There are some situations that the sunburst tree cannot easily handle and for which we have work-arounds in place.
Missing taxonomic levels
Some species in the taxonomic tree may not have one or more of the main eight levels that we display. For example, Bos taurus is not assigned an order in the NCBI taxonomic tree. In such cases we mark the omitted level with, for example, "No order", in both the tooltip and the lineage summary.
Unmapped species names
The tree is built by looking at each sequence in the full alignment for the family. We take the name of the species given by UniProt and try to map that to the full taxonomic tree from NCBI. In some cases, the name chosen by UniProt does not map to any node in the NCBI tree, perhaps because the chosen name is listed as a synonym or a misspelling in the NCBI taxonomy.
So that these nodes are not simply omitted from the sunburst tree, we group them together in a separate branch (or segment of the sunburst tree). Since we cannot determine the lineage for these unmapped species, we show all levels between the superkingdom and the species as "uncategorised".
Since we reduce the species tree to only the eight main taxonomic levels, sequences that are mapped to the sub-species level in the tree would not normally be shown. Rather than leave out these species, we map them instead to their parent species. So, for example, for sequences belonging to one of the Vibrio cholerae sub-species in the NCBI taxonomy, we show them instead as belonging to the species Vibrio cholerae.
Too many species/sequences
For large species trees, you may see blank regions in the outer layers of the sunburst. These occur when there are large numbers of arcs to be drawn in a small space. If an arc is less than approximately one pixel wide, it will not be drawn and the space will be left blank. You may still be able to get some information about the species in that region by moving your mouse across the area, but since each arc will be very small, it will be difficult to accurately locate a particular species.
The tree shows the occurrence of this domain across different species. More...
We show the species tree in one of two ways. For smaller trees we try to show an interactive representation, which allows you to select specific nodes in the tree and view them as an alignment or as a set of Pfam domain graphics.
Unfortunately we have found that there are problems viewing the interactive tree when the it becomes larger than a certain limit. Furthermore, we have found that Internet Explorer can become unresponsive when viewing some trees, regardless of their size. We therefore show a text representation of the species tree when the size is above a certain limit or if you are using Internet Explorer to view the site.
If you are using IE you can still load the interactive tree by clicking the "Generate interactive tree" button, but please be aware of the potential problems that the interactive species tree can cause.
For all of the domain matches in a full alignment, we count the number that are found on all sequences in the alignment. This total is shown in the purple box.
We also count the number of unique sequences on which each domain is found, which is shown in green. Note that a domain may appear multiple times on the same sequence, leading to the difference between these two numbers.
Finally, we group sequences from the same organism according to the NCBI code that is assigned by UniProt, allowing us to count the number of distinct sequences on which the domain is found. This value is shown in the pink boxes.
We use the NCBI species tree to group organisms according to their taxonomy and this forms the structure of the displayed tree. Note that in some cases the trees are too large (have too many nodes) to allow us to build an interactive tree, but in most cases you can still view the tree in a plain text, non-interactive representation. Those species which are represented in the seed alignment for this domain are highlighted.
You can use the tree controls to manipulate how the interactive tree is displayed:
- show/hide the summary boxes
- highlight species that are represented in the seed alignment
- expand/collapse the tree or expand it to a given depth
- select a sub-tree or a set of species within the tree and view them graphically or as an alignment
- save a plain text representation of the tree
Please note: for large trees this can take some time. While the tree is loading, you can safely switch away from this tab but if you browse away from the family page entirely, the tree will not be loaded.
For those sequences which have a structure in the Protein DataBank, we use the mapping between UniProt, PDB and Pfam coordinate systems from the PDBe group, to allow us to map Pfam domains onto UniProt sequences and three-dimensional protein structures. The table below shows the structures on which the Metallothio_6 domain has been found. There are 2 instances of this domain found in the PDB. Note that there may be multiple copies of the domain in a single PDB structure, since many structures contain multiple copies of the same protein seqence.
Loading structure mapping...