Summary: Ubiquitin-like domain
Pfam includes annotations and additional family information from a range of different sources. These sources can be accessed via the tabs below.
This is the Wikipedia entry entitled "Ubiquitin". More...
The Wikipedia text that you see displayed here is a download from Wikipedia. This means that the information we display is a copy of the information from the Wikipedia database. The button next to the article title ("Edit Wikipedia article") takes you to the edit page for the article directly within Wikipedia. You should be aware you are not editing our local copy of this information. Any changes that you make to the Wikipedia article will not be displayed here until we next download the article from Wikipedia. We currently download new content on a nightly basis.
Does Pfam agree with the content of the Wikipedia entry ?
Pfam has chosen to link families to Wikipedia articles. In some case we have created or edited these articles but in many other cases we have not made any direct contribution to the content of the article. The Wikipedia community does monitor edits to try to ensure that (a) the quality of article annotation increases, and (b) vandalism is very quickly dealt with. However, we would like to emphasise that Pfam does not curate the Wikipedia entries and we cannot guarantee the accuracy of the information on the Wikipedia page.
Editing Wikipedia articles
Before you edit for the first time
Wikipedia is a free, online encyclopedia. Although anyone can edit or contribute to an article, Wikipedia has some strong editing guidelines and policies, which promote the Wikipedia standard of style and etiquette. Your edits and contributions are more likely to be accepted (and remain) if they are in accordance with this policy.
You should take a few minutes to view the following pages:
How your contribution will be recorded
Anyone can edit a Wikipedia entry. You can do this either as a new user or you can register with Wikipedia and log on. When you click on the "Edit Wikipedia article" button, your browser will direct you to the edit page for this entry in Wikipedia. If you are a registered user and currently logged in, your changes will be recorded under your Wikipedia user name. However, if you are not a registered user or are not logged on, your changes will be logged under your computer's IP address. This has two main implications. Firstly, as a registered Wikipedia user your edits are more likely seen as valuable contribution (although all edits are open to community scrutiny regardless). Secondly, if you edit under an IP address you may be sharing this IP address with other users. If your IP address has previously been blocked (due to being flagged as a source of 'vandalism') your edits will also be blocked. You can find more information on this and creating a user account at Wikipedia.
If you have problems editing a particular page, contact us at firstname.lastname@example.org and we will try to help.
The community annotation is a new facility of the Pfam web site. If you have problems editing or experience problems with these pages please contact us.
Ubiquitin Edit Wikipedia article
A diagram of ubiquitin. The seven lysine sidechains are shown in orange.
Ubiquitin is a small (8.5 kDa) regulatory protein that has been found in almost all tissues (ubiquitously) of eukaryotic organisms. It was discovered in 1975 by Goldstein and further characterized throughout the 1970s and 1980s. There are four genes in the human genome that produce ubiquitin; UBB, UBC, UBA52 and RPS27A.
Ubiquitination is a post-translational modification (an addition to a protein after it has been made) where ubiquitin is attached to a substrate protein. The addition of ubiquitin can affect proteins in many ways: It can signal for their degradation via the proteasome, alter their cellular location, affect their activity, and promote or prevent protein interactions. Ubiquitination is carried out in three main steps: activation, conjugation, and ligation, performed by ubiquitin-activating enzymes (E1s), ubiquitin-conjugating enzymes (E2s), and ubiquitin ligases (E3s), respectively. The result of this sequential cascade binds ubiquitin to lysine residues on the protein substrate via an isopeptide bond or to the amino group of the protein's N-terminus via a peptide bond.
The protein modifications can be either a single ubiquitin protein or chains of ubiquitin. There are different forms of chains, named by which of the seven lysine amino acids are used to link the chain together. Lysine 48-linked chains, linked by the 48th amino acid (a lysine) have been much-studied. They are the forms of chains that signal proteins to the proteasome, which destroys and recycles proteins. This discovery won the Nobel Prize for chemistry in 2004. Lysine 63-linked chains, linked by the 63rd amino acid of ubiquitin (a lysine), regulate processes such as endocytic trafficking, inflammation, translation and DNA repair.
- 1 Identification
- 2 The protein
- 3 Genes
- 4 Origins
- 5 Ubiquitination
- 6 Variety of ubiquitin modifications
- 7 Functions of ubiquitin modification
- 8 Deubiquitination
- 9 Ubiquitin-binding domains
- 10 Disease associations
- 11 Ubiquitin-like modifiers
- 12 Prokaryotic ubiquitin-like protein (Pup)
- 13 Human proteins containing ubiquitin domain
- 14 Related proteins
- 15 Prediction of ubiquitination
- 16 See also
- 17 References
- 18 External links
Ubiquitin (originally, ubiquitous immunopoietic polypeptide) was first identified in 1975 as an 8.5 kDa protein of unknown function expressed in all eukaryotic cells. The basic functions of ubiquitin and the components of the ubiquitination pathway were elucidated in the early 1980s at the Technion by Aaron Ciechanover, Avram Hershko, and Irwin Rose for which the Nobel Prize in Chemistry was awarded in 2004.
The ubiquitination system was initially characterised as an ATP-dependent proteolytic system present in cellular extracts. A heat-stable polypeptide present in these extracts, ATP-dependent proteolysis factor 1 (APF-1), was found to become covalently attached to the model protein substrate lysozyme in an ATP- and Mg2+-dependent process. Multiple APF-1 molecules were linked to a single substrate molecule by an isopeptide linkage, and conjugates were found to be rapidly degraded with the release of free APF-1. Soon after APF-1-protein conjugation was characterised, APF-1 was identified as ubiquitin. The carboxyl group of the C-terminal glycine residue of ubiquitin (Gly76) was identified as the moiety conjugated to substrate lysine residues.
|Number of residues||76|
|Molecular mass||8564.8448 Da|
|Isoelectric point (pI)||6.79|
|Gene names||RPS27A (UBA80, UBCEP1), UBA52 (UBCEP2), UBB, UBC|
|Sequence in amino acid abbreviations||MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPD
Ubiquitin is a small protein that exists in all eukaryotic cells. It performs its myriad functions through conjugation to a large range of target proteins. A variety of different modifications can occur. The ubiquitin protein itself consists of 76 amino acids and has a molecular mass of about 8.5 kDa. Key features include its C-terminal tail and the 7 lysine residues. It is highly conserved among eukaryotic species: Human and yeast ubiquitin share 96% sequence identity.
Ubiquitin is encoded in mammals by 4 different genes. UBA52 and RPS27A genes code for a single copy of ubiquitin fused to the ribosomal proteins L40 and S27a, respectively. The UBB and UBC genes code for polyubiquitin precursor proteins.
No ubiquitin and ubiquitination machinery are known to exist in prokaryotes. However, ubiquitin is believed to have descended from prokaryotic proteins similar to ThiS or MoaD. These prokaryotic proteins, despite having little sequence identity (ThiS has 14% identity to ubiquitin), share the same protein fold. These proteins also share sulfur chemistry with ubiquitin. MoaD, which is involved in molybdenum cofactor biosynthesis, interacts with MoeB, which acts like an E1 ubiquitin-activating enzyme for MoaD, strengthening the link between these prokaryotic proteins and the ubiquitin system. A similar system exists for ThiS, with its E1-like enzyme ThiF. It is also believed that the Saccharomyces cerevisiae protein Urm-1, a ubiquitin-related modifier, is a "molecular fossil" that connects the evolutionary relation with the prokaryotic ubiquitin-like molecules and ubiquitin.
Ubiquitination (also known as ubiquitylation) is an enzymatic, post-translational modification (PTM) process in which a ubiquitin protein is attached to a substrate protein. This process most commonly binds the last amino acid of ubiquitin (glycine 76) to a lysine residue on the substrate. An isopeptide bond is formed between the carboxylic acid group of the ubiquitin's glycine and the epsilon amino group of the substrate's lysine. Cases are known in which the amine group of a protein's N-terminus is used for ubiquitination, rather than a lysine residue. In a few rare cases nonlysine residues have been identified as ubiquitination targets, such as cysteine, threonine and serine. The end result of this process is the addition of one ubiquitin molecule (monoubiquitination) or a chain of ubiquitin molecules (polyubiquitination) to the substrate protein.
Ubiquitination requires three types of enzyme: ubiquitin-activating enzymes, ubiquitin-conjugating enzymes, and ubiquitin ligases, known as E1s, E2s, and E3s, respectively. The process consists of three main steps:
- Activation: Ubiquitin is activated in a two-step reaction by an E1 ubiquitin-activating enzyme, which is dependent on ATP. The initial step involves production of a ubiquitin-adenylate intermediate. The E1 binds both ATP and ubiquitin and catalyses the acyl-adenylation of the C-terminus of the ubiqutin molecule. The second step transfers ubiquitin to an active site cysteine residue, with release of AMP. This step results in a thioester linkage between the C-terminal carboxyl group of ubiquitin and the E1 cysteine sulfhydryl group. The human genome contains two genes that produce enzymes capable of activating ubiquitin: UBA1 and UBA6.
- Conjugation: E2 ubiquitin-conjugating enzymes catalyse the transfer of ubiquitin from E1 to the active site cysteine of the E2 via a trans(thio)esterification reaction. In order to perform this reaction, the E2 binds to both activated ubiquitin and the E1 enzyme. Humans possess 35 different E2 enzymes, whereas other eukaryotic organisms have between 16 and 35. They are characterised by their highly conserved structure, known as the ubiquitin-conjugating catalytic (UBC) fold.
- Ligation: E3 ubiquitin ligases catalyse the final step of the ubiquitination cascade. Most commonly, they create an isopeptide bond between a lysine of the target protein and the C-terminal glycine of ubiquitin. In general, this step requires the activity of one of the hundreds of E3s. E3 enzymes function as the substrate recognition modules of the system and are capable of interaction with both E2 and substrate. Some E3 enzymes also activate the E2 enzymes. E3 enzymes possess one of two domains: the homologous to the E6-AP carboxyl terminus (HECT) domain and the really interesting new gene (RING) domain (or the closely related U-box domain). HECT domain E3s transiently bind ubiquitin in this process, whereas RING domain E3s catalyse the direct transfer from the E2 enzyme to the substrate. The anaphase-promoting complex (APC) and the SCF complex (for Skp1-Cullin-F-box protein complex) are two examples of multi-subunit E3s involved in recognition and ubiquitination of specific target proteins for degradation by the proteasome.
In the ubiquitination cascade, E1 can bind with many E2s, which can bind with hundreds of E3s in a hierarchical way. Having levels within the cascade allows tight regulation of the ubiquitination machinery. Other ubiquitin-like proteins (UBLs) are also modified via the E1–E2–E3 cascade, although variations in these systems do exist.
Variety of ubiquitin modifications
Ubiquitination affects cellular process by regulating the degradation of proteins (via the proteasome and lysosome), coordinating the cellular localisation of proteins, activating and inactivating proteins, and modulating protein-protein interactions. These effects are mediated by different types of substrate ubiquitination, for example the addition of a single ubiquitin molecule (monoubiquitination) or different types of ubiqutin chains (polyubiquitination).
Monoubiquitination is the addition of one ubiquitin molecule to one substrate protein residue. Multi-monoubiquitination is the addition of one ubiquitin molecule to multiple substrate residues. The monoubiquitination of a protein can have different effects to the polyubiquitination of the same protein. The addition of a single ubiquitin molecule is thought to be required prior to the formation of polyubiquitin chains. Monoubiquitination affects cellular processes such as membrane trafficking, endocytosis and viral budding.
Polyubiquitination is the formation of a ubiquitin chain on a single lysine residue on the substrate protein. Following addition of a single ubiquitin moiety to a protein substrate, further ubiquitin molecules can be added to the first, yielding a polyubiquitin chain. These chains are made by linking the glycine residue of a ubiquitin molecule to a lysine of ubiquitin bound to a substrate. Ubiquitin has seven lysine residues and an N-terminus that may serve as points of ubiquitination; they are K6, K11, K27, K29, K33, K48, and K63. Lysine 48-linked chains were the first identified and are the best-characterised type of ubiquitin chain. K63 chains have also been well-characterised, whereas the function of other lysine chains, mixed chains, branched chains, N-terminal linear chains, and heterologous chains (mixtures of ubiquitin and other ubiquitin-like proteins) remains more unclear.
Lysine 48-linked polyubiquitin chains target proteins for destruction, by a process known as proteolysis. At least four ubiquitin molecules must be attached to a lysine residue on the condemned protein in order for it to be recognised by the 26S proteasome. This is a barrel-shape structure comprising a central proteolytic core made of four ring structures, flanked by two cylinders that selectively allow entry of ubiquitinated proteins. Once inside, the proteins are rapidly degraded into small peptides (usually 3–25 amino acid residues in length). Ubiquitin molecules are cleaved off the protein immediately prior to destruction and are recycled for further use. Although the majority of protein substrates are ubiquitinated, there are examples of non-ubiquitinated proteins targeted to the proteasome. The polyubiquitin chains are recognised by a subunit of the proteasome; S5a/Rpn10. This is achieved by a ubiquitin interacting motif (UIM) found in a hydrophobic patch in the C-terminal region of the S5a/Rpn10 unit.
Lysine 63-linked chains are not associated with proteasomal degradation of the substrate protein. Instead, they allow the coordination of other processes such as endocytic trafficking, inflammation, translation, and DNA repair. In cells, lysine 63-linked chains are bound by the ESCRT-0 complex, which prevents their binding to the proteasome. This complex contains two proteins, Hrs and STAM1, that contain a UIM, which allows it to bind to lysine 63-linked chains.
Less is understood about atypical (non-lysine 48-linked) ubiquitin chains but research is starting to suggest roles for these chains. There is evidence to suggest that atypical chains linked by lysine 6, 11, 27, 29 and N-terminal chains can induce proteasomal degradation.
Structure of chains
Differently linked chains have specific effects on the protein to which they are attached, caused by differences in the conformations of the protein chains. Lysine 63-linked and N-terminal chains produce fairly linear chains known as open-conformation chains. Lysine 6-, 11-, and 48-linked chains form closed conformations. The ubiquitin molecules in linear chains do not interact with each other, except for the covalent isopeptide bonds linking them together. In contrast, the closed conformation chains have interfaces with interacting residues. Altering the chain conformations exposes and conceals different parts of the ubiquitin protein, and the different linkages are recognized by proteins that are specific for the unique topologies that are intrinsic to the linkage. The proteins that bind ubiquitin have ubiquitin-binding domains (UBDs). The distances between individual ubiquitin units in chains differ between lysine 63- and 48-linked chains. The UBDs exploit this by having small spacers between ubiquitin-interacting motifs that bind lysine 48-linked chains (compact ubiquitin chains) and larger spacers for lysine 63-linked chains. The machinery involved in recognising polyubiquitin chains can also differentiate between the linear lysine 63-linked chains and linear N-terminal chains, demonstrated by the fact that the latter can induce proteasomal degradation of the substrate.
Functions of ubiquitin modification
The ubiquitination system functions in a wide variety of cellular processes, including:
- Antigen processing
- Biogenesis of organelles
- Cell cycle and division
- DNA transcription and repair
- Differentiation and development
- Immune response and inflammation
- Neural and muscular degeneration
- Morphogenesis of neural networks
- Modulation of cell surface receptors, ion channels and the secretory pathway
- Response to stress and extracellular modulators
- Ribosome biogenesis
- Viral infection
Multi-monoubiquitination can mark transmembrane proteins (for example, receptors) for removal from membranes (internalisation) and fulfil several signalling roles within the cell. When cell-surface transmembrane molecules are tagged with ubiquitin, the subcellular localization of the protein is altered, often targeting the protein for destruction in lysosomes. This serves as a negative feedback mechanism because often the stimulation of receptors by ligands increases their rate of ubiquitination and internalisation. Like monoubiquitination, lysine 63-linked polyubiquitin chains also has a role in the trafficking some membrane proteins.
Proliferating cell nuclear antigen (PCNA) is a protein involved in DNA synthesis. Under normal physiological conditions PCNA is sumoylated (a similar post-translational modification to ubiquitination). When DNA is damaged by ultra-violet radiation or chemicals, the SUMO molecule that is attached to a lysine residue is replaced by ubiquitin. Monoubiquitinated PCNA recruits polymerases that can carry out DNA synthesis with damaged DNA; but this is very error-prone, possibly resulting in the synthesis of mutated DNA. Lysine 63-linked polyubiquitination of PCNA allows it to perform a less error-prone mutation bypass known by the template switching pathway.
Ubiquitination of histone H2AX is involved in DNA damage recognition of DNA double-strand breaks. Lysine 63-linked polyubiquitin chains are formed on H2AX histone by the E2/E3 ligase pair, Ubc13-Mms2/RNF168. This K63 chain appears to recruit RAP80, which contains a UIM, and RAP80 then helps localize BRCA1. This pathway will eventually recruit the necessary proteins for homologous recombination repair.
Histones can be ubiquitinated and this is usually in the form of monoubiquitination (although polyubiquitinated forms do occur). Histone ubiquitination alters chromatin structure and allows the access of enzymes involved in transcription. Ubiquitin on histones also acts a binding site for proteins that either activate or inhibit transcription and also can induce further post-translational modifications of the protein. These effects can all modulate the transcription of genes.
Deubiquitinating enzymes (DUBs) oppose the role of ubiquination by removing ubiquitin from substrate proteins. They are cysteine proteases that cleave the amide bond between the two proteins. They are highly specific, as are the E3 ligases that attach the ubiquitin, with only a few substrates per enzyme. They can cleave both isopeptide (between ubiquitin and lysine) and peptide bonds (between ubiquitin and the N-terminus). In addition to removing ubiquitin from substrate proteins, DUBs have many other roles within the cell. Ubiquitin is either expressed as multiple copies joined in a chain (polyubiquitin) or attached to ribosomal subunits. DUBs cleave these proteins to produce active ubiquitin. They also recycle ubiquitin that has been accidentally bound to small nucleophilic molecules during the ubiquitination process. Monoubiquitin is formed by DUBs that cleave ubiquitin from free polyubiquitin chains that have been previously removed from proteins.
|Domain||Number of Proteins
|CUE||S. cerevisiae 7
H. sapiens 21
|GATII||S. cerevisiae 2
H. sapiens 14
|GLUE||S. cerevisiae ?
H. sapiens ?
|NZF||S. cerevisiae 1
H. sapiens 25
|PAZ||S. cerevisiae 5
H. sapiens 16
|UBA||S. cerevisiae 10
H. sapiens 98
|UEV||S. cerevisiae 2
H. sapiens ?
|UIM||S. cerevisiae 8
H. sapiens 71
|VHS||S. cerevisiae 4
H. sapiens 28
redirect Ubiquitin Binding Domains
Ubiquitin-binding domains (UBDs) are modular protein domains that non-covalently bind to ubiquitin, these motifs control various cellular events. Detailed molecular structures are known for a number of UBDs, binding specificity determines their mechanism of action and regulation, and how it regulates cellular proteins and processes.
The ubiquitin pathway has been implicated in the pathogenesis of several diseases and genetic disorders:
- Neurodegenerative disorders: Transcript variants encoding different isoforms of ubiquilin-1 are found in lesions associated with Alzheimer's and Parkinson's disease. Higher levels of ubiquilin in the brain have been shown to decrease malformation of amyloid precursor protein (APP), which plays a key role in triggering Alzheimer's disease. Conversely, lower levels of ubiquilin-1 in the brain have been associated with increased malformation of APP. A frameshift mutation in ubiquitin B can result in a truncated peptide missing the C-terminal glycine. This abnormal peptide, known as UBB+1, has been shown to accumulate selectively in Alzheimer's disease and other tauopathies.
- Angelman syndrome is caused by a disruption of UBE3A, which encodes a ubiquitin ligase (E3) enzyme termed E6-AP.
- Von Hippel-Lindau syndrome involves disruption of a ubiquitin E3 ligase termed the VHL tumor suppressor, or VHL gene.
- Fanconi anemia: Eight of the thirteen identified genes whose disruption can cause this disease encode proteins that form a large ubiquitin ligase (E3) complex.
- 3-M syndrome is an autosomal-recessive growth retardation disorder associated with mutations of the Cullin7 E3 ubiquitin ligase.
Immunohistochemistry using antibodies to ubiquitin can identify abnormal accumulations of this protein inside cells, indicating a disease process. These protein accumulations are referred to as inclusion bodies (which is a general term for any microscopically visible collection of abnormal material in a cell). Examples include:
- Neurofibrillary tangles in Alzheimer's disease
- Lewy body in Parkinson's disease
- Pick bodies in Pick's disease
- Inclusions in motor neuron disease and Huntington's Disease
- Mallory bodies in alcoholic liver disease
- Rosenthal fibers in astrocytes
Although ubiquitin is the most-understood post-translation modifier, there is a growing family of ubiquitin-like proteins (UBLs) that modify cellular targets in a pathway that is parallel to, but distinct from, that of ubiquitin. Known UBLs include: small ubiquitin-like modifier (SUMO), ubiquitin cross-reactive protein (UCRP, also known as interferon-stimulated gene-15 ISG15), ubiquitin-related modifier-1 (URM1), neuronal-precursor-cell-expressed developmentally downregulated protein-8 (NEDD8, also called Rub1 in S. cerevisiae), human leukocyte antigen F-associated (FAT10), autophagy-8 (ATG8) and -12 (ATG12), Fau ubiquitin-like protein (FUB1), MUB (membrane-anchored UBL), ubiquitin fold-modifier-1 (UFM1) and ubiquitin-like protein-5 (UBL5, which is but known as homologous to ubiquitin-1 [Hub1] in S. pombe). Whilst these proteins share only modest primary sequence identity with ubiquitin, they are closely related three-dimensionally. For example, SUMO shares only 18% sequence identity, but they contain the same structural fold. This fold is called "ubiquitin fold" or sometimes called ubiquiton fold. FAT10 and UCRP contain two. This compact globular beta-grasp fold is found in ubiquitin, UBLs, and proteins that comprise a ubiquitin-like domain, e.g. the S. cerevisiae spindle pole body duplication protein, Dsk2, and NER protein, Rad23, both contain N-terminal ubiquitin domains.
These related molecules have novel functions and influence diverse biological processes. There is also cross-regulation between the various conjugation pathways, since some proteins can become modified by more than one UBL, and sometimes even at the same lysine residue. For instance, SUMO modification often acts antagonistically to that of ubiquitination and serves to stabilize protein substrates. Proteins conjugated to UBLs are typically not targeted for degradation by the proteasome but rather function in diverse regulatory activities. Attachment of UBLs might, alter substrate conformation, affect the affinity for ligands or other interacting molecules, alter substrate localization, and influence protein stability.
UBLs are structurally similar to ubiquitin and are processed, activated, conjugated, and released from conjugates by enzymatic steps that are similar to the corresponding mechanisms for ubiquitin. UBLs are also translated with C-terminal extensions that are processed to expose the invariant C-terminal LRGG. These modifiers have their own specific E1 (activating), E2 (conjugating) and E3 (ligating) enzymes that conjugate the UBLs to intracellular targets. These conjugates can be reversed by UBL-specific isopeptidases that have similar mechanisms to that of the deubiquitinating enzymes.
Within some species, the recognition and destruction of sperm mitochondria through a mechanism involving ubiquitin is responsible for sperm mitochondria's disposal after fertilization occurs.
Prokaryotic ubiquitin-like protein (Pup)
Recently, a functional analog of ubiquitin has been found in prokaryotes. Prokaryotic ubiquitin-like protein (Pup) serves the same function (targeting proteins for degradations), although the enzymology of ubiquitination and pupylation is different. In contrast to the three-step reaction of ubiquitination, pupylation requires two steps, therefore only two enzymes are involved in pupylation.
Human proteins containing ubiquitin domain
ANUBL1; BAG1; BAT3/BAG6; DDI1; DDI2; FAU; HERPUD1; HERPUD2; HOPS; IKBKB; ISG15; LOC391257; MIDN; NEDD8; OASL; PARK2; RAD23A; RAD23B; RPS27A; SACS; 8U SF3A1; SUMO1; SUMO2; SUMO3; SUMO4; TMUB1; TMUB2; UBA52; UBB; UBC; UBD; UBFD1; UBL4; UBL4A; UBL4B; UBL7; UBLCP1; UBQLN1; UBQLN2; UBQLN3; UBQLN4; UBQLNL; UBTD1; UBTD2; UHRF1; UHRF2;
Prediction of ubiquitination
Currently available prediction programs are:
- UbiPred is a SVM-based prediction server using 31 physicochemical properties for predicting ubiquitination sites.
- UbPred is a random forest-based predictor of potential ubiquitination sites in proteins. It was trained on a combined set of 266 non-redundant experimentally verified ubiquitination sites available from our experiments and from two large-scale proteomics studies.
- CKSAAP_UbSite is SVM-based prediction that employs the composition of k-spaced amino acid pairs surrounding a query site (i.e. any lysine in a query sequence) as input, usess the same dataset as UbPred.
- SUMO protein
- SUMO enzymes
- Ubiquitin ligase
- Prokaryotic ubiquitin-like protein
- JUNQ and IPOD
- Deubiquitinating enzyme
- Goldstein G, Scheid M, Hammerling U, Schlesinger DH, Niall HD, Boyse EA (January 1975). "Isolation of a polypeptide that has lymphocyte-differentiating properties and is probably represented universally in living cells". Proc. Natl. Acad. Sci. U.S.A. 72 (1): 11–5. doi:10.1073/pnas.72.1.11. PMC 432229. PMID 1078892.
- Wilkinson KD (October 2005). "The discovery of ubiquitin-dependent proteolysis". Proc. Natl. Acad. Sci. U.S.A. 102 (43): 15280–2. doi:10.1073/pnas.0504842102. PMC 1266097. PMID 16230621.
- Kimura Y, Tanaka K (2010). "Regulatory mechanisms involved in the control of ubiquitin homeostasis". J Biochem 147 (6): 793–8. doi:10.1093/jb/mvq044. PMID 20418328.
- Glickman MH, Ciechanover A (April 2002). "The ubiquitin-proteasome proteolytic pathway: destruction for the sake of construction". Physiol. Rev. 82 (2): 373–428. doi:10.1152/physrev.00027.2001. PMID 11917093.
- Mukhopadhyay D, Riezman H (January 2007). "Proteasome-independent functions of ubiquitin in endocytosis and signaling". Science 315 (5809): 201–5. doi:10.1126/science.1127085. PMID 17218518.
- Schnell JD, Hicke L (September 2003). "Non-traditional functions of ubiquitin and ubiquitin-binding proteins". J. Biol. Chem. 278 (38): 35857–60. doi:10.1074/jbc.R300018200. PMID 12860974.
- Pickart CM, Eddins MJ (November 2004). "Ubiquitin: structures, functions, mechanisms". Biochim. Biophys. Acta 1695 (1–3): 55–72. doi:10.1016/j.bbamcr.2004.09.019. PMID 15571809.
- Komander D, Rape M (2012). "The ubiquitin code". Annu. Rev. Biochem. 81: 203–29. doi:10.1146/annurev-biochem-060310-170328. PMID 22524316.
- "The Nobel Prize in Chemistry 2004". Nobelprize.org. Retrieved 2010-10-16.
- "The Nobel Prize in Chemistry 2004: Popular Information". Nobelprize.org. Retrieved 2013-12-14.
- Miranda M, Sorkin A (June 2007). "Regulation of receptors and transporters by ubiquitination: new insights into surprisingly similar mechanisms". Mol. Interv. 7 (3): 157–67. doi:10.1124/mi.7.3.7. PMID 17609522.
- Ciechanover A, Hod Y, Hershko A (August 2012). "A heat-stable polypeptide component of an ATP-dependent proteolytic system from reticulocytes. 1978". Biochem. Biophys. Res. Commun. 425 (3): 565–70. doi:10.1016/j.bbrc.2012.08.025. PMID 22925675.
- Wang C, Xi J, Begley TP, Nicholson LK (January 2001). "Solution structure of ThiS and implications for the evolutionary roots of ubiquitin". Nat. Struct. Biol. 8 (1): 47–51. doi:10.1038/83041. PMID 11135670.
- Lake MW, Wuebbens MM, Rajagopalan KV, Schindelin H (November 2001). "Mechanism of ubiquitin activation revealed by the structure of a bacterial MoeB-MoaD complex". Nature 414 (6861): 325–9. doi:10.1038/35104586. PMID 11713534.
- Hochstrasser M (March 2009). "Origin and function of ubiquitin-like proteins". Nature 458 (7237): 422–9. doi:10.1038/nature07958. PMC 2819001. PMID 19325621.
- Pickart CM (2001). "Mechanisms underlying ubiquitination". Annu. Rev. Biochem. 70: 503–33. doi:10.1146/annurev.biochem.70.1.503. PMID 11395416.
- Bloom J, Amador V, Bartolini F, DeMartino G, Pagano M (October 2003). "Proteasome-mediated degradation of p21 via N-terminal ubiquitinylation". Cell 115 (1): 71–82. doi:10.1016/S0092-8674(03)00755-4. PMID 14532004.
- Scaglione KM, Basrur V, Ashraf NS, et al. (June 2013). "The Ubiquitin-conjugating Enzyme (E2) Ube2w Ubiquitinates the N Terminus of Substrates". J. Biol. Chem. 288 (26): 18784–8. doi:10.1074/jbc.C113.477596. PMC 3696654. PMID 23696636.
- Breitschopf K, Bengal E, Ziv T, Admon A, Ciechanover A (October 1998). "A novel site for ubiquitination: the N-terminal residue, and not internal lysines of MyoD, is essential for conjugation and degradation of the protein". EMBO J. 17 (20): 5964–73. doi:10.1093/emboj/17.20.5964. PMC 1170923. PMID 9774340.
- Wang X, Herr RA, Chua WJ, Lybarger L, Wiertz EJ, Hansen TH (May 2007). "Ubiquitination of serine, threonine, or lysine residues on the cytoplasmic tail can induce ERAD of MHC-I by viral E3 ligase mK3". J. Cell Biol. 177 (4): 613–24. doi:10.1083/jcb.200611063. PMC 2064207. PMID 17502423.
- Cadwell K, Coscoy L (July 2005). "Ubiquitination on nonlysine residues by a viral E3 ubiquitin ligase". Science 309 (5731): 127–30. doi:10.1126/science.1110340. PMID 15994556.
- Dikic I, Robertson M (2012). "Ubiquitin ligases and beyond". BMC Biol. 10: 22. doi:10.1186/1741-7007-10-22. PMC 3305657. PMID 22420755.
- Schulman BA, Harper JW (May 2009). "Ubiquitin-like protein activation by E1 enzymes: the apex for downstream signalling pathways". Nat. Rev. Mol. Cell Biol. 10 (5): 319–31. doi:10.1038/nrm2673. PMC 2712597. PMID 19352404.
- Groettrup M, Pelzer C, Schmidtke G, Hofmann K (May 2008). "Activating the ubiquitin family: UBA6 challenges the field". Trends Biochem. Sci. 33 (5): 230–7. doi:10.1016/j.tibs.2008.01.005. PMID 18353650.
- van Wijk SJ, Timmers HT (April 2010). "The family of ubiquitin-conjugating enzymes (E2s): deciding between life and death of proteins". FASEB J. 24 (4): 981–93. doi:10.1096/fj.09-136259. PMID 19940261.
- Metzger MB, Hristova VA, Weissman AM (February 2012). "HECT and RING finger families of E3 ubiquitin ligases at a glance". J. Cell. Sci. 125 (Pt 3): 531–7. doi:10.1242/jcs.091777. PMC 3381717. PMID 22389392.
- Skaar JR, Pagano M (December 2009). "Control of cell growth by the SCF and APC/C ubiquitin ligases". Curr. Opin. Cell Biol. 21 (6): 816–24. doi:10.1016/j.ceb.2009.08.004. PMC 2805079. PMID 19775879.
- Kerscher O, Felberbaum R, Hochstrasser M (2006). "Modification of proteins by ubiquitin and ubiquitin-like proteins". Annu. Rev. Cell Dev. Biol. 22: 159–80. doi:10.1146/annurev.cellbio.22.010605.093503. PMID 16753028.
- Komander D (October 2009). "The emerging complexity of protein ubiquitination". Biochem. Soc. Trans. 37 (Pt 5): 937–53. doi:10.1042/BST0370937. PMID 19754430.
- Ikeda F, Dikic I (June 2008). "Atypical ubiquitin chains: new molecular signals. 'Protein Modifications: Beyond the Usual Suspects' review series". EMBO Rep. 9 (6): 536–42. doi:10.1038/embor.2008.93. PMC 2427391. PMID 18516089.
- Xu P, Peng J (May 2008). "Characterization of polyubiquitin chain structure by middle-down mass spectrometry". Anal. Chem. 80 (9): 3438–44. doi:10.1021/ac800016w. PMC 2663523. PMID 18351785.
- Peng J, Schwartz D, Elias JE, Thoreen CC, Cheng D, Marsischky G, Roelofs J, Finley D, Gygi SP (August 2003). "A proteomics approach to understanding protein ubiquitination". Nat. Biotechnol. 21 (8): 921–6. doi:10.1038/nbt849. PMID 12872131.
- Kirisako T, Kamei K, Murata S, Kato M, Fukumoto H, Kanie M, Sano S, Tokunaga F, Tanaka K, Iwai K (October 2006). "A ubiquitin ligase complex assembles linear polyubiquitin chains". EMBO J. 25 (20): 4877–87. doi:10.1038/sj.emboj.7601360. PMC 1618115. PMID 17006537.
- Hicke L (March 2001). "Protein regulation by monoubiquitin". Nat. Rev. Mol. Cell Biol. 2 (3): 195–201. doi:10.1038/35056583. PMID 11265249.
- Lecker SH, Goldberg AL, Mitch WE (July 2006). "Protein degradation by the ubiquitin-proteasome pathway in normal and disease states". J. Am. Soc. Nephrol. 17 (7): 1807–19. doi:10.1681/ASN.2006010083. PMID 16738015.
- Kravtsova-Ivantsiv Y, Ciechanover A (February 2012). "Non-canonical ubiquitin-based signals for proteasomal degradation". J. Cell. Sci. 125 (Pt 3): 539–48. doi:10.1242/jcs.093567. PMID 22389393.
- Nathan JA, Kim HT, Ting L, Gygi SP, Goldberg AL (February 2013). "Why do cellular proteins linked to K63-polyubiquitin chains not associate with proteasomes?". EMBO J. 32 (4): 552–65. doi:10.1038/emboj.2012.354. PMC 3579138. PMID 23314748.
- Bache KG, Raiborg C, Mehlum A, Stenmark H (April 2003). "STAM and Hrs are subunits of a multivalent ubiquitin-binding complex on early endosomes". J. Biol. Chem. 278 (14): 12513–21. doi:10.1074/jbc.M210843200. PMID 12551915.
- Zhao S, Ulrich HD (April 2010). "Distinct consequences of posttranslational modification by linear versus K63-linked polyubiquitin chains". Proc. Natl. Acad. Sci. U.S.A. 107 (17): 7704–9. doi:10.1073/pnas.0908764107. PMC 2867854. PMID 20385835.
- Kim HT, Kim KP, Lledias F, Kisselev AF, Scaglione KM, Skowyra D, Gygi SP, Goldberg AL (June 2007). "Certain pairs of ubiquitin-conjugating enzymes (E2s) and ubiquitin-protein ligases (E3s) synthesize nondegradable forked ubiquitin chains containing all possible isopeptide linkages". J. Biol. Chem. 282 (24): 17375–86. doi:10.1074/jbc.M609659200. PMID 17426036.
- "Ubiquitin Proteasome Pathway Overview". Archived from the original on 2008-03-30. Retrieved 2008-04-30.
- Shaheen M, Shanmugam I, Hromas R (2010). "The Role of PCNA Posttranslational Modifications in Translesion Synthesis". J Nucleic Acids 2010: 1. doi:10.4061/2010/761217. PMC 2935186. PMID 20847899.
- Jackson SP, Durocher D (March 2013). "Regulation of DNA damage responses by ubiquitin and SUMO". Mol. Cell 49 (5): 795–807. doi:10.1016/j.molcel.2013.01.017. PMID 23416108.
- Campbell SJ, Edwards RA, Leung CC, et al. (July 2012). "Molecular insights into the function of RING finger (RNF)-containing proteins hRNF8 and hRNF168 in Ubc13/Mms2-dependent ubiquitylation". J. Biol. Chem. 287 (28): 23900–10. doi:10.1074/jbc.M112.359653. PMC 3390666. PMID 22589545.
- Ikura T, Tashiro S, Kakino A, et al. (October 2007). "DNA damage-dependent acetylation and ubiquitination of H2AX enhances chromatin dynamics". Mol. Cell. Biol. 27 (20): 7028–40. doi:10.1128/MCB.00579-07. PMC 2168918. PMID 17709392.
- Kim H, Chen J, Yu X (May 2007). "Ubiquitin-binding protein RAP80 mediates BRCA1-dependent DNA damage response". Science 316 (5828): 1202–5. doi:10.1126/science.1139621. PMID 17525342.
- Hofmann K (April 2009). "Ubiquitin-binding domains and their role in the DNA damage response". DNA Repair (Amst.) 8 (4): 544–56. doi:10.1016/j.dnarep.2009.01.003. PMID 19213613.
- Hammond-Martel I, Yu H, Affar el B (February 2012). "Roles of ubiquitin signaling in transcription regulation". Cell. Signal. 24 (2): 410–21. doi:10.1016/j.cellsig.2011.10.009. PMID 22033037.
- Reyes-Turcu FE, Ventii KH, Wilkinson KD (2009). "Regulation and cellular roles of ubiquitin-specific deubiquitinating enzymes". Annu. Rev. Biochem. 78: 363–97. doi:10.1146/annurev.biochem.78.082307.091526. PMC 2734102. PMID 19489724.
- Nijman SM, Luna-Vargas MP, Velds A, et al. (December 2005). "A genomic and functional inventory of deubiquitinating enzymes". Cell 123 (5): 773–86. doi:10.1016/j.cell.2005.11.007. PMID 16325574.
- Hicke L, Schubert HL, Hill CP (August 2005). "Ubiquitin-binding domains". Nat. Rev. Mol. Cell Biol. 6 (8): 610–21. doi:10.1038/nrm1701. PMID 16064137.
- "UBQLN1 ubiquilin 1 [ Homo sapiens ]". Gene. National Center for Biotechnology Information. Retrieved 9 May 2012.
- Stieren ES, El Ayadi A, Xiao Y, Siller E, Landsverk ML, Oberhauser AF, Barral JM, Boehning D (August 2011). "Ubiquilin-1 Is a Molecular Chaperone for the Amyloid Precursor Protein". J Biol Chem 286 (41): 35689–98. doi:10.1074/jbc.M111.243147. PMC 3195644. PMID 21852239. Lay summary – Science Daily.
- Huber C, Dias-Santagata D, Glaser A, O'Sullivan J, Brauner R, Wu K, Xu X, Pearce K, Wang R, Uzielli ML, Dagoneau N, Chemaitilly W, Superti-Furga A, Dos Santos H, Mégarbané A, Morin G, Gillessen-Kaesbach G, Hennekam R, Van der Burgt I, Black GC, Clayton PE, Read A, Le Merrer M, Scambler PJ, Munnich A, Pan ZQ, Winter R, Cormier-Daire V (October 2005). "Identification of mutations in CUL7 in 3-M syndrome". Nat. Genet. 37 (10): 1119–24. doi:10.1038/ng1628. PMID 16142236.
- Downes BP, Saracco SA, Lee SS, Crowell DN, Vierstra RD (September 2006). "MUBs, a family of ubiquitin-fold proteins that are plasma membrane-anchored by prenylation". J. Biol. Chem. 281 (37): 27145–57. doi:10.1074/jbc.M602283200. PMID 16831869.
- Welchman RL, Gordon C, Mayer RJ (2005). "Ubiquitin and ubiquitin-like proteins as multifunctional signals". Nat Rev Mol Cell Biol 6 (8): 599–609. doi:10.1038/nrm1700. PMID 16064136.
- Grabbe C, Dikic I (2009). "Functional roles of ubiquitin-like domain (ULD) and ubiquitin-binding domain (UBD) containing proteins". Chem Rev 109 (4): 1481–94. doi:10.1021/cr800413p. PMID 19253967.
- Sutovsky P, Moreno RD, Ramalho-Santos J, Dominko T, Simerly C, Schatten G (August 2000). "Ubiquitinated sperm mitochondria, selective proteolysis, and the regulation of mitochondrial inheritance in mammalian embryos". Biol. Reprod. 63 (2): 582–90. doi:10.1095/biolreprod63.2.582. PMID 10906068.
- Tung CW, Ho SY (2008). "Computational identification of ubiquitylation sites from protein sequences". BMC Bioinformatics 9: 310. doi:10.1186/1471-2105-9-310. PMC 2488362. PMID 18625080.
- Radivojac P, Vacic V, Haynes C, Cocklin RR, Mohan A, Heyen JW, Goebl MG, Iakoucheva LM (February 2010). "Identification, analysis, and prediction of protein ubiquitination sites". Proteins 78 (2): 365–80. doi:10.1002/prot.22555. PMC 3006176. PMID 19722269.
- Chen Z, Chen YZ, Wang XF, Wang C, Yan RX, Zhang Z (2011). "Prediction of ubiquitination sites by using the composition of k-spaced amino acid pairs". In Fraternali, Franca. PLoS ONE 6 (7): e22930. doi:10.1371/journal.pone.0022930. PMC 3146527. PMID 21829559.
- GeneReviews/NCBI/NIH/UW entry on Angelman syndrome
- OMIM entries on Angelman syndrome
- Ubiquitin at the US National Library of Medicine Medical Subject Headings (MeSH)
Programs for ubiquitination prediction:
- UniProt entry for ubiquitin
- Ubiquitin Web-page
- "7.340 Ubiquitination: The Proteasome and Human Disease". MIT OpenCourseWare. 2004. Notes from MIT course.
This tab holds the annotation information that is stored in the Pfam database. As we move to using Wikipedia as our main source of annotation, the contents of this tab will be gradually replaced by the Wikipedia tab.
Ubiquitin-like domain Provide feedback
This entry contains ubiquitin-like domains [1-2].
Lytle BL, Peterson FC, Qiu SH, Luo M, Zhao Q, Markley JL, Volkman BF;, J Biol Chem. 2004;279:46787-46793.: Solution structure of a ubiquitin-like domain from tubulin-binding cofactor B. PUBMED:15364906 EPMC:15364906
Internal database links
|Similarity to PfamA using HHSearch:||ubiquitin Rad60-SLD|
External database links
This tab holds annotation information from the InterPro database.
No InterPro data for this Pfam family.
Below is a listing of the unique domain organisations or architectures in which this domain is found. More...
The graphic that is shown by default represents the longest sequence with a given architecture. Each row contains the following information:
- the number of sequences which exhibit this architecture
a textual description of the architecture, e.g. Gla, EGF x 2, Trypsin.
This example describes an architecture with one
Gladomain, followed by two consecutive
EGFdomains, and finally a single
- a link to the page in the Pfam site showing information about the sequence that the graphic describes
- the UniProt description of the protein sequence
- the number of residues in the sequence
- the Pfam graphic itself.
Note that you can see the family page for a particular domain by clicking on the graphic. You can also choose to see all sequences which have a given architecture by clicking on the Show link in each row.
Finally, because some families can be found in a very large number of architectures, we load only the first fifty architectures by default. If you want to see more architectures, click the button at the bottom of the page to load the next set.
Loading domain graphics...
This family includes proteins that share the ubiquitin fold. It currently unites four SCOP superfamilies.
The clan contains the following 41 members:APG12 Atg8 Blt1 Caps_synth_GfcC CIDE-N Cobl DUF1315 DUF2407 DUF4430 DWNN FERM_N Lambda_tail_I Multi_ubiq NQRA_SLBB PB1 PI3K_rbd Plug Prok_Ub RA Rad60-SLD Rad60-SLD_2 Ras_bdg_2 RBD SLBB Telomere_Sde2 TGS ThiS ThiS-like TmoB TUG-UBL1 Ub-Mut7C Ub-RnfH ubiquitin Ubiquitin_2 Ubiquitin_3 UBX Ufm1 UN_NPL4 Urm1 YchF-GTPase_C YukD
We store a range of different sequence alignments for families. As well as the seed alignment from which the family is built, we provide the full alignment, generated by searching the sequence database using the family HMM. We also generate alignments using four representative proteomes (RP) sets, the NCBI sequence database, and our metagenomics sequence database. More...
There are various ways to view or download the sequence alignments that we store. We provide several sequence viewers and a plain-text Stockholm-format file for download.
We make a range of alignments for each Pfam-A family:
- the curated alignment from which the HMM for the family is built
- the alignment generated by searching the sequence database using the HMM
- Representative Proteomes (RPs) at 15%, 35%, 55% and 75% co-membership thresholds
- alignment generated by searching the NCBI sequence database using the family HMM
- alignment generated by searching the metagenomics sequence database using the family HMM
You can see the alignments as HTML or in three different sequence viewers:
- a Java applet developed at the University of Dundee. You will need Java installed before running jalview
- an HTML page showing the whole alignment.Please note: full Pfam alignments can be very large. These HTML views are extremely large and often cause problems for browsers. Please use either jalview or the Pfam viewer if you have trouble viewing the HTML version
- an HTML-based representation of the alignment, coloured according to the posterior-probability (PP) values from the HMM. As for the standard HTML view, heatmap alignments can also be very large and slow to render.
- Pfam viewer
- an HTML-based viewer that uses DAS to retrieve alignment fragments on request
You can download (or view in your browser) a text representation of a Pfam alignment in various formats:
You can also change the order in which sequences are listed in the alignment, change how insertions are represented, alter the characters that are used to represent gaps in sequences and, finally, choose whether to download the alignment or to view it in your browser directly.
You may find that large alignments cause problems for the viewers and the reformatting tool, so we also provide all alignments in Stockholm format. You can download either the plain text alignment, or a gzipped version of it.
We make a range of alignments for each Pfam-A family. You can see a description of each above. You can view these alignments in various ways but please note that some types of alignment are never generated while others may not be available for all families, most commonly because the alignments are too large to handle.
1Cannot generate PP/Heatmap alignments for seeds; no PP data available
Key: available, not generated, — not available.
Format an alignment
We make all of our alignments available in Stockholm format. You can download them here as raw, plain text files or as gzip-compressed files.
You can also download a FASTA format file containing the full-length sequences for all sequences in the full alignment.
MyHits provides a collection of tools to handle multiple sequence alignments. For example, one can refine a seed alignment (sequence addition or removal, re-alignment or manual edition) and then search databases for remote homologs using HMMER3.
HMM logos is one way of visualising profile HMMs. Logos provide a quick overview of the properties of an HMM in a graphical form. You can see a more detailed description of HMM logos and find out how you can interpret them here. More...
If you find these logos useful in your own work, please consider citing the following article:
This page displays the phylogenetic tree for this family's seed alignment. We use FastTree to calculate neighbour join trees with a local bootstrap based on 100 resamples (shown next to the tree nodes). FastTree calculates approximately-maximum-likelihood phylogenetic trees from our seed alignment.
Note: You can also download the data file for the tree.
Curation and family details
This section shows the detailed information about the Pfam family. You can see the definitions of many of the terms in this section in the glossary and a fuller explanation of the scoring system that we use in the scores section of the help pages.
This family is new in this Pfam release.
|Number in seed:||85|
|Number in full:||602|
|Average length of the domain:||79.70 aa|
|Average identity of full alignment:||20 %|
|Average coverage of the sequence by the domain:||19.48 %|
|HMM build commands:||
build method: hmmbuild -o /dev/null HMM SEED
search method: hmmsearch -Z 23193494 -E 1000 --cpu 4 HMM pfamseq
|Family (HMM) version:||1|
|Download:||download the raw HMM for this family|
Weight segments by...
Change the size of the sunburst
selected sequences to HMM
a FASTA-format file
- 0 sequences
- 0 species
This visualisation provides a simple graphical representation of the distribution of this family across species. You can find the original interactive tree in the More....
This chart is a modified "sunburst" visualisation of the species tree for this family. It shows each node in the tree as a separate arc, arranged radially with the superkingdoms at the centre and the species arrayed around the outermost ring.
How the sunburst is generated
The tree is built by considering the taxonomic lineage of each sequence that has a match to this family. For each node in the resulting tree, we draw an arc in the sunburst. The radius of the arc, its distance from the root node at the centre of the sunburst, shows the taxonomic level ("superkingdom", "kingdom", etc). The length of the arc represents either the number of sequences represented at a given level, or the number of species that are found beneath the node in the tree. The weighting scheme can be changed using the sunburst controls.
In order to reduce the complexity of the representation, we reduce the number of taxonomic levels that we show. We consider only the following eight major taxonomic levels:
Colouring and labels
Segments of the tree are coloured approximately according to their superkingdom. For example, archeal branches are coloured with shades of orange, eukaryotes in shades of purple, etc. The colour assignments are shown under the sunburst controls. Where space allows, the name of the taxonomic level will be written on the arc itself.
As you move your mouse across the sunburst, the current node will be highlighted. In the top section of the controls panel we show a summary of the lineage of the currently highlighed node. If you pause over an arc, a tooltip will be shown, giving the name of the taxonomic level in the title and a summary of the number of sequences and species below that node in the tree.
Anomalies in the taxonomy tree
There are some situations that the sunburst tree cannot easily handle and for which we have work-arounds in place.
Missing taxonomic levels
Some species in the taxonomic tree may not have one or more of the main eight levels that we display. For example, Bos taurus is not assigned an order in the NCBI taxonomic tree. In such cases we mark the omitted level with, for example, "No order", in both the tooltip and the lineage summary.
Unmapped species names
The tree is built by looking at each sequence in the full alignment for the family. We take the name of the species given by UniProt and try to map that to the full taxonomic tree from NCBI. In some cases, the name chosen by UniProt does not map to any node in the NCBI tree, perhaps because the chosen name is listed as a synonym or a misspelling in the NCBI taxonomy.
So that these nodes are not simply omitted from the sunburst tree, we group them together in a separate branch (or segment of the sunburst tree). Since we cannot determine the lineage for these unmapped species, we show all levels between the superkingdom and the species as "uncategorised".
Since we reduce the species tree to only the eight main taxonomic levels, sequences that are mapped to the sub-species level in the tree would not normally be shown. Rather than leave out these species, we map them instead to their parent species. So, for example, for sequences belonging to one of the Vibrio cholerae sub-species in the NCBI taxonomy, we show them instead as belonging to the species Vibrio cholerae.
Too many species/sequences
For large species trees, you may see blank regions in the outer layers of the sunburst. These occur when there are large numbers of arcs to be drawn in a small space. If an arc is less than approximately one pixel wide, it will not be drawn and the space will be left blank. You may still be able to get some information about the species in that region by moving your mouse across the area, but since each arc will be very small, it will be difficult to accurately locate a particular species.
The tree shows the occurrence of this domain across different species. More...
We show the species tree in one of two ways. For smaller trees we try to show an interactive representation, which allows you to select specific nodes in the tree and view them as an alignment or as a set of Pfam domain graphics.
Unfortunately we have found that there are problems viewing the interactive tree when the it becomes larger than a certain limit. Furthermore, we have found that Internet Explorer can become unresponsive when viewing some trees, regardless of their size. We therefore show a text representation of the species tree when the size is above a certain limit or if you are using Internet Explorer to view the site.
If you are using IE you can still load the interactive tree by clicking the "Generate interactive tree" button, but please be aware of the potential problems that the interactive species tree can cause.
For all of the domain matches in a full alignment, we count the number that are found on all sequences in the alignment. This total is shown in the purple box.
We also count the number of unique sequences on which each domain is found, which is shown in green. Note that a domain may appear multiple times on the same sequence, leading to the difference between these two numbers.
Finally, we group sequences from the same organism according to the NCBI code that is assigned by UniProt, allowing us to count the number of distinct sequences on which the domain is found. This value is shown in the pink boxes.
We use the NCBI species tree to group organisms according to their taxonomy and this forms the structure of the displayed tree. Note that in some cases the trees are too large (have too many nodes) to allow us to build an interactive tree, but in most cases you can still view the tree in a plain text, non-interactive representation. Those species which are represented in the seed alignment for this domain are highlighted.
You can use the tree controls to manipulate how the interactive tree is displayed:
- show/hide the summary boxes
- highlight species that are represented in the seed alignment
- expand/collapse the tree or expand it to a given depth
- select a sub-tree or a set of species within the tree and view them graphically or as an alignment
- save a plain text representation of the tree
Please note: for large trees this can take some time. While the tree is loading, you can safely switch away from this tab but if you browse away from the family page entirely, the tree will not be loaded.
For those sequences which have a structure in the Protein DataBank, we use the mapping between UniProt, PDB and Pfam coordinate systems from the PDBe group, to allow us to map Pfam domains onto UniProt sequences and three-dimensional protein structures. The table below shows the structures on which the Ubiquitin_2 domain has been found. There are 5 instances of this domain found in the PDB. Note that there may be multiple copies of the domain in a single PDB structure, since many structures contain multiple copies of the same protein seqence.
Loading structure mapping...