Summary: Core histone H2A/H2B/H3/H4
Pfam includes annotations and additional family information from a range of different sources. These sources can be accessed via the tabs below.
This is the Wikipedia entry entitled "Histone". More...
The Wikipedia text that you see displayed here is a download from Wikipedia. This means that the information we display is a copy of the information from the Wikipedia database. The button next to the article title ("Edit Wikipedia article") takes you to the edit page for the article directly within Wikipedia. You should be aware you are not editing our local copy of this information. Any changes that you make to the Wikipedia article will not be displayed here until we next download the article from Wikipedia. We currently download new content on a nightly basis.
Does Pfam agree with the content of the Wikipedia entry ?
Pfam has chosen to link families to Wikipedia articles. In some case we have created or edited these articles but in many other cases we have not made any direct contribution to the content of the article. The Wikipedia community does monitor edits to try to ensure that (a) the quality of article annotation increases, and (b) vandalism is very quickly dealt with. However, we would like to emphasise that Pfam does not curate the Wikipedia entries and we cannot guarantee the accuracy of the information on the Wikipedia page.
Editing Wikipedia articles
Before you edit for the first time
Wikipedia is a free, online encyclopedia. Although anyone can edit or contribute to an article, Wikipedia has some strong editing guidelines and policies, which promote the Wikipedia standard of style and etiquette. Your edits and contributions are more likely to be accepted (and remain) if they are in accordance with this policy.
You should take a few minutes to view the following pages:
How your contribution will be recorded
Anyone can edit a Wikipedia entry. You can do this either as a new user or you can register with Wikipedia and log on. When you click on the "Edit Wikipedia article" button, your browser will direct you to the edit page for this entry in Wikipedia. If you are a registered user and currently logged in, your changes will be recorded under your Wikipedia user name. However, if you are not a registered user or are not logged on, your changes will be logged under your computer's IP address. This has two main implications. Firstly, as a registered Wikipedia user your edits are more likely seen as valuable contribution (although all edits are open to community scrutiny regardless). Secondly, if you edit under an IP address you may be sharing this IP address with other users. If your IP address has previously been blocked (due to being flagged as a source of 'vandalism') your edits will also be blocked. You can find more information on this and creating a user account at Wikipedia.
If you have problems editing a particular page, contact us at email@example.com and we will try to help.
The community annotation is a new facility of the Pfam web site. If you have problems editing or experience problems with these pages please contact us.
Histone Edit Wikipedia article
In biology, histones are highly alkaline proteins found in eukaryotic cell nuclei that package and order the DNA into structural units called nucleosomes. They are the chief protein components of chromatin, acting as spools around which DNA winds, and play a role in gene regulation. Without histones, the unwound DNA in chromosomes would be very long (a length to width ratio of more than 10 million to 1 in human DNA). For example, each human cell has about 1.8 meters of DNA, (~6 ft) but wound on the histones it has about 90 micrometers (0.09 mm) of chromatin, which, when duplicated and condensed during mitosis, result in about 120 micrometers of chromosomes.
|Core histone H2A/H2B/H3/H4|
PDB rendering of Complex between nucleosome core particle (h3,h4,h2a,h2b) and 146 bp long DNA fragment based on 1aoi.
|linker histone H1 and H5 family|
- 1 Classes
- 2 Structure
- 3 History
- 4 Conservation across species
- 5 Function
- 6 Functions of histone modifications
- 6.1 Chemistry of histone modifications
- 6.2 Functions in transcription
- 6.3 Other functions
- 7 See also
- 8 References
- 9 External links
Two of each of the core histones assemble to form one octameric nucleosome core, approximately 63 Angstroms in diameter (a solenoid (DNA)-like particle). 147 base pairs of DNA wrap around this core particle 1.65 times in a left-handed super-helical turn to give a particle of around 100 Angstroms across. The linker histone H1 binds the nucleosome at the entry and exit sites of the DNA, thus locking the DNA into place and allowing the formation of higher order structure. The most basic such formation is the 10 nm fiber or beads on a string conformation. This involves the wrapping of DNA around nucleosomes with approximately 50 base pairs of DNA separating each pair of nucleosomes (also referred to as linker DNA). Higher-order structures include the 30 nm fiber (forming an irregular zigzag) and 100 nm fiber, these being the structures found in normal cells. During mitosis and meiosis, the condensed chromosomes are assembled through interactions between nucleosomes and other regulatory proteins.
The following is a list of human histone proteins:
|Linker||H1||H1F||H1F0, H1FNT, H1FOO, H1FX|
|H1H1||HIST1H1A, HIST1H1B, HIST1H1C, HIST1H1D, HIST1H1E, HIST1H1T|
|Core||H2A||H2AF||H2AFB1, H2AFB2, H2AFB3, H2AFJ, H2AFV, H2AFX, H2AFY, H2AFY2, H2AFZ|
|H2A1||HIST1H2AA, HIST1H2AB, HIST1H2AC, HIST1H2AD, HIST1H2AE, HIST1H2AG, HIST1H2AI, HIST1H2AJ, HIST1H2AK, HIST1H2AL, HIST1H2AM|
|H2B||H2BF||H2BFM, H2BFS, H2BFWT|
|H2B1||HIST1H2BA, HIST1H2BB, HIST1H2BC, HIST1H2BD, HIST1H2BE, HIST1H2BF, HIST1H2BG, HIST1H2BH, HIST1H2BI, HIST1H2BJ, HIST1H2BK, HIST1H2BL, HIST1H2BM, HIST1H2BN, HIST1H2BO|
|H3||H3A1||HIST1H3A, HIST1H3B, HIST1H3C, HIST1H3D, HIST1H3E, HIST1H3F, HIST1H3G, HIST1H3H, HIST1H3I, HIST1H3J|
|H4||H41||HIST1H4A, HIST1H4B, HIST1H4C, HIST1H4D, HIST1H4E, HIST1H4F, HIST1H4G, HIST1H4H, HIST1H4I, HIST1H4J, HIST1H4K, HIST1H4L|
The nucleosome core is formed of two H2A-H2B dimers and a H3-H4 tetramer, forming two nearly symmetrical halves by tertiary structure (C2 symmetry; one macromolecule is the mirror image of the other). The H2A-H2B dimers and H3-H4 tetramer also show pseudodyad symmetry. The 4 'core' histones (H2A, H2B, H3 and H4) are relatively similar in structure and are highly conserved through evolution, all featuring a 'helix turn helix turn helix' motif (which allows the easy dimerisation). They also share the feature of long 'tails' on one end of the amino acid structure - this being the location of post-translational modification (see below).
It has been proposed that histone proteins are evolutionarily related to the helical part of the extended AAA+ ATPase domain, the C-domain, and to the N-terminal substrate recognition domain of Clp/Hsp100 proteins. Despite the differences in their topology, these three folds share a homologous helix-strand-helix (HSH) motif.
Using an electron paramagnetic resonance spin-labeling technique, British researchers measured the distances between the spools around which eukaryotic cells wind their DNA. They determined the spacings range from 59 to 70 Ã….
In all, histones make five types of interactions with DNA:
- Helix-dipoles form alpha-helixes in H2B, H3, and H4 cause a net positive charge to accumulate at the point of interaction with negatively charged phosphate groups on DNA
- Hydrogen bonds between the DNA backbone and the amide group on the main chain of histone proteins
- Nonpolar interactions between the histone and deoxyribose sugars on DNA
- Salt bridges and hydrogen bonds between side chains of basic amino acids (especially lysine and arginine) and phosphate oxygens on DNA
- Non-specific minor groove insertions of the H3 and H2B N-terminal tails into two minor grooves each on the DNA molecule
The highly basic nature of histones, aside from facilitating DNA-histone interactions, contributes to their water solubility.
Histones are subject to post translational modification by enzymes primarily on their N-terminal tails, but also in their globular domains. Such modifications include methylation, citrullination, acetylation, phosphorylation, SUMOylation, ubiquitination, and ADP-ribosylation. This affects their function of gene regulation (see "Function" section).
In general, genes that are active have less bound histone, while inactive genes are highly associated with histones during interphase. It also appears that the structure of histones has been evolutionarily conserved, as any deleterious mutations would be severely maladaptive. All histones have a highly positively charged N-terminus with many lysine and arginine residues.
Histones were discovered in 1884 by Albrecht Kossel. The word "histone" dates from the late 19th century and is from the German word "Histon", a word itself of uncertain origin - perhaps from the Greek histanai or histos.
|Look up histone in Wiktionary, the free dictionary.|
Until the early 1990s, histones were dismissed by most as inert packing material for eukaryotic nuclear DNA, a view based in part on the "ball and stick" models of Mark Ptashne and others, who believed that transcription was activated by protein-DNA and protein-protein interactions on largely naked DNA templates, as is the case in bacteria.
During the 1980s, work by Michael Grunstein demonstrated that eukaryotic histones actually repress gene transcription, and that the function of transcriptional activators is to overcome this repression. It is now known that histones play both positive and negative roles in gene expression, forming the basis of the histone code. The work of Vincent Allfrey on histone modification was pioneering and he is regarded as father of epigenetics.
Conservation across species
Histones are found in the nuclei of eukaryotic cells, and in certain Archaea, namely Thermoproteales and Euryarchaea, but not in bacteria. The unicellular algae known as dinoflagellates are the only eukaryotes that are known to completely lack histones.
Archaeal histones may well resemble the evolutionary precursors to eukaryotic histones. Histone proteins are among the most highly conserved proteins in eukaryotes, emphasizing their important role in the biology of the nucleus.:939 In contrast mature sperm cells largely use protamines to package their genomic DNA, most likely because this allows them to achieve an even higher packaging ratio.
Core histones are highly conserved proteins; that is, there are very few differences among the amino acid sequences of the histone proteins of different species. Linker histone usually has more than one form within a species and is also less conserved than the core histones.
There are some variant forms in some of the major classes. They share amino acid sequence homology and core structural similarity to a specific class of major histones but also have their own feature that is distinct from the major histones. These minor histones usually carry out specific functions of the chromatin metabolism. For example, histone H3-like CenpA is associated with only the centromere region of the chromosome. Histone H2A variant H2A.Z is associated with the promoters of actively transcribed genes and also involved in the prevention of the spread of silent heterochromatin. Furthermore, H2A.Z has roles in chromatin for genome stability. Another H2A variant H2A.X binds to the DNA with double-strand breaks and marks the region undergoing DNA repair. Histone H3.3 is associated with the body of actively transcribed genes.
Compacting DNA strands
Histones act as spools around which DNA winds. This enables the compaction necessary to fit the large genomes of eukaryotes inside cell nuclei: the compacted molecule is 40,000 times shorter than an unpacked molecule.
Histones undergo posttranslational modifications that alter their interaction with DNA and nuclear proteins. The H3 and H4 histones have long tails protruding from the nucleosome, which can be covalently modified at several places. Modifications of the tail include methylation, acetylation, phosphorylation, ubiquitination, SUMOylation, citrullination, and ADP-ribosylation. The core of the histones H2A, H2B, and H3 can also be modified. Combinations of modifications are thought to constitute a code, the so-called "histone code". Histone modifications act in diverse biological processes such as gene regulation, DNA repair, chromosome condensation (mitosis) and spermatogenesis (meiosis).
The common nomenclature of histone modifications is:
- The name of the histone (e.g., H3)
- The single-letter amino acid abbreviation (e.g., K for Lysine) and the amino acid position in the protein
- The type of modification (Me: methyl, P: phosphate, Ac: acetyl, Ub: ubiquitin)
- The number of modifications (only Me is known to occur in more than one copy per residue. 1, 2 or 3 is mono-, di- or tri-methylation)
So H3K4me1 denotes the monomethylation of the 4th residue (a lysine) from the start (i.e., the N-terminal) of the H3 protein.
Examples of histone modifications in transcription regulation include:
Functions of histone modifications
A huge catalogue of histone modifications have been described, but a functional understanding of most is still lacking. Collectively, it is thought that histone modifications may underlie a histone code, whereby combinations of histone modifications have specific meanings. However, most functional data concerns individual prominent histone modifications that are biochemically amenable to detailed study.
Chemistry of histone modifications
The addition of one, two or three methyl groups to lysine has little effect on the chemistry of the histone; methylation leaves the charge of the lysine intact and adds a minimal number of atoms so steric interactions are mostly unaffected. However, proteins containing Tudor, chromo or PHD domains, amongst others, can recognise lysine methylation with exquisite sensitivity and differentiate mono, di and tri-methyl lysine, to the extent that, for some lysines (e.g.: H4K20) mono, di and tri-methylation appear to have different meanings. Because of this, lysine methylation tends to be a very informative mark and dominates the known histone modification functions.
What was said above of the chemistry of lysine methylation also applies to arginine methylation, and some protein domainsâ€”e.g., Tudor domainsâ€”can be specific for methyl arginine instead of methyl lysine. Arginine is known to be mono- or di-methylated, and methylation can be symmetric or asymmetric, potentially with different meanings.
Addition of an acetyl group has a major chemical effect on lysine as it neutralises the positive charge. This reduces electrostatic attraction between the histone and the negatively charged DNA backbone, loosening the chromatin structure; highly acetylated histones form more accessible chromatin and tend to be associated with active transcription. Lysine acetylation appears to be less precise in meaning than methylation, in that histone acetyltransferases tend to act on more than one lysine; presumably this reflects the need to alter multiple lysines to have a significant effect on chromatin structure.
Addition of a negatively charged phosphate group can lead to major changes in protein structure, leading to the well-characterised role of phosphorylation in controlling protein function. It is not clear what structural implications histone phosphorylation has, but histone phosphorylation has clear functions as a post-translational modification, and binding domains such as BRCT have been characterised.
Functions in transcription
Most well-studied histone modifications are involved in control of transcription.
Actively transcribed genes
Two histone modifications are particularly associated with active transcription:
H3K4 trimethylation is performed by the COMPASS complex. Despite the conservation of this complex and histone modification from yeast to mammals, it is not entirely clear what role this modification plays. However, it is an excellent mark of active promoters and the level of this histone modification at a geneâ€™s promoter is broadly correlated with transcriptional activity of the gene. The formation of this mark is tied to transcription in a rather convoluted manner: early in transcription of a gene, RNA polymerase II undergoes a switch from initiatingâ€™ to â€˜elongatingâ€™, marked by a change in the phosphorylation states of the RNA polymerase II C terminal domain (CTD). The same enzyme that phosphorylates the CTD also phosphorylates the Rad6 complex, which in turn adds a ubiquitin mark to H2B K123 (K120 in mammals). H2BK123Ub occurs throughout transcribed regions, but this mark is required for COMPASS to trimethylate H3K4 at promoters.
- Trimethylation of H3 lysine 36 (H3K36Me3) in the body of active genes
H3K36 trimethylation is deposited by the methyltransferase Set2. This protein associates with elongating RNA polymerase II, and H3K36Me3 is indicative of actively transcribed genes. H3K36Me3 is recognised by the Rpd3 histone deacetylase complex, which removes acetyl modifications from surrounding histones, increasing chromatin compaction and repressing spurious transcription. Increased chromatin compaction prevents transcription factors from accessing DNA, and reduces the likelihood of new transcription events being initiated within the body of the gene. This process therefore helps ensure that transcription is not interrupted.
Three histone modifications are particularly associated with repressed genes:
- Trimethylation of H3 lysine 27 (H3K27Me3)
This histone modification is depositied by the polycomb complex PRC2. It is a clear marker of gene repression, and is likely bound by other proteins to exert a repressive function. Another polycomb complex, PRC1, can bind H3K27Me3 and adds the histone modification H2AK119Ub which aids chromatin compaction. Based on this data it appears that PRC1 is recruited through the action of PRC2, however, recent studies show that PRC1 is recruited to the same sites in the absence of PRC2.
- Di and tri-methylation of H3 lysine 9 (H3K9Me2/3)
H3K9Me2/3 is a well-characterised marker for heterochromatin, and is therefore strongly associated with gene repression. The formation of heterochromatin has been best studied in the yeast Schizosaccharomyces pombe, where it is initiated by recruitment of the RNA-induced transcriptional silencing complex to double stranded RNAs produced from centromeric repeats. RITS recruits the Clr4 histone methyltransferase which deposits H3K9Me2/3. This process is called histone methylation. H3K9Me2/3 serves as a binding site for the recruitment of Swi6 (heterochromatin protein 1 or HP1, another classic heterochromatin marker) which in turn recruits further repressive activities including histone modifiers such as histone deacetylases and histone methyltransferases.
- Trimethylation of H4 lysine 20 (H4K20Me3)
This modification is tightly associated with heterochromatin, although its functional importance remains unclear. This mark is placed by the Suv4-20h methyltransferase, which is at least in part recruited by heterochromatin protein 1.
Analysis of histone modifications in embryonic stem cells (and other stem cells) revealed many gene promoters carrying both H3K4Me3 and H3K27Me3, in other words these promoters display both activating and repressing marks simultaneously. This peculiar combination of modifications marks genes that are poised for transcription; they are not required in stem cells, but are rapidly required after differentiation into some lineages. Once the cell starts to differentiate, these bivalent promoters are resolved to either active or repressive states depending on the chosen lineage.
Marking sites of DNA damage is an important function for histone modifications.
- Phosphorylation of H2AX at serine 139 (Î³H2AX)
Phosphorylated H2AX (also known as gamma H2AX) is a marker for DNA double strand breaks, and forms part of the response to DNA damage. H2AX is phosphorylated early after detection of DNA double strand break, and forms a domain extending many kilobases either side of the damage. Gamma H2AX acts as a binding site for the protein MDC1, which in turn recruits key DNA repair proteins (this complex topic is well reviewed in) and as such, gamma H2AX forms a vital part of the machinery that ensures genome stability.
- Acetylation of H3 lysine 56 (H3K56Ac)
H3K56Acx is required for genome stability. H3K56 is acetylated by the p300/Rtt109 complex, but is rapidly deacetylated around sites of DNA damage. H3K56 acetylation is also required to stabilise stalled replication forks, preventing dangerous replication fork collapses. Although in general mammals make far greater use of histone modifications than microorganisms, a major role of H3K56Ac in DNA replication exists only in fungi, and this has become a target for antibiotic development.
- Phosphorylation of H3 at serine 10 (phospho-H3S10)
The mitotic kinase aurora B phosphorylates histone H3 at serine 10, triggering a cascade of changes that mediate mitotic chromosome condensation. Condensed chromosomes therefore stain very strongly for this mark, but H3S10 phosphorylation is also present at certain chromosome sites outside mitosis, for example in pericentric heterochromatin of cells during G2. H3S10 phosphorylation has also been linked to DNA damage caused by R loop formation at highly transcribed sites.
Phosphorylation H2B at serine 10 in yeast or serine 14 in mammalian cells (phospho-H2BS10/14)
Phosphorylation of H2B at serine 10 (yeast) or serine 14 (mammals) is also linked to chromatin condensation, but for the very different purpose of mediating chromosome condensation during apoptosis. This mark is not simply a late acting bystander in apoptosis as yeast carrying mutations of this residue are resistant to hydrogen peroxide-induced apoptotic cell death.
- Youngson, Robert M. (2006). Collins Dictionary of Human Biology. Glasgow: HarperCollins. ISBN 0-00-722134-7.
- Cox, Michael; Nelson, David R.; Lehninger, Albert L (2005). Lehninger Principles of Biochemistry. San Francisco: W.H. Freeman. ISBN 0-7167-4339-6.
- Redon C, Pilch D, Rogakou E, Sedelnikova O, Newrock K, Bonner W (April 2002). "Histone H2A variants H2AX and H2AZ". Curr. Opin. Genet. Dev. 12 (2): 162â€“9. doi:10.1016/S0959-437X(02)00282-4. PMID 11893489.
- Bhasin M, Reinherz EL, Reche PA (2006). "Recognition and classification of histones using support vector machine". J. Comput. Biol. 13 (1): 102â€“12. doi:10.1089/cmb.2006.13.102. PMID 16472024.
- Hartl, Daniel L.; Freifelder, David; Snyder, Leon A. (1988). Basic Genetics. Boston: Jones and Bartlett Publishers. ISBN 0-86720-090-1.
- Luger K, MÃ¤der AW, Richmond RK, Sargent DF, Richmond TJ (September 1997). "Crystal structure of the nucleosome core particle at 2.8 A resolution". Nature 389 (6648): 251â€“60. doi:10.1038/38444. PMID 9305837. PDB 1AOI
- Farkas, Daniel (1996). DNA simplified: the hitchhiker's guide to DNA. Washington, D.C: AACC Press. ISBN 0-915274-84-1.
- Alva V, Ammelburg M, Lupas AN (March 2007). "On the origin of the histone fold". BMC Struct Biol 7: 17. doi:10.1186/1472-6807-7-17. PMC 1847821. PMID 17391511.
- Ward R, Bowman A, El-Mkami H, Owen-Hughes T, Norman DG (February 2009). "Long distance PELDOR measurements on the histone core particle". J. Am. Chem. Soc. 131 (4): 1348â€“9. doi:10.1021/ja807918f. PMC 3501648. PMID 19138067.
- Kayne PS, Kim UJ, Han M, Mullen JR, Yoshizaki F, Grunstein M (October 1988). "Extremely conserved histone H4 N terminus is dispensable for growth but essential for repressing the silent mating loci in yeast". Cell 55 (1): 27â€“39. doi:10.1016/0092-8674(88)90006-2. PMID 3048701.
- "Vincent Allfrey's Work on Histone Acetylation". J Biol Chem 287 (3): 2270â€“2271. Jan 13, 2012. doi:10.1074/jbc.O112.000248. PMC 3265906.
- Crane-Robinson C, Dancy SE, Bradbury EM, Garel A, Kovacs AM, Champagne M, Daune M (August 1976). "Structural studies of chicken erythrocyte histone H5". Eur. J. Biochem. 67 (2): 379â€“88. doi:10.1111/j.1432-1033.1976.tb10702.x. PMID 964248.
- Aviles FJ, Chapman GE, Kneale GG, Crane-Robinson C, Bradbury EM (August 1978). "The conformation of histone H5. Isolation and characterisation of the globular segment". Eur. J. Biochem. 88 (2): 363â€“71. doi:10.1111/j.1432-1033.1978.tb12457.x. PMID 689022.
- Peter J. Rizzo (2003). "Those amazing dinoflagellate chromosomes". Cell Research 13 (4): 215â€“217. doi:10.1038/sj.cr.7290166. PMID 12974611.
- Clarke HJ (1992). "Nuclear and chromatin composition of mammalian gametes and early embryos". Biochem. Cell Biol. 70 (10â€“11): 856â€“66. doi:10.1139/o92-134. PMID 1297351.
- Guillemette B, Bataille AR, GÃ©vry N, Adam M, Blanchette M, Robert F, Gaudreau L (December 2005). "Variant Histone H2A.Z Is Globally Localized to the Promoters of Inactive Yeast Genes and Regulates Nucleosome Positioning". PLoS Biol. 3 (12): e384. doi:10.1371/journal.pbio.0030384. PMC 1275524. PMID 16248679.
- Billon P, CÃ´tÃ© J (October 2011). "Precise deposition of histone H2A.Z in chromatin for genome expression and maintenance". Biochim Biophys Acta. 1819 (3â€“4): 290â€“302. doi:10.1016/j.bbagrm.2011.10.004. PMID 22027408.
- Paull TT, Rogakou EP, Yamazaki V, Kirchgessner CU, Gellert M, Bonner WM (2000). "A critical role for histone H2AX in recruitment of repair factors to nuclear foci after DNA damage". Curr. Biol. 10 (15): 886â€“95. doi:10.1016/S0960-9822(00)00610-2. PMID 10959836.
- Ahmad K, Henikoff S (June 2002). "The histone variant H3.3 marks active chromatin by replication-independent nucleosome assembly". Mol. Cell 9 (6): 1191â€“200. doi:10.1016/S1097-2765(02)00542-7. PMID 12086617.
- Strahl BD, Allis CD (Jan 2000). "The language of covalent histone modifications". Nature 403 (6765): 41â€“5. doi:10.1038/47412. PMID 10638745.
- Jenuwein T, Allis CD (Aug 2001). "Translating the histone code". Science 293 (5532): 1074â€“80. doi:10.1126/science.1063127. PMID 11498575.
- Ning Song, Jie Liu, Shucai An, Tomoya Nishino, Yoshitaka Hishikawa and Takehiko Koji (2011). "Immunohistochemical Analysis of Histone H3 Modifications in Germ Cells during Mouse Spermatogenesis". Acta Histochemica et Cytochemica 44 (4): 183â€“90. doi:10.1267/ahc.11027. PMC 3168764. PMID 21927517.
- Benevolenskaya EV (August 2007). "Histone H3K4 demethylases are essential in development and differentiation". Biochem. Cell Biol. 85 (4): 435â€“43. doi:10.1139/o07-057. PMID 17713579.
- Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K (May 2007). "High-resolution profiling of histone methylations in the human genome". Cell 129 (4): 823â€“37. doi:10.1016/j.cell.2007.05.009. PMID 17512414.
- Steger DJ, Lefterova MI, Ying L, Stonestrom AJ, Schupp M, Zhuo D, Vakoc AL, Kim JE, Chen J, Lazar MA, Blobel GA, Vakoc CR (April 2008). "DOT1L/KMT4 Recruitment and H3K79 Methylation Are Ubiquitously Coupled with Gene Transcription in Mammalian Cells". Mol. Cell. Biol. 28 (8): 2825â€“39. doi:10.1128/MCB.02076-07. PMC 2293113. PMID 18285465.
- Rosenfeld JA, Wang Z, Schones DE, Zhao K, DeSalle R, Zhang MQ (2009). "Determination of enriched histone modifications in non-genic portions of the human genome". BMC Genomics 10: 143. doi:10.1186/1471-2164-10-143. PMC 2667539. PMID 19335899.
- Koch CM, Andrews RM, Flicek P, Dillon SC, KaraÃ¶z U, Clelland GK, Wilcox S, Beare DM, Fowler JC, Couttet P, James KD, Lefebvre GC, Bruce AW, Dovey OM, Ellis PD, Dhami P, Langford CF, Weng Z, Birney E, Carter NP, Vetrie D, Dunham I (June 2007). "The landscape of histone modifications across 1% of the human genome in five human cell lines". Genome Res. 17 (6): 691â€“707. doi:10.1101/gr.5704207. PMC 1891331. PMID 17567990.
- Creyghton MP, Cheng AW, Welstead GG, Kooistra T, Carey BW, Steine EJ, Hanna J, Lodato MA, Frampton GM, Sharp PA, Boyer LA, Young RA, Jaenisch R (2010). "Histone H3K27ac separates active from poised enhancers and predicts developmental state". Proceedings of the National Academy of Sciences 107 (= 50, pages = 21931-21936). doi:10.1073/pnas.1016071107. PMC 3003124. PMID 21106759.
- Krogan NJ, Dover J, Wood A, Schneider J, Heidt J, Boateng MA, Dean K, Ryan OW, Golshani A, Johnston M, Greenblatt JF, Shilatifard A (March 2003). "The Paf1 complex is required for histone H3 methylation by COMPASS and Dot1p: linking transcriptional elongation to histone methylation". Mol. Cell 11 (3): 721â€“9. doi:10.1016/S1097-2765(03)00091-1. PMID 12667454.
- Ng HH, Robert F, Young RA, Struhl K (2003). "Targeted recruitment of Set1 histone methylase by elongating Pol II provides a localized mark and memory of recent transcriptional activity". Mol Cell 11 (3): 709â€“19. doi:10.1016/S1097-2765(03)00092-3. PMID 12667453.
- Bernstein BE, Kamal M, Lindblad-Toh K, Bekiranov S, Bailey DK, Huebert DJ, McMahon S, Karlsson EK, Kulbokas EJ, Gingeras TR, Schreiber SL, Lander ES (January 2005). "Genomic maps and comparative analysis of histone modifications in human and mouse". Cell 120 (2): 169â€“81. doi:10.1016/j.cell.2005.01.001. PMID 15680324.
- Krogan NJ, Dover J, Khorrami S, Greenblatt JF, Schneider J, Johnston M, Shilatifard A (March 2002). "COMPASS, a histone H3 (Lysine 4) methyltransferase required for telomeric silencing of gene expression". J. Biol. Chem. 277 (13): 10753â€“5. doi:10.1074/jbc.C200023200. PMID 11805083.
- Roguev A, Schaft D, Shevchenko A, Pijnappel WW, Wilm M, Aasland R, Stewart AF (December 2001). "The Saccharomyces cerevisiae Set1 complex includes an Ash2 homologue and methylates histone 3 lysine 4". EMBO J. 20 (24): 7137â€“48. doi:10.1093/emboj/20.24.7137. PMC 125774. PMID 11742990.
- Nagy PL, Griesenbeck J, Kornberg RD, Cleary ML (January 2002). "A trithorax-group complex purified from Saccharomyces cerevisiae is required for methylation of histone H3". Proc. Natl. Acad. Sci. U.S.A. 99 (1): 90â€“4. doi:10.1073/pnas.221596698. PMC 117519. PMID 11752412.
- Wood A, Schneider J, Dover J, Johnston M, Shilatifard A (2005). "The Bur1/Bur2 complex is required for histone H2B monoubiquitination by Rad6/Bre1 and histone methylation by COMPASS". Mol Cell 20 (4): 589â€“99. doi:10.1016/j.molcel.2005.09.010. PMID 16307922.
- Sarcevic B, Mawson A, Baker RT, Sutherland RL (2002). "Regulation of the ubiquitin-conjugating enzyme hHR6A by CDK-mediated phosphorylation". EMBO J 21 (8): 2009â€“18. doi:10.1093/emboj/21.8.2009. PMC 125963. PMID 11953320.
- Robzyk K, Recht J, Osley MA (2000). "Rad6-dependent ubiquitination of histone H2B in yeast". Science 287 (5452): 501â€“4. doi:10.1126/science.287.5452.501. PMID 10642555.
- Sun ZW, Allis CD (2002). "Ubiquitination of histone H2B regulates H3 methylation and gene silencing in yeast". Nature 418 (6893): 104â€“8. doi:10.1038/nature00883. PMID 12077605.
- Dover J, Schneider J, Tawiah-Boateng MA, Wood A, Dean K, Johnston M, Shilatifard A (August 2002). "Methylation of histone H3 by COMPASS requires ubiquitination of histone H2B by Rad6". J. Biol. Chem. 277 (32): 28368â€“71. doi:10.1074/jbc.C200348200. PMID 12070136.
- Strahl BD, Grant PA, Briggs SD, Sun ZW, Bone JR, Caldwell JA, Mollah S, Cook RG, Shabanowitz J, Hunt DF, Allis CD (March 2002). "Set2 is a nucleosomal histone H3-selective methyltransferase that mediates transcriptional repression". Mol. Cell. Biol. 22 (5): 1298â€“306. doi:10.1128/MCB.22.5.1298-1306.2002. PMC 134702. PMID 11839797.
- Li J, Moazed D, Gygi SP (2002). "Association of the histone methyltransferase Set2 with RNA polymerase II plays a role in transcription elongation". J Biol Chem 277 (51): 49383â€“8. doi:10.1074/jbc.M209294200. PMID 12381723.
- Carrozza MJ, Li B, Florens L, Suganuma T, Swanson SK, Lee KK, Shia WJ, Anderson S, Yates J, Washburn MP, Workman JL (November 2005). "Histone H3 methylation by Set2 directs deacetylation of coding regions by Rpd3S to suppress spurious intragenic transcription". Cell 123 (4): 581â€“92. doi:10.1016/j.cell.2005.10.023. PMID 16286007.
- Keogh MC, Kurdistani SK, Morris SA, Ahn SH, Podolny V, Collins SR, Schuldiner M, Chin K, Punna T, Thompson NJ, Boone C, Emili A, Weissman JS, Hughes TR, Strahl BD, Grunstein M, Greenblatt JF, Buratowski S, Krogan NJ (November 2005). "Cotranscriptional set2 methylation of histone H3 lysine 36 recruits a repressive Rpd3 complex". Cell 123 (4): 593â€“605. doi:10.1016/j.cell.2005.10.025. PMID 16286008.
- Joshi AA, Struhl K (2005). "Eaf3 chromodomain interaction with methylated H3-K36 links histone deacetylation to Pol II elongation". Mol Cell 20 (6): 971â€“8. doi:10.1016/j.molcel.2005.11.021. PMID 16364921.
- Kuzmichev A, Nishioka K, Erdjument-Bromage H, Tempst P, Reinberg D (2002). "Histone methyltransferase activity associated with a human multiprotein complex containing the Enhancer of Zeste protein". Genes Dev 16 (22): 2893â€“905. doi:10.1101/gad.1035902. PMC 187479. PMID 12435631.
- Cao R, Wang L, Wang H, Xia L, Erdjument-Bromage H, Tempst P et al. (2002). "Role of histone H3 lysine 27 methylation in Polycomb-group silencing". Science 298 (5595): 1039â€“43. doi:10.1126/science.1076997. PMID 12351676.
- de Napoles M, Mermoud JE, Wakao R, Tang YA, Endoh M, Appanah R et al. (2004). "Polycomb group proteins Ring1A/B link ubiquitylation of histone H2A to heritable gene silencing and X inactivation". Dev Cell 7 (5): 663â€“76. doi:10.1016/j.devcel.2004.10.005. PMID 15525528.
- Wang H, Wang L, Erdjument-Bromage H, Vidal M, Tempst P, Jones RS et al. (2004). "Role of histone H2A ubiquitination in Polycomb silencing". Nature 431 (7010): 873â€“8. doi:10.1038/nature02985. PMID 15386022.
- Tavares L, Dimitrova E, Oxley D, Webster J, Poot R, Demmers J et al. (2012). "RYBP-PRC1 Complexes Mediate H2A Ubiquitylation at Polycomb Target Sites Independently of PRC2 and H3K27me3". Cell 148 (4): 664â€“78. doi:10.1016/j.cell.2011.12.029. PMC 3281992. PMID 22325148.
- Gao Z, Zhang J, Bonasio R, Strino F, Sawai A, Parisi F et al. (2012). "PCGF Homologs, CBX Proteins, and RYBP Define Functionally Distinct PRC1 Family Complexes". Mol Cell 45 (3): 344â€“56. doi:10.1016/j.molcel.2012.01.002. PMC 3293217. PMID 22325352.
- Verdel A, Jia S, Gerber S, Sugiyama T, Gygi S, Grewal SI et al. (2004). "RNAi-mediated targeting of heterochromatin by the RITS complex". Science 303 (5658): 672â€“6. doi:10.1126/science.1093686. PMC 3244756. PMID 14704433.
- Rea S, Eisenhaber F, O'Carroll D, Strahl BD, Sun ZW, Schmid M et al. (2000). "Regulation of chromatin structure by site-specific histone H3 methyltransferases". Nature 406 (6796): 593â€“9. doi:10.1038/35020506. PMID 10949293.
- Bannister AJ, Zegerman P, Partridge JF, Miska EA, Thomas JO, Allshire RC et al. (2001). "Selective recognition of methylated lysine 9 on histone H3 by the HP1 chromo domain". Nature 410 (6824): 120â€“4. doi:10.1038/35065138. PMID 11242054.
- Lachner M, O'Carroll D, Rea S, Mechtler K, Jenuwein T (2001). "Methylation of histone H3 lysine 9 creates a binding site for HP1 proteins". Nature 410 (6824): 116â€“20. doi:10.1038/35065132. PMID 11242053.
- Schotta G, Lachner M, Sarma K, Ebert A, Sengupta R, Reuter G et al. (2004). "A silencing pathway to induce H3-K9 and H4-K20 methylation at constitutive heterochromatin". Genes Dev 18 (11): 1251â€“62. doi:10.1101/gad.300704. PMC 420351. PMID 15145825.
- Kourmouli N, Jeppesen P, Mahadevhaiah S, Burgoyne P, Wu R, Gilbert DM et al. (2004). "Heterochromatin and tri-methylated lysine 20 of histone H4 in animals". J Cell Sci 117 (Pt 12): 2491â€“501. doi:10.1242/jcs.01238. PMID 15128874.
- Bernstein BE, Mikkelsen TS, Xie X, Kamal M, Huebert DJ, Cuff J et al. (2006). "A bivalent chromatin structure marks key developmental genes in embryonic stem cells". Cell 125 (2): 315â€“26. doi:10.1016/j.cell.2006.02.041. PMID 16630819.
- Rogakou EP, Pilch DR, Orr AH, Ivanova VS, Bonner WM (1998). "DNA double-stranded breaks induce histone H2AX phosphorylation on serine 139". J Biol Chem 273 (10): 5858â€“68. doi:10.1074/jbc.273.10.5858. PMID 9488723.
- Celeste A, Petersen S, Romanienko PJ, Fernandez-Capetillo O, Chen HT, Sedelnikova OA, Reina-San-Martin B, Coppola V, Meffre E, Difilippantonio MJ, Redon C, Pilch DR, Olaru A, Eckhaus M, Camerini-Otero RD, Tessarollo L, Livak F, Manova K, Bonner WM, Nussenzweig MC, Nussenzweig A (2002). "Genomic instability in mice lacking histone H2AX". Science 296 (5569): 922â€“7. doi:10.1126/science.1069398. PMID 11934988.
- Shroff R, Arbel-Eden A, Pilch D, Ira G, Bonner WM, Petrini JH, Haber JE, Lichten M (2004). "Distribution and dynamics of chromatin modification induced by a defined DNA double-strand break". Curr Biol 14 (19): 1703â€“11. doi:10.1016/j.cub.2004.09.047. PMID 15458641.
- Rogakou EP, Boon C, Redon C, Bonner WM (1999). "Megabase chromatin domains involved in DNA double-strand breaks in vivo". J Cell Biol 146 (5): 905â€“16. doi:10.1083/jcb.146.5.905. PMC 2169482. PMID 10477747.
- Stewart GS, Wang B, Bignell CR, Taylor AM, Elledge SJ (2003). "MDC1 is a mediator of the mammalian DNA damage checkpoint". Nature 421 (6926): 961â€“6. doi:10.1038/nature01446. PMID 12607005.
- Bekker-Jensen S, Mailand N (2010). "Assembly and function of DNA double-strand break repair foci in mammalian cells". DNA Repair (Amst) 9 (12): 1219â€“28. doi:10.1016/j.dnarep.2010.09.010. PMID 21035408.
- Ozdemir A, Spicuglia S, Lasonder E, Vermeulen M, Campsteijn C, Stunnenberg HG, Logie C (2005). "Characterization of lysine 56 of histone H3 as an acetylation site in Saccharomyces cerevisiae". J Biol Chem 280 (28): 25949â€“52. doi:10.1074/jbc.C500181200. PMID 15888442.
- Masumoto H, Hawke D, Kobayashi R, Verreault A (2005). "A role for cell-cycle-regulated histone H3 lysine 56 acetylation in the DNA damage response". Nature 436 (7048): 294â€“8. doi:10.1038/nature03714. PMID 16015338.
- Driscoll R, Hudson A, Jackson SP (2007). "Yeast Rtt109 promotes genome stability by acetylating histone H3 on lysine 56". Science 315 (5812): 649â€“52. doi:10.1126/science.1135862. PMC 3334813. PMID 17272722.
- Han J, Zhou H, Horazdovsky B, Zhang K, Xu RM, Zhang Z (2007). "Rtt109 acetylates histone H3 lysine 56 and functions in DNA replication". Science 315 (5812): 653â€“5. doi:10.1126/science.1133234. PMID 17272723.
- Das C, Lucia MS, Hansen KC, Tyler JK (2009). "CBP/p300-mediated acetylation of histone H3 on lysine 56". Nature 459 (7243): 113â€“7. doi:10.1038/nature07861. PMC 2756583. PMID 19270680.
- Han J, Zhou H, Li Z, Xu RM, Zhang Z (2007). "Acetylation of lysine 56 of histone H3 catalyzed by RTT109 and regulated by ASF1 is required for replisome integrity". J Biol Chem 282 (39): 28587â€“96. doi:10.1074/jbc.M702496200. PMID 17690098.
- Wurtele H, Kaiser GS, Bacal J, St-Hilaire E, Lee EH, Tsao S, Dorn J, Maddox P, Lisby M, Pasero P, Verreault A (2012). "Histone H3 lysine 56 acetylation and the response to DNA replication fork damage". Mol Cell Biol 32 (1): 154â€“72. doi:10.1128/MCB.05415-11. PMC 3255698. PMID 22025679.
- Wurtele H, Tsao S, LÃ©pine G, Mullick A, Tremblay J, Drogaris P, Lee EH, Thibault P, Verreault A, Raymond M (2010). "Modulation of histone H3 lysine 56 acetylation as an antifungal therapeutic strategy". Nat Med 16 (7): 774â€“80. doi:10.1038/nm.2175. PMID 20601951.
- Wilkins BJ, Rall NA, Ostwal Y, Kruitwagen T, Hiragami-Hamada K, Winkler M, Barral Y, Fischle W, Neumann H (Jan 3, 2014). "A cascade of histone modifications induces chromatin condensation in mitosis.". Science 343 (6166): 77â€“80. doi:10.1126/science.1244508. PMID 24385627.
- Johansen KM, Johansen J (2006). "Regulation of chromatin structure by histone H3S10 phosphorylation.". Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology 14 (4): 393â€“404. doi:10.1007/s10577-006-1063-4. PMID 16821135.
- Castellano-Pozo M, Santos-Pereira JM, RondÃ³n AG, Barroso S, AndÃºjar E, PÃ©rez-Alegre M, GarcÃa-Muse T, Aguilera A (Nov 21, 2013). "R loops are linked to histone H3 S10 phosphorylation and chromatin condensation.". Molecular Cell 52 (4): 583â€“90. doi:10.1016/j.molcel.2013.10.006. PMID 24211264.
- Cheung WL, Ajiro K, Samejima K, Kloc M, Cheung P, Mizzen CA, Beeser A, Etkin LD, Chernoff J, Earnshaw WC, Allis CD (May 16, 2003). "Apoptotic phosphorylation of histone H2B is mediated by mammalian sterile twenty kinase.". Cell 113 (4): 507â€“17. doi:10.1016/s0092-8674(03)00355-6. PMID 12757711.
- Ahn SH, Cheung WL, Hsu JY, Diaz RL, Smith MM, Allis CD (Jan 14, 2005). "Sterile 20 kinase phosphorylates histone H2B at serine 10 during hydrogen peroxide-induced apoptosis in S. cerevisiae.". Cell 120 (1): 25â€“36. doi:10.1016/j.cell.2004.11.016. PMID 15652479.
This tab holds the annotation information that is stored in the Pfam database. As we move to using Wikipedia as our main source of annotation, the contents of this tab will be gradually replaced by the Wikipedia tab.
Core histone H2A/H2B/H3/H4 Provide feedback
No Pfam abstract.
Internal database links
|SCOOP:||CBFD_NFYB_HMF TFIID-18kDa TFIID-31kDa TAF UPF0137 TFIID_20kDa TAFII28 Bromo_TP TMA7 DUF1931 CENP-X DUF3573 CENP-T_C CENP-S PAF|
|Similarity to PfamA using HHSearch:||CBFD_NFYB_HMF TFIID-31kDa TAF TFIID_20kDa CENP-T_C CENP-S|
External database links
|PRINTS:||PR00620 PR00621 PR00622 PR00623|
|PROSITE:||PDOC00045 PDOC00046 PDOC00287 PDOC00308|
This tab holds annotation information from the InterPro database.
InterPro entry IPR007125
Five major families of histones exist: H1/H5, H2A, H2B, H3, and H4 [PUBMED:16472024] Histones H2A, H2B, H3 and H4 are known as the core histones, while histones H1 and H5 are known as the linker histones. The core histones together with some other DNA binding proteins form a superfamily defined by a common fold and distant sequence similarities [PUBMED:7651829, PUBMED:9016552]. Some proteins contain local homology domains related to the histone fold [PUBMED:9305837].
This entry represents the histone core.
The mapping between Pfam and Gene Ontology is provided by InterPro. If you use this data please cite InterPro.
|Molecular function||DNA binding (GO:0003677)|
Below is a listing of the unique domain organisations or architectures in which this domain is found. More...
The graphic that is shown by default represents the longest sequence with a given architecture. Each row contains the following information:
- the number of sequences which exhibit this architecture
a textual description of the architecture, e.g. Gla, EGF x 2, Trypsin.
This example describes an architecture with one
Gladomain, followed by two consecutive
EGFdomains, and finally a single
- a link to the page in the Pfam site showing information about the sequence that the graphic describes
- the UniProt description of the protein sequence
- the number of residues in the sequence
- the Pfam graphic itself.
Note that you can see the family page for a particular domain by clicking on the graphic. You can also choose to see all sequences which have a given architecture by clicking on the Show link in each row.
Finally, because some families can be found in a very large number of architectures, we load only the first fifty architectures by default. If you want to see more architectures, click the button at the bottom of the page to load the next set.
Loading domain graphics...
Members of this clan all possess a histone fold. Generally proteins in this clan are DNA binding.
The clan contains the following 13 members:Bromo_TP Bromo_TP_like CBFD_NFYB_HMF CENP-S CENP-T_C CENP-W CENP-X Histone TAF TAFII28 TFIID-18kDa TFIID-31kDa TFIID_20kDa
We store a range of different sequence alignments for families. As well as the seed alignment from which the family is built, we provide the full alignment, generated by searching the sequence database using the family HMM. We also generate alignments using four representative proteomes (RP) sets, the NCBI sequence database, and our metagenomics sequence database. More...
There are various ways to view or download the sequence alignments that we store. We provide several sequence viewers and a plain-text Stockholm-format file for download.
We make a range of alignments for each Pfam-A family:
- the curated alignment from which the HMM for the family is built
- the alignment generated by searching the sequence database using the HMM
- Representative Proteomes (RPs) at 15%, 35%, 55% and 75% co-membership thresholds
- alignment generated by searching the NCBI sequence database using the family HMM
- alignment generated by searching the metagenomics sequence database using the family HMM
You can see the alignments as HTML or in three different sequence viewers:
- a Java applet developed at the University of Dundee. You will need Java installed before running jalview
- an HTML page showing the whole alignment.Please note: full Pfam alignments can be very large. These HTML views are extremely large and often cause problems for browsers. Please use either jalview or the Pfam viewer if you have trouble viewing the HTML version
- an HTML-based representation of the alignment, coloured according to the posterior-probability (PP) values from the HMM. As for the standard HTML view, heatmap alignments can also be very large and slow to render.
- Pfam viewer
- an HTML-based viewer that uses DAS to retrieve alignment fragments on request
You can download (or view in your browser) a text representation of a Pfam alignment in various formats:
You can also change the order in which sequences are listed in the alignment, change how insertions are represented, alter the characters that are used to represent gaps in sequences and, finally, choose whether to download the alignment or to view it in your browser directly.
You may find that large alignments cause problems for the viewers and the reformatting tool, so we also provide all alignments in Stockholm format. You can download either the plain text alignment, or a gzipped version of it.
We make a range of alignments for each Pfam-A family. You can see a description of each above. You can view these alignments in various ways but please note that some types of alignment are never generated while others may not be available for all families, most commonly because the alignments are too large to handle.
1Cannot generate PP/Heatmap alignments for seeds; no PP data available
Key: available, not generated, — not available.
Format an alignment
We make all of our alignments available in Stockholm format. You can download them here as raw, plain text files or as gzip-compressed files.
You can also download a FASTA format file containing the full-length sequences for all sequences in the full alignment.
MyHits provides a collection of tools to handle multiple sequence alignments. For example, one can refine a seed alignment (sequence addition or removal, re-alignment or manual edition) and then search databases for remote homologs using HMMER3.
HMM logos is one way of visualising profile HMMs. Logos provide a quick overview of the properties of an HMM in a graphical form. You can see a more detailed description of HMM logos and find out how you can interpret them here. More...
If you find these logos useful in your own work, please consider citing the following article:
This page displays the phylogenetic tree for this family's seed alignment. We use FastTree to calculate neighbour join trees with a local bootstrap based on 100 resamples (shown next to the tree nodes). FastTree calculates approximately-maximum-likelihood phylogenetic trees from our seed alignment.
Note: You can also download the data file for the tree.
Curation and family details
This section shows the detailed information about the Pfam family. You can see the definitions of many of the terms in this section in the glossary and a fuller explanation of the scoring system that we use in the scores section of the help pages.
|Author:||Bateman A, Sonnhammer ELL|
|Number in seed:||30|
|Number in full:||25449|
|Average length of the domain:||106.10 aa|
|Average identity of full alignment:||54 %|
|Average coverage of the sequence by the domain:||80.19 %|
|HMM build commands:||
build method: hmmbuild -o /dev/null HMM SEED
search method: hmmsearch -Z 80369284 -E 1000 --cpu 4 HMM pfamseq
|Family (HMM) version:||20|
|Download:||download the raw HMM for this family|
Weight segments by...
Change the size of the sunburst
selected sequences to HMM
a FASTA-format file
- 0 sequences
- 0 species
This visualisation provides a simple graphical representation of the distribution of this family across species. You can find the original interactive tree in the More....
This chart is a modified "sunburst" visualisation of the species tree for this family. It shows each node in the tree as a separate arc, arranged radially with the superkingdoms at the centre and the species arrayed around the outermost ring.
How the sunburst is generated
The tree is built by considering the taxonomic lineage of each sequence that has a match to this family. For each node in the resulting tree, we draw an arc in the sunburst. The radius of the arc, its distance from the root node at the centre of the sunburst, shows the taxonomic level ("superkingdom", "kingdom", etc). The length of the arc represents either the number of sequences represented at a given level, or the number of species that are found beneath the node in the tree. The weighting scheme can be changed using the sunburst controls.
In order to reduce the complexity of the representation, we reduce the number of taxonomic levels that we show. We consider only the following eight major taxonomic levels:
Colouring and labels
Segments of the tree are coloured approximately according to their superkingdom. For example, archeal branches are coloured with shades of orange, eukaryotes in shades of purple, etc. The colour assignments are shown under the sunburst controls. Where space allows, the name of the taxonomic level will be written on the arc itself.
As you move your mouse across the sunburst, the current node will be highlighted. In the top section of the controls panel we show a summary of the lineage of the currently highlighed node. If you pause over an arc, a tooltip will be shown, giving the name of the taxonomic level in the title and a summary of the number of sequences and species below that node in the tree.
Anomalies in the taxonomy tree
There are some situations that the sunburst tree cannot easily handle and for which we have work-arounds in place.
Missing taxonomic levels
Some species in the taxonomic tree may not have one or more of the main eight levels that we display. For example, Bos taurus is not assigned an order in the NCBI taxonomic tree. In such cases we mark the omitted level with, for example, "No order", in both the tooltip and the lineage summary.
Unmapped species names
The tree is built by looking at each sequence in the full alignment for the family. We take the name of the species given by UniProt and try to map that to the full taxonomic tree from NCBI. In some cases, the name chosen by UniProt does not map to any node in the NCBI tree, perhaps because the chosen name is listed as a synonym or a misspelling in the NCBI taxonomy.
So that these nodes are not simply omitted from the sunburst tree, we group them together in a separate branch (or segment of the sunburst tree). Since we cannot determine the lineage for these unmapped species, we show all levels between the superkingdom and the species as "uncategorised".
Since we reduce the species tree to only the eight main taxonomic levels, sequences that are mapped to the sub-species level in the tree would not normally be shown. Rather than leave out these species, we map them instead to their parent species. So, for example, for sequences belonging to one of the Vibrio cholerae sub-species in the NCBI taxonomy, we show them instead as belonging to the species Vibrio cholerae.
Too many species/sequences
For large species trees, you may see blank regions in the outer layers of the sunburst. These occur when there are large numbers of arcs to be drawn in a small space. If an arc is less than approximately one pixel wide, it will not be drawn and the space will be left blank. You may still be able to get some information about the species in that region by moving your mouse across the area, but since each arc will be very small, it will be difficult to accurately locate a particular species.
The tree shows the occurrence of this domain across different species. More...
We show the species tree in one of two ways. For smaller trees we try to show an interactive representation, which allows you to select specific nodes in the tree and view them as an alignment or as a set of Pfam domain graphics.
Unfortunately we have found that there are problems viewing the interactive tree when the it becomes larger than a certain limit. Furthermore, we have found that Internet Explorer can become unresponsive when viewing some trees, regardless of their size. We therefore show a text representation of the species tree when the size is above a certain limit or if you are using Internet Explorer to view the site.
If you are using IE you can still load the interactive tree by clicking the "Generate interactive tree" button, but please be aware of the potential problems that the interactive species tree can cause.
For all of the domain matches in a full alignment, we count the number that are found on all sequences in the alignment. This total is shown in the purple box.
We also count the number of unique sequences on which each domain is found, which is shown in green. Note that a domain may appear multiple times on the same sequence, leading to the difference between these two numbers.
Finally, we group sequences from the same organism according to the NCBI code that is assigned by UniProt, allowing us to count the number of distinct sequences on which the domain is found. This value is shown in the pink boxes.
We use the NCBI species tree to group organisms according to their taxonomy and this forms the structure of the displayed tree. Note that in some cases the trees are too large (have too many nodes) to allow us to build an interactive tree, but in most cases you can still view the tree in a plain text, non-interactive representation. Those species which are represented in the seed alignment for this domain are highlighted.
You can use the tree controls to manipulate how the interactive tree is displayed:
- show/hide the summary boxes
- highlight species that are represented in the seed alignment
- expand/collapse the tree or expand it to a given depth
- select a sub-tree or a set of species within the tree and view them graphically or as an alignment
- save a plain text representation of the tree
Please note: for large trees this can take some time. While the tree is loading, you can safely switch away from this tab but if you browse away from the family page entirely, the tree will not be loaded.
There are 15 interactions for this family. More...
We determine these interactions using iPfam, which considers the interactions between residues in three-dimensional protein structures and maps those interactions back to Pfam families. You can find more information about the iPfam algorithm in the journal article that accompanies the website.
For those sequences which have a structure in the Protein DataBank, we use the mapping between UniProt, PDB and Pfam coordinate systems from the PDBe group, to allow us to map Pfam domains onto UniProt sequences and three-dimensional protein structures. The table below shows the structures on which the Histone domain has been found. There are 903 instances of this domain found in the PDB. Note that there may be multiple copies of the domain in a single PDB structure, since many structures contain multiple copies of the same protein seqence.
Loading structure mapping...