Summary: Small cytokines (intecrine/chemokine), interleukin-8 like
Pfam includes annotations and additional family information from a range of different sources. These sources can be accessed via the tabs below.
This is the Wikipedia entry entitled "Chemokine". More...
The Wikipedia text that you see displayed here is a download from Wikipedia. This means that the information we display is a copy of the information from the Wikipedia database. The button next to the article title ("Edit Wikipedia article") takes you to the edit page for the article directly within Wikipedia. You should be aware you are not editing our local copy of this information. Any changes that you make to the Wikipedia article will not be displayed here until we next download the article from Wikipedia. We currently download new content on a nightly basis.
Does Pfam agree with the content of the Wikipedia entry ?
Pfam has chosen to link families to Wikipedia articles. In some case we have created or edited these articles but in many other cases we have not made any direct contribution to the content of the article. The Wikipedia community does monitor edits to try to ensure that (a) the quality of article annotation increases, and (b) vandalism is very quickly dealt with. However, we would like to emphasise that Pfam does not curate the Wikipedia entries and we cannot guarantee the accuracy of the information on the Wikipedia page.
Editing Wikipedia articles
Before you edit for the first time
Wikipedia is a free, online encyclopedia. Although anyone can edit or contribute to an article, Wikipedia has some strong editing guidelines and policies, which promote the Wikipedia standard of style and etiquette. Your edits and contributions are more likely to be accepted (and remain) if they are in accordance with this policy.
You should take a few minutes to view the following pages:
How your contribution will be recorded
Anyone can edit a Wikipedia entry. You can do this either as a new user or you can register with Wikipedia and log on. When you click on the "Edit Wikipedia article" button, your browser will direct you to the edit page for this entry in Wikipedia. If you are a registered user and currently logged in, your changes will be recorded under your Wikipedia user name. However, if you are not a registered user or are not logged on, your changes will be logged under your computer's IP address. This has two main implications. Firstly, as a registered Wikipedia user your edits are more likely seen as valuable contribution (although all edits are open to community scrutiny regardless). Secondly, if you edit under an IP address you may be sharing this IP address with other users. If your IP address has previously been blocked (due to being flagged as a source of 'vandalism') your edits will also be blocked. You can find more information on this and creating a user account at Wikipedia.
If you have problems editing a particular page, contact us at email@example.com and we will try to help.
The community annotation is a new facility of the Pfam web site. If you have problems editing or experience problems with these pages please contact us.
Chemokine Edit Wikipedia article
|Small cytokines (intecrine/chemokine), interleukin-8 like|
Solution structure of interleukin-8, a chemokine of the CXC subfamily
This article needs additional citations for verification. (April 2013) (Learn how and when to remove this template message)
Chemokines (Greek -kinos, movement) are a family of small cytokines, or signaling proteins secreted by cells. Their name is derived from their ability to induce directed chemotaxis in nearby responsive cells; they are chemotactic cytokines.
Cytokine proteins are classified as chemokines according to behavior and structural characteristics. In addition to being known for mediating chemotaxis, chemokines are all approximately 8-10 kilodaltons in mass and have four cysteine residues in conserved locations that are key to forming their 3-dimensional shape.
These proteins have historically been known under several other names including the SIS family of cytokines, SIG family of cytokines, SCY family of cytokines, Platelet factor-4 superfamily or intercrines. Some chemokines are considered pro-inflammatory and can be induced during an immune response to recruit cells of the immune system to a site of infection, while others are considered homeostatic and are involved in controlling the migration of cells during normal processes of tissue maintenance or development. Chemokines are found in all vertebrates, some viruses and some bacteria, but none have been described for other invertebrates.
Chemokines have been classified into four main subfamilies: CXC, CC, CX3C and XC. All of these proteins exert their biological effects by interacting with G protein-linked transmembrane receptors called chemokine receptors, that are selectively found on the surfaces of their target cells.
The major role of chemokines is to act as a chemoattractant to guide the migration of cells. Cells that are attracted by chemokines follow a signal of increasing chemokine concentration towards the source of the chemokine. Some chemokines control cells of the immune system during processes of immune surveillance, such as directing lymphocytes to the lymph nodes so they can screen for invasion of pathogens by interacting with antigen-presenting cells residing in these tissues. These are known as homeostatic chemokines and are produced and secreted without any need to stimulate their source cell(s). Some chemokines have roles in development; they promote angiogenesis (the growth of new blood vessels), or guide cells to tissues that provide specific signals critical for cellular maturation. Other chemokines are inflammatory and are released from a wide variety of cells in response to bacterial infection, viruses and agents that cause physical damage such as silica or the urate crystals that occur in gout. Their release is often stimulated by pro-inflammatory cytokines such as interleukin 1. Inflammatory chemokines function mainly as chemoattractants for leukocytes, recruiting monocytes, neutrophils and other effector cells from the blood to sites of infection or tissue damage. Certain inflammatory chemokines activate cells to initiate an immune response or promote wound healing. They are released by many different cell types and serve to guide cells of both innate immune system and adaptive immune system.
Types by function
Chemokines are functionally divided into two groups:
- Homeostatic: are constitutively produced in certain tissues and are responsible for basal leukocyte migration. These include: CCL14, CCL19, CCL20, CCL21, CCL25, CCL27, CXCL12 and CXCL13. This classification is not strict; for example, CCL20 can act also as pro-inflammatory chemokine.
- Inflammatory: these are formed under pathological conditions (on pro-inflammatory stimuli, such as IL-1, TNF-alpha, LPS, or viruses) and actively participate in the inflammatory response attracting immune cells to the site of inflammation. Examples are: CXCL-8, CCL2, CCL3, CCL4, CCL5, CCL11, CXCL10.
Basal: homeostatic chemokines are basal produced in the thymus and lymphoid tissues. Their homeostatic function in homing is best exemplified by the chemokines CCL19 and CCL21 (expressed within lymph nodes and on lymphatic endothelial cells) and their receptor CCR7 (expressed on cells destined for homing in cells to these organs). Using these ligands is possible routing antigen-presenting cells (APC) to lymph nodes during the adaptive immune response. Among other homeostatic chemokine receptors include: CCR9, CCR10, and CXCR5, which are important as part of the cell addresses for tissue-specific homing of leukocytes. CCR9 supports the migration of leukocytes into the intestine, CCR10 to the skin and CXCR5 supports the migration of B-cell to follicles of lymph nodes. As well CXCL12 (SDF-1) constitutively produced in the bone marrow promotes proliferation of progenitor B cells in the bone marrow microenvironment.
Inflammatory: inflammatory chemokines are produced in high concentrations during infection or injury and determine the migration of inflammatory leukocytes into the damaged area. Typical inflammatory chemokines include: CCL2, CCL3 and CCL5, CXCL1, CXCL2 and CXCL8. A typical example is CXCL-8, which acts as a chemoattractant for neutrophils. In contrast to the homeostatic chemokine receptors, there is significant promiscuity (redundancy) associated with binding receptor and inflammatory chemokines. This often complicates research on receptor-specific therapeutics in this area.
Types by cell attracted
- Monocytes / macrophages: the key chemokines that attract these cells to the site of inflammation include: CCL2, CCL3, CCL5, CCL7, CCL8, CCL13, CCL17 and CCL22.
- T-lymphocytes: the four key chemokines that are involved in the recruitment of T lymphocytes to the site of inflammation are: CCL2, CCL1, CCL22 and CCL17. Furthermore, CXCR3 expression by T-cells is induced following T-cell activation and activated T-cells are attracted to sites of inflammation where the IFN-y inducible chemokines CXCL9, CXCL10 and CXCL11 are secreted.
- Mast cells: on their surface express several receptors for chemokines: CCR1, CCR2, CCR3, CCR4, CCR5, CXCR2, and CXCR4. Ligands of these receptors CCL2 and CCL5 play an important role in mast cell recruitment and activation in the lung.There is also evidence that CXCL8 might be inhibitory of mast cells.
- Eosinophils: the migration of eosinophils into various tissues involved several chemokines of CC family: CCL11, CCL24, CCL26, CCL5, CCL7, CCL13, and CCL3. Chemokines CCL11 (eotaxin) and CCL5 (RANTES) acts through a specific receptor CCR3 on the surface of eosinophils, and eotaxin plays an essential role in the initial recruitment of eosinophils into the lesion.
- Neutrophils: are regulated primarily by CXC chemokines. An example CXCL8 (IL-8) is chemoattractant for neutrophils and also activating their metabolic and degranulation.
Proteins are classified into the chemokine family based on their structural characteristics, not just their ability to attract cells. All chemokines are small, with a molecular mass of between 8 and 10 kDa. They are approximately 20-50% identical to each other; that is, they share gene sequence and amino acid sequence homology. They all also possess conserved amino acids that are important for creating their 3-dimensional or tertiary structure, such as (in most cases) four cysteines that interact with each other in pairs to create a Greek key shape that is a characteristic of chemokines. Intramolecular disulfide bonds typically join the first to third, and the second to fourth cysteine residues, numbered as they appear in the protein sequence of the chemokine. Typical chemokine proteins are produced as pro-peptides, beginning with a signal peptide of approximately 20 amino acids that gets cleaved from the active (mature) portion of the molecule during the process of its secretion from the cell. The first two cysteines, in a chemokine, are situated close together near the N-terminal end of the mature protein, with the third cysteine residing in the centre of the molecule and the fourth close to the C-terminal end. A loop of approximately ten amino acids follows the first two cysteines and is known as the N-loop. This is followed by a single-turn helix, called a 310-helix, three β-strands and a C-terminal α-helix. These helices and strands are connected by turns called 30s, 40s and 50s loops; the third and fourth cysteines are located in the 30s and 50s loops.
Types by structure
|CCL8||Scya8||MCP-2||CCR1, CCR2B, CCR5||P80075|
|CCL9/CCL10||Scya9||MRP-2, CCF18, MIP-1?||CCR1||P51670|
|CCL11||Scya11||Eotaxin||CCR2, CCR3, CCR5||P51671|
|CCL13||Scya13||MCP-4, NCC-1, Ckβ10||CCR2, CCR3, CCR5||Q99616|
|CCL14||Scya14||HCC-1, MCIF, Ckβ1, NCC-2, CCL||CCR1||Q16627|
|CCL15||Scya15||Leukotactin-1, MIP-5, HCC-2, NCC-3||CCR1, CCR3||Q16663|
|CCL16||Scya16||LEC, NCC-4, LMC, Ckβ12||CCR1, CCR2, CCR5, CCR8||O15467|
|CCL17||Scya17||TARC, dendrokine, ABCD-2||CCR4||Q92583|
|CCL18||Scya18||PARC, DC-CK1, AMAC-1, Ckβ7, MIP-4||P55774|
|CCL19||Scya19||ELC, Exodus-3, Ckβ11||CCR7||Q99731|
|CCL20||Scya20||LARC, Exodus-1, Ckβ4||CCR6||P78556|
|CCL21||Scya21||SLC, 6Ckine, Exodus-2, Ckβ9, TCA-4||CCR7||O00585|
|CCL23||Scya23||MPIF-1, Ckβ8, MIP-3, MPIF-1||CCR1||P55773|
|CCL24||Scya24||Eotaxin-2, MPIF-2, Ckβ6||CCR3||O00175|
|CCL26||Scya26||Eotaxin-3, MIP-4a, IMAC, TSC-1||CCR3||Q9Y258|
|CCL27||Scya27||CTACK, ILC, Eskine, PESKY, skinkine||CCR10||Q9Y4X3|
|CXCL1||Scyb1||Gro-a, GRO1, NAP-3, KC||CXCR2||P09341|
|CXCL2||Scyb2||Gro-β, GRO2, MIP-2a||CXCR2||P19875|
|CXCL3||Scyb3||Gro-?, GRO3, MIP-2β||CXCR2||P19876|
|CXCL7||Scyb7||NAP-2, CTAPIII, β-Ta, PEP||P02775|
|CXCL8||Scyb8||IL-8, NAP-1, MDNCF, GCP-1||CXCR1, CXCR2||P10145|
|CXCL11||Scyb11||I-TAC, β-R1, IP-9||CXCR3, CXCR7||O14625|
|CXCL12||Scyb12||SDF-1, PBSF||CXCR4, CXCR7||P48061|
|XCL1||Scyc1||Lymphotactin a, SCM-1a, ATAC||XCR1||P47992|
|XCL2||Scyc2||Lymphotactin β, SCM-1β||XCR1||Q9UBD3|
|CX3CL1||Scyd1||Fractalkine, Neurotactin, ABCD-3||CX3CR1||P78423|
Members of the chemokine family are divided into four groups depending on the spacing of their first two cysteine residues. Thus the nomenclature for chemokines is, e.g.: CCL1 for the ligand 1 of the CC-family of chemokines, and CCR1 for its respective receptor.
The CC chemokine (or β-chemokine) proteins have two adjacent cysteines (amino acids), near their amino terminus. There have been at least 27 distinct members of this subgroup reported for mammals, called CC chemokine ligands (CCL)-1 to -28; CCL10 is the same as CCL9. Chemokines of this subfamily usually contain four cysteines (C4-CC chemokines), but a small number of CC chemokines possess six cysteines (C6-CC chemokines). C6-CC chemokines include CCL1, CCL15, CCL21, CCL23 and CCL28. CC chemokines induce the migration of monocytes and other cell types such as NK cells and dendritic cells.
The two N-terminal cysteines of CXC chemokines (or α-chemokines) are separated by one amino acid, represented in this name with an "X". There have been 17 different CXC chemokines described in mammals, that are subdivided into two categories, those with a specific amino acid sequence (or motif) of glutamic acid-leucine-arginine (or ELR for short) immediately before the first cysteine of the CXC motif (ELR-positive), and those without an ELR motif (ELR-negative). ELR-positive CXC chemokines specifically induce the migration of neutrophils, and interact with chemokine receptors CXCR1 and CXCR2. An example of an ELR-positive CXC chemokine is interleukin-8 (IL-8), which induces neutrophils to leave the bloodstream and enter into the surrounding tissue. Other CXC chemokines that lack the ELR motif, such as CXCL13, tend to be chemoattractant for lymphocytes. CXC chemokines bind to CXC chemokine receptors, of which seven have been discovered to date, designated CXCR1-7.
The third group of chemokines is known as the C chemokines (or γ chemokines), and is unlike all other chemokines in that it has only two cysteines; one N-terminal cysteine and one cysteine downstream. Two chemokines have been described for this subgroup and are called XCL1 (lymphotactin-α) and XCL2 (lymphotactin-β).
A fourth group has also been discovered and members have three amino acids between the two cysteines and is termed CX3C chemokine (or d-chemokines). The only CX3C chemokine discovered to date is called fractalkine (or CX3CL1). It is both secreted and tethered to the surface of the cell that expresses it, thereby serving as both a chemoattractant and as an adhesion molecule.
Chemokine receptors are G protein-coupled receptors containing 7 transmembrane domains that are found on the surface of leukocytes. Approximately 19 different chemokine receptors have been characterized to date, which are divided into four families depending on the type of chemokine they bind; CXCR that bind CXC chemokines, CCR that bind CC chemokines, CX3CR1 that binds the sole CX3C chemokine (CX3CL1), and XCR1 that binds the two XC chemokines (XCL1 and XCL2). They share many structural features; they are similar in size (with about 350 amino acids), have a short, acidic N-terminal end, seven helical transmembrane domains with three intracellular and three extracellular hydrophilic loops, and an intracellular C-terminus containing serine and threonine residues important for receptor regulation. The first two extracellular loops of chemokine receptors each has a conserved cysteine residue that allow formation of a disulfide bridge between these loops. G proteins are coupled to the C-terminal end of the chemokine receptor to allow intracellular signaling after receptor activation, while the N-terminal domain of the chemokine receptor determines ligand binding specificity.
Chemokine receptors associate with G-proteins to transmit cell signals following ligand binding. Activation of G proteins, by chemokine receptors, causes the subsequent activation of an enzyme known as phospholipase C (PLC). PLC cleaves a molecule called phosphatidylinositol (4,5)-bisphosphate (PIP2) into two second messenger molecules known as Inositol triphosphate (IP3) and diacylglycerol (DAG) that trigger intracellular signaling events; DAG activates another enzyme called protein kinase C (PKC), and IP3 triggers the release of calcium from intracellular stores. These events promote many signaling cascades (such as the MAP kinase pathway) that generate responses like chemotaxis, degranulation, release of superoxide anions and changes in the avidity of cell adhesion molecules called integrins within the cell harbouring the chemokine receptor.
The discovery that the β chemokines RANTES, MIP (macrophage inflammatory proteins) 1α and 1β (now known as CCL5, CCL3 and CCL4 respectively) suppress HIV-1 provided the initial connection and indicated that these molecules might control infection as part of immune responses in vivo, and that sustained delivery of such inhibitors have the capacity of long-term infection control. The association of chemokine production with antigen-induced proliferative responses, more favorable clinical status in HIV infection, as well as with an uninfected status in subjects at risk for infection suggests a positive role for these molecules in controlling the natural course of HIV infection.
- Mélik-Parsadaniantz S, Rostène W (July 2008). "Chemokines and neuromodulation". Journal of Neuroimmunology. 198 (1-2): 62–8. doi:10.1016/j.jneuroim.2008.04.022. PMID 18538863.
- Zlotnik A, Burkhardt AM, Homey B (August 2011). "Homeostatic chemokine receptors and organ-specific metastasis". Nature Reviews. Immunology. 11 (9): 597–606. doi:10.1038/nri3049. PMID 21866172.
- Zlotnik A, Yoshie O (May 2012). "The chemokine superfamily revisited". Immunity. 36 (5): 705–16. doi:10.1016/j.immuni.2012.05.008. PMC . PMID 22633458.
- Le Y, Zhou Y, Iribarren P, Wang J (April 2004). "Chemokines and chemokine receptors: their manifold roles in homeostasis and disease" (PDF). Cellular & Molecular Immunology. 1 (2): 95–104. PMID 16212895.
- Graham GJ, Locati M (January 2013). "Regulation of the immune and inflammatory responses by the 'atypical' chemokine receptor D6". The Journal of Pathology. 229 (2): 168–75. doi:10.1002/path.4123. PMID 23125030.
- Xie JH, Nomura N, Lu M, Chen SL, Koch GE, Weng Y, Rosa R, Di Salvo J, Mudgett J, Peterson LB, Wicker LS, DeMartino JA (June 2003). "Antibody-mediated blockade of the CXCR3 chemokine receptor results in diminished recruitment of T helper 1 cells into sites of inflammation". Journal of Leukocyte Biology. 73 (6): 771–80. doi:10.1189/jlb.1102573. PMID 12773510.
- Ono SJ, Nakamura T, Miyazaki D, Ohbayashi M, Dawson M, Toda M (June 2003). "Chemokines: roles in leukocyte development, trafficking, and effector function". The Journal of Allergy and Clinical Immunology. 111 (6): 1185–99; quiz 1200. doi:10.1067/mai.2003.1594. PMID 12789214.
- Fernandez EJ, Lolis E (2002). "Structure, function, and inhibition of chemokines". Annual Review of Pharmacology and Toxicology. 42: 469–99. doi:10.1146/annurev.pharmtox.42.091901.115838. PMID 11807180.
- Laing KJ, Secombes CJ (May 2004). "Chemokines". Developmental and Comparative Immunology. 28 (5): 443–60. doi:10.1016/j.dci.2003.09.006. PMID 15062643.
- Villeda SA, Luo J, Mosher KI, Zou B, Britschgi M, Bieri G, Stan TM, Fainberg N, Ding Z, Eggel A, Lucin KM, Czirr E, Park JS, Couillard-Després S, Aigner L, Li G, Peskind ER, Kaye JA, Quinn JF, Galasko DR, Xie XS, Rando TA, Wyss-Coray T (August 2011). "The ageing systemic milieu negatively regulates neurogenesis and cognitive function". Nature. 477 (7362): 90–4. Bibcode:2011Natur.477...90V. doi:10.1038/nature10357. PMC . PMID 21886162.
- Murdoch C, Finn A (May 2000). "Chemokine receptors and their role in inflammation and infectious diseases". Blood. 95 (10): 3032–43. PMID 10807766.
- Cocchi F, DeVico AL, Garzino-Demo A, Arya SK, Gallo RC, Lusso P (December 1995). "Identification of RANTES, MIP-1 alpha, and MIP-1 beta as the major HIV-suppressive factors produced by CD8+ T cells". Science. 270 (5243): 1811–5. Bibcode:1995Sci...270.1811C. doi:10.1126/science.270.5243.1811. PMID 8525373.
- von Recum HA, Pokorski JK (May 2013). "Peptide and protein-based inhibitors of HIV-1 co-receptors". Experimental Biology and Medicine. 238 (5): 442–9. doi:10.1177/1535370213480696. PMC . PMID 23856897.
- Garzino-Demo A, Moss RB, Margolick JB, Cleghorn F, Sill A, Blattner WA, Cocchi F, Carlo DJ, DeVico AL, Gallo RC (October 1999). "Spontaneous and antigen-induced production of HIV-inhibitory beta-chemokines are associated with AIDS-free status". Proceedings of the National Academy of Sciences of the United States of America. 96 (21): 11986–91. Bibcode:1999PNAS...9611986G. doi:10.1073/pnas.96.21.11986. JSTOR 48922. PMC . PMID 10518563.
This tab holds the annotation information that is stored in the Pfam database. As we move to using Wikipedia as our main source of annotation, the contents of this tab will be gradually replaced by the Wikipedia tab.
Small cytokines (intecrine/chemokine), interleukin-8 like Provide feedback
Includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity. Structure contains two highly conserved disulfide bonds.
External database links
This tab holds annotation information from the InterPro database.
InterPro entry IPR001811
Many low-molecular weight factors secreted by cells including fibroblasts, macrophages and endothelial cells, in response to a variety of stimuli such as growth factors, interferons, viral transformation and bacterial products, are structurally related [PUBMED:1910690, PUBMED:2149646, PUBMED:2687068]. Most members of this family of proteins seem to have mitogenic, chemotactic or inflammatory activities. These small cytokines are also called intercrines or chemokines. They are cationic proteins of 70 to 100 amino acid residues that share four conserved cysteine residues involved in two disulphide bonds, as shown in the following schematic representation:
+------------------------------------+ | | xxxxxxxxxxxxxxxxxxxxxxCxCxxxxxxxxxxxxxxxxxxxxxxxCxxxxxxxxxxxxCxxxxx | | +-------------------------+ 'C': conserved cysteine involved in a disulphide bond.
Chemokines can be sorted into main groups based on the spacing of the two amino-terminal cysteines. In the first group (see INTERPRO), the two cysteines are separated by a single residue (C-x-C), while in the second group (see INTERPRO), they are adjacent (C-C).
The mapping between Pfam and Gene Ontology is provided by InterPro. If you use this data please cite InterPro.
|Cellular component||extracellular region (GO:0005576)|
|Molecular function||chemokine activity (GO:0008009)|
|Biological process||immune response (GO:0006955)|
Below is a listing of the unique domain organisations or architectures in which this domain is found. More...
The graphic that is shown by default represents the longest sequence with a given architecture. Each row contains the following information:
- the number of sequences which exhibit this architecture
a textual description of the architecture, e.g. Gla, EGF x 2, Trypsin.
This example describes an architecture with one
Gladomain, followed by two consecutive
EGFdomains, and finally a single
- a link to the page in the Pfam site showing information about the sequence that the graphic describes
- the UniProt description of the protein sequence
- the number of residues in the sequence
- the Pfam graphic itself.
Note that you can see the family page for a particular domain by clicking on the graphic. You can also choose to see all sequences which have a given architecture by clicking on the Show link in each row.
Finally, because some families can be found in a very large number of architectures, we load only the first fifty architectures by default. If you want to see more architectures, click the button at the bottom of the page to load the next set.
Loading domain graphics...
We store a range of different sequence alignments for families. As well as the seed alignment from which the family is built, we provide the full alignment, generated by searching the sequence database (reference proteomes) using the family HMM. We also generate alignments using four representative proteomes (RP) sets, the UniProtKB sequence database, the NCBI sequence database, and our metagenomics sequence database. More...
There are various ways to view or download the sequence alignments that we store. We provide several sequence viewers and a plain-text Stockholm-format file for download.
We make a range of alignments for each Pfam-A family:
- the curated alignment from which the HMM for the family is built
- the alignment generated by searching the sequence database using the HMM
- Representative Proteomes (RPs) at 15%, 35%, 55% and 75% co-membership thresholds
- alignment generated by searching the UniProtKB sequence database using the family HMM
- alignment generated by searching the NCBI sequence database using the family HMM
- alignment generated by searching the metagenomics sequence database using the family HMM
You can see the alignments as HTML or in three different sequence viewers:
- a Java applet developed at the University of Dundee. You will need Java installed before running jalview
- an HTML page showing the whole alignment.Please note: full Pfam alignments can be very large. These HTML views are extremely large and often cause problems for browsers. Please use either jalview or the Pfam viewer if you have trouble viewing the HTML version
- an HTML-based representation of the alignment, coloured according to the posterior-probability (PP) values from the HMM. As for the standard HTML view, heatmap alignments can also be very large and slow to render.
You can download (or view in your browser) a text representation of a Pfam alignment in various formats:
You can also change the order in which sequences are listed in the alignment, change how insertions are represented, alter the characters that are used to represent gaps in sequences and, finally, choose whether to download the alignment or to view it in your browser directly.
You may find that large alignments cause problems for the viewers and the reformatting tool, so we also provide all alignments in Stockholm format. You can download either the plain text alignment, or a gzipped version of it.
We make a range of alignments for each Pfam-A family. You can see a description of each above. You can view these alignments in various ways but please note that some types of alignment are never generated while others may not be available for all families, most commonly because the alignments are too large to handle.
1Cannot generate PP/Heatmap alignments for seeds; no PP data available
Key: available, not generated, — not available.
Format an alignment
We make all of our alignments available in Stockholm format. You can download them here as raw, plain text files or as gzip-compressed files.
You can also download a FASTA format file containing the full-length sequences for all sequences in the full alignment.
HMM logos is one way of visualising profile HMMs. Logos provide a quick overview of the properties of an HMM in a graphical form. You can see a more detailed description of HMM logos and find out how you can interpret them here. More...
If you find these logos useful in your own work, please consider citing the following article:
This page displays the phylogenetic tree for this family's seed alignment. We use FastTree to calculate neighbour join trees with a local bootstrap based on 100 resamples (shown next to the tree nodes). FastTree calculates approximately-maximum-likelihood phylogenetic trees from our seed alignment.
Note: You can also download the data file for the tree.
Curation and family details
This section shows the detailed information about the Pfam family. You can see the definitions of many of the terms in this section in the glossary and a fuller explanation of the scoring system that we use in the scores section of the help pages.
|Seed source:||Overington enriched|
|Number in seed:||237|
|Number in full:||3063|
|Average length of the domain:||58.30 aa|
|Average identity of full alignment:||28 %|
|Average coverage of the sequence by the domain:||52.10 %|
|HMM build commands:||
build method: hmmbuild -o /dev/null HMM SEED
search method: hmmsearch -Z 45638612 -E 1000 --cpu 4 HMM pfamseq
|Family (HMM) version:||20|
|Download:||download the raw HMM for this family|
Weight segments by...
Change the size of the sunburst
selected sequences to HMM
a FASTA-format file
- 0 sequences
- 0 species
This visualisation provides a simple graphical representation of the distribution of this family across species. You can find the original interactive tree in the More....
This chart is a modified "sunburst" visualisation of the species tree for this family. It shows each node in the tree as a separate arc, arranged radially with the superkingdoms at the centre and the species arrayed around the outermost ring.
How the sunburst is generated
The tree is built by considering the taxonomic lineage of each sequence that has a match to this family. For each node in the resulting tree, we draw an arc in the sunburst. The radius of the arc, its distance from the root node at the centre of the sunburst, shows the taxonomic level ("superkingdom", "kingdom", etc). The length of the arc represents either the number of sequences represented at a given level, or the number of species that are found beneath the node in the tree. The weighting scheme can be changed using the sunburst controls.
In order to reduce the complexity of the representation, we reduce the number of taxonomic levels that we show. We consider only the following eight major taxonomic levels:
Colouring and labels
Segments of the tree are coloured approximately according to their superkingdom. For example, archeal branches are coloured with shades of orange, eukaryotes in shades of purple, etc. The colour assignments are shown under the sunburst controls. Where space allows, the name of the taxonomic level will be written on the arc itself.
As you move your mouse across the sunburst, the current node will be highlighted. In the top section of the controls panel we show a summary of the lineage of the currently highlighed node. If you pause over an arc, a tooltip will be shown, giving the name of the taxonomic level in the title and a summary of the number of sequences and species below that node in the tree.
Anomalies in the taxonomy tree
There are some situations that the sunburst tree cannot easily handle and for which we have work-arounds in place.
Missing taxonomic levels
Some species in the taxonomic tree may not have one or more of the main eight levels that we display. For example, Bos taurus is not assigned an order in the NCBI taxonomic tree. In such cases we mark the omitted level with, for example, "No order", in both the tooltip and the lineage summary.
Unmapped species names
The tree is built by looking at each sequence in the full alignment for the family. We take the name of the species given by UniProt and try to map that to the full taxonomic tree from NCBI. In some cases, the name chosen by UniProt does not map to any node in the NCBI tree, perhaps because the chosen name is listed as a synonym or a misspelling in the NCBI taxonomy.
So that these nodes are not simply omitted from the sunburst tree, we group them together in a separate branch (or segment of the sunburst tree). Since we cannot determine the lineage for these unmapped species, we show all levels between the superkingdom and the species as "uncategorised".
Since we reduce the species tree to only the eight main taxonomic levels, sequences that are mapped to the sub-species level in the tree would not normally be shown. Rather than leave out these species, we map them instead to their parent species. So, for example, for sequences belonging to one of the Vibrio cholerae sub-species in the NCBI taxonomy, we show them instead as belonging to the species Vibrio cholerae.
Too many species/sequences
For large species trees, you may see blank regions in the outer layers of the sunburst. These occur when there are large numbers of arcs to be drawn in a small space. If an arc is less than approximately one pixel wide, it will not be drawn and the space will be left blank. You may still be able to get some information about the species in that region by moving your mouse across the area, but since each arc will be very small, it will be difficult to accurately locate a particular species.
The tree shows the occurrence of this domain across different species. More...
We show the species tree in one of two ways. For smaller trees we try to show an interactive representation, which allows you to select specific nodes in the tree and view them as an alignment or as a set of Pfam domain graphics.
Unfortunately we have found that there are problems viewing the interactive tree when the it becomes larger than a certain limit. Furthermore, we have found that Internet Explorer can become unresponsive when viewing some trees, regardless of their size. We therefore show a text representation of the species tree when the size is above a certain limit or if you are using Internet Explorer to view the site.
If you are using IE you can still load the interactive tree by clicking the "Generate interactive tree" button, but please be aware of the potential problems that the interactive species tree can cause.
For all of the domain matches in a full alignment, we count the number that are found on all sequences in the alignment. This total is shown in the purple box.
We also count the number of unique sequences on which each domain is found, which is shown in green. Note that a domain may appear multiple times on the same sequence, leading to the difference between these two numbers.
Finally, we group sequences from the same organism according to the NCBI code that is assigned by UniProt, allowing us to count the number of distinct sequences on which the domain is found. This value is shown in the pink boxes.
We use the NCBI species tree to group organisms according to their taxonomy and this forms the structure of the displayed tree. Note that in some cases the trees are too large (have too many nodes) to allow us to build an interactive tree, but in most cases you can still view the tree in a plain text, non-interactive representation. Those species which are represented in the seed alignment for this domain are highlighted.
You can use the tree controls to manipulate how the interactive tree is displayed:
- show/hide the summary boxes
- highlight species that are represented in the seed alignment
- expand/collapse the tree or expand it to a given depth
- select a sub-tree or a set of species within the tree and view them graphically or as an alignment
- save a plain text representation of the tree
Please note: for large trees this can take some time. While the tree is loading, you can safely switch away from this tab but if you browse away from the family page entirely, the tree will not be loaded.
There are 9 interactions for this family. More...
We determine these interactions using iPfam, which considers the interactions between residues in three-dimensional protein structures and maps those interactions back to Pfam families. You can find more information about the iPfam algorithm in the journal article that accompanies the website.
For those sequences which have a structure in the Protein DataBank, we use the mapping between UniProt, PDB and Pfam coordinate systems from the PDBe group, to allow us to map Pfam domains onto UniProt sequences and three-dimensional protein structures. The table below shows the structures on which the IL8 domain has been found. There are 403 instances of this domain found in the PDB. Note that there may be multiple copies of the domain in a single PDB structure, since many structures contain multiple copies of the same protein sequence.
Loading structure mapping...