Summary: Phospholipase D Active site motif
Pfam includes annotations and additional family information from a range of different sources. These sources can be accessed via the tabs below.
This is the Wikipedia entry entitled "Phospholipase D". More...
The Wikipedia text that you see displayed here is a download from Wikipedia. This means that the information we display is a copy of the information from the Wikipedia database. The button next to the article title ("Edit Wikipedia article") takes you to the edit page for the article directly within Wikipedia. You should be aware you are not editing our local copy of this information. Any changes that you make to the Wikipedia article will not be displayed here until we next download the article from Wikipedia. We currently download new content on a nightly basis.
Does Pfam agree with the content of the Wikipedia entry ?
Pfam has chosen to link families to Wikipedia articles. In some case we have created or edited these articles but in many other cases we have not made any direct contribution to the content of the article. The Wikipedia community does monitor edits to try to ensure that (a) the quality of article annotation increases, and (b) vandalism is very quickly dealt with. However, we would like to emphasise that Pfam does not curate the Wikipedia entries and we cannot guarantee the accuracy of the information on the Wikipedia page.
Editing Wikipedia articles
Before you edit for the first time
Wikipedia is a free, online encyclopedia. Although anyone can edit or contribute to an article, Wikipedia has some strong editing guidelines and policies, which promote the Wikipedia standard of style and etiquette. Your edits and contributions are more likely to be accepted (and remain) if they are in accordance with this policy.
You should take a few minutes to view the following pages:
How your contribution will be recorded
Anyone can edit a Wikipedia entry. You can do this either as a new user or you can register with Wikipedia and log on. When you click on the "Edit Wikipedia article" button, your browser will direct you to the edit page for this entry in Wikipedia. If you are a registered user and currently logged in, your changes will be recorded under your Wikipedia user name. However, if you are not a registered user or are not logged on, your changes will be logged under your computer's IP address. This has two main implications. Firstly, as a registered Wikipedia user your edits are more likely seen as valuable contribution (although all edits are open to community scrutiny regardless). Secondly, if you edit under an IP address you may be sharing this IP address with other users. If your IP address has previously been blocked (due to being flagged as a source of 'vandalism') your edits will also be blocked. You can find more information on this and creating a user account at Wikipedia.
If you have problems editing a particular page, contact us at email@example.com and we will try to help.
The community annotation is a new facility of the Pfam web site. If you have problems editing or experience problems with these pages please contact us.
Phospholipase D Edit Wikipedia article
|PDB structures||RCSB PDB PDBe PDBsum|
|Gene Ontology||AmiGO / QuickGO|
Phospholipase D (EC 18.104.22.168, lipophosphodiesterase II, lecithinase D, choline phosphatase) (PLD) is an enzyme of the phospholipase superfamily. Phospholipases occur widely, and can be found in a wide range of organisms, including bacteria, yeast, plants, animals, and viruses. Phospholipase D’s principal substrate is phosphatidylcholine, which it hydrolyzes to produce the signal molecule phosphatidic acid (PA), and soluble choline. Plants contain numerous genes that encode various PLD isoenzymes, with molecular weights ranging from 90-125 kDa. Mammalian cells encode two isoforms of phospholipase D: PLD1 and PLD2. Phospholipase D is an important player in many physiological processes, including membrane trafficking, cytoskeletal reorganization, receptor-mediated endocytosis, exocytosis, and cell migration. Through these processes, it has been further implicated in the pathophysiology of multiple diseases: in particular the progression of Parkinson’s and Alzheimer’s, as well as various cancers.
- 1 Discovery
- 2 Function
- 3 Structure
- 4 Regulation
- 5 Physiological & pathophysiological roles
- 6 Gallery
- 7 References
- 8 External links
PLD-type activity was first reported in 1947 by Donald J. Hanahan and I.L. Chaikoff. It was not until 1975, however, that the hydrolytic mechanism of action was elucidated in mammalian cells. Plant isoforms of PLD were first purified from cabbage and castor bean; PLDα was ultimately cloned and characterized from a variety of plants, including rice, corn, and tomato. Plant PLDs have been cloned in three isoforms: PLDα, PLDβ, and PLDγ. More than half a century of biochemical studies have implicated phospholipase D and PA activity in a wide range of physiological processes and diseases, including inflammation, diabetes, phagocytosis, neuronal & cardiac signaling, and oncogenesis.
Strictly speaking, phospholipase D is a transphosphatidylase: it mediates the exchange of polar headgroups covalently attached to membrane-bound lipids. Utilizing water as a nucleophile, this enzyme catalyzes the cleavage of the phosphodiester bond in structural phospholipids such as phosphatidylcholine and phosphatidylethanolamine. The products of this hydrolysis are the membrane-bound lipid phosphatidic acid (PA), and choline, which diffuses into the cytosol. As choline has little second messenger activity, PLD activity is mostly transduced by the production of PA. PA is heavily involved in intracellular signal transduction. In addition, some members of the PLD superfamily may employ primary alcohols such as ethanol or 1-butanol in the cleavage of the phospholipid, effectively catalyzing the exchange the polar lipid headgroup. Other members of this family are able hydrolyze other phospholipid substrates, such as cardiolipin, or even the phosphodiester bond constituting the backbone of DNA.
Many of phospholipase D’s cellular functions are mediated by its principal product, phosphatidic acid (PA). PA is a negatively charged phospholipid, whose small head group promotes membrane curvature. It is thus thought to facilitate membrane-vesicle fusion and fission in a manner analogous to clathrin-mediated endocytosis. PA may also recruit proteins that contain its corresponding binding domain, a region characterized by basic amino acid-rich regions. Additionally, PA can be converted into a number of other lipids, such as lysophosphatidic acid (lyso-PA) or diacylglycerol, signal molecules which have a multitude of effects on downstream cellular pathways.PA and its lipid derivatives are implicated in myriad processes that include intracellular vesicle trafficking, endocytosis, exocytosis, actin cytoskeleton dynamics, cell proliferation differentiation, and migration.
Mammalian PLD directly interacts with kinases like PKC, ERK, TYK and controls the signalling indicating that PLD is activated by these kinases. As choline is very abundant in the cell, PLD activity does not significantly affect choline levels, and choline is unlikely to play any role in signalling.
Phosphatidic acid is a signal molecule and acts to recruit SK1 to membranes. PA is extremely short lived and is rapidly hydrolysed by the enzyme phosphatidate phosphatase to form diacylglycerol (DAG). DAG may also be converted to PA by DAG kinase. Although PA and DAG are interconvertible, they do not act in the same pathways. Stimuli that activate PLD do not activate enzymes downstream of DAG and vice versa.
It is possible that, though PA and DAG are interconvertible, separate pools of signalling and non-signalling lipids may be maintained. Studies have suggested that DAG signalling is mediated by polyunsaturated DAG while PLD derived PA is monounsaturated or saturated. Thus functional saturated/monounsaturated PA can be degraded by hydrolysing it to form non-functional saturated/monounsaturated DAG while functional polyunsaturated DAG can be degraded by converting it into non-functional polyunsaturated PA.
Plant and animal PLDs have a consistent molecular structure, characterized by sites of catalysis surrounded by an assortment of regulatory sequences. The active site of PLDs consists of four highly conserved amino acid sequences (I-IV), of which motifs II and IV are particularly conserved. These structural domains contain the distinguishing catalytic sequence HxxxxxxxKxD (HKD), where H, K, and D are the amino acids histidine (H), lysine (K), aspartic acid (D), while x represents nonconservative amino acids. These two HKD motifs confer hydrolytic activity to PLD, and are critical for its enzymatic activity both in vitro and in vivo. Hydrolysis of the phosphodiester bond occurs when these HKD sequences are in the correct proximity.
Human proteins containing this motif include:
PC-hydrolyzing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two HKD motifs containing well-conserved histidine, lysine, and asparagine residues which may contribute to the active site aspartic acid. An Escherichia coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs.
PLD genes additionally encode highly conserved regulatory domains: the pbox consensus sequence (PX), the pleckstrin homology domain (PH), and a binding site for phosphatidylinositol 4,5-bisphosphate (PIP2).
Mechanism of action
PLD-catalyzed hydrolysis has been proposed to occur in two stages via a "ping-pong" mechanism. In this scheme, the histidine residues of each HKD motif successively attack the phospholipid substrate. Functioning as nucleophiles, the constituent imidazole moieties of the histidines form transient covalent bonds with the phospholipid, producing a short-lived intermediate that can be easily hydrolyzed by water in a subsequent step.
Two major isoforms of phospholipase D has been identified in mammalian cells: PLD1 and PLD2 (53% sequence homology), each encoded by distinct genes. PLD activity appears to be present in most cell types, with the possible exceptions of peripheral leukocytes and other lymphocytes. Both PLD isoforms require PIP2 as a cofactor for activity. PLD1 and PLD2 exhibit different subcellular localizations that dynamically change in the course of signal transduction. PLD activity has been observed within the plasma membrane, cytosol, ER, and Golgi complex.
PLD1 is a 120 kDa protein that is mainly located on the inner membranes of cells. It is primarily present at the Golgi complex, endosomes, lysosomes, and secretory granules. Upon the binding of an extracellular stimulus, PLD1 is transported to the plasma membrane. Basal PLD1 activity is low however, and in order to transduce the extracellular signal, it must first be activated by proteins such as Arf, Rho, Rac, and protein kinase C.
In contrast, PLD2 is a 106 kDa protein that primarily localizes to the plasma membrane, residing in light membrane lipid rafts. It has high intrinsic catalytic activity, and is only weakly activated by the above molecules.
The activity of phospholipase D is extensively regulated by hormones, neurotransmitters, lipids, small monomeric GTPases, and other small molecules that bind to their corresponding domains on the enzyme. In most cases, signal transduction is mediated through production of phosphatidic acid, which functions as a secondary messenger.
Specific phospholipids are regulators of PLD activity in plant and animal cells. Most PLDs require phosphatidylinositol 4,5-bisphosphate (PIP2), as a cofactors for activity. PIP2 and other phosphoinositides are important modifiers of cytoskeletal dynamics and membrane transport. PLDs regulated by these phospholipids are commonly involved in intracellular signal transduction. Their activity is dependent upon the binding of these phosphoinositides near the active site. In plants and animals, this binding site is characterized by the presence of a conserved sequence of basic and aromatic amino acids. In plants such as Arabidopsis thaliana, this sequence is constituted by a RxxxxxKxR motif together with its inverted repeat, where R is arginine and K is lysine. Its proximity to the active site ensures high level of PLD1 and PLD2 activity, and promotes the translocation of PLD1 to target membranes in response to extracellular signals.
Substrate presentation controls PLD activity. The enzyme resides inactive in lipid micro-domains rich in sphingomyelin and depleted of PC substrate. Activation of PLD causes the enzyme to translocate to PIP2 micro domains near its substrate PC. Hence PLD can be activated by localization within the membrane rather than a protein conformational change.
Calcium acts as a cofactor in PLD isoforms that contain the C2 domain. Binding of Ca2+ to the C2 domain leads to conformational changes in the enzyme that strengthen enzyme-substrate binding, while weakening the association with phosphoinositides. In some plant isoenzymes, such as PLDβ, Ca2+ may bind directly to the active site, indirectly increasing its affinity for the substrate by strengthening the binding of the activator PIP2.
The pbox consensus sequence (PX) is thought to mediate the binding of additional phosphatidylinositol phosphates, in particular, phosphatidylinositol 5-phosphate (PtdIns5P), a lipid thought to be required for endocytosis, may help facilitate the reinternalization of PLD1 from the plasma membrane.
The highly conserved Pleckstrin homology domain (PH) is a structural domain approximately 120 amino acids in length. It binds phosphatidylinositides such as phosphatidylinositol (3,4,5)-trisphosphate (PIP3) and phosphatidylinositol (4,5)-bisphosphate (PIP2). It may also bind heterotrimeric G proteins via their βγ-subunit. Binding to this domain is also thought to facilitate the re-internalization of the protein by increasing its affinity to endocytotic lipid rafts.
Interactions with small GTPases
In animal cells, small protein factors are important additional regulators of PLD activity. These small monomeric GTPases are members of the Rho and ARF families of the Ras superfamily. Some of these proteins, such as Rac1, Cdc42, and RhoA, allosterically activate mammalian PLD1, directly increasing its activity. In particular, the translocation of cytosolic ADP-ribosylation factor (ARF) to the plasma membrane is essential for PLD activation.
Physiological & pathophysiological roles
Phospholipase D is a regulator of several critical cellular processes, including vesicle transport, endocytosis, exocytosis, cell migration, and mitosis. Dysregulation of these processes is commonplace in carcinogenesis, and in turn, abnormalities in PLD expression have been implicated in the progression of several types cancer. A driver mutation conferring elevated PLD2 activity has been observed in several malignant breast cancers. Elevated PLD expression has also been correlated with tumor size in colorectal carcinoma, gastric carcinoma, and renal cancer. However, the molecular pathways through which PLD drives cancer progression remain unclear. One potential hypothesis casts a critical role for phospholipase D in the activation of mTOR, a suppressor of cancer cell apoptosis. The ability of PLD to suppress apoptosis in cells with elevated tyrosine kinase activity makes it a candidate oncogene in cancers where such expression is typical.
In neurodegenerative diseases
Phospholipase D may also play an important pathophysiological role in the progression of neurodegenerative diseases, primarily through its capacity as a signal transducer in indispensable cellular processes like cytoskeletal reorganization and vesicle trafficking. Dysregulation of PLD by the protein α-synuclein has been shown to lead to the specific loss of dopaminergic neurons in mammals. α-synuclein is the primary structural component of Lewy bodies, protein aggregates that are the hallmarks of Parkinson's disease. Disinhibition of PLD by α-synuclein may contribute to Parkinson's deleterious phenotype.
Abnormal PLD activity has also been suspected in Alzheimer's disease, where it has been observed to interact with presenilin 1 (PS-1), the principal component of the γ-secretase complex responsible for the enzymatic cleavage of amyloid precursor protein (APP). Extracellular plaques of the product β-amyloid are a defining feature of Alzheimer's diseased brains. Action of PLD1 on PS-1 has been shown to affect the intracellular trafficking of the amyloid precursor to this complex. Phospholipase D3 (PLD3), a non-classical and poorly characterized member of the PLD superfamily, has also been associated with the pathogenesis of this disease.
- Jenkins GM, Frohman MA (October 2005). "Phospholipase D: a lipid centric review". Cell Mol Life Sci. 62 (19–20): 2305–16. doi:10.1007/s00018-005-5195-z. PMID 16143829.
- Exton JH (2002). "Phospholipase D-structure, regulation and function". Rev Physiol Biochem Pharmacol. 144: 1–94. doi:10.1007/BFb0116585. PMID 11987824.
- Kolesnikov YS, Nokhrina KP, Kretynin SV, Volotovski ID, Martinec J, Romanov GA, Kravets VS (January 2012). "Molecular structure of phospholipase D and regulatory mechanisms of its activity in plant and animal cells". Biochemistry (Mosc). 77 (1): 1–14. doi:10.1134/S0006297912010014. PMID 22339628.
- Peng X.; M. A. Frohman (February 2012). "Mammalian Phospholipase D Physiological and Pathological Roles". Acta Physiologica. 204 (2): 219–226. doi:10.1111/j.1748-1716.2011.02298.x. PMC . PMID 21447092.
- Foster DA (September 2003). "Phospholipase D in cell proliferation and cancer". Mol Cancer Res. 1 (11): 789–800. doi:10.2174/157436206778226941. PMID 14517341.
- Banno, Y. (2002). "Regulation and Possible Role of Mammalian Phospholipase D in Cellular Functions". Journal of Biochemistry. 131 (3): 301–306. doi:10.1093/oxfordjournals.jbchem.a003103. ISSN 0021-924X.
- McDermott M, Wakelam MJ, Morris AJ (February 2004). "Phospholipase D". Biochem Cell Biol. 82 (1): 225–53. doi:10.1139/o03-079. PMID 15052340.
- Balboa MA, Firestein BL, Godson C, Bell KS, Insel PA (April 1994). "Protein kinase C alpha mediates phospholipase D activation by nucleotides and phorbol ester in Madin-Darby canine kidney cells. Stimulation of phospholipase D is independent of activation of polyphosphoinositide-specific phospholipase C and phospholipase A2". J Biol Chem. 269 (14): 10511–6. PMID 8144636.
- Leiros I, Secundo F, Zambonelli C, Servi S, Hough E (June 2000). "The first crystal structure of a phospholipase D". Structure. 8: 655–67. doi:10.1016/S0969-2126(00)00150-7. PMID 10873862.
- Paruch S, El-Benna J, Djerdjouri B, Marullo S, Périanin A (January 2006). "A role of p44/42 mitogen-activated protein kinases in formyl-peptide receptor-mediated phospholipase D activity and oxidant production". FASEB J. 20 (1): 142–4. doi:10.1096/fj.05-3881fje. PMID 16253958.
- Bocckino S, Blackmore P, Wilson P, Exton J (1987). "Phosphatidate accumulation in hormone-treated hepatocytes via a phospholipase D mechanism". J Biol Chem. 262 (31): 15309–15. PMID 3117799.
- Bocckino S, Wilson P, Exton J (1987). "Ca2+-mobilizing hormones elicit phosphatidylethanol accumulation via phospholipase D activation". FEBS Lett. 225 (1–2): 201–4. doi:10.1016/0014-5793(87)81157-2. PMID 3319693.
- Hodgkin M, Pettitt T, Martin A, Michell R, Pemberton A, Wakelam M (1998). "Diacylglycerols and phosphatidates: which molecular species are intracellular messengers?". Trends Biochem Sci. 23 (6): 200–4. doi:10.1016/S0968-0004(98)01200-6. PMID 9644971.
- M. Nowicki; M. Frentzen (2005). "Cardiolipin synthase of Arabidopsis thaliana". FEBS Letters. 579 (10): 2161–2165. doi:10.1016/j.febslet.2005.03.007. PMID 15811335.
- M. Nowicki (2006). "Characterization of the Cardiolipin Synthase from Arabidopsis thaliana". Ph.D. thesis, RWTH-Aachen University.
- Ponting CP, Kerr ID (1996). "A novel family of phospholipase D homologues that includes phospholipid synthases and putative endonucleases: identification of duplicated repeats and potential active site residues". Protein Sci. 5 (5): 914–922. doi:10.1002/pro.5560050513. PMC . PMID 8732763.
- Koonin EV (1996). "A duplicated catalytic motif in a new superfamily of phosphohydrolases and phospholipid synthases that includes poxvirus envelope proteins". Trends Biochem. Sci. 21 (7): 242–243. doi:10.1016/0968-0004(96)30024-8. PMID 8755242.
- Wang X, Xu L, Zheng L (1994). "Cloning and expression of phosphatidylcholine-hydrolyzing phospholipase D from Ricinus communis L". J. Biol. Chem. 269 (32): 20312–20317. PMID 8051126.
- Singer WD, Brown HA, Sternweis PC (1997). "Regulation of eukaryotic phosphatidylinositol-specific phospholipase C and phospholipase D". Annu. Rev. Biochem. 66: 475–509. doi:10.1146/annurev.biochem.66.1.475. PMID 9242915.
- Lindsley CW, Brown HA (January 2012). "Phospholipase D as a therapeutic target in brain disorders". Neuropsychopharmacology. 37 (1): 301–2. doi:10.1038/npp.2011.178. PMC . PMID 22157867.
- Petersen, E. Nicholas (2016). "Kinetic disruption of lipid rafts is a mechanosensor for phospholipase D". Nat Commun. 7 (13873). doi:10.1038/ncomms13873. PMID 27976674.
- Cruchaga; et al. (2013). "Rare coding variants in the phospholipase D3 gene confer risk for Alzheimer's disease". Nature. 505: 550–554. doi:10.1038/nature12825. PMC . PMID 24336208.
This tab holds the annotation information that is stored in the Pfam database. As we move to using Wikipedia as our main source of annotation, the contents of this tab will be gradually replaced by the Wikipedia tab.
Phospholipase D Active site motif Provide feedback
Phosphatidylcholine-hydrolysing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, and/or asparagine residues which may contribute to the active site. aspartic acid. An E. coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs. The profile contained here represents only the putative active site regions, since an accurate multiple alignment of the repeat units has not been achieved.
Ponting CP, Kerr ID; , Protein Sci 1996;5:914-922.: A novel family of phospholipase D homologues that includes phospholipid synthases and putative endonucleases: identification of duplicated repeats and potential active site residues. PUBMED:8732763 EPMC:8732763
Koonin EV; , Trends Biochem Sci 1996;21:242-243.: A duplicated catalytic motif in a new superfamily of phosphohydrolases and phospholipid synthases that includes poxvirus envelope proteins. PUBMED:8755242 EPMC:8755242
Internal database links
|SCOOP:||DUF1669 PLDc_2 PP_kinase_C RE_NgoFVII Regulator_TrmB|
|Similarity to PfamA using HHSearch:||PLDc_2|
External database links
This tab holds annotation information from the InterPro database.
InterPro entry IPR001736
Phosphatidylcholine-hydrolysing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, and/or asparagine residues which may contribute to the active site aspartic acid. An Escherichia coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs [PUBMED:8732763, PUBMED:8755242, PUBMED:8051126, PUBMED:9242915].
The mapping between Pfam and Gene Ontology is provided by InterPro. If you use this data please cite InterPro.
|Molecular function||catalytic activity (GO:0003824)|
Below is a listing of the unique domain organisations or architectures in which this domain is found. More...
The graphic that is shown by default represents the longest sequence with a given architecture. Each row contains the following information:
- the number of sequences which exhibit this architecture
a textual description of the architecture, e.g. Gla, EGF x 2, Trypsin.
This example describes an architecture with one
Gladomain, followed by two consecutive
EGFdomains, and finally a single
- a link to the page in the Pfam site showing information about the sequence that the graphic describes
- the UniProt description of the protein sequence
- the number of residues in the sequence
- the Pfam graphic itself.
Note that you can see the family page for a particular domain by clicking on the graphic. You can also choose to see all sequences which have a given architecture by clicking on the Show link in each row.
Finally, because some families can be found in a very large number of architectures, we load only the first fifty architectures by default. If you want to see more architectures, click the button at the bottom of the page to load the next set.
Loading domain graphics...
We store a range of different sequence alignments for families. As well as the seed alignment from which the family is built, we provide the full alignment, generated by searching the sequence database (reference proteomes) using the family HMM. We also generate alignments using four representative proteomes (RP) sets, the UniProtKB sequence database, the NCBI sequence database, and our metagenomics sequence database. More...
There are various ways to view or download the sequence alignments that we store. We provide several sequence viewers and a plain-text Stockholm-format file for download.
We make a range of alignments for each Pfam-A family:
- the curated alignment from which the HMM for the family is built
- the alignment generated by searching the sequence database using the HMM
- Representative Proteomes (RPs) at 15%, 35%, 55% and 75% co-membership thresholds
- alignment generated by searching the UniProtKB sequence database using the family HMM
- alignment generated by searching the NCBI sequence database using the family HMM
- alignment generated by searching the metagenomics sequence database using the family HMM
You can see the alignments as HTML or in three different sequence viewers:
- a Java applet developed at the University of Dundee. You will need Java installed before running jalview
- an HTML page showing the whole alignment.Please note: full Pfam alignments can be very large. These HTML views are extremely large and often cause problems for browsers. Please use either jalview or the Pfam viewer if you have trouble viewing the HTML version
- an HTML-based representation of the alignment, coloured according to the posterior-probability (PP) values from the HMM. As for the standard HTML view, heatmap alignments can also be very large and slow to render.
You can download (or view in your browser) a text representation of a Pfam alignment in various formats:
You can also change the order in which sequences are listed in the alignment, change how insertions are represented, alter the characters that are used to represent gaps in sequences and, finally, choose whether to download the alignment or to view it in your browser directly.
You may find that large alignments cause problems for the viewers and the reformatting tool, so we also provide all alignments in Stockholm format. You can download either the plain text alignment, or a gzipped version of it.
We make a range of alignments for each Pfam-A family. You can see a description of each above. You can view these alignments in various ways but please note that some types of alignment are never generated while others may not be available for all families, most commonly because the alignments are too large to handle.
1Cannot generate PP/Heatmap alignments for seeds; no PP data available
Key: available, not generated, — not available.
Format an alignment
We make all of our alignments available in Stockholm format. You can download them here as raw, plain text files or as gzip-compressed files.
You can also download a FASTA format file containing the full-length sequences for all sequences in the full alignment.
HMM logos is one way of visualising profile HMMs. Logos provide a quick overview of the properties of an HMM in a graphical form. You can see a more detailed description of HMM logos and find out how you can interpret them here. More...
If you find these logos useful in your own work, please consider citing the following article:
This page displays the phylogenetic tree for this family's seed alignment. We use FastTree to calculate neighbour join trees with a local bootstrap based on 100 resamples (shown next to the tree nodes). FastTree calculates approximately-maximum-likelihood phylogenetic trees from our seed alignment.
Note: You can also download the data file for the tree.
Curation and family details
This section shows the detailed information about the Pfam family. You can see the definitions of many of the terms in this section in the glossary and a fuller explanation of the scoring system that we use in the scores section of the help pages.
|Seed source:||Alignment kindly provided by SMART|
|Author:||Ponting C, Schultz J, Bork P|
|Number in seed:||30|
|Number in full:||3452|
|Average length of the domain:||29.70 aa|
|Average identity of full alignment:||43 %|
|Average coverage of the sequence by the domain:||3.99 %|
|HMM build commands:||
build method: hmmbuild -o /dev/null HMM SEED
search method: hmmsearch -Z 26740544 -E 1000 --cpu 4 HMM pfamseq
|Family (HMM) version:||21|
|Download:||download the raw HMM for this family|
Weight segments by...
Change the size of the sunburst
selected sequences to HMM
a FASTA-format file
- 0 sequences
- 0 species
This visualisation provides a simple graphical representation of the distribution of this family across species. You can find the original interactive tree in the More....
This chart is a modified "sunburst" visualisation of the species tree for this family. It shows each node in the tree as a separate arc, arranged radially with the superkingdoms at the centre and the species arrayed around the outermost ring.
How the sunburst is generated
The tree is built by considering the taxonomic lineage of each sequence that has a match to this family. For each node in the resulting tree, we draw an arc in the sunburst. The radius of the arc, its distance from the root node at the centre of the sunburst, shows the taxonomic level ("superkingdom", "kingdom", etc). The length of the arc represents either the number of sequences represented at a given level, or the number of species that are found beneath the node in the tree. The weighting scheme can be changed using the sunburst controls.
In order to reduce the complexity of the representation, we reduce the number of taxonomic levels that we show. We consider only the following eight major taxonomic levels:
Colouring and labels
Segments of the tree are coloured approximately according to their superkingdom. For example, archeal branches are coloured with shades of orange, eukaryotes in shades of purple, etc. The colour assignments are shown under the sunburst controls. Where space allows, the name of the taxonomic level will be written on the arc itself.
As you move your mouse across the sunburst, the current node will be highlighted. In the top section of the controls panel we show a summary of the lineage of the currently highlighed node. If you pause over an arc, a tooltip will be shown, giving the name of the taxonomic level in the title and a summary of the number of sequences and species below that node in the tree.
Anomalies in the taxonomy tree
There are some situations that the sunburst tree cannot easily handle and for which we have work-arounds in place.
Missing taxonomic levels
Some species in the taxonomic tree may not have one or more of the main eight levels that we display. For example, Bos taurus is not assigned an order in the NCBI taxonomic tree. In such cases we mark the omitted level with, for example, "No order", in both the tooltip and the lineage summary.
Unmapped species names
The tree is built by looking at each sequence in the full alignment for the family. We take the name of the species given by UniProt and try to map that to the full taxonomic tree from NCBI. In some cases, the name chosen by UniProt does not map to any node in the NCBI tree, perhaps because the chosen name is listed as a synonym or a misspelling in the NCBI taxonomy.
So that these nodes are not simply omitted from the sunburst tree, we group them together in a separate branch (or segment of the sunburst tree). Since we cannot determine the lineage for these unmapped species, we show all levels between the superkingdom and the species as "uncategorised".
Since we reduce the species tree to only the eight main taxonomic levels, sequences that are mapped to the sub-species level in the tree would not normally be shown. Rather than leave out these species, we map them instead to their parent species. So, for example, for sequences belonging to one of the Vibrio cholerae sub-species in the NCBI taxonomy, we show them instead as belonging to the species Vibrio cholerae.
Too many species/sequences
For large species trees, you may see blank regions in the outer layers of the sunburst. These occur when there are large numbers of arcs to be drawn in a small space. If an arc is less than approximately one pixel wide, it will not be drawn and the space will be left blank. You may still be able to get some information about the species in that region by moving your mouse across the area, but since each arc will be very small, it will be difficult to accurately locate a particular species.
The tree shows the occurrence of this domain across different species. More...
We show the species tree in one of two ways. For smaller trees we try to show an interactive representation, which allows you to select specific nodes in the tree and view them as an alignment or as a set of Pfam domain graphics.
Unfortunately we have found that there are problems viewing the interactive tree when the it becomes larger than a certain limit. Furthermore, we have found that Internet Explorer can become unresponsive when viewing some trees, regardless of their size. We therefore show a text representation of the species tree when the size is above a certain limit or if you are using Internet Explorer to view the site.
If you are using IE you can still load the interactive tree by clicking the "Generate interactive tree" button, but please be aware of the potential problems that the interactive species tree can cause.
For all of the domain matches in a full alignment, we count the number that are found on all sequences in the alignment. This total is shown in the purple box.
We also count the number of unique sequences on which each domain is found, which is shown in green. Note that a domain may appear multiple times on the same sequence, leading to the difference between these two numbers.
Finally, we group sequences from the same organism according to the NCBI code that is assigned by UniProt, allowing us to count the number of distinct sequences on which the domain is found. This value is shown in the pink boxes.
We use the NCBI species tree to group organisms according to their taxonomy and this forms the structure of the displayed tree. Note that in some cases the trees are too large (have too many nodes) to allow us to build an interactive tree, but in most cases you can still view the tree in a plain text, non-interactive representation. Those species which are represented in the seed alignment for this domain are highlighted.
You can use the tree controls to manipulate how the interactive tree is displayed:
- show/hide the summary boxes
- highlight species that are represented in the seed alignment
- expand/collapse the tree or expand it to a given depth
- select a sub-tree or a set of species within the tree and view them graphically or as an alignment
- save a plain text representation of the tree
Please note: for large trees this can take some time. While the tree is loading, you can safely switch away from this tab but if you browse away from the family page entirely, the tree will not be loaded.
For those sequences which have a structure in the Protein DataBank, we use the mapping between UniProt, PDB and Pfam coordinate systems from the PDBe group, to allow us to map Pfam domains onto UniProt sequences and three-dimensional protein structures. The table below shows the structures on which the PLDc domain has been found. There are 3 instances of this domain found in the PDB. Note that there may be multiple copies of the domain in a single PDB structure, since many structures contain multiple copies of the same protein sequence.
Loading structure mapping...