Please note: this site relies heavily on the use of javascript. Without a javascript-enabled browser, this site will not function correctly. Please enable javascript and reload the page, or switch to a different browser.
0  structures 52  species 0  interactions 1043  sequences 22  architectures

Family: Olduvai (PF06758)

Summary: Olduvai domain

Pfam includes annotations and additional family information from a range of different sources. These sources can be accessed via the tabs below.

This is the Wikipedia entry entitled "DUF1220". More...

DUF1220 Edit Wikipedia article

Domain of unknown function (DUF1220)
Symbol DUF1220
Pfam PF06758
InterPro IPR010630

DUF1220 is a protein domain that shows a striking human lineage-specific (HLS) increase in copy number and may be important to human brain evolution.[1] The DUF1220 domain name has recently been changed to the Olduvai domain based on data obtained since initial discovery of the domain. [2] The copy number of DUF1220 domains increases generally as a function of a species evolutionary proximity to humans. DUF1220 copy number is highest in human (~289, with some person-to-person variations).[3] and shows the largest HLS increase in copy number (an additional 160 copies) of any protein coding region in the human genome. DUF1220 copy number is reduced in African great apes (estimated 125 copies in chimpanzees), further reduced in orangutan (92) and Old World monkeys (35), single- or low-copy in non-primate mammals and absent in non-mammals.[3] DUF1220 domains are approximately 65 amino acids in length and are encoded by a two-exon doublet. In the human genome DUF1220 sequences are located primarily on chromosome 1 in region 1q21.1-q21.2, with several copies also found at 1p36, 1p13.3, and 1p12. Sequences encoding DUF1220 domains show rhythmicity, resonance[4] and signs of positive selection, especially in primates, and are expressed in several human tissues including brain, where their expression is restricted to neurons.[1]


The gene showing a human-specific increase in DUF1220 copy number was first identified as the result of a genome-wide array CGH study of lineage-specific copy number differences between human and great ape species.[5] The study found 134 genes that showed human lineage-specific increases in copy number, one of which, MGC8902 (also known as NBPF15, cDNA IMAGE:843276), encoded 6 DUF1220 domains.[1] DUF1220 protein domains are found almost exclusively in the NBPF gene family (which includes the MGC8902 gene), which was independently identified as a result of the first member of this family being disrupted in an individual with neuroblastoma.[6] It was recently found that the exceptional increase in human DUF1220 copy number was the results of intragenic domain hyper-amplification primarily involving the three-domain unit called the HLS DUF1220 triplet.[3] Hyper-amplification of the triplet resulted in the addition of ~149 copies of DUF1220 specifically to the human lineage since its divergence from the Pan species, chimpanzee and bonobo, approximately 6 million years ago.[3] The ancestral DUF1220 domain is not part of the NBPF family but rather is found as a single copy within the PDE4DIP (Myomegalin) gene. PDE4DIP encodes a centrosomal protein and is a homolog of CDK5RAP2, a gene that lacks DUF1220 sequences and, when mutated, has been implicated in microcephaly.[7][8]

Association with brain size and evolutionary adaptation

An increasingly large number of disease-associated copy number variations (CNVs) have been reported in the 1q21.1 region and these CNVs either encompass or directly flank DUF1220 domain sequences.[9] Two independent reports [10][11] have linked reciprocal 1q21.1 deletions and duplications in this region with microcephaly and macrocephaly, respectively, raising the possibility that DUF1220 copy number may be involved in influencing human brain size. Targeted 1q21 array CGH investigation of the potential association between DUF1220 and brain size found that DUF1220 copy number decrease is associated with microcephaly in individuals with 1q21 CNVs.[12] Of all 1q21 sequences tested, DUF1220 sequences were the only ones to show consistent correlation between copy number and brain size in both disease (micro/macrocephaly) and non-disease populations. In addition, in primates there is a significant correlation between DUF1220 copy number and both brain size and brain cortical neuron number.[12]

More recent research using MRI measurements of brain surface areas and volumes in healthy individuals has better localized associations with DUF1220 copy number. This work has implicated DUF1220 copy number in multiple brain volume and surface area measurements.[13]

Improved characterization of the genomic architecture of chromosome 1 in a new genomic assembly has allowed for more refined analysis of the location and sequence of DUF1220 domains. Included among the findings was the identification of 20 additional DUF1220 domains in the genome that were added via a duplication from 1q21.2 to 1p11.2. This in turn may have mediated the HLS pericentric inversion on chromosome 1, an important evolutionary event.

For the above reasons and because DUF1220 sequences at 1q21.1 have undergone a dramatic and evolutionarily rapid increase in copy number in humans, a model [9][14] has been developed that proposes that:

1) increasing DUF1220 domain dosage is a driving force behind the evolutionary expansion of the primate (and human) brain,

2) the instability of the 1q21.1 region has facilitated the rapid increase in DUF1220 copy number in humans, and

3) the evolutionary advantage of rapidly increasing DUF1220 copy number in the human genome has resulted in favoring retention of the high genomic instability of the 1q21.1 region, which, in turn, has precipitated a spectrum of recurrent human brain and developmental disorders. These include autism and schizophrenia (as discussed below) and other disorders resulting from 1q21.1 duplication syndrome and 1q21.1 deletion syndrome.[9]

From this perspective, disease-associated 1q21.1 CNVs may be the price the human species paid, and continues to pay, for the adaptive benefit of having large numbers of DUF1220 copies in its genome.[9][14]

Associations with autism, schizophrenia and cognitive function

DUF1220 copy number variation has more recently been investigated in autism and schizophrenia, as both disorders are associated with deletions and duplications of 1q21 yet the causative loci within such regions have not previously been identified. Such research has found that copy number of DUF1220 subtype CON1 is linearly associated with increasing severity of social impairment in autism[15][16] and severity of negative symptoms in schizophrenia.[17] In contrast, copy number increase of DUF1220 subtypes CON1 and HLS1 is associated with reduced severity of positive symptoms in schizophrenia.[17] This evidence is relevant for current theories proposing that the two disorders are fundamentally related. The precise nature of this relationship is currently under debate, with alternative lines of argument suggesting that the two are diametrically opposed diseases, exist on a continuum or exhibit a more nuanced relationship.[18]

Cognitive dysfunction is a feature of multiple neuropsychiatric diseases, and many individuals with 1q21 deletion and duplication syndromes have developmental delay. Given this, the role of DUF1220 in cognitive function has been investigated. Results of this research demonstrate that DUF1220 copy number is linearly associated with increased cognitive function as measured by total IQ and mathematical aptitude scores, a finding identified in two independent populations.[13][19]. This association has important implications for understanding the interplay between cognitive function and autism phenotypes.[20] These findings also provide additional support for the involvement of DUF1220 in a genomic trade-off model involving the human brain: the same key genes that have been major contributors to the evolutionary expansion of the human brain and human cognitive capacity may also, in different combinations, underlie psychiatric disorders such as autism and schizophrenia. [14]


  1. ^ a b c Popesco MC, Maclaren EJ, Hopkins J, Dumas L, Cox M, Meltesen L, McGavran L, Wyckoff GJ, Sikela JM (September 2006). "Human lineage-specific amplification, selection, and neuronal expression of DUF1220 domains". Science. 313 (5791): 1304–7. doi:10.1126/science.1127980. PMID 16946073. 
  2. ^ Sikela JM, van Roy F (2018). "A proposal to change the name of the NBPF/DUF1220 domain to the Olduvai domain". F1000Research. 6 (2185): 2185. doi:10.12688/f1000research.13586.1. PMID 29399325. 
  3. ^ a b c d O'Bleness MS, Dickens CM, Dumas LJ, Kehrer-Sawatzki H, Wyckoff GJ, Sikela JM (September 2012). "Evolutionary history and genome organization of DUF1220 protein domains". G3. 2 (9): 977–86. doi:10.1534/g3.112.003061. PMC 3429928Freely accessible. PMID 22973535. 
  4. ^ Perez, J. C.: DUF1220 Homo sapiens and Neanderthal fractal periods architectures breakthrough. SDRP Journal of Cellular and Molecular Physiology 1 (2017) 1-34[1]
  5. ^ Fortna A, Kim Y, MacLaren E, Marshall K, Hahn G, Meltesen L, Brenton M, Hink R, Burgers S, Hernandez-Boussard T, Karimpour-Fard A, Glueck D, McGavran L, Berry R, Pollack J, Sikela JM (July 2004). "Lineage-specific gene duplication and loss in human and great ape evolution". PLoS Biology. 2 (7): E207. doi:10.1371/journal.pbio.0020207. PMC 449870Freely accessible. PMID 15252450. 
  6. ^ Vandepoele K, Van Roy N, Staes K, Speleman F, van Roy F (November 2005). "A novel gene family NBPF: intricate structure generated by gene duplications during primate evolution". Molecular Biology and Evolution. 22 (11): 2265–74. doi:10.1093/molbev/msi222. PMID 16079250. 
  7. ^ Bond J, Woods CG (February 2006). "Cytoskeletal genes regulating brain size". Current Opinion in Cell Biology. 18 (1): 95–101. doi:10.1016/ PMID 16337370. 
  8. ^ Dumas L, Kim YH, Karimpour-Fard A, Cox M, Hopkins J, Pollack JR, Sikela JM (September 2007). "Gene copy number variation spanning 60 million years of human and primate evolution". Genome Research. 17 (9): 1266–77. doi:10.1101/gr.6557307. PMC 1950895Freely accessible. PMID 17666543. 
  9. ^ a b c d Dumas L, Sikela JM (2009). "DUF1220 domains, cognitive disease, and human brain evolution". Cold Spring Harbor Symposia on Quantitative Biology. 74: 375–82. doi:10.1101/sqb.2009.74.025. PMC 2902282Freely accessible. PMID 19850849. 
  10. ^ Brunetti-Pierri N, Berg JS, Scaglia F, Belmont J, Bacino CA, Sahoo T, et al. (December 2008). "Recurrent reciprocal 1q21.1 deletions and duplications associated with microcephaly or macrocephaly and developmental and behavioral abnormalities". Nature Genetics. 40 (12): 1466–71. doi:10.1038/ng.279. PMC 2680128Freely accessible. PMID 19029900. 
  11. ^ Mefford HC, Sharp AJ, Baker C, Itsara A, Jiang Z, Buysse K, et al. (October 2008). "Recurrent rearrangements of chromosome 1q21.1 and variable pediatric phenotypes". The New England Journal of Medicine. 359 (16): 1685–99. doi:10.1056/NEJMoa0805384. PMC 2703742Freely accessible. PMID 18784092. 
  12. ^ a b Dumas LJ, O'Bleness MS, Davis JM, Dickens CM, Anderson N, Keeney JG, Jackson J, Sikela M, Raznahan A, Giedd J, Rapoport J, Nagamani SS, Erez A, Brunetti-Pierri N, Sugalski R, Lupski JR, Fingerlin T, Cheung SW, Sikela JM (September 2012). "DUF1220-domain copy number implicated in human brain-size pathology and evolution". American Journal of Human Genetics. 91 (3): 444–54. doi:10.1016/j.ajhg.2012.07.016. PMC 3511999Freely accessible. PMID 22901949. 
  13. ^ a b Davis JM, Searles VB, Anderson N, Keeney J, Raznahan A, Horwood LJ, Fergusson DM, Kennedy MA, Giedd J, Sikela JM (January 2015). "DUF1220 copy number is linearly associated with increased cognitive function as measured by total IQ and mathematical aptitude scores". Human Genetics. 134 (1): 67–75. doi:10.1007/s00439-014-1489-2. PMID 25287832. 
  14. ^ a b c Sikela JM, Searles Quick VB (January 2018). "Genomic trade-offs: are autism and schizophrenia the steep price of the human brain?". Human Genetics. 137 (1): 1–13. doi:10.1007/s00439-017-1865-9. PMID 29335774. 
  15. ^ Davis JM, Searles VB, Anderson N, Keeney J, Dumas L, Sikela JM (March 2014). "DUF1220 dosage is linearly associated with increasing severity of the three primary symptoms of autism". PLoS Genetics. 10 (3): e1004241. doi:10.1371/journal.pgen.1004241. PMC 3961203Freely accessible. PMID 24651471. 
  16. ^ Davis JM, Searles Quick VB, Sikela JM (June 2015). "Replicated linear association between DUF1220 copy number and severity of social impairment in autism". Human Genetics. 134 (6): 569–75. doi:10.1007/s00439-015-1537-6. PMID 25758905. 
  17. ^ a b Searles Quick VB, Davis JM, Olincy A, Sikela JM (December 2015). "DUF1220 copy number is associated with schizophrenia risk and severity: implications for understanding autism and schizophrenia as related diseases". Translational Psychiatry. 5: e697. doi:10.1038/tp.2015.192. PMC 5068589Freely accessible. PMID 26670282. 
  18. ^ Crespi B, Badcock C (June 2008). "Psychosis and autism as diametrical disorders of the social brain". The Behavioral and Brain Sciences. 31 (3): 241–61; discussion 261–320. doi:10.1017/S0140525X08004214. PMID 18578904. 
  19. ^ Weiss, Volkmar: Das IQ-Gen - verleugnet seit 2015: Eine bahnbrechende Entdeckung und ihre Feinde. Ares Verlag, Graz 2017, ISBN 978-3-902732-87-3
  20. ^ Crespi BJ (2016-01-01). "Autism As a Disorder of High Intelligence". Frontiers in Neuroscience. 10: 300. doi:10.3389/fnins.2016.00300. PMC 4927579Freely accessible. PMID 27445671. 

Further reading

This article incorporates text from the public domain Pfam and InterPro IPR010630

This page is based on a Wikipedia article. The text is available under the Creative Commons Attribution/Share-Alike License.

This tab holds the annotation information that is stored in the Pfam database. As we move to using Wikipedia as our main source of annotation, the contents of this tab will be gradually replaced by the Wikipedia tab.

Olduvai domain Provide feedback

This domain formerly known as DUF1220 or NBPF domain has been renamed as the Olduvai domain. It is found highly duplicated in the human lineage.

Literature references

  1. Dumas L, Sikela JM;, Cold Spring Harb Symp Quant Biol. 2009; [Epub ahead of print]: DUF1220 Domains, Cognitive Disease, and Human Brain Evolution. PUBMED:19850849 EPMC:19850849

This tab holds annotation information from the InterPro database.

InterPro entry IPR010630

Proteins of the neuroblastoma breakpoint family (NBPF) contain a highly conserved domain of unknown function, which is known as NBPF [PUBMED:16079250] or DUF1220 [PUBMED:19850849]. The NBPF/DUF1220 domain is present in multiple copies in NBPF proteins and once, with lower homology, in mammalian myomegalin, a protein localised in the Golgi/centrosomal area which functions as an anchor to localise components of the cyclic adenosine monophosphate-dependent pathway to this region. The implications of the resemblance of NBPF proteins to myomegalin remain obscure.

NBPF domains are typically built of two exons [PUBMED:16079250, PUBMED:16946073]. The number of NBPF repeat copies is highly expanded in humans, reduced in African great apes, further reduced in orangutan and Old World monkeys, single-copy in nonprimate mammals, and absent in nonmammalian species. The NBPF domain that is found as a singly copy in nonprimate mammals is the likely ancestral domain. Studies suggest an association between NBPF/DUF1220 copy number and brain size, and more specifically neocortex volume [PUBMED:26112965]. An association has been established between DUF1220 subtype CON1 copy number and autism severity [PUBMED:25758905], and between subtype CON2 copy number and cognitive function [PUBMED:25287832].

Domain organisation

Below is a listing of the unique domain organisations or architectures in which this domain is found. More...

Loading domain graphics...


We store a range of different sequence alignments for families. As well as the seed alignment from which the family is built, we provide the full alignment, generated by searching the sequence database (reference proteomes) using the family HMM. We also generate alignments using four representative proteomes (RP) sets, the UniProtKB sequence database, the NCBI sequence database, and our metagenomics sequence database. More...

View options

We make a range of alignments for each Pfam-A family. You can see a description of each above. You can view these alignments in various ways but please note that some types of alignment are never generated while others may not be available for all families, most commonly because the alignments are too large to handle.

Representative proteomes UniProt
Jalview View  View  View  View  View  View  View  View   
HTML View  View               
PP/heatmap 1 View               

1Cannot generate PP/Heatmap alignments for seeds; no PP data available

Key: ✓ available, x not generated, not available.

Format an alignment

Representative proteomes UniProt

Download options

We make all of our alignments available in Stockholm format. You can download them here as raw, plain text files or as gzip-compressed files.

Representative proteomes UniProt
Raw Stockholm Download   Download   Download   Download   Download   Download   Download   Download    
Gzipped Download   Download   Download   Download   Download   Download   Download   Download    

You can also download a FASTA format file containing the full-length sequences for all sequences in the full alignment.

HMM logo

HMM logos is one way of visualising profile HMMs. Logos provide a quick overview of the properties of an HMM in a graphical form. You can see a more detailed description of HMM logos and find out how you can interpret them here. More...


This page displays the phylogenetic tree for this family's seed alignment. We use FastTree to calculate neighbour join trees with a local bootstrap based on 100 resamples (shown next to the tree nodes). FastTree calculates approximately-maximum-likelihood phylogenetic trees from our seed alignment.

Note: You can also download the data file for the tree.

Curation and family details

This section shows the detailed information about the Pfam family. You can see the definitions of many of the terms in this section in the glossary and a fuller explanation of the scoring system that we use in the scores section of the help pages.

Curation View help on the curation process

Seed source: Pfam-B_6292 (release 10.0)
Previous IDs: DUF1220;
Type: Domain
Sequence Ontology: SO:0000417
Author: Moxon SJ , Bateman A
Number in seed: 32
Number in full: 1043
Average length of the domain: 64.80 aa
Average identity of full alignment: 46 %
Average coverage of the sequence by the domain: 19.79 %

HMM information View help on HMM parameters

HMM build commands:
build method: hmmbuild -o /dev/null HMM SEED
search method: hmmsearch -Z 45638612 -E 1000 --cpu 4 HMM pfamseq
Model details:
Parameter Sequence Domain
Gathering cut-off 23.0 23.0
Trusted cut-off 23.1 23.1
Noise cut-off 22.8 22.7
Model length: 66
Family (HMM) version: 13
Download: download the raw HMM for this family

Species distribution

Sunburst controls


Weight segments by...

Change the size of the sunburst


Colour assignments

Archea Archea Eukaryota Eukaryota
Bacteria Bacteria Other sequences Other sequences
Viruses Viruses Unclassified Unclassified
Viroids Viroids Unclassified sequence Unclassified sequence


Align selected sequences to HMM

Generate a FASTA-format file

Clear selection

This visualisation provides a simple graphical representation of the distribution of this family across species. You can find the original interactive tree in the adjacent tab. More...

Loading sunburst data...

Tree controls


The tree shows the occurrence of this domain across different species. More...


Please note: for large trees this can take some time. While the tree is loading, you can safely switch away from this tab but if you browse away from the family page entirely, the tree will not be loaded.