Nucleases of the GIY-YIG family are involved in many cellular processes, including DNA repair and recombination, transfer of mobile genetic elements, and restriction of incoming foreign DNA. The GIY-YIG superfamily groups together nucleases characterised by the presence of a domain of typically ~100 amino acids, with two short motifs "GIY" and "YIG" in the N-terminal part, followed by an Arg residue in the centre and a Glu residue in the C-terminal part [ (PUBMED:10219084) (PUBMED:12379841) (PUBMED:15692561) (PUBMED:16646971) (PUBMED:19361436) ].
The GIY-YIG domain forms a compact structural domain, which serves as a scaffold for the coordination of a divalent metal ion required for catalysis of the phosphodiester bond cleavage. The GIY-YIG domain has an alpha/bera-sandwich architecture with a central three-stranded antiparallel beta-sheet flanked by three-helices. The three-stranded anti-parallel beta-sheet contains the GIY-YIG sequence elements. The most conserved and putative catalytic residues are located on a shallow, concave surface and include a metal coordination site [ (PUBMED:12379841) (PUBMED:15692561) (PUBMED:16646971) (PUBMED:19361436) ].
The GIY-YIG domain has been implicated in a variety of cellular processes involving DNA cleavage, from self-propagation with or without introns, to restriction of foreign DNA, to DNA repair and maintenance of genome stability [ (PUBMED:16646971) ].
Some proteins known to contain a GIY-YIG domain include:
Eukaryotic Slx-1 proteins, involved in the maintenance of the rDNA copy number. They have a C-terminal RING finger Zn-binding domain.
Mamalian ankyrin repeat and LEM domain- containing protein 1 (ANKLE1).
Bacterial and archaeal UvrC subunits of (A)BC excinucleases, which remove damaged nucleotides by incising the damaged strand on both sides of the lesion. Paramecium bursaria Chlorella virus 1 (pbvc1).
Phage T4 endonucleases SegA to E, probably involved in the movement of the endonuclease-encoding DNA.
Phage T4 intron-associated endonuclease 1 (I-TevI), specific to the thymidylate synthase (td) gene splice junction and involved in intron homing.
Family alignment:
There are 34061 GIYc domains in 34044 proteins in SMART's nrdb database.
Click on the following links for more information.
Evolution (species in which this domain is found)
Taxonomic distribution of proteins containing GIYc domain.
This tree includes only several representative species. The complete taxonomic breakdown of all proteins with GIYc domain is also avaliable.
Click on the protein counts, or double click on taxonomic names to display all proteins containing GIYc domain in the selected taxonomic class.
Literature (relevant references for this domain)
Primary literature is listed below; Automatically-derived, secondary literature is also avaliable.
Conserved domains in DNA repair proteins and evolution of repair systems.
Nucleic Acids Res. 1999; 27: 1223-42
Display abstract
A detailed analysis of protein domains involved in DNA repair was performed by comparing the sequences of the repair proteins from two well-studied model organisms, the bacterium Escherichia coli and yeast Saccharomyces cerevisiae, to the entire sets of protein sequences encoded in completely sequenced genomes of bacteria, archaea and eukaryotes. Previously uncharacterized conserved domains involved in repair were identified, namely four families of nucleases and a family of eukaryotic repair proteins related to the proliferating cell nuclear antigen. In addition, a number of previously undetected occurrences of known conserved domains were detected; for example, a modified helix-hairpin-helix nucleic acid-binding domain in archaeal and eukaryotic RecA homologs. There is a limited repertoire of conserved domains, primarily ATPases and nucleases, nucleic acid-binding domains and adaptor (protein-protein interaction) domains that comprise the repair machinery in all cells, but very few of the repair proteins are represented by orthologs with conserved domain architecture across the three superkingdoms of life. Both the external environment of an organism and the internal environment of the cell, such as the chromatin superstructure in eukaryotes, seem to have a profound effect on the layout of the repair systems. Another factor that apparently has made a major contribution to the composition of the repair machinery is horizontal gene transfer, particularly the invasion of eukaryotic genomes by organellar genes, but also a number of likely transfer events between bacteria and archaea. Several additional general trends in the evolution of repair proteins were noticed; in particular, multiple, independent fusions of helicase and nuclease domains, and independent inactivation of enzymatic domains that apparently retain adaptor or regulatory functions.
Configuration of the catalytic GIY-YIG domain of intron endonuclease I-TevI: coincidence of computational and molecular findings.
Nucleic Acids Res. 1999; 27: 2115-25
Display abstract
I-TevI is a member of the GIY-YIG family of homing endonucleases. It is folded into two structural and functional domains, an N-terminal catalytic domain and a C-terminal DNA-binding domain, separated by a flexible linker. In this study we have used genetic analyses, computational sequence analysis andNMR spectroscopy to define the configuration of theN-terminal domain and its relationship to the flexible linker. The catalytic domain is an alpha/beta structure contained within the first 92 amino acids of the 245-amino acid protein followed by an unstructured linker. Remarkably, this structured domain corresponds precisely to the GIY-YIG module defined by sequence comparisons of 57 proteins including more than 30 newly reported members of the family. Although much of the unstructured linker is not essential for activity, residues 93-116 are required, raising the possibility that this region may adopt an alternate conformation upon DNA binding. Two invariant residues of the GIY-YIG module, Arg27 and Glu75, located in alpha-helices, have properties of catalytic residues. Furthermore, the GIY-YIG sequence elements for which the module is named form part of a three-stranded antiparallel beta-sheet that is important for I-TevI structure and function.
Two-domain structure of the td intron-encoded endonuclease I-TevI correlates with the two-domain configuration of the homing site.
J Mol Biol. 1997; 265: 494-506
Display abstract
I-TevI, the T4 td intron-encoded endonuclease, catalyzes the first step in intron homing by making a double-strand break in the intronless allele within a sequence designated the homing site. The 28 kDa enzyme, which interacts with the homing site over a span of 37 bp, binds as a monomer, contacting two domains of the substrate. In this study, limited proteolysis experiments indicate that I-TevI consists of two domains that behave as discrete physical entities as judged by a number of functional and structural criteria. Overexpression clones for each domain were constructed and the proteins were purified. The carboxy-terminal domain has DNA-binding activity coincident with the primary binding region of the homing site and binds with the same affinity as the full-length enzyme. The isolated amino-terminal domain, contains the conserved GIY-YIG motif, consistent with its being the catalytic domain. Furthermore, site-directed mutagenesis of a conserved arginine residue within the extended motif rendered the full-length protein catalytically inactive, although DNA-binding was maintained. This is the first evidence that the GIY-YIG motif is important for catalytic activity. An enzyme with an N-terminal catalytic domain and a C-terminal DNA-binding domain connected by a flexible linker is in accord with the bipartite structure of the homing site.
Metabolism (metabolic pathways involving proteins which contain this domain)
This information is based on mapping of SMART genomic protein database to KEGG orthologous groups. Percentage points are related to the number of proteins with GIYc domain which could be assigned to a KEGG orthologous group, and not all proteins containing GIYc domain. Please note that proteins can be included in multiple pathways, ie. the numbers above will not always add up to 100%.
Crystal structure of the GIY-YIG N-terminal endonuclease domain of UvrC from Thermotoga maritima: Point mutant Y19F bound to the catalytic divalent cation
Crystal structure of the GIY-YIG N-terminal endonuclease domain of UvrC from Thermotoga maritima: Point mutant Y43F bound to its catalytic divalent cation
Crystal structure of the GIY-YIG N-terminal endonuclease domain of UvrC from Thermotoga maritima: Point mutant Y29F bound to its catalytic divalent cation
Crystal structure of the GIY-YIG N-terminal endonuclease domain of UvrC from Thermotoga maritima: Point mutant N88A bound to its catalytic divalent cation