The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.
The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding [ (PUBMED:10785665) ]. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene.
Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure [ (PUBMED:15705956) ]. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.
GO process:
regulation of transcription, DNA-templated (GO:0006355)
GO function:
sequence-specific DNA binding (GO:0043565), DNA-binding transcription factor activity (GO:0003700)
Family alignment:
There are 14494 WRKY domains in 12257 proteins in SMART's nrdb database.
Click on the following links for more information.
Evolution (species in which this domain is found)
Taxonomic distribution of proteins containing WRKY domain.
This tree includes only several representative species. The complete taxonomic breakdown of all proteins with WRKY domain is also avaliable.
Click on the protein counts, or double click on taxonomic names to display all proteins containing WRKY domain in the selected taxonomic class.
Crystallization and preliminary X-ray analysis of the C-terminal WRKY domain of Arabidopsis thaliana WRKY1 transcription factor.
Biochim Biophys Acta. 2005; 1750: 14-6
Display abstract
The C-terminal WRKY domain of Arabidopsis thaliana WRKY1 protein, a transcription factor, was cloned and expressed. The expressed protein was then purified and crystallized. The preliminary X-ray analysis was undertaken. The crystal diffracted to 2.50 A resolution in-house and belongs to space group P2(1) with unit-cell parameters a=64.10 A, b=34.88 A, c=114.72 A, beta=90.49 degrees .
Solution structure of an Arabidopsis WRKY DNA binding domain.
Plant Cell. 2005; 17: 944-56
Display abstract
The WRKY proteins comprise a major family of transcription factors that are essential in pathogen and salicylic acid responses of higher plants as well as a variety of plant-specific reactions. They share a DNA binding domain, designated as the WRKY domain, which contains an invariant WRKYGQK sequence and a CX4-5CX22-23HXH zinc binding motif. Herein, we report the NMR solution structure of the C-terminal WRKY domain of the Arabidopsis thaliana WRKY4 protein. The structure consists of a four-stranded beta-sheet, with a zinc binding pocket formed by the conserved Cys/His residues located at one end of the beta-sheet, revealing a novel zinc and DNA binding structure. The WRKYGQK residues correspond to the most N-terminal beta-strand, kinked in the middle of the sequence by the Gly residue, which enables extensive hydrophobic interactions involving the Trp residue and contributes to the structural stability of the beta-sheet. Based on a profile of NMR chemical shift perturbations, we propose that the same strand enters the DNA groove and forms contacts with the DNA bases.
The WRKY transcription factor superfamily: its origin in eukaryotes and expansion in plants.
BMC Evol Biol. 2005; 5: 1-1
Display abstract
BACKGROUND: WRKY proteins are newly identified transcription factors involved in many plant processes including plant responses to biotic and abiotic stresses. To date, genes encoding WRKY proteins have been identified only from plants. Comprehensive search for WRKY genes in non-plant organisms and phylogenetic analysis would provide invaluable information about the origin and expansion of the WRKY family. RESULTS: We searched all publicly available sequence data for WRKY genes. A single copy of the WRKY gene encoding two WRKY domains was identified from Giardia lamblia, a primitive eukaryote, Dictyostelium discoideum, a slime mold closely related to the lineage of animals and fungi, and the green alga Chlamydomonas reinhardtii, an early branching of plants. This ancestral WRKY gene seems to have duplicated many times during the evolution of plants, resulting in a large family in evolutionarily advanced flowering plants. In rice, the WRKY gene family consists of over 100 members. Analyses suggest that the C-terminal domain of the two-WRKY-domain encoding gene appears to be the ancestor of the single-WRKY-domain encoding genes, and that the WRKY domains may be phylogenetically classified into five groups. We propose a model to explain the WRKY family's origin in eukaryotes and expansion in plants. CONCLUSIONS: WRKY genes seem to have originated in early eukaryotes and greatly expanded in plants. The elucidation of the evolution and duplicative expansion of the WRKY genes should provide valuable information on their functions.