Many bacterial transcription regulation proteins bind DNA through a 'helix-turn-helix' (HTH) motif. One major subfamily of these proteins [ (PUBMED:8451183) (PUBMED:2314271) ] is related to the arabinose operon regulatory protein AraC [ (PUBMED:8451183) (PUBMED:2314271) . Except for celD [ (PUBMED:2179047) ], all of these proteins seem to be positive transcriptional factors.
Although the sequences belonging to this family differ somewhat in length, in nearly every case the HTH motif is situated towards the C terminus in the third quarter of most of the sequences. The minimal DNA binding domain spans roughly 100 residues and comprises two HTH subdomains; the classical HTH domain and another HTH subdomain with similarity to the classical HTH domain but with an insertion of one residue in the turn-region. The N-terminal and central regions of these proteins are presumed to interact with effector molecules and may be involved in dimerisation [ (PUBMED:8516313) ].
The known structure of MarA ( P27246 ) shows that the AraC domain is alpha helical and shows the two HTH subdomains both bind the major groove of the DNA. The two HTH subdomains are separated by only 27 angstroms, which causes the cognate DNA to bend.
This entry representsthe full AraC domain containing the two HTH subdomains.
GO process:
regulation of transcription, DNA-templated (GO:0006355)
GO function:
DNA-binding transcription factor activity (GO:0003700), sequence-specific DNA binding (GO:0043565)
Family alignment:
There are 344050 HTH_ARAC domains in 343647 proteins in SMART's nrdb database.
Click on the following links for more information.
Evolution (species in which this domain is found)
Taxonomic distribution of proteins containing HTH_ARAC domain.
This tree includes only several representative species. The complete taxonomic breakdown of all proteins with HTH_ARAC domain is also avaliable.
Click on the protein counts, or double click on taxonomic names to display all proteins containing HTH_ARAC domain in the selected taxonomic class.
Literature (relevant references for this domain)
Primary literature is listed below; Automatically-derived, secondary literature is also avaliable.
Crystal structure of the cyanobacterial metallothionein repressor SmtB: a model for metalloregulatory proteins.
J Mol Biol. 1998; 275: 337-46
Display abstract
SmtB from Synechococcus PCC7942 is a trans-acting dimeric repressor that is required for Zn(2+)-responsive expression of the metallothionein SmtA. The structure of SmtB was solved using multiple isomorphous replacement techniques and refined at 2.2 A resolution by simulated annealing to an R-factor of 0.218. SmtB displays the classical helix-turn-helix motif found in many DNA-binding proteins. It has an alpha + beta topology, and the arrangement of the three core helices and the beta hairpin is similar to the HNF-3/fork head, CAP and diphtheria toxin repressor proteins. Although there is no zinc in the crystal structure, analysis of a mercuric acetate derivative suggests a total of four Zn2+ binding sites in the dimer. Two of these putative sites are at the opposite ends of the dimer, while the other two are at the dimer interface and are formed by residues contributed from each monomer. The structure of the dimer is such that simultaneous binding for both recognition helices to DNA would require either a bend in the DNA helix or a conformational change in the dimer. The structure of Synechococcus SmtB is the first in this family of metal-binding DNA repressors.
A novel DNA-binding motif in MarA: the first structure for an AraC family transcriptional activator.
Proc Natl Acad Sci U S A. 1998; 95: 10413-8
Display abstract
A crystal structure for a member of the AraC prokaryotic transcriptional activator family, MarA, in complex with its cognate DNA-binding site is described. MarA consists of two similar subdomains, each containing a helix-turn-helix DNA-binding motif. The two recognition helices of the motifs are inserted into adjacent major groove segments on the same face of the DNA but are separated by only 27 A thereby bending the DNA by approximately 35 degrees. Extensive interactions between the recognition helices and the DNA major groove provide the sequence specificity.
The ArC/XylS family of prokaryotic positive transcriptional regulators includes more than 100 proteins and polypeptides derived from open reading frames translated from DNA sequences. Members of this family are widely distributed and have been found in the gamma subgroup of the proteobacteria, low- and high-G + C-content gram-positive bacteria, and cyanobacteria. These proteins are defined by a profile that can be accessed from PROSITE PS01124. Members of the family are about 300 amino acids long and have three main regulatory functions in common: carbon metabolism, stress response, and pathogenesis. Multiple alignments of the proteins of the family define a conserved stretch of 99 amino acids usually located at the C-terminal region of the regulator and connected to a nonconserved region via a linker. The conserved stretch contains all the elements required to bind DNA target sequences and to activate transcription from cognate promoters. Secondary analysis of the conserved region suggests that it contains two potential alpha-helix-turn-alpha-helix DNA binding motifs. The first, and better-fitting motif is supported by biochemical data, whereas existing biochemical data neither support nor refute the proposal that the second region possesses this structure. The phylogenetic relationship suggests that members of the family have recruited the nonconserved domain(s) into a series of existing domains involved in DNA recognition and transcription stimulation and that this recruited domain governs the role that the regulator carries out. For some regulators, it has been demonstrated that the nonconserved region contains the dimerization domain. For the regulators involved in carbon metabolism, the effector binding determinants are also in this region. Most regulators belonging to the AraC/XylS family recognize multiple binding sites in the regulated promoters. One of the motifs usually overlaps or is adjacent to the -35 region of the cognate promoters. Footprinting assays have suggested that these regulators protect a stretch of up to 20 bp in the target promoters, and multiple alignments of binding sites for a number of regulators have shown that the proteins recognize short motifs within the protected region.
The AraC protein, which regulates the L-arabinose operons in Escherichia coli, was dissected into two domains that function in chimeric proteins. One provides a dimerization capability and binds the ligand arabinose, and the other provides a site-specific DNA-binding capability and activates transcription. In vivo and in vitro experiments showed that a fusion protein consisting of the N-terminal half of the AraC protein and the DNA-binding domain of the LexA repressor dimerizes, binds well to a LexA operator, and represses expression of a LexA operator-beta-galactosidase fusion gene in an arabinose-responsive manner. In vivo and in vitro experiments also showed that a fusion protein consisting of the C-terminal half of the AraC protein and the leucine zipper dimerization domain from the C/EBP transcriptional activator binds to araI and activates transcription from a PBAD promoter-beta-galactosidase fusion gene. Dimerization was necessary for occupancy and activation of the wild-type AraC binding site.
At least twenty-seven proteins belong to the XylS/AraC family of prokaryote transcriptional regulators. All members of this family except CelD and TetD are positive transcriptional factors. Three subgroups were distinguished within the family in accordance with the Needleman and Wunsch algorithm. Multiple alignment of these proteins revealed that they shared a high degree of sequence homology at their C-terminal end, where a characteristic conserved motif, whose consensus sequence is I-DIA--GF-S--YF--F---G-TPS--R (where - means any aminoacid), was found. Within the homologous C-terminal region, but outside the above consensus motif, a putative DNA-binding domain organized as a helix-turn-helix motif was located in all regulators. For regulators recognizing chemical signals, the non-homologous N-terminal region of these regulators is presumed to contain binding sites for activator molecules that confer specificity.
Metabolism (metabolic pathways involving proteins which contain this domain)
Click the image to view the interactive version of the map in iPath
This information is based on mapping of SMART genomic protein database to KEGG orthologous groups. Percentage points are related to the number of proteins with HTH_ARAC domain which could be assigned to a KEGG orthologous group, and not all proteins containing HTH_ARAC domain. Please note that proteins can be included in multiple pathways, ie. the numbers above will not always add up to 100%.