Secondary literature sources for FerI
The following references were automatically generated.
- Haynie DT, Xue B
- Superdomains in the protein structure hierarchy: The case of PTP-C2.
- Protein Sci. 2015; 24: 874-82
- Display abstract
Superdomain is uniquely defined in this work as a conserved combination of different globular domains in different proteins. The amino acid sequences of 25 structurally and functionally diverse proteins from fungi, plants, and animals have been analyzed in a test of the superdomain hypothesis. Each of the proteins contains a protein tyrosine phosphatase (PTP) domain followed by a C2 domain. Four novel conserved sequence motifs have been identified, one in the PTP domain and three in the C2 domain. All contribute to the PTP-C2 domain interface in PTEN, a tumor suppressor, and all are more conserved than the PTP signature motif, HCX3 (K/R)XR, in the 25 sequences. We show that PTP-C2 was formed prior to the fungi, plant, and animal kingdom divergence. A superdomain as defined here does not fit the usual protein structure classification system. The demonstrated existence of one superdomain suggests the existence of others.
- Nomiyama H, Yoshie O
- Functional roles of evolutionary conserved motifs and residues in vertebrate chemokine receptors.
- J Leukoc Biol. 2015; 97: 39-47
- Display abstract
Chemokine receptors regulate cell migration and homing. They belong to the rhodopsin-like family of GPCRs. Their ancestor genes emerged in the early stages of vertebrate evolution. Since then, the family has been greatly expanded through whole and segmental genome duplication events. During evolution, many amino acid changes have been introduced in individual chemokine receptors, but certain motifs and residues are highly conserved. Previously, we proposed a nomenclature system of the vertebrate chemokine receptors based on their evolutionary history and phylogenetic analyses. With the use of this classification system, we are now able to confidently assign the species orthologs of vertebrate chemokine receptors. Here, we systematically analyze conserved motifs and residues of each group of orthologous chemokine receptors that may play important roles in their signaling and biologic functions. Our present analysis may provide useful information on how individual chemokine receptors are activated upon ligand binding.
- Chemes LB, de Prat-Gay G, Sanchez IE
- Convergent evolution and mimicry of protein linear motifs in host-pathogen interactions.
- Curr Opin Struct Biol. 2015; 32: 91-101
- Display abstract
Pathogen linear motif mimics are highly evolvable elements that facilitate rewiring of host protein interaction networks. Host linear motifs and pathogen mimics differ in sequence, leading to thermodynamic and structural differences in the resulting protein-protein interactions. Moreover, the functional output of a mimic depends on the motif and domain repertoire of the pathogen protein. Regulatory evolution mediated by linear motifs can be understood by measuring evolutionary rates, quantifying positive and negative selection and performing phylogenetic reconstructions of linear motif natural history. Convergent evolution of linear motif mimics is widespread among unrelated proteins from viral, prokaryotic and eukaryotic pathogens and can also take place within individual protein phylogenies. Statistics, biochemistry and laboratory models of infection link pathogen linear motifs to phenotypic traits such as tropism, virulence and oncogenicity. In vitro evolution experiments and analysis of natural sequences suggest that changes in linear motif composition underlie pathogen adaptation to a changing environment.
- Rao XJ et al.
- Structural features, evolutionary relationships, and transcriptional regulation of C-type lectin-domain proteins in Manduca sexta.
- Insect Biochem Mol Biol. 2015; 62: 75-85
- Display abstract
C-type lectins (CTLs) are a large family of Ca(2+)-dependent carbohydrate-binding proteins recognizing various glycoconjugates and functioning primarily in immunity and cell adhesion. We have identified 34 CTLDP (for CTL-domain protein) genes in the Manduca sexta genome, which encode proteins with one to three CTL domains. CTL-S1 through S9 (S for simple) have one or three CTL domains; immulectin-1 through 19 have two CTL domains; CTL-X1 through X6 (X for complex) have one or two CTL domains along with other structural modules. Nine simple CTLs and seventeen immulectins have a signal peptide and are likely extracellular. Five complex CTLs have both an N-terminal signal peptide and a C-terminal transmembrane region, indicating that they are membrane anchored. Immulectins exist broadly in Lepidoptera and lineage-specific gene duplications have generated three clusters of fourteen genes in the M. sexta genome, thirteen of which have similar expression patterns. In contrast to the family expansion, CTL-S1 approximately S6, S8, and X1 approximately X6 have 1:1 orthologs in at least four lepidopteran/dipteran/coleopteran species, suggestive of conserved functions in a wide range of holometabolous insects. Structural modeling suggests the key residues for Ca(2+)-dependent or independent binding of certain carbohydrates by CTL domains. Promoter analysis identified putative kappaB motifs in eighteen of the CTL genes, which did not have a strong correlation with immune inducibility in the mRNA or protein levels. Together, the gene identification, sequence comparisons, structure modeling, phylogenetic analysis, and expression profiling establish a solid foundation for future studies of M. sexta CTL-domain proteins.
- Cerrudo CS, Mengual Gomez DL, Gomez DE, Ghiringhelli PD
- Novel insights into the evolution and structural characterization of dyskerin using comprehensive bioinformatics analysis.
- J Proteome Res. 2015; 14: 874-87
- Display abstract
Dyskerin is a conserved nucleolar protein. Several related genetic diseases are caused by defects in dyskerin. We hypothesized that having a comprehensive bioinformatic analysis of dyskerin will help to develop new drugs for this diseases. We predicted protein domains and compared sequences and structures to detect the universe of dyskerin-like proteins. We identified conserved features of shared domains in the three superkingdoms. We analyzed the phylogenetic diversity, confirming that there is a strong structural conservation. Also, we studied the relationship of dyskerin-like proteins with other proteins through an integrative protein-protein interaction approach. Most of them are conserved among homologous eukaryotic and archaeal proteins. Our results highlighted the preservation of proteins interacting with dyskerin. We identified conserved dyskerin interactor proteins between the different eukaryotes organisms. Furthermore, we studied the existence of dyskerin-like proteins in different species. Also, we compared and analyzed the secondary structure with the hydrophobic profile, confirming that all have hydrophilic properties highly conserved among proteins. The greatest difference was observed in the NTE and CTE regions. Another aspect studied was the comparison and analysis of tertiary structures. In our knowledge, this is the first time that these analyses were performed in such a comprehensive manner.
- Zhang S et al.
- Nmf9 Encodes a Highly Conserved Protein Important to Neurological Function in Mice and Flies.
- PLoS Genet. 2015; 11: 1005344-1005344
- Display abstract
Many protein-coding genes identified by genome sequencing remain without functional annotation or biological context. Here we define a novel protein-coding gene, Nmf9, based on a forward genetic screen for neurological function. ENU-induced and genome-edited null mutations in mice produce deficits in vestibular function, fear learning and circadian behavior, which correlated with Nmf9 expression in inner ear, amygdala, and suprachiasmatic nuclei. Homologous genes from unicellular organisms and invertebrate animals predict interactions with small GTPases, but the corresponding domains are absent in mammalian Nmf9. Intriguingly, homozygotes for null mutations in the Drosophila homolog, CG45058, show profound locomotor defects and premature death, while heterozygotes show striking effects on sleep and activity phenotypes. These results link a novel gene orthology group to discrete neurological functions, and show conserved requirement across wide phylogenetic distance and domain level structural changes.
- Kilaparty SP, Singh A, Baltosser WH, Ali N
- Computational analysis reveals a successive adaptation of multiple inositol polyphosphate phosphatase 1 in higher organisms through evolution.
- Evol Bioinform Online. 2014; 10: 239-50
- Display abstract
Multiple inositol polyphosphate phosphatase 1 (Minpp1) in higher organisms dephosphorylates InsP6, the most abundant inositol phosphate. It also dephosphorylates less phosphorylated InsP5 and InsP4 and more phosphorylated InsP7 or InsP8. Minpp1 is classified as a member of the histidine acid phosphatase super family of proteins with functional resemblance to phytases found in lower organisms. This study took a bioinformatics approach to explore the extent of evolutionary diversification in Minpp1 structure and function in order to understand its physiological relevance in higher organisms. The human Minpp1 amino acid (AA) sequence was BLAST searched against available national protein databases. Phylogenetic analysis revealed that Minpp1 was widely distributed from lower to higher organisms. Further, we have identified that there exist four isoforms of Minpp1. Multiple computational tools were used to identify key functional motifs and their conservation among various species. Analyses showed that certain motifs predominant in higher organisms were absent in lower organisms. Variation in AA sequences within motifs was also analyzed. We found that there is diversification of key motifs and thus their functions present in Minpp1 from lower organisms to higher organisms. Another interesting result of this analysis was the presence of a glucose-1-phosphate interaction site in Minpp1; the functional significance of which has yet to be determined experimentally. The overall findings of our study point to an evolutionary adaptability of Minpp1 functions from lower to higher life forms.
- Saad II, Saha SB, Thomas G
- The RAS subfamily Evolution - tracing evolution for its utmost exploitation.
- Bioinformation. 2014; 10: 293-8
- Display abstract
In the development of multicellularity, signaling proteins has played a very important role. Among them, RAS family is one of the most widely studied protein family. However, evolutionary analysis has been carried out mainly on super family level leaving sub family information in scanty. Thus, a subfamily evolutionary study on RAS evolutionary expansion is imperative as it will aid in better drug designing against dreadful diseases like Cancer and other developmental diseases. The present study was aimed to understand RAS evolution on both holistic as well as reductive level. All human RAS family genes and protein were subjected to BLAST tools to find orthologs and paralogs with different parameters followed by phylogenetic tree generation. Our results clearly showed that H-RAS is the most primitive RAS in higher eukaryotes and then diverged into other RAS family members due to different gene modification events. Furthermore, a site specific selection pressure analysis was carried out using SELECTON server which showed that H-RAS, M-RAS and N-RAS are evolving faster than K-RAS and R-RAS. Thus, the results ascertain a new ground to cancer biologists to exploit negatively selected K-RAS and R-RAS as potent drug targets in cancer therapeutics.
- Desalle R, Chicote JU, Sun TT, Garcia-Espana A
- Generation of divergent uroplakin tetraspanins and their partners during vertebrate evolution: identification of novel uroplakins.
- BMC Evol Biol. 2014; 14: 13-13
- Display abstract
BACKGROUND: The recent availability of sequenced genomes from a broad array of chordates (cephalochordates, urochordates and vertebrates) has allowed us to systematically analyze the evolution of uroplakins: tetraspanins (UPK1a and UPK1b families) and their respective partner proteins (UPK2 and UPK3 families). RESULTS: We report here: (1) the origin of uroplakins in the common ancestor of vertebrates, (2) the appearance of several residues that have statistically significantly positive dN/dS ratios in the duplicated paralogs of uroplakin genes, and (3) the existence of strong coevolutionary relationships between UPK1a/1b tetraspanins and their respective UPK2/UPK3-related partner proteins. Moreover, we report the existence of three new UPK2/3 family members we named UPK2b, 3c and 3d, which will help clarify the evolutionary relationships between fish, amphibian and mammalian uroplakins that may perform divergent functions specific to these different and physiologically distinct groups of vertebrates. CONCLUSIONS: Since our analyses cover species of all major chordate groups this work provides an extremely clear overall picture of how the uroplakin families and their partner proteins have evolved in parallel. We also highlight several novel features of uroplakin evolution including the appearance of UPK2b and 3d in fish and UPK3c in the common ancestor of reptiles and mammals. Additional studies of these novel uroplakins should lead to new insights into uroplakin structure and function.
- Fukasawa Y, Leung RK, Tsui SK, Horton P
- Plus ca change - evolutionary sequence divergence predicts protein subcellular localization signals.
- BMC Genomics. 2014; 15: 46-46
- Display abstract
BACKGROUND: Protein subcellular localization is a central problem in understanding cell biology and has been the focus of intense research. In order to predict localization from amino acid sequence a myriad of features have been tried: including amino acid composition, sequence similarity, the presence of certain motifs or domains, and many others. Surprisingly, sequence conservation of sorting motifs has not yet been employed, despite its extensive use for tasks such as the prediction of transcription factor binding sites. RESULTS: Here, we flip the problem around, and present a proof of concept for the idea that the lack of sequence conservation can be a novel feature for localization prediction. We show that for yeast, mammal and plant datasets, evolutionary sequence divergence alone has significant power to identify sequences with N-terminal sorting sequences. Moreover sequence divergence is nearly as effective when computed on automatically defined ortholog sets as on hand curated ones. Unfortunately, sequence divergence did not necessarily increase classification performance when combined with some traditional sequence features such as amino acid composition. However a post-hoc analysis of the proteins in which sequence divergence changes the prediction yielded some proteins with atypical (i.e. not MPP-cleaved) matrix targeting signals as well as a few misannotations. CONCLUSION: We report the results of the first quantitative study of the effectiveness of evolutionary sequence divergence as a feature for protein subcellular localization prediction. We show that divergence is indeed useful for prediction, but it is not trivial to improve overall accuracy simply by adding this feature to classical sequence features. Nevertheless we argue that sequence divergence is a promising feature and show anecdotal examples in which it succeeds where other features fail.
- Wang QY et al.
- Genome Sequence of Tumebacillus flagellatus GST4, the First Genome Sequence of a Species in the Genus Tumebacillus.
- Genome Announc. 2014; 2: 0-0
- Display abstract
We present here the first genome sequence of a species in the genus Tumebacillus. The draft genome sequence of Tumebacillus flagellatus GST4 provides a genetic basis for future studies addressing the origins, evolution, and ecological role of Tumebacillus organisms, as well as a source of acid-resistant amylase-encoding genes for further studies.
- Oulhen N, Onorato TM, Ramos I, Wessel GM
- Dysferlin is essential for endocytosis in the sea star oocyte.
- Dev Biol. 2014; 388: 94-102
- Display abstract
Dysferlin is a calcium-binding transmembrane protein involved in membrane fusion and membrane repair. In humans, mutations in the dysferlin gene are associated with muscular dystrophy. In this study, we isolated plasma membrane-enriched fractions from full-grown immature oocytes of the sea star, and identified dysferlin by mass spectrometry analysis. The full-length dysferlin sequence is highly conserved between human and the sea star. We learned that in the sea star Patiria miniata, dysferlin RNA and protein are expressed from oogenesis to gastrulation. Interestingly, the protein is highly enriched in the plasma membrane of oocytes. Injection of a morpholino against dysferlin leads to a decrease of endocytosis in oocytes, and to a developmental arrest during gastrulation. These results suggest that dysferlin is critical for normal endocytosis during oogenesis and for embryogenesis in the sea star and that this animal may be a useful model for studying the relationship of dysferlin structure as it relates to its function.
- Maynard KB, Smith SA, Davis AC, Trivette A, Seipelt-Thiemann RL
- Evolutionary analysis of the mammalian M1 aminopeptidases reveals conserved exon structure and gene death.
- Gene. 2014; 552: 126-32
- Display abstract
The members of the M1 aminopeptidase family share conserved domains, yet show functional divergence within the family as a whole. In order to better understand this family, this study analyzed the mammalian members in depth at exon, gene, and protein levels. The twelve human members, eleven rat members, and eleven mouse members were first analyzed in multiple alignments to visualize both reported and unreported conserved domains. Phylogenetic trees were then generated for humans, rats, mice, and all mammals to determine how closely related the homologs were and to gain insight to the divergence in the family members. This produced three groups with similarity within the family. Next, a synteny study was completed to determine the present locations of the genes and changes that had occurred. It became apparent that gene death likely resulted in the lack of one member in mouse and rat. Finally, an in-depth analysis of the exon structure revealed that nine members of the human family and eight in mouse, are highly conserved within the exon structure. Taken together, these results indicate that the M1 aminopeptidase family is a divergent family with three subgroups and that genetic evidence mirrors categorization of the family by enzymatic function.
- Khafif M, Cottret L, Balague C, Raffaele S
- Identification and phylogenetic analyses of VASt, an uncharacterized protein domain associated with lipid-binding domains in Eukaryotes.
- BMC Bioinformatics. 2014; 15: 222-222
- Display abstract
BACKGROUND: Several regulators of programmed cell death (PCD) in plants encode proteins with putative lipid-binding domains. Among them, VAD1 is a regulator of PCD propagation harboring a GRAM putative lipid-binding domain. However the function of VAD1 at the subcellular level is unknown and the domain architecture of VAD1 has not been analyzed in details. RESULTS: We analyzed sequence conservation across the plant kingdom in the VAD1 protein and identified an uncharacterized VASt (VAD1 Analog of StAR-related lipid transfer) domain. Using profile hidden Markov models (profile HMMs) and phylogenetic analysis we found that this domain is conserved among eukaryotes and generally associates with various lipid-binding domains. Proteins containing both a GRAM and a VASt domain include notably the yeast Ysp2 cell death regulator and numerous uncharacterized proteins. Using structure-based phylogeny, we found that the VASt domain is structurally related to Bet v1-like domains. CONCLUSION: We identified a novel protein domain ubiquitous in Eukaryotic genomes and belonging to the Bet v1-like superfamily. Our findings open perspectives for the functional analysis of VASt-containing proteins and the characterization of novel mechanisms regulating PCD.
- Kaushik S, Sowdhamini R
- Distribution, classification, domain architectures and evolution of prolyl oligopeptidases in prokaryotic lineages.
- BMC Genomics. 2014; 15: 985-985
- Display abstract
BACKGROUND: Prolyl oligopeptidases (POPs) are proteolytic enzymes, widely distributed in all the kingdoms of life. Bacterial POPs are pharmaceutically important enzymes, yet their functional and evolutionary details are not fully explored. Therefore, current analysis is aimed at understanding the distribution, domain architecture, probable biological functions and gene family expansion of POPs in bacterial and archaeal lineages. RESULTS: Exhaustive sequence analysis of 1,202 bacterial and 91 archaeal genomes revealed ~3,000 POP homologs, with only 638 annotated POPs. We observed wide distribution of POPs in all the analysed bacterial lineages. Phylogenetic analysis and co-clustering of POPs of different phyla suggested their common functions in all the prokaryotic species. Further, on the basis of unique sequence motifs we could classify bacterial POPs into eight subtypes. Analysis of coexisting domains in POPs highlighted their involvement in protein-protein interactions and cellular signaling. We proposed significant extension of this gene family by characterizing 39 new POPs and 158 new alpha/beta hydrolase members. CONCLUSIONS: Our study reflects diversity and functional importance of POPs in bacterial species. Many genomes with multiple POPs were identified with high sequence variations and different cellular localizations. Such anomalous distribution of POP genes in different bacterial genomes shows differential expansion of POP gene family primarily by multiple horizontal gene transfer events.
- Redpath GM et al.
- Calpain cleavage within dysferlin exon 40a releases a synaptotagmin-like module for membrane repair.
- Mol Biol Cell. 2014; 25: 3037-48
- Display abstract
Dysferlin and calpain are important mediators of the emergency response to repair plasma membrane injury. Our previous research revealed that membrane injury induces cleavage of dysferlin to release a synaptotagmin-like C-terminal module we termed mini-dysferlinC72. Here we show that injury-activated cleavage of dysferlin is mediated by the ubiquitous calpains via a cleavage motif encoded by alternately spliced exon 40a. An exon 40a-specific antibody recognizing cleaved mini-dysferlinC72 intensely labels the circumference of injury sites, supporting a key role for dysferlinExon40a isoforms in membrane repair and consistent with our evidence suggesting that the calpain-cleaved C-terminal module is the form specifically recruited to injury sites. Calpain cleavage of dysferlin is a ubiquitous response to membrane injury in multiple cell lineages and occurs independently of the membrane repair protein MG53. Our study links calpain and dysferlin in the calcium-activated vesicle fusion of membrane repair, placing calpains as upstream mediators of a membrane repair cascade that elicits cleaved dysferlin as an effector. Of importance, we reveal that myoferlin and otoferlin are also cleaved enzymatically to release similar C-terminal modules, bearing two C2 domains and a transmembrane domain. Evolutionary preservation of this feature highlights its functional importance and suggests that this highly conserved C-terminal region of ferlins represents a functionally specialized vesicle fusion module.
- Sula A, Cole AR, Yeats C, Orengo C, Keep NH
- Crystal structures of the human Dysferlin inner DysF domain.
- BMC Struct Biol. 2014; 14: 3-3
- Display abstract
BACKGROUND: Mutations in dysferlin, the first protein linked with the cell membrane repair mechanism, causes a group of muscular dystrophies called dysferlinopathies. Dysferlin is a type two-anchored membrane protein, with a single C terminal trans-membrane helix, and most of the protein lying in cytoplasm. Dysferlin contains several C2 domains and two DysF domains which are nested one inside the other. Many pathogenic point mutations fall in the DysF domain region. RESULTS: We describe the crystal structure of the human dysferlin inner DysF domain with a resolution of 1.9 Angstroms. Most of the pathogenic mutations are part of aromatic/arginine stacks that hold the domain in a folded conformation. The high resolution of the structure show that these interactions are a mixture of parallel ring/guanadinium stacking, perpendicular H bond stacking and aliphatic chain packing. CONCLUSIONS: The high resolution structure of the Dysferlin DysF domain gives a template on which to interpret in detail the pathogenic mutations that lead to disease.
- Chakraborty C, George Priya Doss C, Sharma R, Sahana S, Nair TS
- Does computational biology help us to understand the molecular phylogenetics and evolution of cluster of differentiation (CD) proteins?
- Protein J. 2013; 32: 143-54
- Display abstract
Cluster of differentiation (CD) is a group of proteins with highly immunological and medical importance, and some are established therapeutics. These membrane proteins are used to investigate of cell surface molecules of blood cells especially WBC. We selected a population of fifteen members with most medical importance, which includes CD2, CD4, CD5, CD6, CD7, CD9, CD14, CD16, CD19, CD22, CD28, CD33, CD36, CD38, and CD44 and performed in silico analysis using algorithm analysis and mathematical models. The results suggest that LEU (L) is well aligned. CD16 is rooted with CD22 and likewise, CD4 is closely related to CD44. Notably, highest number of highly conserved amino acids is recorded in CD22. WebLogo were formed up to 350 amino acid position and Met (M) is found to be tallest logo. Our results would be useful for upcoming researchers to obtain fundamental idea about the particular regions CD proteins which is having the structural and functional significance related to the evolutionary biology.
- Panneton WM
- The mammalian diving response: an enigmatic reflex to preserve life?
- Physiology (Bethesda). 2013; 28: 284-97
- Display abstract
The mammalian diving response is a remarkable behavior that overrides basic homeostatic reflexes. It is most studied in large aquatic mammals but is seen in all vertebrates. Pelagic mammals have developed several physiological adaptations to conserve intrinsic oxygen stores, but the apnea, bradycardia, and vasoconstriction is shared with those terrestrial and is neurally mediated. The adaptations of aquatic mammals are reviewed here as well as the neural control of cardiorespiratory physiology during diving in rodents.
- Venkatesh B, Ravi V, Lee AP, Warren WC, Brenner S
- Basal vertebrates clarify the evolutionary history of ciliopathy-associated genes Tmem138 and Tmem216.
- Mol Biol Evol. 2013; 30: 62-5
- Display abstract
Recently, Lee et al. (Lee JH, Silhavy JL, Lee JE, et al. (30 co-authors). 2012. Evolutionarily assembled cis-regulatory module at a human ciliopathy locus. Science (335:966-969.) demonstrated that mutation in either of the transmembrane protein encoding genes, TMEM138 or TMEM216, causes phenotypically indistinguishable ciliopathy. Furthermore, on the basis of the observation that their orthologs are linked in a head-to-tail configuration in other mammals and Anolis, but present on different scaffolds or chromosomes in Xenopus tropicalis and zebrafish, the authors concluded that the two genes were joined by chromosomal rearrangement at the evolutionary amphibian-to-reptile transition to form a functional module. We have sequenced these gene loci in a cartilaginous fish, the elephant shark, and found that the two genes together with a related gene (Tmem80) constitute a tandem cluster. This suggests that the two genes were already linked in the vertebrate ancestor and then rearranged independently in Xenopus and zebrafish. Analyses of the coelacanth and lamprey genomes support this hypothesis. Our study highlights the importance of basal vertebrates as critical reference genomes.
- Mukherjee K, Brocchieri L
- Ancient Origin of Chaperonin Gene Paralogs Involved in Ciliopathies.
- J Phylogenetics Evol Biol. 2013; 1: 0-0
- Display abstract
The Bardet-Biedl Syndrome (BBS) is a human developmental disorder that has been associated with fourteen BBS genes affecting the development of cilia. Three BBS genes are distant relatives of chaperonin proteins, a family of chaperones well known for the protein-folding role of their double-ringed complexes. Chaperonin-like BBS genes were originally thought to be vertebrate-specific, but related genes from different metazoan species have been identified as chaperonin-like BBS genes based on sequence similarity. Our phylogenetic analyses confirmed the classification of these genes in the chaperonin-like BBS gene family, and set the origin of the gene family earlier than the time of separation of Bilateria, Cnidaria, and Placozoa. By extensive searches of chaperonin-like genes in complete genomes representing several eukaryotic lineages, we discovered the presence of chaperonin-like BBS genes also in the genomes of Phytophthora and Pythium, belonging to the group of Oomycetes. This finding suggests that the chaperonin-like BBS gene family had already evolved before the origin of Metazoa, as early in eukaryote evolution as before separation of the lineages of Unikonts and Chromalveolates. The analysis of coding sequences indicated that chaperonin-like BBS proteins have evolved in all lineages under constraining selection. Furthermore, analysis of the predicted structural features suggested that, despite their high rate of divergence, chaperonin-like BBS proteins mostly conserve a typical chaperonin-like three-dimensional structure, but question their ability to assemble and function as chaperonin-like double-ringed complexes.
- Cazares-Garcia SV, Vazquez-Garciduenas S, Vazquez-Marrufo G
- Structural and phylogenetic analysis of laccases from Trichoderma: a bioinformatic approach.
- PLoS One. 2013; 8: 55295-55295
- Display abstract
The genus Trichoderma includes species of great biotechnological value, both for their mycoparasitic activities and for their ability to produce extracellular hydrolytic enzymes. Although activity of extracellular laccase has previously been reported in Trichoderma spp., the possible number of isoenzymes is still unknown, as are the structural and functional characteristics of both the genes and the putative proteins. In this study, the system of laccases sensu stricto in the Trichoderma species, the genomes of which are publicly available, were analyzed using bioinformatic tools. The intron/exon structure of the genes and the identification of specific motifs in the sequence of amino acids of the proteins generated in silico allow for clear differentiation between extracellular and intracellular enzymes. Phylogenetic analysis suggests that the common ancestor of the genus possessed a functional gene for each one of these enzymes, which is a characteristic preserved in T. atroviride and T. virens. This analysis also reveals that T. harzianum and T. reesei only retained the intracellular activity, whereas T. asperellum added an extracellular isoenzyme acquired through horizontal gene transfer during the mycoparasitic process. The evolutionary analysis shows that in general, extracellular laccases are subjected to purifying selection, and intracellular laccases show neutral evolution. The data provided by the present study will enable the generation of experimental approximations to better understand the physiological role of laccases in the genus Trichoderma and to increase their biotechnological potential.
- Doerrler WT, Sikdar R, Kumar S, Boughner LA
- New functions for the ancient DedA membrane protein family.
- J Bacteriol. 2013; 195: 3-11
- Display abstract
The DedA protein family is a highly conserved and ancient family of membrane proteins with representatives in most sequenced genomes, including those of bacteria, archaea, and eukarya. The functions of the DedA family proteins remain obscure. However, recent genetic approaches have revealed important roles for certain bacterial DedA family members in membrane homeostasis. Bacterial DedA family mutants display such intriguing phenotypes as cell division defects, temperature sensitivity, altered membrane lipid composition, elevated envelope-related stress responses, and loss of proton motive force. The DedA family is also essential in at least two species of bacteria: Borrelia burgdorferi and Escherichia coli. Here, we describe the phylogenetic distribution of the family and summarize recent progress toward understanding the functions of the DedA membrane protein family.
- Krajacic P, Pistilli EE, Tanis JE, Khurana TS, Lamitina ST
- FER-1/Dysferlin promotes cholinergic signaling at the neuromuscular junction in C. elegans and mice.
- Biol Open. 2013; 2: 1245-52
- Display abstract
Dysferlin is a member of the evolutionarily conserved ferlin gene family. Mutations in Dysferlin lead to Limb Girdle Muscular Dystrophy 2B (LGMD2B), an inherited, progressive and incurable muscle disorder. However, the molecular mechanisms underlying disease pathogenesis are not fully understood. We found that both loss-of-function mutations and muscle-specific overexpression of C. elegans fer-1, the founding member of the Dysferlin gene family, caused defects in muscle cholinergic signaling. To determine if Dysferlin-dependent regulation of cholinergic signaling is evolutionarily conserved, we examined the in vivo physiological properties of skeletal muscle synaptic signaling in a mouse model of Dysferlin-deficiency. In addition to a loss in muscle strength, Dysferlin -/- mice also exhibited a cholinergic deficit manifested by a progressive, frequency-dependent decrement in their compound muscle action potentials following repetitive nerve stimulation, which was observed in another Dysferlin mouse model but not in a Dysferlin-independent mouse model of muscular dystrophy. Oral administration of Pyridostigmine bromide, a clinically used acetylcholinesterase inhibitor (AchE.I) known to increase synaptic efficacy, reversed the action potential defect and restored in vivo muscle strength to Dysferlin -/- mice without altering muscle pathophysiology. Our data demonstrate a previously unappreciated role for Dysferlin in the regulation of cholinergic signaling and suggest that such regulation may play a significant pathophysiological role in LGMD2B disease.
- Marty NJ, Holman CL, Abdullah N, Johnson CP
- The C2 domains of otoferlin, dysferlin, and myoferlin alter the packing of lipid bilayers.
- Biochemistry. 2013; 52: 5585-92
- Display abstract
Ferlins are large multi-C2 domain membrane proteins involved in membrane fusion and fission events. In this study, we investigate the effects of binding of the C2 domains of otoferlin, dysferlin, and myoferlin on the structure of lipid bilayers. Fluorescence measurements indicate that multi-C2 domain constructs of myoferlin, dysferlin, and otoferlin change the lipid packing of both small unilamellar vesicles and giant plasma membrane vesicles. The activities of these proteins were enhanced in the presence of calcium and required negatively charged lipids like phosphatidylserine or phosphatidylglycerol for activity. Experiments with individual domains uncovered functional differences between the C2A domain of otoferlin and those of dysferlin and myoferlin, and truncation studies suggest that the effects of each subsequent C2 domain on lipid ordering appear to be additive. Finally, we demonstrate that the activities of these proteins on membranes are insensitive to high salt concentrations, suggesting a nonelectrostatic component to the interaction between ferlin C2 domains and lipid bilayers. Together, the data indicate that dysferlin, otoferlin, and myoferlin do not merely passively adsorb to membranes but actively sculpt lipid bilayers, which would result in highly curved or distorted membrane regions that could facilitate membrane fusion, membrane fission, or recruitment of other membrane-trafficking proteins.
- Johnson G, Moore SW
- The Leu-Arg-Glu (LRE) adhesion motif in proteins of the neuromuscular junction with special reference to proteins of the carboxylesterase/cholinesterase family.
- Comp Biochem Physiol Part D Genomics Proteomics. 2013; 8: 231-43
- Display abstract
Short linear motifs confer evolutionary flexibility on proteins as they can be added with relative ease allowing the acquisition of new functions. Such motifs may mediate a variety of signalling functions. The adhesion-mediating Leu-Arg-Glu (LRE) motif is enriched in laminin beta 2, and has been observed in other proteins, including members of the carboxylesterase/cholinesterase family. It acts as a stop signal for growing axons in the developing neuromuscular junction, binding to the voltage-gated calcium channel. In this bioinformatic analysis, we have investigated the presence of the motif in proteins of the neuromuscular junction, and have also examined its structural position and potential for ligand interaction, as well as phylogenetic conservation, in the carboxylesterase/cholinesterase family. The motif was observed to occur with a significantly higher frequency than expected in the UniProt/Swiss-Prot database, as well as in four individual species (human, mouse, Caenorhabditis elegans and Drosophila melanogaster). Examination of its presence in neuromuscular junction proteins showed it to be enriched in certain proteins of the synaptic basement membrane, including laminin, agrin, acetylcholinesterase and tenascin. A highly significant enrichment was observed in cytoskeletal proteins, particularly intermediate filament proteins and members of the spectrin family. In the carboxylesterase/cholinesterase family, the motif was observed in four conserved positions in the protein structure. It is present in the majority of mammalian acetylcholinesterases, as well as acetylcholinesterases from electric fish and a number of invertebrates. In insects, it is present in the ace-2, rather than in the synaptic ace-1, enzyme. It is also observed in the cholinesterase-like adhesion molecules (neuroligins, neurotactin and glutactin). It is never seen in butyrylcholinesterases, which do not mediate cell adhesion. In conclusion, the significant enrichment of the motif in certain classes of protein, as well as its conserved presence and structural positioning in one protein family, suggests that it has specific functions both in cell adhesion in the neuromuscular junction and in maintaining the structural integrity of the cytoskeleton.
- Costantini S, Sharma A, Raucci R, Costantini M, Autiero I, Colonna G
- Genealogy of an ancient protein family: the Sirtuins, a family of disordered members.
- BMC Evol Biol. 2013; 13: 60-60
- Display abstract
BACKGROUND: Sirtuins genes are widely distributed by evolution and have been found in eubacteria, archaea and eukaryotes. While prokaryotic and archeal species usually have one or two sirtuin homologs, in humans as well as in eukaryotes we found multiple versions and in mammals this family is comprised of seven different homologous proteins being all NAD-dependent de-acylases. 3D structures of human SIRT2, SIRT3, and SIRT5 revealed the overall conformation of the conserved core domain but they were unable to give a structural information about the presence of very flexible and dynamically disordered regions, the role of which is still structurally and functionally unclear. Recently, we modeled the 3D-structure of human SIRT1, the most studied member of this family, that unexpectedly emerged as a member of the intrinsically disordered proteins with its long disordered terminal arms. Despite clear similarities in catalytic cores between the human sirtuins little is known of the general structural characteristics of these proteins. The presence of disorder in human SIRT1 and the propensity of these proteins in promoting molecular interactions make it important to understand the underlying mechanisms of molecular recognition that reasonably should involve terminal segments. The mechanism of recognition, in turn, is a prerequisite for the understanding of any functional activity. Aim of this work is to understand what structural properties are shared among members of this family in humans as well as in other organisms. RESULTS: We have studied the distribution of the structural features of N- and C-terminal segments of sirtuins in all known organisms to draw their evolutionary histories by taking into account average length of terminal segments, amino acid composition, intrinsic disorder, presence of charged stretches, presence of putative phosphorylation sites, flexibility, and GC content of genes. Finally, we have carried out a comprehensive analysis of the putative phosphorylation sites in human sirtuins confirming those sites already known experimentally for human SIRT1 and 2 as well as extending their topology to all the family to get feedback of their physiological functions and cellular localization. CONCLUSIONS: Our results highlight that the terminal segments of the majority of sirtuins possess a number of structural features and chemical and physical properties that strongly support their involvement in activities of recognition and interaction with other protein molecules. We also suggest how a multisite phosphorylation provides a possible mechanism by which flexible and intrinsically disordered segments of a sirtuin supported by the presence of positively or negatively charged stretches might enhance the strength and specificity of interaction with a particular molecular partner.
- Lenart A, Dudkiewicz M, Grynberg M, Pawlowski K
- CLCAs - a family of metalloproteases of intriguing phylogenetic distribution and with cases of substituted catalytic sites.
- PLoS One. 2013; 8: 62272-62272
- Display abstract
The zinc-dependent metalloproteases with His-Glu-x-x-His (HExxH) active site motif, zincins, are a broad group of proteins involved in many metabolic and regulatory functions, and found in all forms of life. Human genome contains more than 100 genes encoding proteins with known zincin-like domains. A survey of all proteins containing the HExxH motif shows that approximately 52% of HExxH occurrences fall within known protein structural domains (as defined in the Pfam database). Domain families with majority of members possessing a conserved HExxH motif include, not surprisingly, many known and putative metalloproteases. Furthermore, several HExxH-containing protein domains thus identified can be confidently predicted to be putative peptidases of zincin fold. Thus, we predict zincin-like fold for eight uncharacterised Pfam families. Besides the domains with the HExxH motif strictly conserved, and those with sporadic occurrences, intermediate families are identified that contain some members with a conserved HExxH motif, but also many homologues with substitutions at the conserved positions. Such substitutions can be evolutionarily conserved and non-random, yet functional roles of these inactive zincins are not known. The CLCAs are a novel zincin-like protease family with many cases of substituted active sites. We show that this allegedly metazoan family has a number of bacterial and archaeal members. An extremely patchy phylogenetic distribution of CLCAs in prokaryotes and their conserved protein domain composition strongly suggests an evolutionary scenario of horizontal gene transfer (HGT) from multicellular eukaryotes to bacteria, providing an example of eukaryote-derived xenologues in bacterial genomes. Additionally, in a protein family identified here as closely homologous to CLCA, the CLCA_X (CLCA-like) family, a number of proteins is found in phages and plasmids, supporting the HGT scenario.
- Turtoi A et al.
- Myoferlin is a key regulator of EGFR activity in breast cancer.
- Cancer Res. 2013; 73: 5438-48
- Display abstract
Myoferlin is a member of the ferlin family of proteins that participate in plasma membrane fusion, repair, and endocytosis. While some reports have implicated myoferlin in cancer, the extent of its expression in and contributions to cancer are not well established. In this study, we show that myoferlin is overexpressed in human breast cancers and that it has a critical role in controlling degradation of the epidermal growth factor (EGF) receptor (EGFR) after its activation and internalization in breast cancer cells. Myoferlin depletion blocked EGF-induced cell migration and epithelial-to-mesenchymal transition. Both effects were induced as a result of impaired degradation of phosphorylated EGFR via dysfunctional plasma membrane caveolae and alteration of caveolin homo-oligomerization. In parallel, myoferlin depletion reduced tumor development in a chicken chorioallantoic membrane xenograft model of human breast cancer. Considering the therapeutic significance of EGFR targeting, our findings identify myoferlin as a novel candidate function to target for future drug development.
- Zhang YB et al.
- Identification of a novel Gig2 gene family specific to non-amniote vertebrates.
- PLoS One. 2013; 8: 60588-60588
- Display abstract
Gig2 (grass carp reovirus (GCRV)-induced gene 2) is first identified as a novel fish interferon (IFN)-stimulated gene (ISG). Overexpression of a zebrafish Gig2 gene can protect cultured fish cells from virus infection. In the present study, we identify a novel gene family that is comprised of genes homologous to the previously characterized Gig2. EST/GSS search and in silico cloning identify 190 Gig2 homologous genes in 51 vertebrate species ranged from lampreys to amphibians. Further large-scale search of vertebrate and invertebrate genome databases indicate that Gig2 gene family is specific to non-amniotes including lampreys, sharks/rays, ray-finned fishes and amphibians. Phylogenetic analysis and synteny analysis reveal lineage-specific expansion of Gig2 gene family and also provide valuable evidence for the fish-specific genome duplication (FSGD) hypothesis. Although Gig2 family proteins exhibit no significant sequence similarity to any known proteins, a typical Gig2 protein appears to consist of two conserved parts: an N-terminus that bears very low homology to the catalytic domains of poly(ADP-ribose) polymerases (PARPs), and a novel C-terminal domain that is unique to this gene family. Expression profiling of zebrafish Gig2 family genes shows that some duplicate pairs have diverged in function via acquisition of novel spatial and/or temporal expression under stresses. The specificity of this gene family to non-amniotes might contribute to a large extent to distinct physiology in non-amniote vertebrates.
- Wang Z, Zarlenga D, Martin J, Abubucker S, Mitreva M
- Exploring metazoan evolution through dynamic and holistic changes in protein families and domains.
- BMC Evol Biol. 2012; 12: 138-138
- Display abstract
BACKGROUND: Proteins convey the majority of biochemical and cellular activities in organisms. Over the course of evolution, proteins undergo normal sequence mutations as well as large scale mutations involving domain duplication and/or domain shuffling. These events result in the generation of new proteins and protein families. Processes that affect proteome evolution drive species diversity and adaptation. Herein, change over the course of metazoan evolution, as defined by birth/death and duplication/deletion events within protein families and domains, was examined using the proteomes of 9 metazoan and two outgroup species. RESULTS: In studying members of the three major metazoan groups, the vertebrates, arthropods, and nematodes, we found that the number of protein families increased at the majority of lineages over the course of metazoan evolution where the magnitude of these increases was greatest at the lineages leading to mammals. In contrast, the number of protein domains decreased at most lineages and at all terminal lineages. This resulted in a weak correlation between protein family birth and domain birth; however, the correlation between domain birth and domain member duplication was quite strong. These data suggest that domain birth and protein family birth occur via different mechanisms, and that domain shuffling plays a role in the formation of protein families. The ratio of protein family birth to protein domain birth (domain shuffling index) suggests that shuffling had a more demonstrable effect on protein families in nematodes and arthropods than in vertebrates. Through the contrast of high and low domain shuffling indices at the lineages of Trichinella spiralis and Gallus gallus, we propose a link between protein redundancy and evolutionary changes controlled by domain shuffling; however, the speed of adaptation among the different lineages was relatively invariant. Evaluating the functions of protein families that appeared or disappeared at the last common ancestors (LCAs) of the three metazoan clades supports a correlation with organism adaptation. Furthermore, bursts of new protein families and domains in the LCAs of metazoans and vertebrates are consistent with whole genome duplications. CONCLUSION: Metazoan speciation and adaptation were explored by birth/death and duplication/deletion events among protein families and domains. Our results provide insights into protein evolution and its bearing on metazoan evolution.
- Lawson CB et al.
- The salmonid myostatin gene family: a novel model for investigating mechanisms that influence duplicate gene fate.
- BMC Evol Biol. 2012; 12: 202-202
- Display abstract
BACKGROUND: Most fishes possess two paralogs for myostatin, a muscle growth inhibitor, while salmonids are presumed to have four: mstn1a, mstn1b, mstn2a and mstn2b, a pseudogene. The mechanisms responsible for preserving these duplicates as well as the depth of mstn2b nonfunctionalization within the family remain unknown. We therefore characterized several genomic clones in order to better define species and gene phylogenies. RESULTS: Gene organization and sequence conservation was particularly evident among paralog groupings and within salmonid subfamilies. All mstn2b sequences included in-frame stop codons, confirming its nonfunctionalization across taxa, although the indels and polymorphisms responsible often differed. For example, the specific indels within the Onchorhynchus tshawytscha and O. nerka genes were remarkably similar and differed equally from other mstn2b orthologs. A phylogenetic analysis weakly established a mstn2b clade including only these species, which coupled with a shared 51 base pair deletion might suggest a history involving hybridization or a shared phylogenetic history. Furthermore, mstn2 introns all lacked conserved splice site motifs, suggesting that the tissue-specific processing of mstn2a transcripts, but not those of mstn2b, is due to alternative cis regulation and is likely a common feature in salmonids. It also suggests that limited transcript processing may have contributed to mstn2b nonfunctionalization. CONCLUSIONS: Previous studies revealed divergence within gene promoters while the current studies provide evidence for relaxed or positive selection in some coding sequence lineages. These results together suggest that the salmonid myostatin gene family is a novel resource for investigating mechanisms that regulate duplicate gene fate as paralog specific differences in gene expression, transcript processing and protein structure are all suggestive of active divergence.
- Hickford D, Frankenberg S, Shaw G, Renfree MB
- Evolution of vertebrate interferon inducible transmembrane proteins.
- BMC Genomics. 2012; 13: 155-155
- Display abstract
BACKGROUND: Interferon inducible transmembrane proteins (IFITMs) have diverse roles, including the control of cell proliferation, promotion of homotypic cell adhesion, protection against viral infection, promotion of bone matrix maturation and mineralisation, and mediating germ cell development. Most IFITMs have been well characterised in human and mouse but little published data exists for other animals. This study characterised IFITMs in two distantly related marsupial species, the Australian tammar wallaby and the South American grey short-tailed opossum, and analysed the phylogeny of the IFITM family in vertebrates. RESULTS: Five IFITM paralogues were identified in both the tammar and opossum. As in eutherians, most marsupial IFITM genes exist within a cluster, contain two exons and encode proteins with two transmembrane domains. Only two IFITM genes, IFITM5 and IFITM10, have orthologues in both marsupials and eutherians. IFITM5 arose in bony fish and IFITM10 in tetrapods. The bone-specific expression of IFITM5 appears to be restricted to therian mammals, suggesting that its specialised role in bone production is a recent adaptation specific to mammals. IFITM10 is the most highly conserved IFITM, sharing at least 85% amino acid identity between birds, reptiles and mammals and suggesting an important role for this presently uncharacterised protein. CONCLUSIONS: Like eutherians, marsupials also have multiple IFITM genes that exist in a gene cluster. The differing expression patterns for many of the paralogues, together with poor sequence conservation between species, suggests that IFITM genes have acquired many different roles during vertebrate evolution.
- Leclere L, Rentzsch F
- Repeated evolution of identical domain architecture in metazoan netrin domain-containing proteins.
- Genome Biol Evol. 2012; 4: 883-99
- Display abstract
The majority of proteins in eukaryotes are composed of multiple domains, and the number and order of these domains is an important determinant of protein function. Although multidomain proteins with a particular domain architecture were initially considered to have a common evolutionary origin, recent comparative studies of protein families or whole genomes have reported that a minority of multidomain proteins could have appeared multiple times independently. Here, we test this scenario in detail for the signaling molecules netrin and secreted frizzled-related proteins (sFRPs), two groups of netrin domain-containing proteins with essential roles in animal development. Our primary phylogenetic analyses suggest that the particular domain architectures of each of these proteins were present in the eumetazoan ancestor and evolved a second time independently within the metazoan lineage from laminin and frizzled proteins, respectively. Using an array of phylogenetic methods, statistical tests, and character sorting analyses, we show that the polyphyly of netrin and sFRP is well supported and cannot be explained by classical phylogenetic reconstruction artifacts. Despite their independent origins, the two groups of netrins and of sFRPs have the same protein interaction partners (Deleted in Colorectal Cancer/neogenin and Unc5 for netrins and Wnts for sFRPs) and similar developmental functions. Thus, these cases of convergent evolution emphasize the importance of domain architecture for protein function by uncoupling shared domain architecture from shared evolutionary history. Therefore, we propose the terms merology to describe the repeated evolution of proteins with similar domain architecture and discuss the potential of merologous proteins to help understanding protein evolution.
- Rybarczyk-Mydlowska K et al.
- Rather than by direct acquisition via lateral gene transfer, GHF5 cellulases were passed on from early Pratylenchidae to root-knot and cyst nematodes.
- BMC Evol Biol. 2012; 12: 221-221
- Display abstract
BACKGROUND: Plant parasitic nematodes are unusual Metazoans as they are equipped with genes that allow for symbiont-independent degradation of plant cell walls. Among the cell wall-degrading enzymes, glycoside hydrolase family 5 (GHF5) cellulases are relatively well characterized, especially for high impact parasites such as root-knot and cyst nematodes. Interestingly, ancestors of extant nematodes most likely acquired these GHF5 cellulases from a prokaryote donor by one or multiple lateral gene transfer events. To obtain insight into the origin of GHF5 cellulases among evolutionary advanced members of the order Tylenchida, cellulase biodiversity data from less distal family members were collected and analyzed. RESULTS: Single nematodes were used to obtain (partial) genomic sequences of cellulases from representatives of the genera Meloidogyne, Pratylenchus, Hirschmanniella and Globodera. Combined Bayesian analysis of approximately 100 cellulase sequences revealed three types of catalytic domains (A, B, and C). Represented by 84 sequences, type B is numerically dominant, and the overall topology of the catalytic domain type shows remarkable resemblance with trees based on neutral (= pathogenicity-unrelated) small subunit ribosomal DNA sequences. Bayesian analysis further suggested a sister relationship between the lesion nematode Pratylenchus thornei and all type B cellulases from root-knot nematodes. Yet, the relationship between the three catalytic domain types remained unclear. Superposition of intron data onto the cellulase tree suggests that types B and C are related, and together distinct from type A that is characterized by two unique introns. CONCLUSIONS: All Tylenchida members investigated here harbored one or multiple GHF5 cellulases. Three types of catalytic domains are distinguished, and the presence of at least two types is relatively common among plant parasitic Tylenchida. Analysis of coding sequences of cellulases suggests that root-knot and cyst nematodes did not acquire this gene directly by lateral genes transfer. More likely, these genes were passed on by ancestors of a family nowadays known as the Pratylenchidae.
- Chemes LB, Glavina J, Faivovich J, de Prat-Gay G, Sanchez IE
- Evolution of linear motifs within the papillomavirus E7 oncoprotein.
- J Mol Biol. 2012; 422: 336-46
- Display abstract
Many protein functions can be traced to linear sequence motifs of less than five residues, which are often found within intrinsically disordered domains. In spite of their prevalence, their role in protein evolution is only beginning to be understood. The study of papillomaviruses has provided many insights on the evolution of protein structure and function. We have chosen the papillomavirus E7 oncoprotein as a model system for the evolution of functional linear motifs. The multiple functions of E7 proteins from paradigmatic papillomavirus types can be explained to a large extent in terms of five linear motifs within the intrinsically disordered N-terminal domain and two linear motifs within the globular homodimeric C-terminal domain. We examined the motif inventory of E7 proteins from over 200 known papillomavirus types and found that the motifs reported for paradigmatic papillomavirus types are absent from many uncharacterized E7 proteins. Several motif pairs occur more often than expected, suggesting that linear motifs may evolve and function in a cooperative manner. The E7 linear motifs have appeared or disappeared multiple times during papillomavirus evolution, confirming the evolutionary plasticity of short functional sequences. Four of the motifs appeared several times during papillomavirus evolution, providing direct evidence for convergent evolution. Interestingly, the evolution pattern of a motif is independent of its location in a globular or disordered domain. The correlation between the presence of some motifs and virus host specificity and tissue tropism suggests that linear motifs play a role in the adaptive evolution of papillomaviruses.
- Fortunato S et al.
- Genome-wide analysis of the sox family in the calcareous sponge Sycon ciliatum: multiple genes with unique expression patterns.
- Evodevo. 2012; 3: 14-14
- Display abstract
BACKGROUND: Sox genes are HMG-domain containing transcription factors with important roles in developmental processes in animals; many of them appear to have conserved functions among eumetazoans. Demosponges have fewer Sox genes than eumetazoans, but their roles remain unclear. The aim of this study is to gain insight into the early evolutionary history of the Sox gene family by identification and expression analysis of Sox genes in the calcareous sponge Sycon ciliatum. METHODS: Calcaronean Sox related sequences were retrieved by searching recently generated genomic and transcriptome sequence resources and analyzed using variety of phylogenetic methods and identification of conserved motifs. Expression was studied by whole mount in situ hybridization. RESULTS: We have identified seven Sox genes and four Sox-related genes in the complete genome of Sycon ciliatum. Phylogenetic and conserved motif analyses showed that five of Sycon Sox genes represent groups B, C, E, and F present in cnidarians and bilaterians. Two additional genes are classified as Sox genes but cannot be assigned to specific subfamilies, and four genes are more similar to Sox genes than to other HMG-containing genes. Thus, the repertoire of Sox genes is larger in this representative of calcareous sponges than in the demosponge Amphimedon queenslandica. It remains unclear whether this is due to the expansion of the gene family in Sycon or a secondary reduction in the Amphimedon genome. In situ hybridization of Sycon Sox genes revealed a variety of expression patterns during embryogenesis and in specific cell types of adult sponges. CONCLUSIONS: In this study, we describe a large family of Sox genes in Sycon ciliatum with dynamic expression patterns, indicating that Sox genes are regulators in development and cell type determination in sponges, as observed in higher animals. The revealed differences between demosponge and calcisponge Sox genes repertoire highlight the need to utilize models representing different sponge lineages to describe sponge development, a prerequisite for deciphering evolution of metazoan developmental mechanisms.
- Pei J, Grishin NV
- Unexpected diversity in Shisa-like proteins suggests the importance of their roles as transmembrane adaptors.
- Cell Signal. 2012; 24: 758-69
- Display abstract
The Shisa family of single-transmembrane proteins is characterized by an N-terminal cysteine-rich domain and a proline-rich C-terminal region. Its founding member, Xenopus Shisa, promotes head development by antagonizing Wnt and FGF signaling. Recently, a mouse brain-specific Shisa protein CKAMP44 (Shisa9) was shown to play an important role in AMPA receptor desensitization. We used sequence similarity searches against protein, genome and EST databases to study the evolutionary origin and phylogenetic distribution of Shisa homologs. In addition to nine Shisa subfamilies in vertebrates, we detected distantly related Shisa homologs that possess an N-terminal domain with six conserved cysteines. These Shisa-like proteins include FAM159 and KIAA1644 mainly from vertebrates, and members from various bilaterian invertebrates and Porifera, suggesting their presence in the last common ancestor of Metazoa. Shisa-like genes have undergone large expansions in Branchiostoma floridae and Saccoglossus kowalevskii, and appear to have been lost in certain insects. Pattern-based searches against eukaryotic proteomes also uncovered several other families of predicted single-transmembrane proteins with a similar cysteine-rich domain. We refer to these proteins (Shisa/Shisa-like, WBP1/VOPP1, CX, DUF2650, TMEM92, and CYYR1) as STMC6 proteins (single-transmembrane proteins with conserved 6 cysteines). STMC6 genes are widespread in Metazoa, with the human genome containing 17 members. Frequent occurrences of PY motifs in STMC6 proteins suggest that most of them could interact with WW-domain-containing proteins, such as the NEDD4 family E3 ubiquitin ligases, and could play critical roles in protein degradation and sorting. STMC6 proteins are likely transmembrane adaptors that regulate membrane proteins such as cell surface receptors.
- Sallman Almen M, Bringeland N, Fredriksson R, Schioth HB
- The dispanins: a novel gene family of ancient origin that contains 14 human members.
- PLoS One. 2012; 7: 31961-31961
- Display abstract
The Interferon induced transmembrane proteins (IFITM) are a family of transmembrane proteins that is known to inhibit cell invasion of viruses such as HIV-1 and influenza. We show that the IFITM genes are a subfamily in a larger family of transmembrane (TM) proteins that we call Dispanins, which refers to a common 2TM structure. We mined the Dispanins in 36 eukaryotic species, covering all major eukaryotic groups, and investigated their evolutionary history using Bayesian and maximum likelihood approaches to infer a phylogenetic tree. We identified ten human genes that together with the known IFITM genes form the Dispanin family. We show that the Dispanins first emerged in eukaryotes in a common ancestor of choanoflagellates and metazoa, and that the family later expanded in vertebrates where it forms four subfamilies (A-D). Interestingly, we also find that the family is found in several different phyla of bacteria and propose that it was horizontally transferred to eukaryotes from bacteria in the common ancestor of choanoflagellates and metazoa. The bacterial and eukaryotic sequences have a considerably conserved protein structure. In conclusion, we introduce a novel family, the Dispanins, together with a nomenclature based on the evolutionary origin.
- Dayraud C et al.
- Independent specialisation of myosin II paralogues in muscle vs. non-muscle functions during early animal evolution: a ctenophore perspective.
- BMC Evol Biol. 2012; 12: 107-107
- Display abstract
BACKGROUND: Myosin II (or Myosin Heavy Chain II, MHCII) is a family of molecular motors involved in the contractile activity of animal muscle cells but also in various other cellular processes in non-muscle cells. Previous phylogenetic analyses of bilaterian MHCII genes identified two main clades associated respectively with smooth/non-muscle cells (MHCIIa) and striated muscle cells (MHCIIb). Muscle cells are generally thought to have originated only once in ancient animal history, and decisive insights about their early evolution are expected to come from expression studies of Myosin II genes in the two non-bilaterian phyla that possess muscles, the Cnidaria and Ctenophora. RESULTS: We have uncovered three MHCII paralogues in the ctenophore species Pleurobrachia pileus. Phylogenetic analyses indicate that the MHCIIa / MHCIIb duplication is more ancient than the divergence between extant metazoan lineages. The ctenophore MHCIIa gene (PpiMHCIIa) has an expression pattern akin to that of "stem cell markers" (Piwi, Vasa...) and is expressed in proliferating cells. We identified two MHCIIb genes that originated from a ctenophore-specific duplication. PpiMHCIIb1 represents the exclusively muscular form of myosin II in ctenophore, while PpiMHCIIb2 is expressed in non-muscle cells of various types. In parallel, our phalloidin staining and TEM observations highlight the structural complexity of ctenophore musculature and emphasize the experimental interest of the ctenophore tentacle root, in which myogenesis is spatially ordered and strikingly similar to striated muscle formation in vertebrates. CONCLUSION: MHCIIa expression in putative stem cells/proliferating cells probably represents an ancestral trait, while specific involvement of some MHCIIa genes in smooth muscle fibres is a uniquely derived feature of the vertebrates. That one ctenophore MHCIIb paralogue (PpiMHCIIb2) has retained MHCIIa-like expression features furthermore suggests that muscular expression of the other paralogue, PpiMHCIIb1, was the result of neofunctionalisation within the ctenophore lineage, making independent origin of ctenophore muscle cells a likely option.
- Caputi L, Malnoy M, Goremykin V, Nikiforova S, Martens S
- A genome-wide phylogenetic reconstruction of family 1 UDP-glycosyltransferases revealed the expansion of the family during the adaptation of plants to life on land.
- Plant J. 2012; 69: 1030-42
- Display abstract
For almost a decade, our knowledge on the organisation of the family 1 UDP-glycosyltransferases (UGTs) has been limited to the model plant A. thaliana. The availability of other plant genomes represents an opportunity to obtain a broader view of the family in terms of evolution and organisation. Family 1 UGTs are known to glycosylate several classes of plant secondary metabolites. A phylogeny reconstruction study was performed to get an insight into the evolution of this multigene family during the adaptation of plants to life on land. The organisation of the UGTs in the different organisms was also investigated. More than 1500 putative UGTs were identified in 12 fully sequenced and assembled plant genomes based on the highly conserved PSPG motif. Analyses by maximum likelihood (ML) method were performed to reconstruct the phylogenetic relationships existing between the sequences. The results of this study clearly show that the UGT family expanded during the transition from algae to vascular plants and that in higher plants the clustering of UGTs into phylogenetic groups appears to be conserved, although gene loss and gene gain events seem to have occurred in certain lineages. Interestingly, two new phylogenetic groups, named O and P, that are not present in A. thaliana were discovered.
- Wu J, Li H, Zhang S
- Regulator of complement activation (RCA) group 2 gene cluster in zebrafish: identification, expression, and evolution.
- Funct Integr Genomics. 2012; 12: 367-77
- Display abstract
The activation of the complement system is tightly regulated by a group of plasma and cell membrane-associated proteins for host cell protection. In humans, these regulatory protein genes are clustered in a region named the regulator of complement activation (RCA) gene locus and can be categorized into two groups. The group 1 gene cluster has been reported in zebrafish, but information regarding the RCA locus remains scarce in fish. Here we identified two closely linked RCA group 2 genes in zebrafish, ZRC1 and ZRC2, which had all the features characteristic of known RCA group 2 genes. Both ZRC1 and ZRC2 were closely linked to the PFKFB1 gene and located 17 Mkb downstream of the PFKFB2 gene; in contrast, RCA group 2 genes are closely linked to PFKFB2 in frogs, chickens, and humans. However, both the direction of the RCA group 2 genes relative to PFKFB2 and the order of the RCA group 2 gene-encoded proteins in zebrafish were comparable to those in frogs, chickens, and humans. ZRC1 and ZRC2 shared 71.1% identity to each other, implicating that they might have originated by gene duplication after the split of the fish/mammalian common ancestor. Moreover, ZRC1 and ZRC2 encoded a membrane-associated protein and a soluble protein, respectively, and displayed different expression patterns, suggesting that functional divergence has already occurred. This is the first report showing the presence of the RCA group 2 cluster as well as the membrane-associated complement regulatory protein in zebrafish, providing a better understanding of the origin and evolution of RCA proteins.
- Wang Y, Deng D, Zhang R, Wang S, Bian Y, Yin Z
- Systematic analysis of plant-specific B3 domain-containing proteins based on the genome resources of 11 sequenced species.
- Mol Biol Rep. 2012; 39: 6267-82
- Display abstract
B3 domain-containing proteins constitute a large transcription factor superfamily. The plant-specific B3 superfamily consists of four family members, i.e., LAV (LEC2 [LEAFY COTYLEDON 2]/ABI3 [ABSCISIC ACID INSENSITIVE 3] - VAL [VP1/ABI3-LIKE]), RAV (RELATED to ABI3/VP1), ARF (AUXIN RESPONSE FACTOR) and REM (REPRODUCTIVE MERISTEM) families. The B3 superfamily plays a central role in plant life, from embryogenesis to seed maturation and dormancy. In previous research, we have characterized ARF family, member of the B3 superfamily in silico (Wang et al., Mol Biol Rep, 2011, doi:10.1007/s11033-011-0991-z). In this study, we systematically analyzed the diversity, phylogeny and evolution of B3 domain-containing proteins based on genomic resources of 11 sequenced species. A total of 865 B3 domain-containing genes were identified from 11 sequenced species through an iterative strategy. The number of B3 domain-containing genes varies not only between species but between gene families. B3 domain-containing genes are unevenly distributed in chromosomes and tend to cluster in the genome. Numerous combinations of B3 domains and their partner domains contribute to the sequences and structural diversification of the B3 superfamiy. Phylogenetic results showed that moss VAL proteins are related to LEC2/ABI3 instead of VAL proteins from higher plants. Lineage-specific expansion of ARF and REM proteins was observed. The REM family is the most diversified member among the B3 superfamily and experiences a rapid divergence during selective sweep. Based on structural and phylogenetic analysis results, two possible evolutional modes of the B3 superfamily were presented. Results presented here provide a resource for further characterization of the B3 superfamily.
- Gosu V, Basith S, Durai P, Choi S
- Molecular evolution and structural features of IRAK family members.
- PLoS One. 2012; 7: 49771-49771
- Display abstract
The interleukin-1 receptor-associated kinase (IRAK) family comprises critical signaling mediators of the TLR/IL-1R signaling pathways. IRAKs are Ser/Thr kinases. There are 4 members in the vertebrate genome (IRAK1, IRAK2, IRAKM, and IRAK4) and an IRAK homolog, Pelle, in insects. IRAK family members are highly conserved in vertebrates, but the evolutionary relationship between IRAKs in vertebrates and insects is not clear. To investigate the evolutionary history and functional divergence of IRAK members, we performed extensive bioinformatics analysis. The phylogenetic relationship between IRAK sequences suggests that gene duplication events occurred in the evolutionary lineage, leading to early vertebrates. A comparative phylogenetic analysis with insect homologs of IRAKs suggests that the Tube protein is a homolog of IRAK4, unlike the anticipated protein, Pelle. Furthermore, the analysis supports that an IRAK4-like kinase is an ancestral protein in the metazoan lineage of the IRAK family. Through functional analysis, several potentially diverged sites were identified in the common death domain and kinase domain. These sites have been constrained during evolution by strong purifying selection, suggesting their functional importance within IRAKs. In summary, our study highlighted the molecular evolution of the IRAK family, predicted the amino acids that contributed to functional divergence, and identified structural variations among the IRAK paralogs that may provide a starting point for further experimental investigations.
- Fanali G, Ascenzi P, Bernardi G, Fasano M
- Sequence analysis of serum albumins reveals the molecular evolution of ligand recognition properties.
- J Biomol Struct Dyn. 2012; 29: 691-701
- Display abstract
Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.
- Kallberg M, Bhardwaj N, Langlois R, Lu H
- A structure-based protocol for learning the family-specific mechanisms of membrane-binding domains.
- Bioinformatics. 2012; 28: 431437-431437
- Display abstract
MOTIVATION: Peripheral membrane-targeting domain (MTD) families, such as C1-, C2- and PH domains, play a key role in signal transduction and membrane trafficking by dynamically translocating their parent proteins to specific plasma membranes when changes in lipid composition occur. It is, however, difficult to determine the subset of domains within families displaying this property, as sequence motifs signifying the membrane binding properties are not well defined. For this reason, procedures based on sequence similarity alone are often insufficient in computational identification of MTDs within families (yielding less than 65% accuracy even with a sequence identity of 70%). RESULTS: We present a machine learning protocol for determining membrane-targeting properties achieving 85-90% accuracy in separating binding and non-binding domains within families. Our model is based on features from both sequence and structure, thereby incorporation statistics obtained from the entire domain family and domain-specific physical quantities such as surface electrostatics. In addition, by using the enriched rules in alternating decision tree classifiers, we are able to determine the meaning of the assigned function labels in terms of biological mechanisms. CONCLUSIONS: The high accuracy of the learned models and good agreement between the rules discovered using the ADtree classifier and mechanisms reported in the literature reflect the value of machine learning protocols in both prediction and biological knowledge discovery. Our protocol can thus potentially be used as a general function annotation and knowledge mining tool for other protein domains. AVAILABILITY: metador.bioengr.uic.edu CONTACT: email@example.com.
- Ahn SJ, Vogel H, Heckel DG
- Comparative analysis of the UDP-glycosyltransferase multigene family in insects.
- Insect Biochem Mol Biol. 2012; 42: 133-47
- Display abstract
UDP-glycosyltransferases (UGT) catalyze the conjugation of a range of diverse small lipophilic compounds with sugars to produce glycosides, playing an important role in the detoxification of xenobiotics and in the regulation of endobiotics in insects. Recent progress in genome sequencing has enabled an assessment of the extent of the UGT multigene family in insects. Here we report over 310 putative UGT genes identified from genomic databases of eight different insect species together with a transcript database from the lepidopteran Helicoverpa armigera. Phylogenetic analysis of the insect UGTs showed Order-specific gene diversification and inter-species conservation of this multigene family. Only one family (UGT50) is found in all insect species surveyed (except the pea aphid) and may be homologous to mammalian UGT8. Three families (UGT31, UGT32, and UGT305) related to Lepidopteran UGTs are unique to baculoviruses. A lepidopteran sub-tree constructed with 40 H. armigera UGTs and 44 Bombyx mori UGTs revealed that lineage-specific expansions of some families in both species appear to be driven by diversification in the N-terminal substrate binding domain, increasing the range of compounds that could be detoxified or regulated by glycosylation. By comparison of the deduced protein sequences, several important domains were predicted, including the N-terminal signal peptide, UGT signature motif, and C-terminal transmembrane domain. Furthermore, several conserved residues putatively involved in sugar donor binding and catalytic mechanism were also identified by comparison with human UGTs. Many UGTs were expressed in fat body, midgut, and Malpighian tubules, consistent with functions in detoxification, and some were expressed in antennae, suggesting a role in pheromone deactivation. Transcript variants derived from alternative splicing, exon skipping, or intron retention produced additional UGT diversity. These findings from this comparative study of two lepidopteran UGTs as well as other insects reveal a diversity comparable to this gene family in vertebrates, plants and fungi and show the magnitude of the task ahead, to determine biochemical function and physiological relevance of each UGT enzyme.
- Fahey B, Degnan BM
- Origin and evolution of laminin gene family diversity.
- Mol Biol Evol. 2012; 29: 1823-36
- Display abstract
Laminins are a family of multidomain glycoproteins that are important contributors to the structure of metazoan extracellular matrices. To investigate the origin and evolution of the laminin family, we characterized the full complement of laminin-related genes in the genome of the sponge, Amphimedon queenslandica. As a representative of the Demospongiae, a group consistently placed within the earliest diverging branch of animals by molecular phylogenies, Amphimedon is uniquely placed to provide insight into early steps in the evolution of metazoan gene families. Five Amphimedon laminin-related genes possess the conserved molecular features, and most of the domains found in bilaterian laminins, but all display domain architectures distinct from those of the canonical laminin chain types known from model bilaterians. This finding prompted us to perform a comparative genomic analysis of laminins and related genes from a choanoflagellate and diverse metazoans and to conduct phylogenetic analyses using the conserved Laminin N-terminal domain in order to explore the relationships between genes with distinct architectures. Laminin-like genes appear to have originated in the holozoan lineage (choanoflagellates + metazoans + several other unicellular opisthokont taxa), with several laminin domains originating later and appearing only in metazoan (animal) or eumetazoan (placozoans + ctenophores + cnidarians + bilaterians) laminins. Typical bilaterian alpha, beta, and gamma laminin chain forms arose in the eumetazoan stem and another chain type that is conserved in Amphimedon, the cnidarian, Nematostella vectensis, and the echinoderm, Strongylocentrotus purpuratus, appears to have been lost independently from the placozoan, Trichoplax adhaerens, and from multiple bilaterians. Phylogenetic analysis did not clearly reconstruct relationships between the distinct laminin chain types (with the exception of the alpha chains) but did reveal how several members of the netrin family were generated independently from within the laminin family by duplication and domain shuffling and by domain loss. Together, our results suggest that gene duplication and loss and domain shuffling and loss all played a role in the evolution of the laminin family and contributed to the generation of lineage-specific diversity in the laminin gene complements of extant metazoans.
- Leung C, Shaheen F, Bernatchez P, Hackett TL
- Expression of myoferlin in human airway epithelium and its role in cell adhesion and zonula occludens-1 expression.
- PLoS One. 2012; 7: 40478-40478
- Display abstract
BACKGROUND: Normal airway epithelial barrier function is maintained by cell-cell contacts which require the translocation of adhesion proteins at the cell surface, through membrane vesicle trafficking and fusion events. Myoferlin and dysferlin, members of the multiple-C2-domain Ferlin superfamily, have been implicated in membrane fusion processes through the induction of membrane curvature. The objectives of this study were to examine the expression of dysferlin and myoferlin within the human airway and determine the roles of these proteins in airway epithelial homeostasis. METHODS: The expression of dysferlin and myoferlin were evaluated in normal human airway sections by immunohistochemistry, and primary human airway epithelial cells and fibroblasts by immuno blot. Localization of dysferlin and myoferlin in epithelial cells were determined using confocal microscopy. Functional outcomes analyzed included cell adhesion, protein expression, and cell detachment following dysferlin and myoferlin siRNA knock-down, using the human bronchial epithelial cell line, 16HBE. RESULTS: Primary human airway epithelial cells express both dysferlin and myoferlin whereas fibroblasts isolated from bronchi and the parenchyma only express myoferlin. Expression of dysferlin and myoferlin was further localized within the Golgi, cell cytoplasm and plasma membrane of 16HBE cells using confocal micrscopy. Treatment of 16HBE cells with myoferlin siRNA, but not dysferlin siRNA, resulted in a rounded cell morphology and loss of cell adhesion. This cell shedding following myoferlin knockdown was associated with decreased expression of tight junction molecule, zonula occludens-1 (ZO-1) and increased number of cells positive for apoptotic markers Annexin V and propidium iodide. Cell shedding was not associated with release of the innate inflammatory cytokines IL-6 and IL-8. CONCLUSIONS/SIGNIFICANCE: This study demonstrates the heterogeneous expression of myoferlin within epithelial cells and fibroblasts of the respiratory airway. The effect of myoferlin on the expression of ZO-1 in airway epithelial cells indicates its role in membrane fusion events that regulate cell detachment and apoptosis within the airway epithelium.
- Piha-Gossack A, Sossin W, Reinhardt DP
- The evolution of extracellular fibrillins and their functional domains.
- PLoS One. 2012; 7: 33560-33560
- Display abstract
Fibrillins constitute the major backbone of multifunctional microfibrils in elastic and non-elastic extracellular matrices, and are known to interact with several binding partners including tropoelastin and integrins. Here, we study the evolution of fibrillin proteins. Following sequence collection from 39 organisms representative of the major evolutionary groups, molecular evolutionary genetics and phylogeny inference software were used to generate a series of evolutionary trees using distance-based and maximum likelihood methods. The resulting trees support the concept of gene duplication as a means of generating the three vertebrate fibrillins. Beginning with a single fibrillin sequence found in invertebrates and jawless fish, a gene duplication event, which coincides with the appearance of elastin, led to the creation of two genes. One of the genes significantly evolved to become the gene for present-day fibrillin-1, while the other underwent evolutionary changes, including a second duplication, to produce present-day fibrillin-2 and fibrillin-3. Detailed analysis of several sequences and domains within the fibrillins reveals distinct similarities and differences across various species. The RGD integrin-binding site in TB4 of all fibrillins is conserved in cephalochordates and vertebrates, while the integrin-binding site within cbEGF18 of fibrillin-3 is a recent evolutionary change. The proline-rich domain in fibrillin-1, glycine-rich domain in fibrillin-2 and proline-/glycine-rich domain in fibrillin-3 are found in all analyzed tetrapod species, whereas it is completely replaced with an EGF-like domain in cnidarians, arthropods, molluscs and urochordates. All collected sequences contain the first 9-cysteine hybrid domain, and the second 8-cysteine hybrid domain with exception of arthropods containing an atypical 10-cysteine hybrid domain 2. Furin cleavage sites within the N- and C-terminal unique domains were found for all analyzed fibrillin sequences, indicating an essential role for processing of the fibrillin pro-proteins. The four cysteines in the unique N-terminus and the two cysteines in the unique C-terminus are also highly conserved.
- Van Hiel MB, Vandersmissen HP, Van Loy T, Vanden Broeck J
- An evolutionary comparison of leucine-rich repeat containing G protein-coupled receptors reveals a novel LGR subtype.
- Peptides. 2012; 34: 193-200
- Display abstract
Leucine-rich repeat containing G protein-coupled receptors or LGRs are receptors with important functions in development and reproduction. Belonging to this evolutionarily conserved group of receptors are the well-studied glycoprotein hormone receptors and relaxin receptors in mammals, as well as the bursicon receptor, which triggers cuticle hardening and tanning in freshly enclosed insects. In this study, the numerous LGR sequences in different animal phyla are analyzed and compared. Based on these data a phylogenetic tree was generated. This information sheds new light on structural and evolutionary aspects regarding this receptor group. Apart from vertebrates and insects, LGRs are also present in early chordates (Urochordata, Cephalochordata and Hyperoartia) and other arthropods (Arachnida and Branchiopoda) as well as in Mollusca, Echinodermata, Hemichordata, Nematoda, and even in ancient animal life forms, such as Cnidaria and Placozoa. Three distinct types of LGR exist, distinguishable by their number of leucine-rich repeats (LRRs), their type-specific hinge region and the presence or absence of an LDLa motif. Type C LGRs containing only one LDLa (C1 subtype) appear to be present in nearly all animal phyla. We here describe a second subtype, C2, containing multiple LDLa motifs, which was discovered in echinoderms, mollusks and in one insect species (Pediculus humanis corporis). In addition, eight putative LGRs can be predicted from the genome data of the placozoan species Trichoplax adhaerens. They may represent an ancient form of the LGRs, however, more genomic data will be required to confirm this hypothesis.
- Ocana KA, Davila AM
- Phylogenomics-based reconstruction of protozoan species tree.
- Evol Bioinform Online. 2011; 7: 107-21
- Display abstract
We have developed a semi-automatic methodology to reconstruct the phylogenetic species tree in Protozoa, integrating different phylogenetic algorithms and programs, and demonstrating the utility of a supermatrix approach to construct phylogenomics-based trees using 31 universal orthologs (UO). The species tree obtained was formed by three major clades that were related to three groups of data: i) Species containing at least 80% of UO (25/31) in the concatenated multiple alignment or supermatrix, this clade was called C1, ii) Species containing between 50%-79% (15-24/31) of UO called C2, and iii) Species containing less than 50% (1-14/31) of UO called C3. C1 was composed by only protozoan species, C2 was composed by species related to Protozoa, and C3 was composed by some species of C1 (Protozoa) and C2 (related to Protozoa). Our phylogenomics-based methodology using a supermatrix approach proved to be reliable with protozoan genome data and using at least 25 UO, suggesting that (a) the more UO used the better, (b) using the entire UO sequence or just a conserved block of it for the supermatrix produced similar phylogenomic trees.
- Daza DO, Sundstrom G, Bergqvist CA, Duan C, Larhammar D
- Evolution of the insulin-like growth factor binding protein (IGFBP) family.
- Endocrinology. 2011; 152: 2278-89
- Display abstract
The evolution of the IGF binding protein (IGFBP) gene family has been difficult to resolve. Both chromosomal and serial duplications have been suggested as mechanisms for the expansion of this gene family. We have identified and annotated IGFBP sequences from a wide selection of vertebrate species as well as Branchiostoma floridae and Ciona intestinalis. By combining detailed sequence analysis with sequence-based phylogenies and chromosome information, we arrive at the following scenario: the ancestral chordate IGFBP gene underwent a local gene duplication, resulting in a gene pair adjacent to a HOX cluster. Subsequently, the gene family expanded in the two basal vertebrate tetraploidization (2R) resulting in the six IGFBP types that are presently found in placental mammals. The teleost fish ancestor underwent a third tetraploidization (3R) that further expanded the IGFBP repertoire. The five sequenced teleost fish genomes retain 9-11 of IGFBP genes. This scenario is supported by the phylogenies of three adjacent gene families in the HOX gene regions, namely the epidermal growth factor receptors (EGFR) and the Ikaros and distal-less (DLX) transcription factors. Our sequence comparisons show that several important structural components in the IGFBPs are ancestral vertebrate features that have been maintained in all orthologs, for instance the integrin interaction motif Arg-Gly-Asp in IGFBP-2. In contrast, the Arg-Gly-Asp motif in IGFBP-1 has arisen independently in mammals. The large degree of retention of IGFBP genes after the ancient expansion of the gene family strongly suggests that each gene evolved distinct and important functions early in vertebrate evolution.
- Ravisankar V, Singh TP, Manoj N
- Molecular evolution of the EGF-CFC protein family.
- Gene. 2011; 482: 43-50
- Display abstract
The epidermal growth factor-Cripto-1/FRL-1/Cryptic (EGF-CFC) proteins, characterized by the highly conserved EGF and CFC domains, are extracellular membrane associated growth factor-like glycoproteins. These proteins are essential components of the Nodal signaling pathway during early vertebrate embryogenesis. Homologs of the EGF-CFC family have also been implicated in tumorigenesis in humans. Yet, little is known about the mode of molecular evolution in this family. Here we investigate the origin, extent of conservation and evolutionary relationships of EGF-CFC proteins across the metazoa. The results suggest that the first appearance of the EGF-CFC gene occurred in the ancestor of the deuterostomes. Phylogenetic analysis supports the classification of the family into distinct subfamilies that appear to have evolved through lineage-specific duplication and divergence. Site-specific analyses of evolutionary rate shifts between the two major mammalian paralogous subfamilies, Cripto and Cryptic, reveal critical amino acid sites that may account for the observed functional divergence. Furthermore, estimates of functional divergence suggest that rapid change of evolutionary rates at sites located mainly in the CFC domain may contribute towards distinct functional properties of the two paralogs.
- Cortese MS, Etxebeste O, Garzia A, Espeso EA, Ugalde U
- Elucidation of functional markers from Aspergillus nidulans developmental regulator FlbB and their phylogenetic distribution.
- PLoS One. 2011; 6: 17505-17505
- Display abstract
Aspergillus nidulans is a filamentous fungus widely used as a model for biotechnological and clinical research. It is also used as a platform for the study of basic eukaryotic developmental processes. Previous studies identified and partially characterized a set of proteins controlling cellular transformations in this ascomycete. Among these proteins, the bZip type transcription factor FlbB is a key regulator of reproduction, stress responses and cell-death. Our aim here was the prediction, through various bioinformatic methods, of key functional residues and motifs within FlbB in order to inform the design of future laboratory experiments and further the understanding of the molecular mechanisms that control fungal development. A dataset of FlbB orthologs and those of its key interaction partner FlbE was assembled from 40 members of the Pezizomycotina. Unique features were identified in each of the three structural domains of FlbB. The N-terminal region encoded a bZip transcription factor domain with a novel histidine-containing DNA binding motif while the dimerization determinants exhibited two distinct profiles that segregated by class. The C-terminal region of FlbB showed high similarity with the AP-1 family of stress response regulators but with variable patterns of conserved cysteines that segregated by class and order. Motif conservation analysis revealed that nine FlbB orthologs belonging to the Eurotiales order contained a motif in the central region that could mediate interaction with FlbE. The key residues and motifs identified here provide a basis for the design of follow-up experimental investigations. Additionally, the presence or absence of these residues and motifs among the FlbB orthologs could help explain the differences in the developmental programs among fungal species as well as define putative complementation groups that could serve to extend known functional characterizations to other species.
- Foret S et al.
- Phylogenomics reveals an anomalous distribution of USP genes in metazoans.
- Mol Biol Evol. 2011; 28: 153-61
- Display abstract
Members of the universal stress protein (USP) family were originally identified in stressed bacteria on the basis of a shared domain, which has since been reported in a phylogenetically diverse range of prokaryotes, fungi, protists, and plants. Although not previously characterized in metazoans, here we report that USP genes are distributed in animal genomes in a unique pattern that reflects frequent independent losses and independent expansions. Multiple USP loci are present in urochordates as well as all Cnidaria and Lophotrochozoa examined, but none were detected in any of the available ecdysozoan or non-urochordate deuterostome genome data. The vast majority of the metazoan USPs are short, single-domain proteins and are phylogenetically distinct from the prokaryotic, plant, protist, and fungal members of the protein family. Whereas most of the metazoan USP genes contain introns, with few exceptions those in the cnidarian Hydra are intronless and cluster together in phylogenetic analyses. Expression patterns were determined for several cnidarian USPs, including two genes belonging to the intronless clade, and these imply diverse functions. The apparent paradox of implied diversity of roles despite high overall levels of sequence (and implied structural) similarity parallels the situation in bacteria. The absence of USP genes in ecdysozoans and most deuterostomes may be a consequence of functional redundancy or specialization in taxon-specific roles.
- Heffer A, Pick L
- Rapid isolation of gene homologs across taxa: Efficient identification and isolation of gene orthologs from non-model organism genomes, a technical report.
- Evodevo. 2011; 2: 7-7
- Display abstract
BACKGROUND: Tremendous progress has been made in the field of evo-devo through comparisons of related genes from diverse taxa. While the vast number of species in nature precludes a complete analysis of the molecular evolution of even one single gene family, this would not be necessary to understand fundamental mechanisms underlying gene evolution if experiments could be designed to systematically sample representative points along the path of established phylogenies to trace changes in regulatory and coding gene sequence. This isolation of homologous genes from phylogenetically diverse, representative species can be challenging, especially if the gene is under weak selective pressure and evolving rapidly. RESULTS: Here we present an approach - Rapid Isolation of Gene Homologs across Taxa (RIGHT) - to efficiently isolate specific members of gene families. RIGHT is based upon modification and a combination of degenerate polymerase chain reaction (PCR) and gene-specific amplified fragment length polymorphism (AFLP). It allows targeted isolation of specific gene family members from any organism, only requiring genomic DNA. We describe this approach and how we used it to isolate members of several different gene families from diverse arthropods spanning millions of years of evolution. CONCLUSIONS: RIGHT facilitates systematic isolation of one gene from large gene families. It allows for efficient gene isolation without whole genome sequencing, RNA extraction, or culturing of non-model organisms. RIGHT will be a generally useful method for isolation of orthologs from both distant and closely related species, increasing sample size and facilitating the tracking of molecular evolution of gene families and regulatory networks across the tree of life.
- Eisenberg MC, Kim Y, Li R, Ackerman WE, Kniss DA, Friedman A
- Mechanistic modeling of the effects of myoferlin on tumor cell invasion.
- Proc Natl Acad Sci U S A. 2011; 108: 20078-83
- Display abstract
Myoferlin (MYOF) is a member of the evolutionarily conserved ferlin family of proteins, noted for their role in a variety of membrane processes, including endocytosis, repair, and vesicular transport. Notably, ferlins are implicated in Caenorhabditis elegans sperm motility (Fer-1), mammalian skeletal muscle development and repair (MYOF and dysferlin), and presynaptic transmission in the auditory system (otoferlin). In this paper, we demonstrate that MYOF plays a previously unrecognized role in cancer cell invasion, using a combination of mathematical modeling and in vitro experiments. Using a real-time impedance-based invasion assay (xCELLigence), we have shown that lentiviral-based knockdown of MYOF significantly reduced invasion of MDA-MB-231 breast cancer cells in Matrigel bioassays. Based on these experimental data, we developed a partial differential equation model of MYOF effects on cancer cell invasion, which we used to generate mechanistic hypotheses. The mathematical model predictions revealed that matrix metalloproteinases (MMPs) may play a key role in modulating this invasive property, which was supported by experimental data using quantitative RT-PCR screens. These results suggest that MYOF may be a promising target for biomarkers or drug target for metastatic cancer diagnosis and therapy, perhaps mediated through MMPs.
- Kojima KK, Jurka J
- Crypton transposons: identification of new diverse families and ancient domestication events.
- Mob DNA. 2011; 2: 12-12
- Display abstract
BACKGROUND: "Domestication" of transposable elements (TEs) led to evolutionary breakthroughs such as the origin of telomerase and the vertebrate adaptive immune system. These breakthroughs were accomplished by the adaptation of molecular functions essential for TEs, such as reverse transcription, DNA cutting and ligation or DNA binding. Cryptons represent a unique class of DNA transposons using tyrosine recombinase (YR) to cut and rejoin the recombining DNA molecules. Cryptons were originally identified in fungi and later in the sea anemone, sea urchin and insects. RESULTS: Herein we report new Cryptons from animals, fungi, oomycetes and diatom, as well as widely conserved genes derived from ancient Crypton domestication events. Phylogenetic analysis based on the YR sequences supports four deep divisions of Crypton elements. We found that the domain of unknown function 3504 (DUF3504) in eukaryotes is derived from Crypton YR. DUF3504 is similar to YR but lacks most of the residues of the catalytic tetrad (R-H-R-Y). Genes containing the DUF3504 domain are potassium channel tetramerization domain containing 1 (KCTD1), KIAA1958, zinc finger MYM type 2 (ZMYM2), ZMYM3, ZMYM4, glutamine-rich protein 1 (QRICH1) and "without children" (WOC). The DUF3504 genes are highly conserved and are found in almost all jawed vertebrates. The sequence, domain structure, intron positions and synteny blocks support the view that ZMYM2, ZMYM3, ZMYM4, and possibly QRICH1, were derived from WOC through two rounds of genome duplication in early vertebrate evolution. WOC is observed widely among bilaterians. There could be four independent events of Crypton domestication, and one of them, generating WOC/ZMYM, predated the birth of bilaterian animals. This is the third-oldest domestication event known to date, following the domestication generating telomerase reverse transcriptase (TERT) and Prp8. Many Crypton-derived genes are transcriptional regulators with additional DNA-binding domains, and the acquisition of the DUF3504 domain could have added new regulatory pathways via protein-DNA or protein-protein interactions. CONCLUSIONS: Cryptons have contributed to animal evolution through domestication of their YR sequences. The DUF3504 domains are domesticated YRs of animal Crypton elements.
- Kong X, Wang X, He S
- Molecular variation and evolution of the tyrosine kinase domains of insulin receptor IRa and IRb genes in Cyprinidae.
- Sci China Life Sci. 2011; 54: 626-33
- Display abstract
The insulin receptor (IR) gene plays an important role in regulating cell growth, differentiation and development. In the present study, DNA sequences of insulin receptor genes, IRa and IRb, were amplified and sequenced from 37 representative species of the Cyprinidae and from five outgroup species from non-cyprinid Cypriniformes. Based on coding sequences (CDS) of tyrosine kinase regions of IRa and IRb, molecular evolution and phylogenetic relationships were analyzed to better understand the characteristics of IR gene divergence in the family Cyprinidae. IRa and IRb were clustered into one lineage in the gene tree of the IR gene family, reconstructed using the unweighted pair group method with arithmetic mean (UPGMA). IRa and IRb have evolved into distinct genes after IR gene duplication in Cyprinidae. For each gene, molecular evolution analyses showed that there was no significant difference among different groups in the reconstructed maximum parsimony (MP) tree of Cyprinidae; IRa and IRb have been subjected to similar evolutionary pressure among different lineages. Although the amino acid sequences of IRa and IRb tyrosine kinase regions were highly conserved, our analyses showed that there were clear sequence variations between the tyrosine kinase regions of IRa and IRb proteins. This indicates that IRa and IRb proteins might play different roles in the insulin signaling pathway.
- Sreedharan S, Stephansson O, Schioth HB, Fredriksson R
- Long evolutionary conservation and considerable tissue specificity of several atypical solute carrier transporters.
- Gene. 2011; 478: 11-8
- Display abstract
The superfamily of Solute Carriers (SLCs) has around 384 members in the human genome grouped into at least 48 families. While many of these transporters have been well characterized with established important biological functions, there are few recently identified genes that are not studied regarding tissue distribution or evolutionary origin. Here we study 14 of these recently discovered SLC genes (HIAT1, HIATL1, MFSD1, MFSD5, MFSD6, MFSD9, MFSD10, SLC7A14, SLC7A15, SLC10A6, SLC15A5, SLC16A12, SLC30A10 and SLC21A21) with the purpose to give much better picture over the sequence relationship and tissue expression of the diverse SLC gene family. We used a range of bioinformatic methods to classify each of these genes into the different SLC gene families. We found that 9 of the 14 atypical SLCs are distant members of the Major Facilitator Superfamily (MFS) clan while the others belong to the APC clan, the DMT clan, the CPA_AT clan and the IT clan. We found most of the genes to be highly evolutionary conserved, likely to be present in most bilateral species, except for SLC21A21 that we found only present in mammals. Several of these transporter genes have highly specific tissue expression profile while it is notable that most are expressed in the CNS with the exception of SLC21A21 and SLC15A5. This work provides fundamental information on 14 transporters that previously have not received much attention enabling a more comprehensive view over the SLC superfamily.
- Elias M
- Patterns and processes in the evolution of the eukaryotic endomembrane system.
- Mol Membr Biol. 2010; 27: 469-89
- Display abstract
The eukaryotic endomembrane system (ES) is served by hundreds of dedicated proteins. Experimental characterization of the ES-associated molecular machinery in several model eukaryotes complemented by a recent progress in phylogenomics and comparative genomics have revealed a conserved complex core of the machinery that appears to have been established before the last eukaryotic common ancestor (LECA). At the same time, modern eukaryotes exhibit a huge variation in the ES resulting from a multitude of evolutionary processes operating along the ever-branching paths from the LECA to its descendants. The most important source of evolutionary novelty in the ES functioning has undoubtedly been gene duplication followed by divergence of the gene copies, responsible not only for the pre-LECA establishment of many multi-paralog families of proteins in the very core of the ES-associated machinery, but also for post-LECA lineage-specific elaborations via family expansions and the origin of novel components. Extreme sequence divergence has obscured actual homologous relationships between potentially many components of the machinery, even between orthologous proteins, as illustrated by the yeast Vps51 subunit of the vesicle tethering complex GARP hypothesized here to be a highly modified ortholog of a conserved eukaryotic family typified by the zebrafish Fat-free (Ffr) protein. A dynamic evolution of many ES-associated proteins, especially those centred around RAB and ARF GTPases, seems to take place at the level of their domain architectures. Finally, reductive evolution and recurrent gene loss are emerging as pervasive factors shaping the ES in all phylogenetic lineages.
- Reitzel AM, Tarrant AM
- Correlated evolution of androgen receptor and aromatase revisited.
- Mol Biol Evol. 2010; 27: 2211-5
- Display abstract
Conserved interactions among proteins or other molecules can provide strong evidence for coevolution across their evolutionary history. Diverse phylogenetic methods have been applied to identify potential coevolutionary relationships. In most cases, these methods minimally require comparisons of orthologous sequences and appropriate controls to separate effects of selection from the overall evolutionary relationships. In vertebrates, androgen receptor (AR) and cytochrome p450 aromatase (CYP19) share an affinity for androgenic steroids, which serve as receptor ligands and enzyme substrates. In a recent study, Tiwary and Li (Tiwary BK, Li W-H. 2009. Parallel evolution between aromatase and androgen receptor in the animal kingdom. Mol Biol Evol. 26:123-129) reported that AR and CYP19 displayed a signature of ancient and conserved interactions throughout all the Eumetazoa (i.e., cnidarians, protostomes, and deuterostomes). Because these findings conflicted with a number of previous studies, we reanalyzed the data set used by Tiwary and Li. First, our analyses demonstrate that the invertebrate genes used in the previous analysis are not orthologous sequences but instead represent a diverse set of nuclear receptors and CYP enzymes with no confirmed or hypothesized relationships with androgens. Second, we show that 1) their analytical approach, which measures correlations in evolutionary distances between proteins, potentially led to spurious significant relationships due simply to conserved domains and 2) control comparisons provide positive evidence for a strong influence of evolutionary history. We discuss how corrections to this method and analysis of key taxa (e.g., duplications in the teleost fish and suiform lineages) can inform investigations of the coevolutionary relationships between AR and aromatase.
- Arenas AF, Osorio-Mendez JF, Gutierrez AJ, Gomez-Marin JE
- Genome-wide survey and evolutionary analysis of trypsin proteases in apicomplexan parasites.
- Genomics Proteomics Bioinformatics. 2010; 8: 103-12
- Display abstract
Apicomplexa are an extremely diverse group of unicellular organisms that infect humans and other animals. Despite the great advances in combating infectious diseases over the past century, these parasites still have a tremendous social and economic burden on human societies, particularly in tropical and subtropical regions of the world. Proteases from apicomplexa have been characterized at the molecular and cellular levels, and central roles have been proposed for proteases in diverse processes. In this work, 16 new genes encoding for trypsin proteases are identified in 8 apicomplexan genomes by a genome-wide survey. Phylogenetic analysis suggests that these genes were gained through both intracellular gene transfer and vertical gene transfer. Identification, characterization and understanding of the evolutionary origin of protease-mediated processes are crucial to increase the knowledge and improve the strategies for the development of novel chemotherapeutic agents and vaccines.
- Liu Y et al.
- Parathyroid hormone gene family in a cartilaginous fish, the elephant shark (Callorhinchus milii).
- J Bone Miner Res. 2010; 25: 2613-23
- Display abstract
The development of bone was a major step in the evolution of vertebrates. A bony skeleton provided structural support and a calcium reservoir essential for the movement from an aquatic to a terrestrial environment. Cartilaginous fishes are the oldest living group of jawed vertebrates. In this study we have identified three members of the parathyroid hormone (Pth) gene family in a cartilaginous fish, the elephant shark (Callorhinchus milii). The three genes include two Pth genes, designated as Pth1 and Pth2, and a Pthrp gene. Phylogenetic analysis suggested that elephant shark Pth2 is an ancient gene whose orthologue is lost in bony vertebrates. The Pth1 and Pth2 genes have the same structure as the Pth gene in bony vertebrates, whereas the structure of the Pthrp gene is more complex in tetrapods compared with elephant shark. The three elephant shark genes showed distinct patterns of expression, with Pth2 being expressed only in the brain and spleen. This contrasts with localization of the corresponding proteins, which showed considerable overlap in their distribution. There were conserved sites of localization for Pthrp between elephant shark and mammals, including tissues such as kidney, skin, skeletal and cardiac muscle, pancreas, and cartilage. The elephant shark Pth1(1-34) and Pthrp(1-34) peptides were able to stimulate cAMP accumulation in mammalian UMR106.01 cells. However, Pth2(1-34) peptide did not show such PTH-like biologic activity. The presence of Pth and Pthrp genes in the elephant shark indicates that these genes played fundamental roles before their recruitment to bone development in bony jawed vertebrates.
- Zhang D, Aravind L
- Identification of novel families and classification of the C2 domain superfamily elucidate the origin and evolution of membrane targeting activities in eukaryotes.
- Gene. 2010; 469: 18-30
- Display abstract
Eukaryotes contain an elaborate membrane system, which bounds the cell itself, nuclei, organelles and transient intracellular structures, such as vesicles. The emergence of this system was marked by an expansion of a number of structurally distinct classes of lipid-binding domains that could throw light on the early evolution of eukaryotic membranes. The C2 domain is a useful model to understand these events because it is one of the most prevalent eukaryotic lipid-binding domains deployed in diverse functional contexts. Most studies have concentrated on C2 domains prototyped by those in protein kinase C (PKC-C2) isoforms that bind lipid in a calcium-dependent manner. While two other distinct families of C2 domains, namely those in PI3K-C2 and PTEN-C2 are also recognized, a complete picture of evolutionary relationships within the C2 domain superfamily is lacking. We systematically studied this superfamily using sequence profile searches, phylogenetic and phyletic-pattern analysis and structure-prediction. Consequently, we identified several distinct families of C2 domains including those respectively typified by C2 domains in the Aida (axin interactor, dorsalization associated) proteins, B9 proteins (e.g. Mks1 (Xbx-7), Stumpy (Tza-1) and Tza-2) involved in centrosome migration and ciliogenesis, Dock180/Zizimin proteins which are Rac/CDC42 GDP exchange factors, the EEIG1/Sym-3, EHBP1 and plant RPG/PMI1 proteins involved in endocytotic recycling and organellar positioning and an apicomplexan family. We present evidence that the last eukaryotic common ancestor (LECA) contained at least 10 C2 domains belonging to 6 well-defined families. Further, we suggest that this pre-LECA diversification was linked to the emergence of several quintessentially eukaryotic structures, such as membrane repair and vesicular trafficking system, anchoring of the actin and tubulin cytoskeleton to the plasma and vesicular membranes, localization of small GTPases to membranes and lipid-based signal transduction. Subsequent lineage-specific expansions of Zizimin-type C2 domains and functionally linked CDC42/Rac GTPases occurred independently in eukaryotes that evolved active amoeboid motility. While two lipid-binding regions are likely to be shared by majority of C2 domains, the actual constellation of lipid-binding residues (predominantly basic) are distinct in each family potentially reflective of the functional and biochemical diversity of these domains. Importantly, we show that the calcium-dependent membrane interaction is a derived feature limited to the PKC-C2 domains. Our identification of novel C2 domains offers new insights into interaction between both the microtubular and microfilament cytoskeleton and cellular membranes.
- Johnson KR, Nicodemus-Johnson J, Danziger RS
- An evolutionary analysis of cAMP-specific Phosphodiesterase 4 alternative splicing.
- BMC Evol Biol. 2010; 10: 247-247
- Display abstract
BACKGROUND: Cyclic nucleotide phosphodiesterases (PDEs) hydrolyze the intracellular second messengers: cyclic adenosine monophosphate (cAMP) and cyclic guanine monophosphate (cGMP). The cAMP-specific PDE family 4 (PDE4) is widely expressed in vertebrates. Each of the four PDE4 gene isoforms (PDE4 A-D) undergo extensive alternative splicing via alternative transcription initiation sites, producing unique amino termini and yielding multiple splice variant forms from each gene isoform termed long, short, super-short and truncated super-short. Many species across the vertebrate lineage contain multiple splice variants of each gene type, which are characterized by length and amino termini. RESULTS: A phylogenetic approach was used to visualize splice variant form genesis and identify conserved splice variants (genome conservation with EST support) across the vertebrate taxa. Bayesian and maximum likelihood phylogenetic inference indicated PDE4 gene duplication occurred at the base of the vertebrate lineage and reveals additional gene duplications specific to the teleost lineage. Phylogenetic inference and PDE4 splice variant presence, or absence as determined by EST screens, were further supported by the genomic analysis of select vertebrate taxa. Two conserved PDE4 long form splice variants were found in each of the PDE4A, PDE4B, and PDE4C genes, and eight conserved long forms from the PDE4 D gene. Conserved short and super-short splice variants were found from each of the PDE4A, PDE4B, and PDE4 D genes, while truncated super-short variants were found from the PDE4C and PDE4 D genes. PDE4 long form splice variants were found in all taxa sampled (invertebrate through mammals); short, super-short, and truncated super-short are detected primarily in tetrapods and mammals, indicating an increasing complexity in both alternative splicing and cAMP metabolism through vertebrate evolution. CONCLUSIONS: There was a progressive independent incorporation of multiple PDE4 splice variant forms and amino termini, increasing PDE4 proteome complexity from primitive vertebrates to humans. While PDE4 gene isoform duplicates with limited alternative splicing were found in teleosts, an expansion of both PDE4 splice variant forms, and alternatively spliced amino termini predominantly occurs in mammals. Since amino termini have been linked to intracellular targeting of the PDE4 enzymes, the conservation of amino termini in PDE4 splice variants in evolution highlights the importance of compartmentalization of PDE4-mediated cAMP hydrolysis.
- Jaspers P et al.
- The RST and PARP-like domain containing SRO protein family: analysis of protein structure, function and conservation in land plants.
- BMC Genomics. 2010; 11: 170-170
- Display abstract
BACKGROUND: The SROs (SIMILAR TO RCD-ONE) are a group of plant-specific proteins which have important functions in stress adaptation and development. They contain the catalytic core of the poly(ADP-ribose) polymerase (PARP) domain and a C-terminal RST (RCD-SRO-TAF4) domain. In addition to these domains, several, but not all, SROs contain an N-terminal WWE domain. RESULTS: SROs are present in all analyzed land plants and sequence analysis differentiates between two structurally distinct groups; cryptogams and monocots possess only group I SROs whereas eudicots also contain group II. Group I SROs possess an N-terminal WWE domain (PS50918) but the WWE domain is lacking in group II SROs. Group I domain structure is widely represented in organisms as distant as humans (for example, HsPARP11). We propose a unified nomenclature for the SRO family. The SROs are able to interact with transcription factors through the C-terminal RST domain but themselves are generally not regulated at the transcriptional level. The most conserved feature of the SROs is the catalytic core of the poly(ADP-ribose) polymerase (PS51059) domain. However, bioinformatic analysis of the SRO PARP domain fold-structure and biochemical assays of AtRCD1 suggested that SROs do not possess ADP-ribosyl transferase activity. CONCLUSIONS: The SROs are a highly conserved family of plant specific proteins. Sequence analysis of the RST domain implicates a highly preserved protein structure in that region. This might have implications for functional conservation. We suggest that, despite the presence of the catalytic core of the PARP domain, the SROs do not possess ADP-ribosyl transferase activity. Nevertheless, the function of SROs is critical for plants and might be related to transcription factor regulation and complex formation.
- Pruvot B, Laurens V, Salvadori F, Solary E, Pichon L, Chluba J
- Comparative analysis of nonaspanin protein sequences and expression studies in zebrafish.
- Immunogenetics. 2010; 62: 681-99
- Display abstract
Nonaspanins constitute a family of proteins, also called TM9SF, characterized by a large non-cytoplasmic domain and nine putative transmembrane domains. This family is highly conserved through evolution and comprises three members in Saccharomyces cerevisiae, Dictyostelium discoideum, and Drosophila melanogaster, and four members are reported in mammals (TM9SF1-TM9SF4). Genetic studies in Dictyostelium and Drosophila have shown that TM9SF members are required for adhesion and phagocytosis in innate immune response, furthermore, human TM9SF1 plays a role in the regulation of autophagy and human TM9SF4 in tumor cannibalism. Here we report that the zebrafish genome encodes five members of this family, TM9SF1-TM9SF5, which show high level of sequence conservation with the previously reported members. Expression analysis in zebrafish showed that all members are maternally expressed and continue to be present throughout embryogenesis to adults. Gene expression could not be regulated by pathogen-associated molecular patterns such as LPS, CpG, or Poly I:C. By bioinformatic analyses of 80 TM9SF protein sequences from yeast, plants, and animals, we confirmed a very conserved protein structure. An evolutionary conserved immunoreceptor tyrosine-based inhibition motif has been detected in the cytoplasmic domain between transmembrane domain (TM) 7 and TM8 in TM9SF1, TM9SF2, TM9SF4 and TM9SF5, and at the extreme C-terminal end of TM9SF4. Finally, a conserved TRAF2 binding domain could also be predicted in the cytoplasmic regions of TM9SF2, TM9SF3, TM9SF4, and TM9SF5. This confirms the hypothesis that TM9SF proteins may play a regulatory role in a specific and ancient cellular mechanism that is involved in innate immunity.
- Perrin E et al.
- Exploring the HME and HAE1 efflux systems in the genus Burkholderia.
- BMC Evol Biol. 2010; 10: 164-164
- Display abstract
BACKGROUND: The genus Burkholderia includes a variety of species with opportunistic human pathogenic strains, whose increasing global resistance to antibiotics has become a public health problem. In this context a major role could be played by multidrug efflux pumps belonging to Resistance Nodulation Cell-Division (RND) family, which allow bacterial cells to extrude a wide range of different substrates, including antibiotics. This study aims to i) identify rnd genes in the 21 available completely sequenced Burkholderia genomes, ii) analyze their phylogenetic distribution, iii) define the putative function(s) that RND proteins perform within the Burkholderia genus and iv) try tracing the evolutionary history of some of these genes in Burkholderia. RESULTS: BLAST analysis of the 21 Burkholderia sequenced genomes, using experimentally characterized ceoB sequence (one of the RND family counterpart in the genus Burkholderia) as probe, allowed the assembly of a dataset comprising 254 putative RND proteins. An extensive phylogenetic analysis revealed the occurrence of several independent events of gene loss and duplication across the different lineages of the genus Burkholderia, leading to notable differences in the number of paralogs between different genomes. A putative substrate [antibiotics (HAE1 proteins)/heavy-metal (HME proteins)] was also assigned to the majority of these proteins. No correlation was found between the ecological niche and the lifestyle of Burkholderia strains and the number/type of efflux pumps they possessed, while a relation can be found with genome size and taxonomy. Remarkably, we observed that only HAE1 proteins are mainly responsible for the different number of proteins observed in strains of the same species. Data concerning both the distribution and the phylogenetic analysis of the HAE1 and HME in the Burkholderia genus allowed depicting a likely evolutionary model accounting for the evolution and spreading of HME and HAE1 systems in the Burkholderia genus. CONCLUSION: A complete knowledge of the presence and distribution of RND proteins in Burkholderia species was obtained and an evolutionary model was depicted. Data presented in this work may serve as a basis for future experimental tests, focused especially on HAE1 proteins, aimed at the identification of novel targets in antimicrobial therapy against Burkholderia species.
- Pawlowski K, Muszewska A, Lenart A, Szczepinska T, Godzik A, Grynberg M
- A widespread peroxiredoxin-like domain present in tumor suppression- and progression-implicated proteins.
- BMC Genomics. 2010; 11: 590-590
- Display abstract
BACKGROUND: Peroxide turnover and signalling are involved in many biological phenomena relevant to human diseases. Yet, all the players and mechanisms involved in peroxide perception are not known. Elucidating very remote evolutionary relationships between proteins is an approach that allows the discovery of novel protein functions. Here, we start with three human proteins, SRPX, SRPX2 and CCDC80, involved in tumor suppression and progression, which possess a conserved region of similarity. Structure and function prediction allowed the definition of P-DUDES, a phylogenetically widespread, possibly ancient protein structural domain, common to vertebrates and many bacterial species. RESULTS: We show, using bioinformatics approaches, that the P-DUDES domain, surprisingly, adopts the thioredoxin-like (Thx-like) fold. A tentative, more detailed prediction of function is made, namely, that of a 2-Cys peroxiredoxin. Incidentally, consistent overexpression of all three human P-DUDES genes in two public glioblastoma microarray gene expression datasets was discovered. This finding is discussed in the context of the tumor suppressor role that has been ascribed to P-DUDES proteins in several studies. Majority of non-redundant P-DUDES proteins are found in marine metagenome, and among the bacterial species possessing this domain a trend for a higher proportion of aquatic species is observed. CONCLUSIONS: The new protein structural domain, now with a broad enzymatic function predicted, may become a drug target once its detailed molecular mechanism of action is understood in detail.
- Herrmann H, Strelkov SV, Burkhard P, Aebi U
- Intermediate filaments: primary determinants of cell architecture and plasticity.
- J Clin Invest. 2009; 119: 1772-83
- Display abstract
Intermediate filaments (IFs) are major constituents of the cytoskeleton and nuclear boundary in animal cells. They are of prime importance for the functional organization of structural elements. Depending on the cell type, morphologically similar but biochemically distinct proteins form highly viscoelastic filament networks with multiple nanomechanical functions. Besides their primary role in cell plasticity and their established function as cellular stress absorbers, recently discovered gene defects have elucidated that structural alterations of IFs can affect their involvement both in signaling and in controlling gene regulatory networks. Here, we highlight the basic structural and functional properties of IFs and derive a concept of how mutations may affect cellular architecture and thereby tissue construction and physiology.
- Green JB, Lower RP, Young JP
- The NfeD protein family and its conserved gene neighbours throughout prokaryotes: functional implications for stomatin-like proteins.
- J Mol Evol. 2009; 69: 657-67
- Display abstract
NfeD-like proteins are widely distributed throughout prokaryotes and are frequently associated with genes encoding stomatin-like proteins (slipins). Here, we reveal that the NfeD family is ancient and comprises three major groups: NfeD1a, NfeD1b and truncated NfeD1b. Members of each group are associated with one of four conserved gene partners, three of which have eukaryotic homologues that are membrane raft associated, namely stomatin, paraslipin (previously SLP-2) and flotillin. The first NfeD group (NfeD1b), comprises proteins of approximately 460-aa long that have three functional domains: an N-terminal protease, a middle membrane-spanning region and a soluble C-terminal region rich in beta-strands. The nfeD1b gene is adjacent to eoslipin in prokaryotic genomes except in Firmicutes and Deinococci, where yqfA replaces eoslipin. Proteins in the second major group (NfeD1a) are homologous to the C-terminus of NfeD1b which forms a beta-barrel-like domain, and their genes are associated with paraslipin. Using OrthoMCL clustering, we show that nfeD1b genes have become truncated on many independent occasions giving rise to the third major group. These short NfeD homologues frequently remain associated with their ancestral gene neighbour, resembling NfeD1a in structure, yet are much more related to full-length NfeD1b; we term these "truncated NfeD1b". These conserved associations suggest that NfeD proteins are dependent on gene partners for their function and that the site of interaction may lie within the C-terminal portion that is common to all NfeD homologues. Although NfeD homologues are confined to prokaryotes, this conserved association could represent an excellent system to study slipin and flotillin proteins.
- Rawlings ND, Bateman A
- Pepsin homologues in bacteria.
- BMC Genomics. 2009; 10: 437-437
- Display abstract
BACKGROUND: Peptidase family A1, to which pepsin belongs, had been assumed to be restricted to eukaryotes. The tertiary structure of pepsin shows two lobes with similar folds and it has been suggested that the gene has arisen from an ancient duplication and fusion event. The only sequence similarity between the lobes is restricted to the motif around the active site aspartate and a hydrophobic-hydrophobic-Gly motif. Together, these contribute to an essential structural feature known as a psi-loop. There is one such psi-loop in each lobe, and so each lobe presents an active Asp. The human immunodeficiency virus peptidase, retropepsin, from peptidase family A2 also has a similar fold but consists of one lobe only and has to dimerize to be active. All known members of family A1 show the bilobed structure, but it is unclear if the ancestor of family A1 was similar to an A2 peptidase, or if the ancestral retropepsin was derived from a half-pepsin gene. The presence of a pepsin homologue in a prokaryote might give insights into the evolution of the pepsin family. RESULTS: Homologues of the aspartic peptidase pepsin have been found in the completed genomic sequences from seven species of bacteria. The bacterial homologues, unlike those from eukaryotes, do not possess signal peptides, and would therefore be intracellular acting at neutral pH. The bacterial homologues have Thr218 replaced by Asp, a change which in renin has been shown to confer activity at neutral pH. No pepsin homologues could be detected in any archaean genome. CONCLUSION: The peptidase family A1 is found in some species of bacteria as well as eukaryotes. The bacterial homologues fall into two groups, one from oceanic bacteria and one from plant symbionts. The bacterial homologues are all predicted to be intracellular proteins, unlike the eukaryotic enzymes. The bacterial homologues are bilobed like pepsin, implying that if no horizontal gene transfer has occurred the duplication and fusion event might be very ancient indeed, preceding the divergence of bacteria and eukaryotes. It is unclear whether all the bacterial homologues are derived from horizontal gene transfer, but those from the plant symbionts probably are. The homologues from oceanic bacteria are most closely related to memapsins (or BACE-1 and BACE-2), but are so divergent that they are close to the root of the phylogenetic tree and to the division of the A1 family into two subfamilies.
- Cannon JP
- Plasticity of the immunoglobulin domain in the evolution of immunity.
- Integr Comp Biol. 2009; 49: 187-96
- Display abstract
Immune receptors are omnipresent in multicellular organisms and comprise a vast array of molecular structures that serve to detect and eliminate pathogenic threats. The immunoglobulin (Ig) domain, a central structural feature of the antigen binding receptors that mediate adaptive immunity in jawed vertebrates, appears to play a particularly widespread role in metazoan immunity. Recent reports also have implicated Ig domains in the immune responses of protostomes such as flies and snails. Our research has focused on understanding the utilization of the Ig domain in the immunity of chordates and has identified numerous multigene families of Ig domain-containing receptors that appear to serve roles distinct from the adaptive antigen-binding receptors. Three families have received particular focus: novel immune-type receptors (NITRs) of bony fish, modular domain immune-type receptors (MDIRs) of cartilaginous fish and variable region-containing chitin-binding proteins (VCBPs) of amphioxus. NITRs and MDIRs are encoded in large multigene families of highly diversified forms and exhibit a striking dichotomy of an apparently ubiquitous presence but extensive diversification of sequence both within and among the particular taxonomic groups in which they are found. Crystal structures of VCBPs and NITRs demonstrate significant similarity to those of antigen-binding receptors but at the same time exhibit key differences that imply acquisition of separate and distinct ligand-binding functions. The tremendous plasticity of the Ig domain makes it a strong focus for studies of evolutionary events that have shaped modern integrated immune systems. Current data are consistent with a model of extremely rapid emergence and divergence of immune receptors, perhaps specific to individual species, as organisms contend with environments in which pathogens are continually selected for variation of their own molecular signatures.
- Viiri KM, Heinonen TY, Maki M, Lohi O
- Phylogenetic analysis of the SAP30 family of transcriptional regulators reveals functional divergence in the domain that binds the nuclear matrix.
- BMC Evol Biol. 2009; 9: 149-149
- Display abstract
BACKGROUND: Deacetylation of histones plays a fundamental role in gene silencing, and this is mediated by a corepressor complex containing Sin3 as an essential scaffold protein. In this report we examine the evolution of two proteins in this complex, the Sin3-associated proteins SAP30L and SAP30, by using an archive of protein sequences from 62 species. RESULTS: Our analysis indicates that in tetrapods SAP30L is more similar than SAP30 to the ancestral protein, and the two copies in this group originated by gene duplication which occurred after the divergence of Actinopterygii and Sarcopterygii about 450 million years ago (Mya). The phylogenetic analysis and biochemical experiments suggest that SAP30 has diverged functionally from the ancestral SAP30L by accumulating mutations that have caused attenuation of one of the original functions, association with the nuclear matrix. This function is mediated by a nuclear matrix association sequence, which consists of a conserved motif in the C-terminus and the adjacent nucleolar localization signal (NoLS). CONCLUSION: These results add further insight into the evolution and function of proteins of the SAP30 family, which share many characteristic with nuclear scaffolding proteins that are intimately involved in regulation of gene expression. Furthermore, SAP30L seems essential to eukaryotic biology, as it is found in animals, plants, fungi, as well as some taxa of unicellular eukaryotes.
- Rajagopalan L, Pereira FA, Lichtarge O, Brownell WE
- Identification of functionally important residues/domains in membrane proteins using an evolutionary approach coupled with systematic mutational analysis.
- Methods Mol Biol. 2009; 493: 287-97
- Display abstract
Structure-function studies of membrane proteins present a unique challenge to researchers due to the numerous technical difficulties associated with their expression, purification and structural characterization. In the absence of structural information, rational identification of putative functionally important residues/regions is difficult. Phylogenetic relationships could provide valuable information about the functional significance of a particular residue or region of a membrane protein. Evolutionary Trace (ET) analysis is a method developed to utilize this phylogenetic information to predict functional sites in proteins. In this method, residues are ranked according to conservation or divergence through evolution, based on the hypothesis that mutations at key positions should coincide with functional evolutionary divergences. This information can be used as the basis for a systematic mutational analysis of identified residues, leading to the identification of functionally important residues and/or domains in membrane proteins, in the absence of structural information apart from the primary amino acid sequence. This approach is potentially useful in the context of the auditory system, as several key processes in audition involve the action of membrane proteins, many of which are novel and not well characterized structurally or functionally to date.
- McTaggart SJ, Conlon C, Colbourne JK, Blaxter ML, Little TJ
- The components of the Daphnia pulex immune system as revealed by complete genome sequencing.
- BMC Genomics. 2009; 10: 175-175
- Display abstract
BACKGROUND: Branchiopod crustaceans in the genus Daphnia are key model organisms for investigating interactions between genes and the environment. One major theme of research on Daphnia species has been the evolution of resistance to pathogens and parasites, but lack of knowledge of the Daphnia immune system has limited the study of immune responses. Here we provide a survey of the immune-related genome of D. pulex, derived from the newly completed genome sequence. Genes likely to be involved in innate immune responses were identified by comparison to homologues from other arthropods. For each candidate, the gene model was refined, and we conducted an analysis of sequence divergence from homologues from other taxa. RESULTS AND CONCLUSION: We found that some immune pathways, in particular the TOLL pathway, are fairly well conserved between insects and Daphnia, while other elements, in particular antimicrobial peptides, could not be recovered from the genome sequence. We also found considerable variation in gene family copy number when comparing Daphnia to insects and present phylogenetic analyses to shed light on the evolution of a range of conserved immune gene families.
- Muthusamy N et al.
- Phylogenetic analysis of the NEEP21/calcyon/P19 family of endocytic proteins: evidence for functional evolution in the vertebrate CNS.
- J Mol Evol. 2009; 69: 319-32
- Display abstract
Endocytosis and vesicle trafficking are required for optimal neural transmission. Yet, little is currently known about the evolution of neuronal proteins regulating these processes. Here, we report the first phylogenetic study of NEEP21, calcyon, and P19, a family of neuronal proteins implicated in synaptic receptor endocytosis and recycling, as well as in membrane protein trafficking in the somatodendritic and axonal compartments of differentiated neurons. Database searches identified orthologs for P19 and NEEP21 in bony fish, but not urochordate or invertebrate phyla. Calcyon orthologs were only retrieved from mammalian databases and distant relatives from teleost fish. In situ localization of the P19 zebrafish ortholog, and extant progenitor of the gene family, revealed a CNS specific expression pattern. Based on non-synonymous nucleotide substitution rates, the calcyon genes appear to be under less intense negative selective pressure. Indeed, a functional group II WW domain binding motif was detected in primate and human calcyon, but not in non-primate orthologs. Sequencing of the calcyon gene from 80 human subjects revealed a non-synonymous single nucleotide polymorphism that abrogated group II WW domain protein binding. Altogether, our data indicate the NEEP21/calcyon/P19 gene family emerged, and underwent two rounds of gene duplication relatively late in metazoan evolution (but early in vertebrate evolution at the latest). As functional studies suggest NEEP21 and calcyon play related, but distinct roles in regulating vesicle trafficking at synapses, and in neurons in general, we propose the family arose in chordates to support a more diverse range of synaptic and behavioral responses.
- Idone V, Tam C, Goss JW, Toomre D, Pypaert M, Andrews NW
- Repair of injured plasma membrane by rapid Ca2+-dependent endocytosis.
- J Cell Biol. 2008; 180: 905-14
- Display abstract
Ca2+ influx through plasma membrane lesions triggers a rapid repair process that was previously shown to require the exocytosis of lysosomal organelles (Reddy, A., E. Caler, and N. Andrews. 2001. Cell. 106:157-169). However, how exocytosis leads to membrane resealing has remained obscure, particularly for stable lesions caused by pore-forming proteins. In this study, we show that Ca2+-dependent resealing after permeabilization with the bacterial toxin streptolysin O (SLO) requires endocytosis via a novel pathway that removes SLO-containing pores from the plasma membrane. We also find that endocytosis is similarly required to repair lesions formed in mechanically wounded cells. Inhibition of lesion endocytosis (by sterol depletion) inhibits repair, whereas enhancement of endocytosis through disruption of the actin cytoskeleton facilitates resealing. Thus, endocytosis promotes wound resealing by removing lesions from the plasma membrane. These findings provide an important new insight into how cells protect themselves not only from mechanical injury but also from microbial toxins and pore-forming proteins produced by the immune system.
- Intra J, Pavesi G, Horner DS
- Phylogenetic analyses suggest multiple changes of substrate specificity within the glycosyl hydrolase 20 family.
- BMC Evol Biol. 2008; 8: 214-214
- Display abstract
BACKGROUND: Beta-N-acetylhexosaminidases belonging to the glycosyl hydrolase 20 (GH20) family are involved in the removal of terminal beta-glycosidacally linked N-acetylhexosamine residues. These enzymes, widely distributed in microorganisms, animals and plants, are involved in many important physiological and pathological processes, such as cell structural integrity, energy storage, pathogen defence, viral penetration, cellular signalling, fertilization, development of carcinomas, inflammatory events and lysosomal storage diseases. Nevertheless, only limited analyses of phylogenetic relationships between GH20 genes have been performed until now. RESULTS: Careful phylogenetic analyses of 233 inferred protein sequences from eukaryotes and prokaryotes reveal a complex history for the GH20 family. In bacteria, multiple gene duplications and lineage specific gene loss (and/or horizontal gene transfer) are required to explain the observed taxonomic distribution. The last common ancestor of extant eukaryotes is likely to have possessed at least one GH20 family member. At least one gene duplication before the divergence of animals, plants and fungi as well as other lineage specific duplication events have given rise to multiple paralogous subfamilies in eukaryotes. Phylogenetic analyses also suggest that a second, divergent subfamily of GH20 family genes present in animals derive from an independent prokaryotic source. Our data suggest multiple convergent changes of functional roles of GH20 family members in eukaryotes. CONCLUSION: This study represents the first detailed evolutionary analysis of the glycosyl hydrolase GH20 family. Mapping of data concerning physiological function of GH20 family members onto the phylogenetic tree reveals that apparently convergent and highly lineage specific changes in substrate specificity have occurred in multiple GH20 subfamilies.
- Ayme-Southgate AJ, Southgate RJ, Philipp RA, Sotka EE, Kramp C
- The myofibrillar protein, projectin, is highly conserved across insect evolution except for its PEVK domain.
- J Mol Evol. 2008; 67: 653-69
- Display abstract
All striated muscles respond to stretch by a delayed increase in tension. This physiological response, known as stretch activation, is, however, predominantly found in vertebrate cardiac muscle and insect asynchronous flight muscles. Stretch activation relies on an elastic third filament system composed of giant proteins known as titin in vertebrates or kettin and projectin in insects. The projectin insect protein functions jointly as a "scaffold and ruler" system during myofibril assembly and as an elastic protein during stretch activation. An evolutionary analysis of the projectin molecule could potentially provide insight into how distinct protein regions may have evolved in response to different evolutionary constraints. We mined candidate genes in representative insect species from Hemiptera to Diptera, from published and novel genome sequence data, and carried out a detailed molecular and phylogenetic analysis. The general domain organization of projectin is highly conserved, as are the protein sequences of its two repeated regions-the immunoglobulin type C and fibronectin type III domains. The conservation in structure and sequence is consistent with the proposed function of projectin as a scaffold and ruler. In contrast, the amino acid sequences of the elastic PEVK domains are noticeably divergent, although their length and overall unusual amino acid makeup are conserved. These patterns suggest that the PEVK region working as an unstructured domain can still maintain its dynamic, and even its three-dimensional, properties, without the need for strict amino acid conservation. Phylogenetic analysis of the projectin proteins also supports a reclassification of the Hymenoptera in relation to Diptera and Coleoptera.
- Ryan TJ, Emes RD, Grant SG, Komiyama NH
- Evolution of NMDA receptor cytoplasmic interaction domains: implications for organisation of synaptic signalling complexes.
- BMC Neurosci. 2008; 9: 6-6
- Display abstract
BACKGROUND: Glutamate gated postsynaptic receptors in the central nervous system (CNS) are essential for environmentally stimulated behaviours including learning and memory in both invertebrates and vertebrates. Though their genetics, biochemistry, physiology, and role in behaviour have been intensely studied in vitro and in vivo, their molecular evolution and structural aspects remain poorly understood. To understand how these receptors have evolved different physiological requirements we have investigated the molecular evolution of glutamate gated receptors and ion channels, in particular the N-methyl-D-aspartate (NMDA) receptor, which is essential for higher cognitive function. Studies of rodent NMDA receptors show that the C-terminal intracellular domain forms a signalling complex with enzymes and scaffold proteins, which is important for neuronal and behavioural plasticity RESULTS: The vertebrate NMDA receptor was found to have subunits with C-terminal domains up to 500 amino acids longer than invertebrates. This extension was specific to the NR2 subunit and occurred before the duplication and subsequent divergence of NR2 in the vertebrate lineage. The shorter invertebrate C-terminus lacked vertebrate protein interaction motifs involved with forming a signaling complex although the terminal PDZ interaction domain was conserved. The vertebrate NR2 C-terminal domain was predicted to be intrinsically disordered but with a conserved secondary structure. CONCLUSION: We highlight an evolutionary adaptation specific to vertebrate NMDA receptor NR2 subunits. Using in silico methods we find that evolution has shaped the NMDA receptor C-terminus into an unstructured but modular intracellular domain that parallels the expansion in complexity of an NMDA receptor signalling complex in the vertebrate lineage. We propose the NR2 C-terminus has evolved to be a natively unstructured yet flexible hub organising postsynaptic signalling. The evolution of the NR2 C-terminus and its associated signalling complex may contribute to species differences in behaviour and in particular cognitive function.
- Brocchieri L, Conway de Macario E, Macario AJ
- hsp70 genes in the human genome: Conservation and differentiation patterns predict a wide array of overlapping and specialized functions.
- BMC Evol Biol. 2008; 8: 19-19
- Display abstract
BACKGROUND: Hsp70 chaperones are required for key cellular processes and response to environmental changes and survival but they have not been fully characterized yet. The human hsp70-gene family has an unknown number of members (eleven counted over ten years ago); some have been described but the information is incomplete and inconsistent. A coherent body of knowledge encompassing all family components that would facilitate their study individually and as a group is lacking. Nowadays, the study of chaperone genes benefits from the availability of genome sequences and a new protocol, chaperonomics, which we applied to elucidate the human hsp70 family. RESULTS: We identified 47 hsp70 sequences, 17 genes and 30 pseudogenes. The genes distributed into seven evolutionarily distinct groups with distinguishable subgroups according to phylogenetic and other data, such as exon-intron and protein features. The N-terminal ATP-binding domain (ABD) was conserved at least partially in the majority of the proteins but the C-terminal substrate-binding domain (SBD) was not. Nine proteins were typical Hsp70s (65-80 kDa) with ABD and SBD, two were lighter lacking partly or totally the SBD, and six were heavier (>80 kDa) with divergent C-terminal domains. We also analyzed exon-intron features, transcriptional variants and protein structure and isoforms, and modality and patterns of expression in various tissues and developmental stages. Evolutionary analyses, including human hsp70 genes and pseudogenes, and other eukaryotic hsp70 genes, showed that six human genes encoding cytosolic Hsp70s and 27 pseudogenes originated from retro-transposition of HSPA8, a gene highly expressed in most tissues and developmental stages. CONCLUSION: The human hsp70-gene family is characterized by a remarkable evolutionary diversity that mainly resulted from multiple duplications and retrotranspositions of a highly expressed gene, HSPA8. Human Hsp70 proteins are clustered into seven evolutionary Groups, with divergent C-terminal domains likely defining their distinctive functions. These functions may also be further defined by the observed differences in the N-terminal domain.
- Burglin TR
- Evolution of hedgehog and hedgehog-related genes, their origin from Hog proteins in ancestral eukaryotes and discovery of a novel Hint motif.
- BMC Genomics. 2008; 9: 127-127
- Display abstract
BACKGROUND: The Hedgehog (Hh) signaling pathway plays important roles in human and animal development as well as in carcinogenesis. Hh molecules have been found in both protostomes and deuterostomes, but curiously the nematode Caenorhabditis elegans lacks a bona-fide Hh. Instead a series of Hh-related proteins are found, which share the Hint/Hog domain with Hh, but have distinct N-termini. RESULTS: We performed extensive genome searches such as the cnidarian Nematostella vectensis and several nematodes to gain further insights into Hh evolution. We found six genes in N. vectensis with a relationship to Hh: two Hh genes, one gene with a Hh N-terminal domain fused to a Willebrand factor type A domain (VWA), and three genes containing Hint/Hog domains with distinct novel N-termini. In the nematode Brugia malayi we find the same types of hh-related genes as in C. elegans. In the more distantly related Enoplea nematodes Xiphinema and Trichinella spiralis we find a bona-fide Hh. In addition, T. spiralis also has a quahog gene like C. elegans, and there are several additional hh-related genes, some of which have secreted N-terminal domains of only 15 to 25 residues. Examination of other Hh pathway components revealed that T. spiralis - like C. elegans - lacks some of these components. Extending our search to all eukaryotes, we recovered genes containing a Hog domain similar to Hh from many different groups of protists. In addition, we identified a novel Hint gene family present in many eukaryote groups that encodes a VWA domain fused to a distinct Hint domain we call Vint. Further members of a poorly characterized Hint family were also retrieved from bacteria. CONCLUSION: In Cnidaria and nematodes the evolution of hh genes occurred in parallel to the evolution of other genes that contain a Hog domain but have different N-termini. The fact that Hog genes comprising a secreted N-terminus and a Hog domain are found in many protists indicates that this gene family must have arisen in very early eukaryotic evolution, and gave rise eventually to hh and hh-related genes in animals. The results indicate a hitherto unsuspected ability of Hog domain encoding genes to evolve new N-termini. In one instance in Cnidaria, the Hh N-terminal signaling domain is associated with a VWA domain and lacks a Hog domain, suggesting a modular mode of evolution also for the N-terminal domain. The Hog domain proteins, the inteins and VWA-Vint proteins are three families of Hint domain proteins that evolved in parallel in eukaryotes.
- Vandre DD et al.
- Dysferlin is expressed in human placenta but does not associate with caveolin.
- Biol Reprod. 2007; 77: 533-42
- Display abstract
A proteomics screen of human placental microvillous syncytiotrophoblasts (STBs) revealed the expression of dysferlin (DYSF), a plasma membrane repair protein associated with certain muscular dystrophies. This was unexpected given that previous studies of DYSF have been restricted to skeletal muscle. Within the placenta, DYSF localized to the STB and, with the exception of variable labeling in the fetal placental endothelium, none of the other cell types expressed detectable levels of DYSF. Such restricted expression was recapitulated using primary trophoblast cell cultures, because the syncytia expressed DYSF, but not the prefusion mononuclear cells. The apical plasma membrane of the STB contained approximately 4-fold more DYSF than the basal membrane, suggesting polarized trafficking. Unlike skeletal muscle, DYSF in the STB is localized to the plasma membrane in the absence of caveolin. DYSF expression in the STB was developmentally regulated, because first-trimester placentas expressed approximately 3-fold more DYSF than term placentas. As the current literature indicates that few cell types express DYSF, it is of interest that the two major syncytial structures in the human body, skeletal muscle and the STB, express this protein.
- Jimenez JL, Davletov B
- Beta-strand recombination in tricalbin evolution and the origin of synaptotagmin-like C2 domains.
- Proteins. 2007; 68: 770-8
- Display abstract
Two protein families involved in membrane traffic, tricalbins and synaptotagmins, contain several copies of C2 domains and are related based on their sequence and domain architecture. Paradoxically, tricalbin and synaptotagmin C2 domains belong to different structural types with apparent circular permutation of terminal beta-strands. To understand whether a topological switch took place, we analyzed tricalbin and synaptotagmin-like C2 domains using two-dimensional structural analysis. We found that yeast tricalbins contain five to six C2 domains. One of these C2 domains possesses many features of synaptotagmin-like C2 domains and also carries a conserved C-terminal strand that is similar to its structural equivalent in synaptotagmin-like C2 domains, suggesting a structural permutation event. Indeed, among higher eukaryotes, animal tricalbins have evolved a C2 domain with synaptotagmin-like topology indicating that the structural conversion has taken place. Investigation of plant synaptotagmins, however, proves that they are direct tricalbin orthologs. Our analysis shows that beta-strand recombination is a possible evolutionary mechanism to generate new structural topologies with altered functional properties.
- Munfus DL, Haga CL, Burrows PD, Cooper MD
- A conserved gene family encodes transmembrane proteins with fibronectin, immunoglobulin and leucine-rich repeat domains (FIGLER).
- BMC Biol. 2007; 5: 36-36
- Display abstract
BACKGROUND: In mouse the cytokine interleukin-7 (IL-7) is required for generation of B lymphocytes, but human IL-7 does not appear to have this function. A bioinformatics approach was therefore used to identify IL-7 receptor related genes in the hope of identifying the elusive human cytokine. RESULTS: Our database search identified a family of nine gene candidates, which we have provisionally named fibronectin immunoglobulin leucine-rich repeat (FIGLER). The FIGLER 1-9 genes are predicted to encode type I transmembrane glycoproteins with 6-12 leucine-rich repeats (LRR), a C2 type Ig domain, a fibronectin type III domain, a hydrophobic transmembrane domain, and a cytoplasmic domain containing one to four tyrosine residues. Members of this multichromosomal gene family possess 20-47% overall amino acid identity and are differentially expressed in cell lines and primary hematopoietic lineage cells. Genes for FIGLER homologs were identified in macaque, orangutan, chimpanzee, mouse, rat, dog, chicken, toad, and puffer fish databases. The non-human FIGLER homologs share 38-99% overall amino acid identity with their human counterpart. CONCLUSION: The extracellular domain structure and absence of recognizable cytoplasmic signaling motifs in members of the highly conserved FIGLER gene family suggest a trophic or cell adhesion function for these molecules.
- Min SW, Chang WP, Sudhof TC
- E-Syts, a family of membranous Ca2+-sensor proteins with multiple C2 domains.
- Proc Natl Acad Sci U S A. 2007; 104: 3823-8
- Display abstract
C(2) domains are autonomously folded protein modules that generally act as Ca(2+)- and phospholipid-binding domains and/or as protein-protein interaction domains. We now report the primary structures and biochemical properties of a family of evolutionarily conserved mammalian proteins, referred to as E-Syts, for extended synaptotagmin-like proteins. E-Syts contain an N-terminal transmembrane region, a central juxtamembranous domain that is conserved from yeast to human, and five (E-Syt1) or three (E-Syt2 and E-Syt3) C-terminal C(2) domains. Only the first E-Syt C(2) domain, the C(2)A domain, includes the complete sequence motif that is required for Ca(2+) binding in C(2) domains. Recombinant protein fragments of E-Syt2 that include the first C(2) domain are capable of Ca(2+)-dependent phospholipid binding at micromolar concentrations of free Ca(2+), suggesting that E-Syts bind Ca(2+) through their first C(2) domain in a phospholipid complex. E-Syts are ubiquitously expressed, but enriched in brain. Expression of myc-tagged E-Syt proteins in transfected cells demonstrated localization to intracellular membranes for E-Syt1 and to plasma membranes for E-Syt2 and E-Syt3. Structure/function studies showed that the plasma-membrane localization of E-Syt2 and E-Syt3 was directed by their C-terminal C(2)C domains. This result reveals an unexpected mechanism by which the C(2)C domains of E-Syt2 and E-Syt3 functions as a targeting motif that localizes these proteins into the plasma membrane independent of their transmembrane region. Viewed together, our findings suggest that E-Syts function as Ca(2+)-regulated intrinsic membrane proteins with multiple C(2) domains, expanding the repertoire of such proteins to a fourth class beyond synaptotagmins, ferlins, and MCTPs (multiple C(2) domain and transmembrane region proteins).
- Larkin MA et al.
- Clustal W and Clustal X version 2.0.
- Bioinformatics. 2007; 23: 2947-8
- Display abstract
SUMMARY: The Clustal W and Clustal X multiple sequence alignment programs have been completely rewritten in C++. This will facilitate the further development of the alignment algorithms in the future and has allowed proper porting of the programs to the latest versions of Linux, Macintosh and Windows operating systems. AVAILABILITY: The programs can be run on-line from the EBI web server: http://www.ebi.ac.uk/tools/clustalw2. The source code and executables for Windows, Linux and Macintosh computers are available from the EBI ftp site ftp://ftp.ebi.ac.uk/pub/software/clustalw2/
- Uda K et al.
- Evolution of the arginine kinase gene family.
- Comp Biochem Physiol Part D Genomics Proteomics. 2006; 1: 209-18
- Display abstract
Arginine kinase (AK), catalyzing the reversible transfer of phosphate from MgATP to arginine yielding phosphoarginine and MgADP, is widely distributed throughout the invertebrates and is also present in certain protozoa. Typically, these proteins are found as monomers targeted to the cytoplasm, but true dimeric and contiguous dimeric AKs as well as mitochondrial AK activities have been observed. In the present study, we have obtained the sequences of the genes for AKs from two distantly related molluscs-the cephalopod Nautilus pompilius and the bivalve Crassostrea gigas. These new data were combined with available gene structure data (exon/intron organization) extracted from EST and genome sequencing project databases. These data, comprised of 23 sequences and gene structures from Protozoa, Cnidaria, Platyhelminthes, Mollusca, Arthropoda and Nematoda, provide great insight into the evolution and divergence of the AK family. Sequence and phylogenetic analyses clearly show that the AKs are homologous having arisen from some common ancestor. However, AK gene organization is highly divergent and variable. Molluscan AK genes typically have a highly conserved six-exon/five-intron organization, a structure that is very similar to that of the platyhelminth Schistosoma mansoni Arthropod and nematode AK genes have fewer introns, while the cnidarian and protozoan genes each display unique exon/intron organization when compared to the other AK genes. The non-conservative nature of the AK genes is in sharp contrast to the relatively high degree of conservation of intron positions seen in a homologous enzyme creatine kinase (CK). The present results also show that gene duplication and subsequent fusion events forming unusual two-domain AKs occurred independently at least four times as these contiguous dimers are present in Protozoa, Cnidaria, Platyhelminthes and Mollusca. Detailed analyses of the amino acid sequences indicate that two AKs (one each from Drosophila and Caenorhabditis) have what appear to be N-terminal mitochondrial targeting sequences, providing the first evidence for true mitochondrial AK genes. The AK gene family is ancient and the lineage has undergone considerable divergence as well as multiple duplication and fusion events.
- Rodenburg KW, Smolenaars MM, Van Hoof D, Van der Horst DJ
- Sequence analysis of the non-recurring C-terminal domains shows that insect lipoprotein receptors constitute a distinct group of LDL receptor family members.
- Insect Biochem Mol Biol. 2006; 36: 250-63
- Display abstract
Lipoprotein-mediated delivery of lipids in mammals involves endocytic receptors of the low density lipoprotein (LDL) receptor (LDLR) family. In contrast, in insects, the lipoprotein, lipophorin (Lp), functions as a reusable lipid shuttle in lipid delivery, and these animals, therefore, were not supposed to use endocytic receptors. However, recent data indicate additional endocytic uptake of Lp, mediated by a Lp receptor (LpR) of the LDLR family. The two N-terminal domains of LDLR family members are involved in ligand binding and dissociation, respectively, and are composed of a mosaic of multiple repeats. The three C-terminal domains, viz., the optional O-linked glycosylation domain, the transmembrane domain, and the intracellular domain, are of a non-repetitive sequence. The present classification of newly discovered LDLR family members, including the LpRs, bears no relevance to physiological function. Therefore, as a novel approach, the C-terminal domains of LDLR family members across the entire animal kingdom were used to perform a sequence comparison analysis in combination with a phylogenetic tree analysis. The LpRs appeared to segregate into a specific group distinct from the groups encompassing the other family members, and each of the three C-terminal domains of the insect receptors is composed of unique set of sequence motifs. Based on conservation of sequence motifs and organization of these motifs in the domains, LpR resembles most the groups of the LDLRs, very low density lipoprotein (VLDL) receptors, and vitellogenin receptors. However, in sequence aspects in which LpR deviates from these three receptor groups, it most notably resembles LDLR-related protein-2, or megalin. These features might explain the functional differences disclosed between insect and mammalian lipoprotein receptors.
- Cho W, Stahelin RV
- Membrane binding and subcellular targeting of C2 domains.
- Biochim Biophys Acta. 2006; 1761: 838-49
- Display abstract
C2 domains are a ubiquitous structural module and many of them function in Ca2+ -dependent membrane binding and thereby serve as Ca2+ effectors for divergent Ca2+ -mediated cellular processes. Extensive structural, biochemical, biophysical, and cellular studies of C2 domains and host proteins in the past decade have shown that due to their structural diversity C2 domains have disparate Ca2+ sensitivity, lipid selectivity and membrane binding mechanisms. This review summarizes the basic structural and functional properties of C2 domains as well as recent findings on Ca2+ and membrane binding, lipid selectivity, and subcellular localization of C2 domains and their host proteins.
- Harlin-Cognato A, Hoffman EA, Jones AG
- Gene cooption without duplication during the evolution of a male-pregnancy gene in pipefish.
- Proc Natl Acad Sci U S A. 2006; 103: 19407-12
- Display abstract
Comparative studies of developmental processes suggest that novel traits usually evolve through the cooption of preexisting genes and proteins, mainly via gene duplication and functional specialization of paralogs. However, an alternative hypothesis is that novel protein function can evolve without gene duplication, through changes in the spatiotemporal patterns of gene expression (e.g., via cis-regulatory elements), or functional modifications (e.g., addition of functional domains) of the proteins they encode, or both. Here we present an astacin metalloprotease, dubbed patristacin, which has been coopted without duplication, via alteration in the expression of a preexisting gene from the kidney and liver of bony fishes, for a novel role in the brood pouch of pregnant male pipefish. We examined the molecular evolution of patristacin and found conservation of astacin-specific motifs but also several positively selected amino acids that may represent functional modifications for male pregnancy. Overall, our results pinpoint a clear case in which gene cooption occurred without gene duplication during the genesis of an evolutionarily significant novel structure, the male brood pouch. These findings contribute to a growing understanding of morphological innovation, a critically important but poorly understood process in evolutionary biology.
- Cannon JP et al.
- Ancient divergence of a complex family of immune-type receptor genes.
- Immunogenetics. 2006; 58: 362-73
- Display abstract
Multigene families of activating/inhibitory receptors belonging to the immunoglobulin superfamily (IgSF) regulate immunological and other cell-cell interactions. A new family of such genes, termed modular domain immune-type receptors (MDIRs), has been identified in the clearnose skate (Raja eglanteria), a phylogenetically ancient vertebrate. At least five different major forms of predicted MDIR proteins are comprised of four different subfamilies of IgSF ectodomains of the intermediate (I)- or C2-set. The predicted number of individual IgSF ectodomains in MDIRs varies from one to six. MDIR1 contains a positively charged transmembrane residue and MDIR2 and MDIR3 each possesses at least one immunoreceptor tyrosine-based inhibitory motif in their cytoplasmic regions. MDIR4 and MDIR5 lack characteristic activating/inhibitory signalling motifs. MDIRs are encoded in a particularly large and complex multigene family. MDIR domains exhibit distant sequence similarity to mammalian CMRF-35-like molecules, polymeric immunoglobulin receptors, triggering receptors expressed on myeloid cells (TREMs), TREM-like transcripts, NKp44 and FcR homologs, as well as to sequences identified in several different vertebrate genomes. Phylogenetic analyses suggest that MDIRs are representative members of an extended family of IgSF genes that diverged before or very early in evolution of the vertebrates and subsequently came to occupy multiple, fully independent distributions in the present day.
- Snell EA, Brooke NM, Taylor WR, Casane D, Philippe H, Holland PW
- An unusual choanoflagellate protein released by Hedgehog autocatalytic processing.
- Proc Biol Sci. 2006; 273: 401-7
- Display abstract
Hedgehog proteins are important cell-cell signalling proteins utilized during the development of multicellular animals. Members of the hedgehog gene family have not been detected outside the Metazoa, raising unanswered questions about their evolutionary origin. Here we report a highly unusual hedgehog-related gene from a choanoflagellate, a close unicellular relative of the animals. The deduced C-terminal domain, Hoglet-C, is homologous to the autocatalytic domain of Hedgehog proteins and is predicted to function in autocatalytic cleavage of the precursor peptide. In contrast, the N-terminal Hoglet-N peptide has no similarity to the signalling peptide of Hedgehog (Hh-N). Instead, Hoglet-N is deduced to be a secreted protein with an enormous threonine-rich domain of unprecedented size and purity (over 200 threonine residues) and two polysaccharide-binding domains. Structural modelling reveals that these domains have a novel combination of features found in cellulose-binding domains (CBD) of types IIa and IIb, and are expected to bind cellulose. We propose that the two CBD domains enable Hoglet-N to bind to plant matter, tethering an amorphous nucleophilic anchor, facilitating transient adhesion of the choanoflagellate cell. Since Hh-C and Hoglet-C are homologous, but Hh-N and Hoglet-N are not, we argue that metazoan hedgehog genes evolved by fusion of two distinct genes.
- Nikolskaya AN, Arighi CN, Huang H, Barker WC, Wu CH
- PIRSF family classification system for protein functional and evolutionary analysis.
- Evol Bioinform Online. 2006; 2: 197-209
- Display abstract
The PIRSF protein classification system (http://pir.georgetown.edu/pirsf/) reflects evolutionary relationships of full-length proteins and domains. The primary PIRSF classification unit is the homeomorphic family, whose members are both homologous (evolved from a common ancestor) and homeomorphic (sharing full-length sequence similarity and a common domain architecture). PIRSF families are curated systematically based on literature review and integrative sequence and functional analysis, including sequence and structure similarity, domain architecture, functional association, genome context, and phyletic pattern. The results of classification and expert annotation are summarized in PIRSF family reports with graphical viewers for taxonomic distribution, domain architecture, family hierarchy, and multiple alignment and phylogenetic tree. The PIRSF system provides a comprehensive resource for bioinformatics analysis and comparative studies of protein function and evolution. Domain or fold-based searches allow identification of evolutionarily related protein families sharing domains or structural folds. Functional convergence and functional divergence are revealed by the relationships between protein classification and curated family functions. The taxonomic distribution allows the identification of lineage-specific or broadly conserved protein families and can reveal horizontal gene transfer. Here we demonstrate, with illustrative examples, how to use the web-based PIRSF system as a tool for functional and evolutionary studies of protein families.
- Reiner O et al.
- The evolving doublecortin (DCX) superfamily.
- BMC Genomics. 2006; 7: 188-188
- Display abstract
BACKGROUND: Doublecortin (DCX) domains serve as protein-interaction platforms. Mutations in members of this protein superfamily are linked to several genetic diseases. Mutations in the human DCX gene result in abnormal neuronal migration, epilepsy, and mental retardation; mutations in RP1 are associated with a form of inherited blindness, and DCDC2 has been associated with dyslectic reading disabilities. RESULTS: The DCX-repeat gene family is composed of eleven paralogs in human and in mouse. Its evolution was followed across vertebrates, invertebrates, and was traced to unicellular organisms, thus enabling following evolutionary additions and losses of genes or domains. The N-terminal and C-terminal DCX domains have undergone sub-specialization and divergence. Developmental in situ hybridization data for nine genes was generated. In addition, a novel co-expression analysis for most human and mouse DCX superfamily-genes was performed using high-throughput expression data extracted from Unigene. We performed an in-depth study of a complete gene superfamily using several complimentary methods. CONCLUSION: This study reveals the existence and conservation of multiple members of the DCX superfamily in different species. Sequence analysis combined with expression analysis is likely to be a useful tool to predict correlations between human disease and mouse models. The sub-specialization of some members due to restricted expression patterns and sequence divergence may explain the successful addition of genes to this family throughout evolution.
- McLellan AS, Zimmermann W, Moore T
- Conservation of pregnancy-specific glycoprotein (PSG) N domains following independent expansions of the gene families in rodents and primates.
- BMC Evol Biol. 2005; 5: 39-39
- Display abstract
BACKGROUND: Rodent and primate pregnancy-specific glycoprotein (PSG) gene families have expanded independently from a common ancestor and are expressed virtually exclusively in placental trophoblasts. However, within each species, it is unknown whether multiple paralogs have been selected for diversification of function, or for increased dosage of monofunctional PSG. We analysed the evolution of the mouse PSG sequences, and compared them to rat, human and baboon PSGs to attempt to understand the evolution of this complex gene family. RESULTS: Phylogenetic tree analyses indicate that the primate N domains and the rodent N1 domains exhibit a higher degree of conservation than that observed in a comparison of the mouse N1 and N2 domains, or mouse N1 and N3 domains. Compared to human and baboon PSG N domain exons, mouse and rat PSG N domain exons have undergone less sequence homogenisation. The high non-synonymous substitution rates observed in the CFG face of the mouse N1 domain, within a context of overall conservation, suggests divergence of function of mouse PSGs. The rat PSG family appears to have undergone less expansion than the mouse, exhibits lower divergence rates and increased sequence homogenisation in the CFG face of the N1 domain. In contrast to most primate PSG N domains, rodent PSG N1 domains do not contain an RGD tri-peptide motif, but do contain RGD-like sequences, which are not conserved in rodent N2 and N3 domains. CONCLUSION: Relative conservation of primate N domains and rodent N1 domains suggests that, despite independent gene family expansions and structural diversification, mouse and human PSGs retain conserved functions. Human PSG gene family expansion and homogenisation suggests that evolution occurred in a concerted manner that maintains similar functions of PSGs, whilst increasing gene dosage of the family as a whole. In the mouse, gene family expansion, coupled with local diversification of the CFG face, suggests selection both for increased gene dosage and diversification of function. Partial conservation of RGD and RGD-like tri-peptides in primate and rodent N and N1 domains, respectively, supports a role for these motifs in PSG function.
- Meijer HJ, Latijnhouwers M, Ligterink W, Govers F
- A transmembrane phospholipase D in Phytophthora; a novel PLD subfamily.
- Gene. 2005; 350: 173-82
- Display abstract
Phospholipase D (PLD) is a ubiquitous enzyme in eukaryotes that participates in various cellular processes. Its catalytic domain is characterized by two HKD motifs in the C-terminal part. Until now, two subfamilies were recognized based on their N-terminal domain structure. The first has a PX domain in combination with a PH domain and is designated as PXPH-PLD. Members of the second subfamily, named C2-PLD, have a C2 domain and have, so far, only been found in plants. Here we describe a novel PLD subfamily that we identified in Phytophthora, a genus belonging to the class oomycetes and comprising many important plant pathogens. We cloned Pipld1 from Phytophthora infestans and retrieved full-length sequences of its homologues from Phytophthora sojae and Phytophthora ramorum genome databases. Their promoters contain two putative regulatory elements, one of which is highly conserved in all three genes. The three Phytophthora pld1 genes encode nearly identical proteins of around 1807 amino acids, with the two characteristic HKD motifs in the C-terminal part. Homology of the predicted proteins with known PLDs however is restricted to the two catalytic HKD motifs and adjacent domains. In the N-terminal part Phytophthora PLD1 has a PX-like domain, but it lacks a PH domain. Instead the N-terminal region contains five putative membrane spanning domains suggesting that Phytophthora PLD1 is a transmembrane protein. Since Phytophthora PLD1 cannot be categorized in one of the two existing subfamilies we propose to create a novel subfamily named PXTM-PLD.
- Wu KL, Guo ZJ, Wang HH, Li J
- The WRKY family of transcription factors in rice and Arabidopsis and their origins.
- DNA Res. 2005; 12: 9-26
- Display abstract
WRKY transcription factors, originally isolated from plants contain one or two conserved WRKY domains, about 60 amino acid residues with the WRKYGQK sequence followed by a C2H2 or C2HC zinc finger motif. Evidence is accumulating to suggest that the WRKY proteins play significant roles in responses to biotic and abiotic stresses, and in development. In this research, we identified 102 putative WRKY genes from the rice genome and compared them with those from Arabidopsis. The WRKY genes from rice and Arabidopsis were divided into three groups with several subgroups on the basis of phylogenies and the basic structure of the WRKY domains (WDs). The phylogenetic trees generated from the WDs and the genes indicate that the WRKY gene family arose during evolution through duplication and that the dramatic amplification of rice WRKY genes in group III is due to tandem and segmental gene duplication compared with those of Arabidopsis. The result suggests that some of the rice WRKY genes in group III are evolutionarily more active than those in Arabidopsis, and may have specific roles in monocotyledonous plants. Further, it was possible to identify the presence of WRKY-like genes in protists (Giardia lamblia and Dictyostelium discoideum) and green algae Chlamydomonas reinhardtii through database research, demonstrating the ancient origin of the gene family. The results obtained by alignments of the WDs from different species and other analysis imply that domain gain and loss is a divergent force for expansion of the WRKY gene family, and that a rapid amplification of the WRKY genes predate the divergence of monocots and dicots. On the basis of these results, we believe that genes encoding a single WD may have been derived from the C-terminal WD of the genes harboring two WDs. The conserved intron splicing positions in the WDs of higher plants offer clues about WRKY gene evolution, annotation, and classification.
- Huang CH, Peng J
- Evolutionary conservation and diversification of Rh family genes and proteins.
- Proc Natl Acad Sci U S A. 2005; 102: 15512-7
- Display abstract
Rhesus (Rh) proteins were first identified in human erythroid cells and recently in other tissues. Like ammonia transporter (Amt) proteins, their only homologues, Rh proteins have the 12 transmembrane-spanning segments characteristic of transporters. Many think Rh and Amt proteins transport the same substrate, NH(3)/NH(4)(+), whereas others think that Rh proteins transport CO(2) and Amt proteins NH(3). In the latter view, Rh and Amt are different biological gas channels. To reconstruct the phylogeny of the Rh family and study its coexistence with and relationship to Amt in depth, we analyzed 111 Rh genes and 260 Amt genes. Although Rh and Amt are found together in organisms as diverse as unicellular eukaryotes and sea squirts, Rh genes apparently arose later, because they are rare in prokaryotes. However, Rh genes are prominent in vertebrates, in which Amt genes disappear. In organisms with both types of genes, Rh had apparently diverged away from Amt rapidly and then evolved slowly over a long period. Functionally divergent amino acid sites are clustered in transmembrane segments and around the gas-conducting lumen recently identified in Escherichia coli AmtB, in agreement with Rh proteins having new substrate specificity. Despite gene duplications and mutations, the Rh paralogous groups all have apparently been subject to strong purifying selection indicating functional conservation. Genes encoding the classical Rh proteins in mammalian red cells show higher nucleotide substitution rates at nonsynonymous codon positions than other Rh genes, a finding that suggests a possible role for these proteins in red cell morphogenetic evolution.
- Nikolaidis N, Makalowska I, Chalkia D, Makalowski W, Klein J, Nei M
- Origin and evolution of the chicken leukocyte receptor complex.
- Proc Natl Acad Sci U S A. 2005; 102: 4057-62
- Display abstract
In mammals, the cell surface receptors encoded by the leukocyte receptor complex (LRC) regulate the activity of T lymphocytes and B lymphocytes, as well as that of natural killer cells, and thus provide protection against pathogens and parasites. The chicken genome encodes many Ig-like receptors that are homologous to the LRC receptors. The chicken Ig-like receptor (CHIR) genes are members of a large monophyletic gene family and are organized into genomic clusters, which are in conserved synteny with the mammalian LRC. One-third of CHIR genes encode polypeptide molecules that contain both activating and inhibitory motifs. These genes are present in different phylogenetic groups, suggesting that the primordial CHIR gene could have encoded both types of motifs in a single molecule. In contrast to the mammalian LRC genes, the CHIR genes with similar function (inhibition or activation) are evolutionarily closely related. We propose that, in addition to recombination, single nucleotide substitutions played an important role in the generation of receptors with different functions. Structural models and amino acid analyses of the CHIR proteins reveal the presence of different types of Ig-like domains in the same phylogenetic groups, as well as sharing of conserved residues and conserved changes of residues between different CHIR groups and between CHIRs and LRCs. Our data support the notion that the CHIR gene clusters are regions homologous to the mammalian LRC gene cluster and favor a model of evolution by repeated processes of birth and death (expansion-contraction) of the Ig-like receptor genes.
- Premzl M, Gready JE, Jermiin LS, Simonic T, Marshall Graves JA
- Evolution of vertebrate genes related to prion and Shadoo proteins--clues from comparative genomic analysis.
- Mol Biol Evol. 2004; 21: 2210-31
- Display abstract
Recent findings of new genes in fish related to the prion protein (PrP) gene PRNP, including our recent report of SPRN coding for Shadoo (Sho) protein found also in mammals, raise issues of their function and evolution. Here we report additional novel fish genes found in public databases, including a duplicated SPRN gene, SPRNB, in Fugu, Tetraodon, carp, and zebrafish encoding the Sho2 protein, and we use comparative genomic analysis to analyze the evolutionary relationships and to infer evolutionary trajectories of the complete data set. Phylogenetic footprinting performed on aligned human, mouse, and Fugu SPRN genes to define candidate regulatory promoter regions, detected 16 conserved motifs, three of which are known transcription factor-binding sites for a receptor and transcription factors specific to or associated with expression in brain. This result and other homology-based (VISTA global genomic alignment; protein sequence alignment and phylogenetics) and context-dependent (genomic context; relative gene order and orientation) criteria indicate fish and mammalian SPRN genes are orthologous and suggest a strongly conserved basic function in brain. Whereas tetrapod PRNPs share context with the analogous stPrP-2-coding gene in fish, their sequences are diverged, suggesting that the tetrapod and fish genes are likely to have significantly different functions. Phylogenetic analysis predicts the SPRN/SPRNB duplication occurred before divergence of fish from tetrapods, whereas that of stPrP-1 and stPrP-2 occurred in fish. Whereas Sho appears to have a conserved function in vertebrate brain, PrP seems to have an adaptive role fine-tuned in a lineage-specific fashion. An evolutionary model consistent with our findings and literature knowledge is proposed that has an ancestral prevertebrate SPRN-like gene leading to all vertebrate PrP-related and Sho-related genes. This provides a new framework for exploring the evolution of this unusual family of proteins and for searching for members in other fish branches and intermediate vertebrate groups.
- Parker N, Porter AC
- Identification of a novel gene family that includes the interferon-inducible human genes 6-16 and ISG12.
- BMC Genomics. 2004; 5: 8-8
- Display abstract
BACKGROUND: The human 6-16 and ISG12 genes are transcriptionally upregulated in a variety of cell types in response to type I interferon (IFN). The predicted products of these genes are small (12.9 and 11.5 kDa respectively), hydrophobic proteins that share 36% overall amino acid identity. Gene disruption and over-expression studies have so far failed to reveal any biochemical or cellular roles for these proteins. RESULTS: We have used in silico analyses to identify a novel family of genes (the ISG12 gene family) related to both the human 6-16 and ISG12 genes. Each ISG12 family member codes for a small hydrophobic protein containing a conserved ~80 amino-acid motif (the ISG12 motif). So far we have detected 46 family members in 25 organisms, ranging from unicellular eukaryotes to humans. Humans have four ISG12 genes: the 6-16 gene at chromosome 1p35 and three genes (ISG12(a), ISG12(b) and ISG12(c)) clustered at chromosome 14q32. Mice have three family members (ISG12(a), ISG12(b1) and ISG12(b2)) clustered at chromosome 12F1 (syntenic with human chromosome 14q32). There does not appear to be a murine 6-16 gene. On the basis of phylogenetic analyses, genomic organisation and intron-alignments we suggest that this family has arisen through divergent inter- and intra-chromosomal gene duplication events. The transcripts from human and mouse genes are detectable, all but two (human ISG12(b) and ISG12(c)) being upregulated in response to type I IFN in the cell lines tested. CONCLUSIONS: Members of the eukaryotic ISG12 gene family encode a small hydrophobic protein with at least one copy of a newly defined motif of approximately 80 amino-acids (the ISG12 motif). In higher eukaryotes, many of the genes have acquired a responsiveness to type I IFN during evolution suggesting that a role in resisting cellular or environmental stress may be a unifying property of all family members. Analysis of gene-function in higher eukaryotes is complicated by the possibility of functional redundancy between family-members. Genetic studies in organisms (e.g. Dictyostelium discoideum) with just one family member so far identified may be particularly helpful in this respect.
- Anantharaman V, Aravind L
- Novel conserved domains in proteins with predicted roles in eukaryotic cell-cycle regulation, decapping and RNA stability.
- BMC Genomics. 2004; 5: 45-45
- Display abstract
BACKGROUND: The emergence of eukaryotes was characterized by the expansion and diversification of several ancient RNA-binding domains and the apparent de novo innovation of new RNA-binding domains. The identification of these RNA-binding domains may throw light on the emergence of eukaryote-specific systems of RNA metabolism. RESULTS: Using sensitive sequence profile searches, homology-based fold recognition and sequence-structure superpositions, we identified novel, divergent versions of the Sm domain in the Scd6p family of proteins. This family of Sm-related domains shares certain features of conventional Sm domains, which are required for binding RNA, in addition to possessing some unique conserved features. We also show that these proteins contain a second previously uncharacterized C-terminal domain, termed the FDF domain (after a conserved sequence motif in this domain). The FDF domain is also found in the fungal Dcp3p-like and the animal FLJ22128-like proteins, where it fused to a C-terminal domain of the YjeF-N domain family. In addition to the FDF domains, the FLJ22128-like proteins contain yet another divergent version of the Sm domain at their extreme N-terminus. We show that the YjeF-N domains represent a novel version of the Rossmann fold that has acquired a set of catalytic residues and structural features that distinguish them from the conventional dehydrogenases. CONCLUSIONS: Several lines of contextual information suggest that the Scd6p family and the Dcp3p-like proteins are conserved components of the eukaryotic RNA metabolism system. We propose that the novel domains reported here, namely the divergent versions of the Sm domain and the FDF domain may mediate specific RNA-protein and protein-protein interactions in cytoplasmic ribonucleoprotein complexes. More specifically, the protein complexes containing Sm-like domains of the Scd6p family are predicted to regulate the stability of mRNA encoding proteins involved in cell cycle progression and vesicular assembly. The Dcp3p and FLJ22128 proteins may localize to the cytoplasmic processing bodies and possibly catalyze a specific processing step in the decapping pathway. The explosive diversification of Sm domains appears to have played a role in the emergence of several uniquely eukaryotic ribonucleoprotein complexes, including those involved in decapping and mRNA stability.
- Lau WL, Scholnick SB
- Identification of two new members of the CSMD gene family.
- Genomics. 2003; 82: 412-5
- Display abstract
CSMD1 is a putative suppressor of squamous cell carcinomas mapping to human chromosomal region 8p23. We have cloned two new members of this gene family, CSMD2 and CSMD3. The three CSMD proteins have very similar structures, each consisting of 14 CUB domains separated from one another by a sushi domain, an additional uninterrupted array of sushi domains, a single transmembrane domain, and a short cytoplasmic tail. CUB and sushi domains are thought to be sites of protein-protein or protein-ligand interactions, suggesting that CSMD proteins are either transmembrane receptors or adhesion proteins. The cytoplasmic tail sequences are highly conserved within the vertebrate lineage. CSMD2 maps to a chromosomal region that may contain a suppressor of oligodendrogliomas, yet its expression is elevated in some head and neck cancer cell lines. Functional overlap between the CSMD1 and the CSMD2 proteins may modify the phenotype resulting from the loss of either protein in tumors.
- Oertle T, Klinger M, Stuermer CA, Schwab ME
- A reticular rhapsody: phylogenic evolution and nomenclature of the RTN/Nogo gene family.
- FASEB J. 2003; 17: 1238-47
- Display abstract
Reticulon (RTN) genes code for a family of proteins relatively recently described in higher vertebrates. The four known mammalian paralogues (RTN1, -2, -3, and -4/Nogo) have homologous carboxyl termini with two characteristic large hydrophobic regions. Except for RTN4-A/Nogo-A, thought to be an inhibitor for neurite outgrowth, restricting the regenerative capabilities of the mammalian CNS after injury, the functions of other family members are largely unknown. The overall occurrence of RTNs in different phyla and the evolution of the RTN gene family have hitherto not been analyzed. Here we expound data showing that the RTN family has arisen during early eukaryotic evolution potentially concerted to the establishment of the endomembrane system. Over 250 reticulon-like (RTNL) genes were identified in deeply diverging eukaryotes, fungi, plants, and animals. A systematic nomenclature for all identified family members is introduced. The analysis of exon-intron arrangements and of protein homologies allowed us to isolate key steps in the history of these genes. Our data corroborate the hypothesis that present RTNs evolved from an intron-rich reticulon ancestor mainly by the loss of different introns in diverse phyla. We also present evidence that the exceptionally large RTN4-A-specific exon 3, which harbors a potent neurite growth inhibitory region, may have arisen de novo approximately 350 MYA during transition to land vertebrates. These data emphasize on the one hand the universal role of reticulons in the eukaryotic system and on the other hand the acquisition of putative new functions through acquirement of novel amino-terminal exons.
- Koonin EV, Makarova KS, Rogozin IB, Davidovic L, Letellier MC, Pellegrini L
- The rhomboids: a nearly ubiquitous family of intramembrane serine proteases that probably evolved by multiple ancient horizontal gene transfers.
- Genome Biol. 2003; 4: 19-19
- Display abstract
BACKGROUND: The rhomboid family of polytopic membrane proteins shows a level of evolutionary conservation unique among membrane proteins. They are present in nearly all the sequenced genomes of archaea, bacteria and eukaryotes, with the exception of several species with small genomes. On the basis of experimental studies with the developmental regulator rhomboid from Drosophila and the AarA protein from the bacterium Providencia stuartii, the rhomboids are thought to be intramembrane serine proteases whose signaling function is conserved in eukaryotes and prokaryotes. RESULTS: Phylogenetic tree analysis carried out using several independent methods for tree constructions and the corresponding statistical tests suggests that, despite its broad distribution in all three superkingdoms, the rhomboid family was not present in the last universal common ancestor of extant life forms. Instead, we propose that rhomboids evolved in bacteria and have been acquired by archaea and eukaryotes through several independent horizontal gene transfers. In eukaryotes, two distinct, ancient acquisitions apparently gave rise to the two major subfamilies, typified by rhomboid and PARL (presenilins-associated rhomboid-like protein), respectively. Subsequent evolution of the rhomboid family in eukaryotes proceeded by multiple duplications and functional diversification through the addition of extra transmembrane helices and other domains in different orientations relative to the conserved core that harbors the protease activity. CONCLUSIONS: Although the near-universal presence of the rhomboid family in bacteria, archaea and eukaryotes appears to suggest that this protein is part of the heritage of the last universal common ancestor, phylogenetic tree analysis indicates a likely bacterial origin with subsequent dissemination by horizontal gene transfer. This emphasizes the importance of explicit phylogenetic analysis for the reconstruction of ancestral life forms. A hypothetical scenario for the origin of intracellular membrane proteases from membrane transporters is proposed.
- Brancaccio M et al.
- Chp-1 and melusin, two CHORD containing proteins in vertebrates.
- FEBS Lett. 2003; 551: 47-52
- Display abstract
Melusin is a muscle specific protein required for heart hypertrophy in response to mechanical overload. Here we describe a protein 63% homologous to melusin, named chp-1, expressed in all tissues tested, including muscles, and highly conserved from invertebrates to human. Both proteins are characterized in their N-terminal half by a tandemly repeated zinc binding 60 amino acid domain with a motif of uniquely spaced cysteine and histidine residues. These motives are highly conserved from plants to mammals and have been recently named CHORD (for cysteine and histidine rich domain) domains. At the C-terminal end melusin contains a calcium binding stretch of 30 acidic amino acid residues which is absent in chp-1. While invertebrate genome contains only one gene coding for a chp-1 homolog, two genes coding for CHORD containing proteins (chp-1 and melusin) are present in vertebrates. Sequence analysis suggests that the muscle specific CHORD containing protein melusin originated by a gene duplication event during early chordate evolution.
- Guindon S, Gascuel O
- A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood.
- Syst Biol. 2003; 52: 696-704
- Display abstract
The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximum- likelihood principle, which clearly satisfies these requirements. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximum-likelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distance-based and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page: http://www.lirmm.fr/w3ifa/MAAS/.
- Anderson MK, Sun X, Miracle AL, Litman GW, Rothenberg EV
- Evolution of hematopoiesis: Three members of the PU.1 transcription factor family in a cartilaginous fish, Raja eglanteria.
- Proc Natl Acad Sci U S A. 2001; 98: 553-8
- Display abstract
T lymphocytes and B lymphocytes are present in jawed vertebrates, including cartilaginous fishes, but not in jawless vertebrates or invertebrates. The origins of these lineages may be understood in terms of evolutionary changes in the structure and regulation of transcription factors that control lymphocyte development, such as PU.1. The identification and characterization of three members of the PU.1 family of transcription factors in a cartilaginous fish, Raja eglanteria, are described here. Two of these genes are orthologs of mammalian PU.1 and Spi-C, respectively, whereas the third gene, Spi-D, is a different family member. In addition, a PU.1-like gene has been identified in a jawless vertebrate, Petromyzon marinus (sea lamprey). Both DNA-binding and transactivation domains are highly conserved between mammalian and skate PU.1, in marked contrast to lamprey Spi, in which similarity is evident only in the DNA-binding domain. Phylogenetic analysis of sequence data suggests that the appearance of Spi-C may predate the divergence of the jawed and jawless vertebrates and that Spi-D arose before the divergence of the cartilaginous fish from the lineage leading to the mammals. The tissue-specific expression patterns of skate PU.1 and Spi-C suggest that these genes share regulatory as well as structural properties with their mammalian orthologs.
- Davis DB, Delmonte AJ, Ly CT, McNally EM
- Myoferlin, a candidate gene and potential modifier of muscular dystrophy.
- Hum Mol Genet. 2000; 9: 217-26
- Display abstract
Dysferlin, the gene product of the limb girdle muscular dystrophy (LGMD) 2B locus, encodes a membrane-associated protein with homology to Caenorhabditis elegans fer-1. Humans with mutations in dysferlin ( DYSF ) develop muscle weakness that affects both proximal and distal muscles. Strikingly, the phenotype in LGMD 2B patients is highly variable, but the type of mutation in DYSF cannot explain this phenotypic variability. Through electronic database searching, we identified a protein highly homologous to dysferlin that we have named myoferlin. Myoferlin mRNA was highly expressed in cardiac muscle and to a lesser degree in skeletal muscle. However, antibodies raised to myoferlin showed abundant expression of myoferlin in both cardiac and skeletal muscle. Within the cell, myoferlin was associated with the plasma membrane but, unlike dysferlin, myoferlin was also associated with the nuclear membrane. Ferlin family members contain C2 domains, and these domains play a role in calcium-mediated membrane fusion events. To investigate this, we studied the expression of myoferlin in the mdx mouse, which lacks dystrophin and whose muscles undergo repeated rounds of degeneration and regeneration. We found upregulation of myoferlin at the membrane in mdx skeletal muscle. Thus, myoferlin ( MYOF ) is a candidate gene for muscular dystrophy and cardiomyopathy, or possibly a modifier of the muscular dystrophy phenotype.
- Thornton JW, DeSalle R
- Gene family evolution and homology: genomics meets phylogenetics.
- Annu Rev Genomics Hum Genet. 2000; 1: 41-73
- Display abstract
With the advent of high-throughput DNA sequencing and whole-genome analysis, it has become clear that the coding portions of the genome are organized hierarchically in gene families and superfamilies. Because the hierarchy of genes, like that of living organisms, reflects an ancient and continuing process of gene duplication and divergence, many of the conceptual and analytical tools used in phylogenetic systematics can and should be used in comparative genomics. Phylogenetic principles and techniques for assessing homology, inferring relationships among genes, and reconstructing evolutionary events provide a powerful way to interpret the ever increasing body of sequence data. In this review, we outline the application of phylogenetic approaches to comparative genomics, beginning with the inference of phylogeny and the assessment of gene orthology and paralogy. We also show how the phylogenetic approach makes possible novel kinds of comparative analysis, including detection of domain shuffling and lateral gene transfer, reconstruction of the evolutionary diversification of gene families, tracing of evolutionary change in protein function at the amino acid level, and prediction of structure-function relationships. A marriage of the principles of phylogenetic systematics with the copious data generated by genomics promises unprecedented insights into the nature of biological organization and the historical processes that created it.
- Lowry JA, Atchley WR
- Molecular evolution of the GATA family of transcription factors: conservation within the DNA-binding domain.
- J Mol Evol. 2000; 50: 103-15
- Display abstract
The GATA-binding transcription factors comprise a protein family whose members contain either one or two highly conserved zinc finger DNA-binding domains. Members of this group have been identified in organisms ranging from cellular slime mold to vertebrates, including plants, fungi, nematodes, insects, and echinoderms. While much work has been done describing the expression patterns, functional aspects, and target genes for many of these proteins, an evolutionary analysis of the entire family has been lacking. Herein we show that only the C-terminal zinc finger (Cf) and basic domain, which together constitute the GATA-binding domain, are conserved throughout this protein family. Phylogenetic analyses of amino acid sequences demonstrate distinct evolutionary pathways. Analysis of GATA factors isolated from vertebrates suggests that the six distinct vertebrate GATAs are descended from a common ancestral sequence, while those isolated from nonvertebrates (with the exception of the fungal AREA orthologues and Arabidopsis paralogues) appear to be related only within the DNA-binding domain and otherwise provide little insight into their evolutionary history. These results suggest multiple modes of evolution, including gene duplication and modular evolution of GATA factors based upon inclusion of a class IV zinc finger motif. As such, GATA transcription factors represent a group of proteins related solely by their homologous DNA-binding domains. Further analysis of this domain examines the degree of conservation at each amino acid site using the Boltzmann entropy measure, thereby identifying residues critical to preservation of structure and function. Finally, we construct a predictive motif that can accurately identify potential GATA proteins.
- Britton S et al.
- The third human FER-1-like protein is highly similar to dysferlin.
- Genomics. 2000; 68: 313-21
- Display abstract
Dysferlin, the protein product of the gene mutated in patients with an autosomal recessive limb-girdle muscular dystrophy type 2B (LGMD2B) and a distal muscular dystrophy, Miyoshi myopathy, is homologous to a Caenorhabditis elegans spermatogenesis factor, FER-1. Analysis of fer-1 mutants and of sequence predictions of the FER-1 and dysferlin ORFs has predicted a role in membrane fusion. Otoferlin, another human FER-1-like protein (ferlin), has recently been shown to be responsible for autosomal recessive nonsyndromic deafness (DFNB9). In this report we describe the third human ferlin gene, FER1L3, which maps to chromosome 10q23.3. Expression analysis of the orthologous mouse gene shows ubiquitous expression but predominant expression in the eye, esophagus, and salivary gland. All the ferlins are characterized by sequences corresponding to multiple C2 domains that share the highest level of homology with the C2A domain of rat synaptotagmin III. They are predicted to be Type II transmembrane proteins, with the majority of the protein facing the cytoplasm anchored by the C-terminal transmembrane domain. Sequence and predicted structural comparisons have highlighted the high degree of similarity of dysferlin and FER1L3, which have sequences corresponding to six C2 domains and which share more than 60% amino acid sequence identity.
- Wang W, Shakes DC
- Molecular evolution of the 14-3-3 protein family.
- J Mol Evol. 1996; 43: 384-98
- Display abstract
Members of the highly conserved and ubiquitous 14-3-3 protein family modulate a wide variety of cellular processes. To determine the evolutionary relationships among specific 14-3-3 proteins in different plant, animal, and fungal species and to initiate a predictive analysis of isoform-specific differences in light of the latest functional and structural studies of 14-3-3, multiple alignments were constructed from forty-six 14-3-3 sequences retrieved from the GenBank and SwissProt databases and a newly identified second 14-3-3 gene from Caenorhabditis elegans. The alignment revealed five highly conserved sequence blocks. Blocks 2-5 correlate well with the alpha helices 3, 5, 7, and 9 which form the proposed internal binding domain in the three-dimensional structure model of the functioning dimer. Amino acid differences within the functional and structural domains of plant and animal 14-3-3 proteins were identified which may account for functional diversity amongst isoforms. Protein phylogenic trees were constructed using both the maximum parsimony and neighbor joining methods of the PHYLIP(3.5c) package; 14-3-3 proteins from Entamoeba histolytica, an amitochondrial protozoa, were employed as an outgroup in our analysis. Epsilon isoforms from the animal lineage form a distinct grouping in both trees, which suggests an early divergence from the other animal isoforms. Epsilons were found to be more similar to yeast and plant isoforms than other animal isoforms at numerous amino acid positions, and thus epsilon may have retained functional characteristics of the ancestral protein. The known invertebrate proteins group with the nonepsilon mammalian isoforms. Most of the current 14-3-3 isoform diversity probably arose through independent duplication events after the divergence of the major eukaryotic kingdoms. Divergence of the seven mammalian isoforms beta, zeta, gamma, eta, epsilon, tau, and sigma (stratifin/HME1) occurred before the divergence of mammalian and perhaps before the divergence of vertebrate species. A possible ancestral 14-3-3 sequence is proposed.
- Theissen G, Kim JT, Saedler H
- Classification and phylogeny of the MADS-box multigene family suggest defined roles of MADS-box gene subfamilies in the morphological evolution of eukaryotes.
- J Mol Evol. 1996; 43: 484-516
- Display abstract
The MADS-box encodes a novel type of DNA-binding domain found so far in a diverse group of transcription factors from yeast, animals, and seed plants. Here, our first aim was to evaluate the primary structure of the MADS-box. Compilation of the 107 currently available MADS-domain sequences resulted in a signature which can strictly discriminate between genes possessing or lacking a MADS-domain and allowed a classification of MADS-domain proteins into several distinct subfamilies. A comprehensive phylogenetic analysis of known eukaryotic MADS-box genes, which is the first comprising animal as well as fungal and plant homologs, showed that the vast majority of subfamily members appear on distinct subtrees of phylogenetic trees, suggesting that subfamilies represent monophyletic gene clades and providing the proposed classification scheme with a sound evolutionary basis. A reconstruction of the history of the MADS-box gene subfamilies based on the taxonomic distribution of contemporary subfamily members revealed that each subfamily comprises highly conserved putative orthologs and recent paralogs. Some subfamilies must be very old (1,000 MY or more), while others are more recent. In general, subfamily members tend to share highly similar sequences, expression patterns, and related functions. The defined species distribution, specific function, and strong evolutionary conservation of the members of most subfamilies suggest that the establishment of different subfamilies was followed by rapid fixation and was thus highly advantageous during eukaryotic evolution. These gene subfamilies may have been essential prerequisites for the establishment of several complex eukaryotic body structures, such as muscles in animals and certain reproductive structures in higher plants, and of some signal transduction pathways. Phylogenetic trees indicate that after establishment of different subfamilies, additional gene duplications led to a further increase in the number of MADS-box genes. However, several molecular mechanisms of MADS-box gene diversification were used to a quite different extent during animal and plant evolution. Known plant MADS-domain sequences diverged much faster than those of animals, and gene duplication and sequence diversification were extensively used for the creation of new genes during plant evolution, resulting in a relatively large number of interacting genes. In contrast, the available data on animal genes suggest that increase in gene number was only moderate in the lineage leading to mammals, but in the case of MEF2-like gene products, heterodimerization between different splice variants may have increased the combinatorial possibilities of interactions considerably. These observations demonstrate that in metazoan and plant evolution, increased combinatorial possibilities of MADS-box gene product interactions correlated with the evolution of increasingly complex body plans.
- Okagaki T, Weber FE, Fischman DA, Vaughan KT, Mikawa T, Reinach FC
- The major myosin-binding domain of skeletal muscle MyBP-C (C protein) resides in the COOH-terminal, immunoglobulin C2 motif.
- J Cell Biol. 1993; 123: 619-26
- Display abstract
A common feature shared by myosin-binding proteins from a wide variety of species is the presence of a variable number of related internal motifs homologous to either the Ig C2 or the fibronectin (Fn) type III repeats. Despite interest in the potential function of these motifs, no group has clearly demonstrated a function for these sequences in muscle, either intra- or extracellularly. We have completed the nucleotide sequence of the fast type isoform of MyBP-C (C protein) from chicken skeletal muscle. The deduced amino acid sequence reveals seven Ig C2 sets and three Fn type III motifs in MyBP-C. alpha-chymotryptic digestion of purified MyBP-C gives rise to four peptides. NH2-terminal sequencing of these peptides allowed us to map the position of each along the primary structure of the protein. The 28-kD peptide contains the NH2-terminal sequence of MyBP-C, including the first C2 repeat. It is followed by two internal peptides, one of 5 kD containing exclusively spacer sequences between the first and second C2 motifs, and a 95-kD fragment containing five C2 domains and three fibronectin type III motifs. The C-terminal sequence of MyBP-C is present in a 14-kD peptide which contains only the last C2 repeat. We examined the binding properties of these fragments to reconstituted (synthetic) myosin filaments. Only the COOH-terminal 14-kD peptide is capable of binding myosin with high affinity. The NH2-terminal 28-kD fragment has no myosin-binding, while the long internal 100-kD peptide shows very weak binding to myosin. We have expressed and purified the 14-kD peptide in Escherichia coli. The recombinant protein exhibits saturable binding to myosin with an affinity comparable to that of the 14-kD fragment obtained by proteolytic digestion (1/2 max binding at approximately 0.5 microM). These results indicate that the binding to myosin filaments is mainly restricted to the last 102 amino acids of MyBP-C. The remainder of the molecule (1,032 amino acids) could interact with titin, MyBP-H (H protein) or thin filament components. A comparison of the highly conserved Ig C2 domains present at the COOH-terminus of five MyBPs thus far sequenced (human slow and fast MyBP-C, human and chicken MyBP-H, and chicken MyBP-C) was used to identify residues unique to these myosin-binding Ig C2 repeats.
- de Jong WW, Leunissen JA, Voorter CE
- Evolution of the alpha-crystallin/small heat-shock protein family.
- Mol Biol Evol. 1993; 10: 103-26
- Display abstract
The common characteristic of the alpha-crystallin/small heat-shock protein family is the presence of a conserved homologous sequence of 90-100 residues. Apart from the vertebrate lens proteins--alpha A- and alpha B-crystallin--and the ubiquitous group of 15-30-kDa heat-shock proteins, this family also includes two mycobacterial surface antigens and a major egg antigen of Schistosoma mansoni. Multiple small heat-shock proteins are especially present in higher plants, where they can be distinguished in at least two classes of cytoplasmic proteins and a chloroplast-located class. The alpha-crystallins have recently been found in many tissues outside the lens, and alpha B-crystallin, in particular, behaves in many respects like a small heat-shock protein. The homologous sequences constitute the C-terminal halves of the proteins and probably represent a structural domain with a more variable C-terminal extension. These domains must be responsible for the common structural and functional properties of this protein family. Analysis of the phylogenetic tree and comparison of the biological properties of the various proteins in this family suggest the following scenario for its evolution: The primordial role of the small heat-shock protein family must have been to cope with the destabilizing effects of stressful conditions on cellular integrity. The alpha-crystallin-like domain appears to be very stable, which makes it suitable both as a surface antigen in parasitic organisms and as a long-living lens protein in vertebrates. It has recently been demonstrated that, like the other heat-shock proteins, the alpha-crystallins and small heat-shock proteins function as molecular chaperones, preventing undesired protein-protein interactions and assisting in refolding of denatured proteins. Many of the small heat-shock proteins are differentially expressed during normal development, and there is good evidence that they are involved in cytomorphological reorganizations and in degenerative diseases. In conjunction with the stabilizing, thermoprotective role of alpha-crystallins and small heat-shock proteins, they may also be involved in signal transduction. The reversible phosphorylation of these proteins appears to be important in this respect.
- Paulev PE et al.
- Facial cold receptors and the survival reflex "diving bradycardia" in man.
- Jpn J Physiol. 1990; 40: 701-12
- Display abstract
We measured heart rate (HR), stroke volume (SV), systemic arterial blood pressure (BP), and mean arterial pressure (MAP) in 7 healthy volunteers in response to face immersion in water with concomitant breath-holding at different lung volumes. The subjects were at rest in the prone position. During breath-holding at total lung capacity (TLC), baseline HR (70 to 75 beats/min) fell by 10% within fractions of a second, both in the control preimmersion state when the head was surrounded by room air, and when it was immersed in water of 33 degrees C. This response was associated with rises in MAP and in SV. Immersion of the face in 10 degrees C water while breath-holding, was associated with a strong, negative chronotropic effect (22% fall in HR), which developed within 10 s. Breath-holding at functional residual capacity (FRC) reduced HR substantially only in 10 degrees C water, and in contrast to that at TLC, the response was slowly developing with a latency of 10-15 s. All these reductions in HR were significant and accompanied by increases in BP and MAP. The strong, negative chronotropic effect of cold water was typically linked to a rise in SV. The study identified two temporal components of HR reduction to face immersion: a fast parasympathetic response dependent on the input from the high pressure baroreceptors, and a late response mediated, in all likelihood, by sympathetic efferent activity. Facial receptors sensitive to cold seem to be vital in the largest responses observed. The fast response to breath-holding with the face in water of neutral temperature was equal to that in air. Thus "diving bradycardia" is in fact a basic survival response independent of water.
- Altschul SF, Lipman DJ
- Protein database searches for multiple alignments.
- Proc Natl Acad Sci U S A. 1990; 87: 5509-13
- Display abstract
Protein database searches frequently can reveal biologically significant sequence relationships useful in understanding structure and function. Weak but meaningful sequence patterns can be obscured, however, by other similarities due only to chance. By searching a database for multiple as opposed to pairwise alignments, distant relationships are much more easily distinguished from background noise. Recent statistical results permit the power of this approach to be analyzed. Given a typical query sequence, an algorithm described here permits the current protein database to be searched for three-sequence alignments in less than 4 min. Such searches have revealed a variety of subtle relationships that pairwise search methods would be unable to detect.
- Wilson KL, Newport J
- A trypsin-sensitive receptor on membrane vesicles is required for nuclear envelope formation in vitro.
- J Cell Biol. 1988; 107: 57-68
- Display abstract
The reformation of functioning organelles at the end of mitosis presents a problem in vesicle targeting. Using extracts made from Xenopus laevis frog eggs, we have studied in vitro the vesicles that reform the nuclear envelope. In the in vitro assay, nuclear envelope growth is linear with time. Furthermore, the final surface area of the nuclear envelopes formed is directly dependent upon the amount of membrane vesicles added to the assay. Egg membrane vesicles could be fractionated into two populations, only one of which was competent for nuclear envelope assembly. We found that vesicles active in nuclear envelope assembly contained markers (BiP and alpha-glucosidase II) characteristic of the endoplasmic reticulum (ER), but that the majority of ER-derived vesicles do not contribute to nuclear envelope size. This functional distinction between nuclear vesicles and ER-derived vesicles implies that nuclear vesicles are unique and possess at least one factor required for envelope assembly that is lacking in other vesicles. Consistent with this, treatment of vesicles with trypsin destroyed their ability to form a nuclear envelope; electron microscopic studies indicate that the trypsin-sensitive proteins is required for vesicles to bind to chromatin. However, the protease-sensitive component(s) is resistant to treatments that disrupt protein-protein interactions, such as high salt, EDTA, or low ionic strength solutions. We propose that an integral membrane protein, or protein tightly associated with the membrane, is critical for nuclear vesicle targeting or function.