Prévia do material em texto
Biol. Rev. (2007), 82, pp. 551–572. 551 doi:10.1111/j.1469-185X.2007.00024.x Conceptual bases for quantifying the role of the environment on gene evolution: the participation of positive selection and neutral evolution Anthony Levasseur1*, Ludovic Orlando2, Xavier Bailly3, Michel C. Milinkovitch4, Etienne G. J. Danchin5 and Pierre Pontarotti1* 1 Phylogenomics Laboratory, EA 3781 Evolution Biologique Universit�e de Provence, Case 19, Pl. V. Hugo, 13331 Marseille Cedex 03, France 2 Pal�eog�en�etique et Evolution mol�eculaire, Ecole Normale Sup�erieure de Lyon, Universit�e de Lyon, Institut de G�enomique Fonctionnelle de Lyon, CNRS UMR 5262 - INRA, 46 All�ee d’Italie, 69364 Lyon Cedex 07, France 3 Station Biologique de Roscoff, Place Georges Teissier 29680 Roscoff, France 4 Laboratory of Evolutionary Genetics, Institute for Molecular Biology & Medicine, Universit�e Libre de Bruxelles (ULB), 12 rue Jeener & Brachet, 6041 Gosselies, Belgium 5Glycogenomics and Biomedical Structural Biology, AFMB UMR 6098 - CNRS - Aix-Marseille I and II, 163 Av. de Luminy, Case 932, 13288 Marseille Cedex 09, France (Received 30 August 2006; revised 6 July 2007; accepted 9 July 2007) ABSTRACT To demonstrate that a given change in the environment has contributed to the emergence of a given genotypic and phenotypic shift during the course of evolution, one should ask to what extent such shifts would have occurred without environmental change. Of course, such tests are rarely practical but phenotypic novelties can still be correlated to genomic shifts in response to environmental changes if enough information is available. We surveyed and re-evaluated the published data in order to estimate the role of environmental changes on the course of species and genomic evolution. Only a few published examples clearly demonstrate a causal link between a given environmental change and the fixation of a genomic variant resulting in functional modification (gain, loss or alteration of function). Many others suggested a link between a given phenotypic shift and a given environmental change but failed to identify the underlying genomic determinant(s) and/or the associated functional consequence(s). The proportion of genotypic and phenotypic variation that is fixed concomitantly with environmental changes is often considered adaptive and hence, the result of positive selection, even though alternative causes, such as genetic drift, are rarely investigated. Therefore, the second aim herein is to review evidence for the mechanisms leading to fixation. Key words: genome evolution, environmental changes, positive selection, adaptation, evolutionary shift. CONTENTS I. Introduction ...................................................................................................................................... 553 (1) Defining biological function ....................................................................................................... 553 (2) The molecular mechanisms of gene co-option .......................................................................... 553 (a ) Co-option without shift of the original function .................................................................. 553 (b ) Co-option with shift of the original function ....................................................................... 553 * Address for correspondence: Tel: ]33 491 106 489; E-mail: Anthony.Levasseur@up.univ-mrs.fr; Pierre.Pontarotti@up.univ-mrs.fr Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society II. Functional losses ............................................................................................................................... 554 (1) Convergent loss with a suggested relationship between gene function and environmental change ......................................................................................................................................... 554 (a ) Gene losses that occurred faster than under neutrality (positive selection) ........................ 554 (i ) FR1 ................................................................................................................................. 554 (ii ) CCR5 .............................................................................................................................. 555 (iii ) G6PD .............................................................................................................................. 555 (iv ) Riftia pachyptila haemoglobin ........................................................................................... 555 (b ) When positive selection cannot be detected ........................................................................ 556 (i ) Sws1 ................................................................................................................................. 556 (ii ) Ectodysplasin (EDA) ....................................................................................................... 556 ( c ) Mass gene losses .................................................................................................................... 556 (d ) Loss not directly linked to an environmental change but to the acquisition of a new function ....................................................................................................................... 556 (2) Suggested relationship between gene function and environmental change but absence of convergent losses ......................................................................................................................... 556 (a ) Cases with positive selection ................................................................................................. 556 (i ) The selfing locus in Arabidopsis thaliana ........................................................................... 556 (ii ) The Duffy blood group locus (FY) ................................................................................. 556 (iii ) CASP12 ........................................................................................................................... 556 (b ) No positive selection detected ............................................................................................... 557 (i ) Olfactory receptors in Stenella coeruleoalba ....................................................................... 557 (ii ) Human bitter taste receptor genes ................................................................................. 557 (3) Convergent loss with no clear relationship between gene function and environment ............. 557 (a ) Galactose pathway ................................................................................................................ 557 (b ) Class I vomeronasal receptor ............................................................................................... 557 (4) No convergence and no clear relationship between gene function and environment ............. 557 III. Functional gains ................................................................................................................................ 557 (1) Cases in which all criteria are met ............................................................................................ 558 (a ) Artificial cases ........................................................................................................................ 558 (i ) Insecticides ...................................................................................................................... 558 (ii ) Antibiotics ....................................................................................................................... 558 (b ) Natural cases .........................................................................................................................558 (i ) Somatic evolution ........................................................................................................... 558 (ii ) RNAseI from ruminants and colobine monkeys ........................................................... 558 (2) Cases where only some criteria are met .................................................................................... 559 (a ) Case A: All criteria are met except that positive selection is not shown to occur for sites involved in the function ........................................................................................................ 559 (i ) Lysosyme ......................................................................................................................... 559 (ii ) Major histocompatibility complex .................................................................................. 560 (b ) Case B: convergence, with no demonstrated positive selection for sites involved in the function, and no clear correlation with the environment ......................................... 560 (i ) Aldehyde oxidase (AOX) and xanthine dehydrogenase (XDH) .................................... 560 (ii ) Semenogelin .................................................................................................................... 560 ( c ) Case C: convergent evolution at the molecular level, with demonstrated positive selection, but no demonstrated functional shift or relation with the environment ............ 561 (d ) Case D: convergent evolution at the molecular and functional levels, existence of correlation with an environmental change, but no detection of positive selection ............ 561 (i ) Evolution of orthologous genes ...................................................................................... 561 (a ) Vertebrate rhodopsins .............................................................................................. 561 (b ) TTX-resistant sodium channel ................................................................................ 562 (ii ) Evolution of non-orthologous genes ............................................................................... 562 (a ) Antifreeze glycoproteins (AFGP) ............................................................................. 562 (b ) Crystallins ................................................................................................................. 562 (iii ) Evolution of genomic repertoire .................................................................................... 562 ( e ) Case E: demonstrated positive selection on sites involved in the functional shift, with functional convergence ................................................................................................. 562 ( f ) Case F: demonstrated positive selection on sites involved in the functional shift, but with no observed convergence among lineages ............................................................. 563 (i ) Probable link with an environmental change ................................................................ 563 Anthony Levasseur and others552 Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society (a ) Proteorhodopsin ....................................................................................................... 563 (b ) TRIM5a ................................................................................................................... 563 (g ) ECP and EDN ......................................................................................................... 563 (d ) Lipase/feruloyl esterase A ....................................................................................... 563 (ii ) No clear link with an environmental shift ..................................................................... 564 (a ) Glutamate dehydrogenase 2 (GLUD2) ................................................................... 564 (b ) Glutathione transferases (GSTs) .............................................................................. 564 (g ) Case G: functional shift linked to an environmental change but with no evidence for positive selection or convergence .................................................................................... 564 (i ) Iota crystallin .................................................................................................................. 564 (ii ) Melanocortin-1 receptor (Mc1r) ..................................................................................... 564 (h ) Case H: functional shift but no evidence for positive selection .......................................... 564 ( i ) Case I: demonstrated positive selection but with no evidence for functional shift ............ 564 IV. Co-option through transcriptional shifts .......................................................................................... 565 (1) IL4 ............................................................................................................................................... 565 (2) Coagulation factor VII ............................................................................................................... 565 (3) MMP3 ......................................................................................................................................... 566 (4) LCT ............................................................................................................................................. 566 V. Subcellular localisation shift driven by environmental change ....................................................... 566 VI. Evidence for a role of environment on phenotypic shifts without information at the genomic level ..................................................................................................................................... 567 VII. Conclusions ....................................................................................................................................... 567 VIII. Acknowledgments ............................................................................................................................. 568 IX. References ......................................................................................................................................... 568 I. INTRODUCTION (1) Defining biological function During the evolutionary history of species, genomic events become fixed, first at the population then at the species level due to selection or to genetic drift. These changes can have different impacts at different functional levels (e.g. mod- ifications of protein-coding or regulatory sequences). We first need to clarify the notions of gene and protein function. Indeed, depending on the author, the word ‘‘function’’ refers to either (i) the general biochemical activity of a given gene product, or (ii) the cellular process in which the gene product is involved, (iii) the detailed mechanisms of catalysis or recognition, or (iv) a generalized phenotype (e.g. ‘‘olfaction’’). The term ‘‘function’’ refers to the specific results of specific experiments, and for that reason, a ‘‘function’’ can be defined at different organiza- tional levels of organisms. The first level describes molecular functions such as catalytic or binding activities. As two proteins with similar molecular activities can generate different phenotypes according to their subcellular locations, a second level refers to the cell compartments where the molecular activity takes place. The next functional level is the cell as a whole and refers to cellular pathways, cascades or processes in which a given gene is involved. In multicellular organisms, functions can go beyond cellular functions: tissue distribu- tion of biological processes, cellular interactions and communications, etc. These notionscould even be further developed towards yet higher levels of integration such as population or social levels. The function of a given gene and its product(s) can change due to mutations that alter coding or regulatory sequences resulting in a shift at the biochemical level, subcellular localisation and/or transcriptional level which, in turn, may lead to functional shifts at higher levels of organization. Novelty in evolution is mainly the result of functional shifts, also called ‘‘gene co-option’’ (Ganfornina & Sanchez, 1999) whereas truly new genes (gene occurring via overprinting for example, see Vernet et al., 1993) have not been identified (Long et al., 2003). We will describe in the following paragraph reported cases of co-option. (2) The molecular mechanisms of gene co-option (a ) Co-option without shift of the original function As demonstrated by the complexity of the immune system in vertebrates, a given enzymatic product (e.g. the products of the proteasome holoenzyme) may be recruited to a new pathway (e.g. antigen presentation to the major histocom- patibility complex) without any change in the basic biochemical function of the enzymes (Danchin et al., 2004). Thus, a mutation is not always necessary for a novel biochemical function to appear. (b ) Co-option with shift of the original function Different types of mutations can generate co-option: (1) Micromutations (substitutions, small indels). If located in the coding sequence, such mutations can lead to bio- chemical or sub-cellular localisation shifts whereas they can Role of the environment on gene evolution 553 Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society lead to transcriptional shifts when they affect promoters. (2) Gene shuffling between coding sequences or between a coding sequence and promoter. (3) Gene duplications by regional or whole-genome duplications (i.e. polyploidiza- tion) followed by micromutations and gene shuffling. All these events alter the genomic context, as illustrated by single gene (promoter plus coding sequence) duplication: either the two copies are subfunctionalized at the bio- chemical or transcriptional level (for reviews see Prince & Picket, 2002, and van Hoof, 2005) or neofunctionalized, i.e. one of the duplicates maintains the ancestral function while the other accumulates substitutions and evolves towards a new function (e.g. see Rodriguez-Trelles, Tarrio & Ayala, 2001, and Bos, 2005). Neofunctionalization can be associated with the asymmetric distribution of type I substitution sites (conserved in one subfamily but not in the other) indicating relaxed selection followed by positive selection (not detectable if the event is ancient due to saturation of synonymous substitutions). Van Hoof (2005) designed a test to discriminate between neo- and subfunctionalization. He analysed a pair of duplicates in yeast that evolved in an asymmetric manner and the non-duplicated corresponding orthologue in the closest related species. It appeared that the non-duplicated orthologue was able to complement both copies, while the copies were not able to complement each other, suggesting that both copies were not neofunctionalized but subfunc- tionalized. Another possibility to explain maintenance of duplicates under neutrality is to consider that the pre- duplication gene was once expressed under two conditions that were distributed between the two copies after duplication. Several lines of experimental evidence support this model. For instance, the en1 gene is expressed in the pectoral appendage bud and in some neurones in mice and chicks, while in zebrafish Brachydario rerio two paralogues are found, one being expressed in the pectoral appendage bud and a second in neurons (Force et al., 1999). The best examples of co-option are found among the crystallin genes. The ocular lens in vertebrates and some invertebrates is a transparent cellular tissue involved in light refraction. The necessary refraction index is achieved by the accumulation of soluble proteins: the crystallins. The genes coding for crystallins have been recruited in a recurrent manner from genes having non-lens functions (see Section III.2.d). Because in some organisms the crystallin function and the enzymatic function are encoded by two related but distinct genes, Piatigorsky & Wistow (1989) have suggested that the acquisition of the new, additional, molecular or transcriptional behaviour occurred at first, then was followed by a duplication event, and finally each duplicate lost a part of its ancestral ‘‘dual-behaviour’’. In the text we do not distinguish direct co-option followed by gene duplication from gene duplication followed by co-option. Shifts in molecular, transcriptional, and subcellular local- ization parameters could have an impact at different functional levels of the organism. For example, neo expression of a ‘‘master’’ regulator gene can modify the expression of several genes from the same cascade in a new cellular environment. One of the most famous examples is that of the Dll gene and butterfly eyespots (True & Carroll, 2002). The main questions addressed herein are: how many co- option events have been fixed in response to environmental changes and what is the role of positive selection in this process? Positive selection is indeed implicated in many studies but this conclusion is often based on observed sequence changes without evidence for a possible link between these evolutionary shifts and functional shifts or even for the existence of an environmental change. These events are rarely related to a precise environmental shift. We will focus here mainly on genomic changes that have easily detectable functional impacts, such as functional losses or biochemical, transcriptional, and cellular localiza- tion shifts. Many phenotypes have been suggested being selected due to environmental changes but the genomic loci involved have generally not been characterized. They will be briefly discussed in the final part of this review. Extensive reviews explaining how to detect positive selection at the population, species or sequence levels are already available (see, for example, Nielsen, 2005; Ponting & Lunter, 2006; Yang & Bielawski, 2000). II. FUNCTIONAL LOSSES Functional loss can be seen as an extreme case of co-option and can occur at the level of functional sites (subfunction- alization), genes (pseudogenization) or of whole cascades. Under selective constraints, the role of the environment on genomic evolution is not always obvious unless the same functional loss has occurred independently several times in similar environments (convergence). Consider a species subjected to a particular environmental change, some of its original functions are no longer essential for survival and may be lost (together with the associated genes) through genetic drift. Here, genomic evolution would have been driven by environmental change despite the lack of any form of selection (or because it allowed relaxation of the selective pressure). Therefore, loss of function can either be fixed by positive selection (if the maintenance of the function is deleterious) or by genetic drift (if the loss/ maintenance of function is neutral) in response to environmental change. Ideally, statistical tests should be applied to determine how often a given functional loss has occurred for a given environmental shift. Table 1 provides examples from the literature of losses possibly associated with environmental changes, classified according to the presence or absence of evidence for convergence, positive selection or a functional link. (1) Convergent loss with a suggested relationship between gene function and environmental change (a ) Gene losses that occurred faster than under neutrality (positive selection) ( i ) FR1The FRIGIDA (FR1) gene has been shown to be a major determinant of flowering time in Arabidopsis thaliana. A Anthony Levasseur and others554 Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society majority of early-flowering ecotypes shows one or two deletions that generate a frameshift in the FR1 open reading frame (ORF), suggesting that this phenotype has arisen at least twice. Le Corre, Roux & Reboud (2002) performed a population analysis on different ecotypes and confirmed that the loss of function mutations was associated with an early-flowering phenotype, these gene inactivations system- atically evolved in a non-neutral fashion. Moreover, they confirmed that the gene inactivation was phenotypically linked to an early flowering ecotype adaptated to cold environments (Johanson et al., 2000; Le Corre et al., 2002). This represents a strong indication that environmental change has driven this genetic change. ( ii ) CCR5 This primate transmembrane receptor is a cellular gateway for the entry of HIV-1 and all strains of SIV. Human homozygotes for the CCR5 null allele which has a 32 base pair (bp) deletion are highly resistant to HIV-1 infection. Another null allele (24 bp deletion) of CCR5 has convergently evolved in sooty mangabeys (Cercocebus atys), a natural host of SIV. The occurrence of the mangabey null allele at an appreciable frequency (around 4%) could be explained by positive selection; null homozygotes are protected from SIV infection because the encoded protein is not transported to the cell surface (Palacios et al., 1998). The null allele has been shown to be positively selected in humans (Galvani & Novembre, 2005). However, the exact nature of the selective pressure involved in the origin of the CCR5-{delta} 32 allele and its high prevalence in European populations (approximately 10%) is unclear as the HIV epidemic in humans is much more recent than the age of the null allele (about 700 years). However, both HIV and poxviruses enter leukocytes using chemokine receptors; it is plausible that the loss of the CCR5 chemokine receptor originally conferred resistance against smallpox. This hypothesis is supported by a correlation between historical smallpox epidemics and allele geographic distribution (Galvani & Slatkin, 2003). ( iii ) G6PD The frequencies of the low-activity coding alleles of glucose- 6-phosphate dehydrogenase (G6PD) in humans are highly correlated with the prevalence of malaria. The low activity coding alleles are thought to reduce the risk of infection by Plasmodium falciparum and are maintained at high frequencies despite the haemopathologies they cause (anaemia). Hap- lotype analysis of the low-activity coding alleles (8–20% for the G6PDA- and < 5% for the G6PDMed) indicated that the mutations at this locus evolved independently (Tishkoff et al., 2001) and their frequencies have increased at a rate too rapid to be explained by genetic drift (Tishkoff et al., 2001; Sabeti et al., 2002). Therefore, though a functional link between low activity of G6PD and protection from Plasmodium falciparum has not been established, the parasite could have driven genome evolution at this locus. Such a mechanism might also be involved in the sickle cell haemoglobin allele (given as a classical example of positive selection driven by the environment by Haldane, 1949), although new data are needed to attest formally for fixation under positive selection (Pagnier et al., 1984; Currat et al., 2002). (iv) Riftia pachyptila haemoglobin Haemoglobin of the deep-sea hydrothermal vent vestimen- tiferan Riftia pachyptila (Annelida) is able to bind hydrogen sulphide (H2S) to free-cysteine residues and transport it to fuel endosymbiotic sulphide-oxidizing bacteria. Cysteine residues are key amino acids (aa) conserved in annelid globins living in sulphide-rich environments, but not in globins from annelids living in sulphide-free environments. Synonymous and non-synonymous substitutions analyses from two different sets of orthologous annelid globin genes revealed that free-cysteine residues in annelids living in sulphide-free environments were lost during the course of evolution due to positive selection (free cysteines are disadvantageous in H2S-free environments because they interact with blood components disturbing homeostasis and reducing fitness; Bailly et al., 2003). The ability to bind hydrogen sulphide has been lost in several worms living in H2S-free environments. These worms form polyphyletic groups (McHugh, 1997), suggesting according to Bailly et al. (2003), that binding to H2S could represent the ancestral state, and that loss of binding capacity occurred indepen- dently many times. However, the positive selection has been shown to occur in only one case; it would be interesting to investigate other annelids living in H2S-free environments. Table 1. Functional losses of genes classified according to the presence of convergence, positive selection or a functional link FR1 CCR5 G6PD Haemoglobin Trichromacy SWS1 EDA SCR FY CASP12 OR TAS2R Galactose pathway V1RL TRP2 V1R Functional explanation for selective environmental advantage associated with gene loss ] ] ] [ [ New environment should lead to relaxed functional constraint NA ] NA ] NA NA Positive selection ] [ ] [ [ [ Convergence : same gene loss in similar environmental changes ] ] [ [ ] [ NA, not applicable. –: criteria not fulfilled or not tested. Role of the environment on gene evolution 555 Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society (b ) When positive selection cannot be detected ( i ) Sws1 The sws1 (short wavelength sensitive type I pigment group) gene mediates ultraviolet (UV) and violet vision in vertebrates except dolphins and coelacanths Latimeria chalumnae. Since sister species of both dolphins and coelacanths have functional sws1, it has been deduced that the sws1 gene became independently non-functional in these two groups. Such loss of function is unsurprising for coelacanths which live at depths exceeding 80 m, where UV and violet light are not available, but is less straightforward for dolphins which spend a significant portion of their time at the surface. It is possible that UV and violet vision may have been replaced by other communication means in these species (Shi & Yokoyama, 2003). ( ii ) Ectodysplasin (EDA) Morphological characters of several groups of teleost fish changed following the colonization of new freshwater environments; many shows a reduction of the bony armour found in their oceanic ancestors (Bell & Foster, 1994). Marine and freshwater stickleback populations have been studied with reference to the presence (‘‘complete’’ morph) or absence (‘‘low’’ morph) of armour plates. Mapping, sequencing, and transgenic studies demonstrated that ectodysplasin (EDA) (a member of the tumour necrosis family of secreted signalling molecules) played a key role in these evolutionary changes in natural populations and that parallel evolution of freshwater stickleback low-plated phenotypes has occurred repeatedly by selection of Eda alleles derived from an ancestral haplotype (Colosimo, et al., 2005). The selective advantage of armour plate reduction after freshwater colonization could be due to increased body flexibility and manoeuvrability, changes in swimming performance and predation regime in the freshwater environment. (c ) Mass gene losses In the endosymbiotic bacterium Buchnera aphidicola the genome contains 580 genes whereas the closely related species Escherichia coli has 4,300 genes. It seems that B. aphidicola lost 85 % of its genes 220–250 million years ago, during adaptation to an endosymbiotic life-style. Many other examples of multiple gene losses are foundamong bacteria symbionts (Moran, 2003) or eukaryotic intracellu- lar parasites (e.g. the microsporidian Encephalitozoon cunil; Katinka et al., 2001). Such genome shrinkage could be explained by the fact that host tissues supply many metabolic intermediates and cofactors making these symbiont genes redundant. Furthermore, as host-associated bacteria have small genetic population sizes relative to free- living relatives (Funk, Wernegreen & Moran, 2001), genetic drift would accelerate the loss of non-essential genes. Pathways for the synthesis of vitamins and amino acids present in Bacteria, Archea, fungi, plants, etc., but absent in Metazoa, provide good examples of this class of gene losses (Danchin, Gouret & Pontarotti, 2006). Using data from complete animal genomes, Friedman & Hughes (2004) showed that the same gene families have been lost independently in different lineages and that this has occurred more often than expected if gene loss occurred randomly. (d ) Loss not directly linked to an environmental change but to the acquisition of a new function The acquisition of obligate trichromacy occurred indepen- dently in the Catharrhini (apes, old-world-monkeys and a new-world monkey (the howler monkey Alouatta seniculus). In these trichromatic groups, a significantly higher pro- portion of olfactory receptor pseudogenes is found com- pared with closest relatives, suggesting that the deterioration of the olfactory repertoire occurred concomitantly with the acquisition of full trichromatic colour vision in primates (Gilad et al., 2004). However, a link with environmental conditions is difficult to demonstrate formally. (2) Suggested relationship between gene function and environmental change but absence of convergent losses (a ) Cases with positive selection ( i ) The selfing locus in Arabidopsis thaliana Shimizu et al. (2004) provided very strong evidence that inactivation of the SCR gene, which encodes a pollen coat protein required for reproductive self-incompatibility, has been selected during the evolution of A. thaliana. This may have been a key event enabling A. thaliana to expand its range from glacial regions into Eurasia post-Pleistocene. This phenomenon has not been observed for other species under the same environmental shift. ( ii ) The Duffy blood group locus (FY) A single cis-regulatory single nucleotide mutation has been demonstrated to shut down the expression of the FY locus in humans and confers resistance to malaria. This null expression allele is over-represented in sub-Saharan Africa whereas it is present at very low frequencies in other populations. Moreover, Fixation index (FST) values based on that locus are higher than those based on surrounding sequences or 10 non-coding and non-functional regions scattered throughout the genome. There is thus good evidence for local adaptation of human subpopulations to the malarial selective agent through selection on this mutation despite the absence of convergence (Hamblin & Di Rienzo, 2000). ( iii ) CASP12 The gene encoding for the CASPASE12 protein is non- functional in the human lineage. This null allele appeared shortly before modern humans migrated out of Africa (Wang, Grus & Zhang, 2006). Wang et al. (2006) show that nearly complete fixation of the null allele has been driven by positive selection, and propose that the functional loss was a consequence of exposure to new antigens during colonization of new areas. Indeed, CASPASE12 is involved Anthony Levasseur and others556 Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society in the regulation of inflammatory and immune responses to endotoxins and protections against severe sepsis. In this case, the role and nature of environmental change is not clear and it remains to be tested whether this gene has also been convergently lost in other lineages although initial analysis shows that the null allele is rare in other mammals (Saleh et al., 2004). However, in contrast with the CCR5 and Duffy genes, whose null alleles are specifically present in limited geographic areas (Europe and Africa, respec- tively), the loss of CASPASE 12 is characteristic of the whole human lineage. (b ) No positive selection detected ( i ) Olfactory receptors in Stenella coeruleoalba The olfactory receptor repertoire of the dolphin S. coeruleoalba consists only of non-functional class II olfactory receptor genes (pseudogenes; Freitag et al., 1998). Class II olfactory receptors, which recognize volatile odorant molecules, were duplicated en masse when tetrapods colonized land. The terrestrial ancestor of dolphins would therefore have had a set of class II genes like most mammals for which sequences are available. As these class II receptors presumably did not function in an aquatic environment, pseudogenization pre- sumably was not counter-selected. It would be interesting to examine convergent gene losses in other aquatic mammals that colonized the aquatic environment independently. ( ii ) Human bitter taste receptor genes Bitter taste perception prevents mammals from ingesting poisonous substances because many toxins taste bitter. Wang, Thomas & Zhang (2004) hypothesized that selective constraints on human bitter taste receptor (TAS2R) genes might have been relaxed because of changes in diet, use of fire and reliance on other means of toxin avoidance that emerged during human evolution. They looked at intra- specific variations of all 25 genes of the human TAS2R repertoire and found hallmarks of neutral evolution including : (1) similar rates of synonymous (dS) and non- synonymous (dN) nucleotide changes among rare poly- morphisms, (2) no variation in dN/dS among functional domains; and (3) segregation of pseudogene alleles within species and fixation of loss-of-function mutations. (3) Convergent loss with no clear relationship between gene function and environment In most studies investigating functional losses or gains, a clear causal relationship between an environmental change and a genotypic/phenotypic shift has not been demonstrated. However, establishing such links is extremely difficult since the effect of the environment is possibly indirect and palaeontological and palaeoenvironmental data are often incomplete. (a ) Galactose pathway Repeated losses of functionally linked genes have been described for seven genes involved in the galactose pathway (Hittinger, Rokas & Carroll, 2004). These genes were lost at least three times independently in the yeasts Eremothecium gossypii, Candida glabrata and Saccharomyces kudriavzvii and Hittinger et al. (2004) linked this to a change in ecological niche. An alternative view is that loss of these genes prevented the collapse of the gene network, rather than representing an adaptation to new environmental param- eters (Silwa & Korona, 2005). (b ) Class I vomeronasal receptor Another example of recurrent gene losses can be found for the class I vomeronasal receptor-like genes (V1RL) (Mundy & Cook, 2003) which were lost seven times independently. No obvious environmental shift can be linked to these gene losses. (4) No convergence and no clear relationship between gene function and environment Genes encoding the TRP2 ion channel and V1R pheromone receptors are two components of the vomeronasal phero- mone transduction pathway and have been pseudogenized during the evolution of old world monkeys (OWM) (Zhang & Webb, 2003). It is however difficult to link this loss to a particular environmental change. Phylogenetic distribution of vomeronasal pheromone insensitivity is concordant with conspicuous female sexual swelling and male trichromatic colour vision, suggesting that vision-based signaling may have replaced a vomeronasal mediated chemical-based system in hominoids and OWM. Trichromacy may have arisen or becomefixed because aided detection of young leaves and ripe fruits against dappled foliage. Once acquired, trichromatic vision allowed perception of subtle colour changes which may have provided the selective force for sexual swelling. However, Webb, Cortes-Ortiz & Zhang (2004) showed that pheromone perception and full tri- chromatic vision coexist in howler monkeys. Consequently, it is difficult to determine the validity of this scenario. III. FUNCTIONAL GAINS Like functional losses, functional gains can be positively selected or not. To show that a gain of function has occurred due to an environmental change, the following criteria should be met: (1) there is a clear correlation between the environmental change and the functional gain. (2) The sequence (or part of the sequence) has undergone substitutions characterized by positive selection. (3) The corresponding protein has undergone a functional shift (co- option) and the shift is associated with sites that have been shown to be under positive selection. (4) There is evidence for convergence at the functional and, possibly, at the molecular level (substitutions at relevant amino-acid sites occurring in several lineages, and a statistical test demon- strating significant convergent or parallel substitutions). (5) The functional gain is correlated with an environmental shift. (6) The novelty confers a selective advantage in the new environment. Role of the environment on gene evolution 557 Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society Alternatively a gain of function could be driven by an environmental change without positive selection, i.e. by genetic drift. Note that an apparent lack of positive selection can be due to low statistical power preventing detection; in this case, the role of the environment can still be detected in the presence of convergent gains. Many studies have focused on specific genes where functional shift can be detected but adaptation to new environments could involve a combination of several genes producing a particular phenotypic trait. (1) Cases in which all criteria are met (a ) Artificial cases To our knowledge, most cases reported in the literature that fulfill all the above criteria are associated with artificial selection imposed by humans (most notably, insecticide, herbicide and antibiotic-resistance genes: Palumbi, 2001; Wright et al., 2005). ( i ) Insecticides Three main types of resistance mechanisms have been described to date, two involving enhanced insecticide detoxification and one rendering the target site for the insecticide insensitive to its effects. This latter case has been found repeatedly for a large range of species and types of insecticide in proteins including acetylcholinesterase, g-aminobutyric acid (GABA) receptors and voltage-gated Na] channels which are the targets of organophosphate, cyclodiene and synthetic pyrethroids, like DDT, respectively. One of the remarkable aspects of insecticide resistance is the recurrence of exactly the same amino acid changes in orthologous proteins across different species (Ffrench- Constant, Daborn & Le Goff, 2004; Hartley et al., 2006). In the GABA receptor case, insecticide resistance is associated with replacement of alanine at position 302 with either a serine or a glycine residue. Alanine 302 is thought to lie in the narrowest part of the chloride ion channel and replacement of this crucial residue plays a dual role both in reducing insecticide-binding and in destabilizing the insecti- cide-bound conformation of the receptor. Population studies have documented increases in resistant allele frequencies in response to insecticide application and have shown that the corresponding loci are under positive selection (see reviews by Roush & Mckenzie, 1987; Guillemaud et al., 1998; and Scott, Diwell & McKenzie, 2000). Resistance to organophosphates is particularly interest- ing: molecular analysis of preserved specimens collected indicate that insecticide-resistant haplotypes of the esterase 3 gene present in two Australasian sibling blowfly species (Lucilia cuprina and Lucilia sericata) spread during the resistance outbreak associated with the first use of these insecticides in Australia (around 1955) (Hartley et al., 2006). ( ii ) Antibiotics ß-lactam antibiotics, including penicillin, ampicillin, ceph- alosporins and monobactams (and their derivatives) account for 50% of global antibiotic consumption. The integrity of the ß-lactam ring is necessary for the activity of the antibiotic which exerts its effect through inactivation of transpeptidases that catalyse key cross-linking reactions in peptidoglycan synthesis. Resistance to ß-lactams is the result of expression of ß-lactamases, enzymes that degrade and inactivate ß-lactams. The most common ß-lactamases are the TEM ß-lactamases encoded by the TEM-1 gene and its relatives. TEM-1 is taxonomically widely distributed (Meideiros, 1997) and exists at high frequencies in diverse antibiotic-resistant bacterial species (Chanal et al., 2000; Yan et al., 2000). Antibiotic resistance in bacteria can occur by many diverse mechanisms including nucleotide changes in the gene coding for the antibiotic target. In the case of TEM b-lactamase resistance, nine amino acid substitutions have been demonstrated to have occurred more than once; most of the observed substitutions are non-synonymous suggesting that positive selection is occuring (Barlow & Hall, 2002). Similar examples can be found for herbicide resistance (see, for example, Tranel & Wright, 2002). (b ) Natural cases The only natural cases where all the above criteria are met have been described in somatic evolution concerning antibodies and for the pancreatic ribonuclease in leaf- eating monkeys. ( i ) Somatic evolution Concerning antibodies, it has been shown that B cells undergo positive selection for mutations that increase their affinity for the antigen. The mutations that occur during the process of somatic recombination have been shown to be functional (by promoting antigen binding shifts; see, for example, Wellmann et al., 2005) and under positive selection (Manser, 1989). Parallel evolution has also been described (Wysocki, Gefter & Margolies, 1990). ( ii ) RNAseI from ruminants and colobine monkeys The ancestral function of RNAseI is the degradation of double-stranded RNA. A new biochemical function (i.e. the ability to digest single-stranded RNAs at low pH) appeared independently in both colobines (primates) and ruminants (artiodactyls) in relation to their capacity to perform fermentation in the pre-stomach. Recently, Zhang (2006) showed that the gene encoding RNaseI was independently duplicated in Asian and African leaf-eating monkeys and that those new genes acquired enhanced digestive efficien- cies through parallel amino acid replacements driven by positive selection. In ruminants, positive selection has not been demon- strated for RNAseI (Golding & Dean, 1998). Furthermore the amino acids involved in the functional shift are different for ruminant and colobine RNAseI: five amino acid substitutions in ruminant RNAseI that are known to affect its catalytic activity against double-strand RNA (Jermann et al., 1995) are not present in colobines. The relationship between digestive behaviour and environment is not straightforward. However, according Anthony Levasseur and others558 Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society to palaeomolecular work (Jermann et al., 1995), the co- option events in ruminants occurred around 35 million years (Myr) ago which is also the age of the earliest known ruminant fossil. Therefore it appears that the biochemical functional shift occurred in response to acquisition of herbivory. The Oligoceneglobal cooling that followed allowed tough grasses to expand and possibly gave a selective advantage to tough-grass eaters. For colobines monkeys, fossil evidence suggests that a shift to foregut fermentation occurred at least 10 Myr ago, predating the duplications of RNaseI (Delsen, 1994). Therefore a func- tional shift in the optimal pH of RNaseI was not necessary for the change in diet but rather provided a selective pressure for enhanced performance of digestive RNases in acidified environments. To our knowledge, no other example in this category has been described. (2) Cases where only some criteria are met Other natural cases reported in the literature do not fulfill all the criteria required to demonstrate a role of the environment on gene evolution via positive selection. These different cases are listed in Table 2 and shown in a simplified overview in Fig. 1. (a ) Case A: All criteria are met except that positive selection is not shown to occur for sites involved in the function ( i ) Lysosyme Lysosyme is an enzyme which disrupts bacterial peptido- glycans and in tetrapods is normally expressed in tear macrophages, saliva, avian egg white and mammalian milk; however, lysosyme has been recruited independently in the foregut of ruminants, colobine monkeys, and the hoatzins Opisthocomus hoazin (a leaf-eating bird). These stomach lysosymes have similar convergent biochemical properties: lytic activity is clearly optimal approximately at pH 5 for ruminant, colobine, and hoatzin lysosymes compared to pH 5–7 in other species. Ruminant and colobine digestive lysosymes also show increased resistance to inactivation by pepsin compared to other lysosymes. Positive selection has been detected in primate lysosyme (Messier & Stewart, 1997) by inference of ancestral sequences and dN/dS analysis on each branch of the tree. Most adaptive substitutions in lysosyme seem to have occurred at the origin of the colobine group. The same result was obtained by Yang & Nielsen (2002) using codon-substitution models at individual sites along specific lineages. Stomach lysozyme from ruminants also underwent positive selection (Jolles et al., 1990; Yu & Irwin, 1996; Zhang & Kumar, 1997): sites 75, 87 underwent parallel evolution, however, it is necessary to test functionally these two positions by site-directed mutagenesis for a definitive demonstration of their involve- ment in the gain of function. In the hoatzin, the lack of available sequence prevented testing the possible occurrence of positive selection. The relationship between digestive behaviour and environment changes in ruminants remains hypothetical. However, according to ancestral sequence reconstruction T a b le 2 . F u n ct io n a l g a in s o f g en es in th e li te ra tu re cl a ss if ie d a cc o rd in g to th e p re se n ce o f th e cr it er ia li st ed in se ct io n II I T ex t se ct io n (I II ) 1 2 a 2 b i 2 b ii 2 c 2 d 2 e 2 fi 2 fl i 2 g i g ii 2 h 2 i C a se A B l B 2 C D E F l F 2 G l G 2 H I 1 . E x is te n ce o f a cl ea r co rr el a ti o n b et w ee n th e en v ir o n m en ta l ch a n g e a n d th e g en e fu n ct io n ] ] [ [ [ ] [ ] [ ] ] ] / [ ] / [ 2 . F u n ct io n a l sh if t ] ] ] ] [ ] ] ] ] ] ] ] [ 3 . E vi d en ce fo r p o si ti ve se le ct io n ] ] ] ] ] [ ] ] ] [ [ [ ] 4 . E vi d en ce fo r ev o lu ti o n a ry sh if t ] ] ] ] ] [ ] ] ] [ [ ] ] 5 . S it es u n d er p o si ti ve se le ct io n a re a ss o ci a te d to fu n ct io n a l sh if t ] [ [ [ [ [ ] ] ] [ [ [ [ 6 . C o n ve rg en ce a t fu n ct io n a l le ve l ] ] ] ] [ ] ] [ [ [ ] [ [ 7 . C o n ve rg en ce a t m o le cu la r le ve l ] ] ] [ ] ] [ [ [ [ [ [ [ 8 . C o n ve rg en t si te s a re in vo lv ed in th e fu n ct io n a l sh if t ] [ [ [ [ ] [ [ [ [ [ [ [ 9 . F u n ct io n a l sh if t is li n k ed to en vi ro n m en ta l ch a n g e ] ] ] [ [ ] [ [ [ ] ] [ [ A rt ic if ia l ca se s R N A S E s L ys o sy m e M H C A O X SE M G 2 F O X P 2 R H 1 L W S SW S1 T T X A F G P C ry st a ll in G F P -l ik e pr ot ei ns P ro te o -r h o d o p si n T R IM 5 a E C P F A E A G L U D 2 G ST Io ta - cr ys ta ll in M C 1 R M o st o f th e p u b li sh ed ex a m p le s ([ ): C ri te ri a n o t fu lf il le d o r n o t te st ed . N o te th a t th e li n k b et w ee n th e g a in o f fu n ct io n a n d th e en vi ro n m en ta l ch a n g e is m o st o ft en sp ec u la ti ve . C o lu m n h ea d in g s re fe r to th e te x t, S ec ti o n II I, in w h ic h th es e g a in s a re d is cu ss ed . Role of the environment on gene evolution 559 Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society (Messier & Stewart, 1997), the co-option events seem to have occurred, as in the case of ruminant digestive RNAseI, around 35 million years ago. Additional palaeontological data and phylogenetic analyses are necessary to confirm the nature of the environmental shift responsible for lysozyme recruitment in the stomach. ( ii ) Major histocompatibility complex In the major histocompatibility complex (MHC) adaptive evolution has promoted diversity of the antigen recognition site (Hughes & Nei, 1988). Further analyses identified that the majority of residues located in the antigen recognition site are involved in antigen-binding (Yang & Swanson, 2002) and are under positive selection. However no functional study has been undertaken yet to test whether positively selected residues are indeed directly involved in the peptide-binding shift. Convergent changes in the MHC have been demonstrated (Andersson et al., 1991; Kriener et al., 2000). (b ) Case B: convergence, with no demonstrated positive selection for sites involved in the function, and no clear correlation with the environment ( i ) Aldehyde oxidase (AOX) and xanthine dehydrogenase (XDH) Aldehyde oxidase (AOX) and xanthine dehydrogenase (XDH) encode two members of the xanthine oxidase family of molibdo-flavoenzymes with different functions. AOX and XDH are homodimers (290 kDa) but each monomer acts independently in catalysis. XDH is involved in catabolism of purines by oxidizing hypoxanthine into xanthine, and xanthine into uric acid, whereas AOX catalyses the oxidation of aldehydes into acids and does not show reactivity with hypoxanthine. AOX and XDH originated from duplication events and thus provide an interesting case of neofunctionalization. Rodriguez-Trelles et al. (2001) demonstratedthat Aox evolved independently twice from two different Xdh paralogues whose duplicates were subjected to positive selection after each round of duplication. Moreover, in both cases, the same amino acids (located in the flavin adenine dinucleotide and substrate- binding pockets) have been positively selected. Although the link with an environmental change is difficult to demon- strate, convergence at functional and molecular levels strongly argues in favour of an adaptive event. ( ii ) Semenogelin In primates, semenogelin is the main protein of the seminal fluid produced by seminal vesicles. After ejaculation, semenogelin undergoes covalent cross-linking to become the principal structural component of semen coagulum in the reproductive tract of the recipient female. Over time, the coagulum is liquefied through the cleavage of semenogelin and this process leads to the release of sperm from the coagulum. This process is crucial in preventing fertilization of a recently inseminated female by rival males in subsequent copulations and thus is subject to different selective regimes under different mating systems. Dorus et al. (2004) showed a correlation between the rate of evolution of SEMG2, and three behavioural and physiological repro- ductive parameters in primates: mean number of male partners per female periovulatory period; the residual testis size within the species; and semen coagulation rates. Occurrence of convergent or parallel evolution at the molecular level would strengthen the case for selection although ultimate demonstration requires functional tests Fig. 1. Simplified overview of cases of functional gain. Cases are classified according to the presence (blue arrows) or absence (red crosses) of simplified criteria (left box). Cases are defined and discussed in the text, section numbers are as in Table 2. Anthony Levasseur and others560 Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society (using site-directed mutagenesis to quantify the impact of the incriminated substitutions on semen coagulation). The link between an environmental change and this behavioural shift remains unclear. (c ) Case C: convergent evolution at the molecular level, with demonstrated positive selection, but no demonstrated functional shift or relation with the environment FOXP2 (forkhead box P2) is a gene probably involved in cerebral and cognitive processes such as speech acquisition. Positive selection in humans has been shown at the population level (Zhang, Webb & Podlaha, 2002; Enard, 2002) but neither a functional shift nor a correlation with an environmental change has been demonstrated. Convergent or parallel evolution at the molecular level has been documented between humans and carnivores although statistically significant positive selection was not detected in carnivores. (d ) Case D: convergent evolution at the molecular and functional levels, existence of correlation with an environmental change, but no detection of positive selection When positive selection is not detected, gain of function could also be the consequence of a relaxed functional constraint allowing the new function to appear under neutrality. Moreover, many instances of selection are probably not detectable by any currently available method. The presence of convergence provides significant proof of a role of the environment on the evolutionary change. Therefore evidence of positive selection is not strictly required to substantiate an environmental influence. ( i ) Evolution of orthologous genes (a) Vertebrate rhodopsins. Retinal photoreceptors consist of a light-absorbing component (the chromophore) and a protein moiety (the visual pigment or opsin Wald, 1968). In vertebrates, two types of chromophore coexist, 11-cis-retinal and 11-cis-3,4 dehydroretinal and five types of opsins. There are five evolutionary groups of opsins, one in rods and four in cones. The rod opsin (rhodopsin 1, RH1) facilitates formation of black and white images in dim light whereas the cone opsins mediate colour vision in bright light. The four colour opsins differ in their light sensitivity: short-wavelength ultraviolet (UV)-sensitive 1 (SWS1, maximum absorbance lmax ¼ 360–430 nm), short-wavelength sensitive 2 (SWS2, lmax ¼ 440–460 nm), rhodopsin–like 2 (RH2, lmax ¼ 470–510 nm) and middle (green) and long (red) wavelength sensitive (MWS/LWS, lmax ¼ 510–560 nm). The light sensitivity of a visual pig- ment is determined by the chromophore and its interac- tion with the amino acid residues lining the pocket of the opsin in which chromophore is embedded. RH1 Most RH1 pigments tested so far have lmax values around 500 nm; however, marine conger eel Conger myriaster, bottlenose Tursiops truncates and saddleback dolphins Delphinus delphis, and coelacanths Latimeria chalumnae have RH1 pigments that show a 10–20 nm blue-shift in lmax value (Yokoyama, 2000; Yokoyama, personal communica- tion) possibly because they inhabit aquatic environments dominated by blue light. Although several pigments are known to have a lmax, the molecular basis of this shift has been analysed only for coelacanths (Yokoyama et al., 1999). While positive selection could not be detected with statistical significance, their analysis suggested that the same amino acid substitution occurred several times independently. In squirrel fish rhodopsins (Yokoyama & Takenaka, 2004), RH1 lmax ranges from 481nm to 501nm. Phyloge- netic and mutagenesis analyses suggest that the common ancestor of these pigments had a lmax value of 493 nm and that extant values were generated largely by three amino acid replacements: E122M, F261Y and A292S. The probability of simultaneous substitution of these three amino acids occurring by chance is only 2.5 � 10[9. The close correlation between the lmax values of these pigments and the wavelengths of light available to these species suggests that this functional shift can be associated with an identified environmental change. LWS/MWS Yokayama & Yokoyama (1990) reported that red pigments in humans and fish independently evolved from green pigments by identical amino acid substitutions at key functional positions. Indeed, three amino acid sites underwent parallel substitutions in bony fish and primate opsins, generating a similar change in lmax. Zhang (2003) using the statistical test developed by Zhang & Kumar (1997) rejected the null model of parallel substitutions due to chance alone. The same parallel substitutions were later found to have occurred in other species, and in all cases, the events were correlated with changes in lmax (Boissinot et al., 1998; Yokoyama & Radlwimmer, 2001). The environmen- tal changes that selected for these functional shifts are not yet known. SWS1/SWS2 Shi & Yokoyama (2003) showed that reconstructed SWS pigments of the ancestor of bony vertebrates had a lmax of 360 nm, enabling UV vision. A shift towards the violet spectrum occurred independently in different vertebrate lineages, partly due to a single amino- acid substitution. Shi & Yokoyama (2003) identified amino- acid substitutions that affected the lmax of SWS1 opsins. As noted by Zhang (2003), from such information, it should be possible to predict the lmax of any SWS1 opsin simply from its sequence. Odeen & Hastad (2003) obtained the SWS1 sequence for 41 birds and predicted that UV vision was regained several times independently. The difference in peak sensitivity between the UV and violet spectrum (at least 23 nm) is quite dramatic and changes not only the perception of objects that reflect light solely in the UV or violet ranges but also the perception of objects reflecting longer wavelengths. All of the six raptors examined by Odeen & Hastad (2003) possessed violet- sensitive SWS1, unlike many of their Passeriform prey. This could mean that preybirds use colours for signaling that are conspicuous to members of their own species but cryptic to raptors (Odeen & Hastad, 2003). Selection should favour stronger signals in the wavelengths to which predators (including human hunters) are insensitive: either higher plumage reflectance in the SWS1 and SWS2 ranges or Role of the environment on gene evolution 561 Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society higher sensitivity to those parts of the spectrum (Odeen & Hastad, 2003). (b) TTX-resistant sodium channel. Coevolution between the garter snake Thamnophilis sirtalis and its toxic prey, the newt Tarisha granulosa, has resulted in geographic variabil- ity in a physiological trait in this snake: resistance to tetro- dotoxin (TTX) (Brodie, 2002). Tetrodotoxin causes paralysis and death by binding to the outer pore of voltage-gated Na] channels [tsNa(V)1.4] blocking nerve and muscle fibre activity. Some populations of T. granulosa have high levels of TTX in their skin, providing a defence against predators. Sequence analyses of the TTX Na] channel gene showed that TTX resistance evolved at least twice during the radiation of T. sirtalis and in vitro experi- ments revealed the amino acid substitutions involved in the functional shift (Geffeney et al., 2005). One substitu- tion may represent parallel evolution at that site among resistant populations. ( ii ) Evolution of non-orthologous genes The above examples involve orthologous genes driven by the environment towards a similar new function, but convergence can occur among paralogues or even non- homologous sequences. Two examples are found: the antifreeze glycoproteins (AFGP) and the crystallins. (a) Antifreeze glycoproteins (AFGP). Antarctic notothenioid fish and Arctic cod (Chen, DeVries & Cheng, 1997) show similar AFGP gene structures that did not arise by descent from a common progenitor, but from the tendency for short repetitive sequences to undergo expansion through slippage replication and unequal crossing-over that gave rise to similar mature glycotripeptide gene products capa- ble of ice binding. The underlying environmental change (glaciation) can be linked to the same physiological adap- tation to life at low temperatures in these two groups althought a test for positive selection is difficult here since we need to identify the sequence of the protein before it was recruited to become an antifreeze. (b) Crystallins. The ocular lens in vertebrates and some invertebrates is a transparent cellular tissue whose princi- pal function is light refraction. The refraction index is achieved by the accumulation of soluble proteins: the crys- tallins. The genes coding for crystallins were recruited repeatedly from genes with non-lens functions. The oldest event involved a crystallin that was co-opted in the com- mon ancestor of vertebrates from small heat shock pro- teins (Ingolia & Craig, 1982). Genes co-opted to form crystallins are often easily identifiable as many of their products still function as enzymes in tissues outside the lens. Wistow (1993) suggested that these co-option events played a major role in the rise and radiation of land vertebrates. The vertebrate eye evolved in an optically dense medium (water) requiring a high refractive index. g crystallins have a high refractive index and predominate in both fish and mammalian lenses. During the tetrapods radiation in the less optically dense medium of air, the eye lens refractive index was reduced to allow accommodation and focusing at large distances. This reduction occurred via either the elimination of g crystallins or their dilution with other, unrelated, crystallin proteins. Birds with high diurnal visual acuity have completely replaced g crystallins with crystallins of low refractive index, co-opted from the argininosuccinate lyase (ASL), lactate dehydrogenase B (LDHB), and enolase enzymes, giving rise to d, 3, and t crytallins, respectively. Co-option events in mammals can be understood given their evolutionary history: the few mammals that survived the Cretaceous-Tertiary extinctions are thought to have been nocturnal. During this nocturnal episode in mamma- lian evolution, g crystallin genes underwent amplification and any lens-softening crystallins were lost. The explosive radiation of diurnal mammals in the tertiary led to the loss of some g crystallins and the independent recruitment of several enzymes to reduce the refractive index once more. Less well-studied crystallins found in invertebrate eye lenses have been similarly co-opted from a variety of genes (see Tomarev & Piatigorsky, 1996). Therefore, shifts in lens refraction indices can be linked to environmental changes requiring aquatic versus aerial and nocturnal versus diurnal vision causing shifts in the crystallin composition in relation to the co-option of different proteins. However, there is no indication that the co-opted genes were fixed more rapidly than under neutrality. Analyses of crystallin from amphibious species, such as hippopotami, pinnipeds, etc., might shed light on this. ( iii ) Evolution of genomic repertoire The genomic repertoire of a community is specific to its biotope, suggesting that genomic architecture is shaped by the environment. For instance, distributional patterns of genes in microbial planktonic communities between the surface of the ocean and the sea floor allowed the identification of depth- variable trends in gene contents and metabolic pathway components (De Long et al., 2006). Likewise, Rocap et al. (2003) compared the genome of two Prochlorococcus ecotypes (MED4 and MIT9313) that exhibit different light sensitivity for growth. Although displaying 1350 genes in common, a significant number (;1300 non orthologous genes between both ecotypes) were not shared but were either differentially retained from a common ancestor or acquired through lateral transfer. Some of these genes are likely to help determine the relative fitness of the ecotypes in response to key environ- mental variables and hence directly participate to their distribution and reproductive success in oceans. (e ) Case E: demonstrated positive selection on sites involved in the functional shift, with functional convergence The great star coral Montastrea cavernosa possesses several genes coding for fluorescent GFP-like proteins with cyan, shortwave green, longwave green, and red emission colours (Ugalde, Chang & Matz, 2004). Phylogenetic analysis suggested convergent evolution within this gene family. Field et al. (2006) suggested that the increase in fluorescent colour diversity is adaptive and hypothesized that multi- coloured fluorescent proteins could have evolved as part of a mechanism regulating the relationship between the coral and its algal endosymbionts (zooxanthellae). Anthony Levasseur and others562 Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society Field et al. (2006) recreated the ancestral proteins to establish where in the evolutionary lineages the phenotypic transition happened, they searched for and found evidence of episodic positive selection in these lineages, and used mutagenesis of extant and ancestral proteins to confirm that the predicted positively selected mutations were involved in the colour change. Mutagenesis experiments showed that positively selected sites were both essential and sufficient to generate cyan colour from ancestral green. However, mutagenesis proved that positively selected sites were essential but not sufficient for the phenotypic change in the case of the red colour raising the possibility of a role of neutral evolution in addition to positive selection (Field et al., 2006). ( f ) Case F: demonstrated positive selection on sites involved in the functionalshift, but with no observed convergence among lineages In this category two cases in which there is no molecular convergence can be defined: basic examples concerning different genes involved in the same functional gain, also examples where different sites in an individual gene are linked to the gain in function. ( i ) Probable link with an environmental change (a) Proteorhodopsin. Bielawski et al. (2004) detected posi- tively selected amino acid sites in proteorhodopsin, a retinal-binding membrane protein in marine bacteria that functions as a light-driven pump. Site-directed mutagene- sis (Man-Aharonovich et al., 2004) showed that two out of four positively selected amino acids sites could account for the spectral difference between the two major proteorho- dopsin families found in marine bacteria populations. Members of the two related proteorhodopsin families absorb light with different lmax (525 nm, green; 490 nm, blue) and their distribution in the water column was shown to be stratified according the available wave- lengths. (b) TRIM5a. The primate genome encodes a variety of genes involved in immune strategies against retrovi- ruses. One of these gene products, TRIM5a, probably involved in an antagonistic conflict with proteins from the viral capsid, can restrict diverse retroviruses in a species- specific manner (Sawyer et al., 2005): whereas rhesus monkey’s TRIM5a can strongly restrict HIV-1, human TRIM5a exhibits only weak HIV-1 restriction. Sawyer et al. (2005) found strong evidence for ancient positive selec- tion of TRIM5a in the primate lineage and suggested that TRIM5a evolution was driven by antagonistic interactions with a wide variety of viruses that pre-dated the origin of primate lentiviruses. A 13 amino-acid patch in the B30.2 functional domain bears multiple positively selected resi- dues, potentially acting at the viral interface. Experiments with recombinant proteins later have shown that this patch is generally essential for retroviral restriction. The antiquity of the detected positive selection rules out the emergence of primate lentiviruses (like HIV-1) as the major cause. However, TRIM5a from humans and old world monkeys are active against murine leukaemia virus (a gamma-retrovirus closely related to human endogenous retroviruses) that has episodically invaded primate genomes and still continues to be active. This suggests that TRIM5a evolution may have been strongly influenced by episodes of endogenous retrovirus infection and subsequent retroposi- tion events. HIV-1 and other primate lentiviruses are likely to be newcomers in this conflict, with the TRIM5a restriction against HIV-1 in old world monkeys being just an evolutionary coincidence. (g) ECP and EDN. The eosinophil cationic protein (ECP/RNase3) and eosinophil-derived neurotoxin (EDN/ RNAse2) are paralogues that emerged from a duplication event around 31 million years ago in the old world mon- keys lineage; the orthologues found among new world monkeys and prosimians are named EDNs by convention (Zhang & Rosenberg, 2002; Bielawski & Yang, 2004). In humans, EDN and ECP proteins are found in large gran- ules in eosinophilic leukocytes. In vitro studies showed that human EDN reduces the infectivity of certain RNA viruses through an RNAse-dependent process. This antiviral activ- ity is also found in old world monkeys EDN. Human ECP however shows only a weak antiviral activity (even at rela- tively high concentrations) but exhibits a cell membrane disruptive function that is probably responsible for its toxic- ity against bacteria and parasites. New world monkey EDNs lack antibacterial and antiviral activities. As (i) both EDN and ECP can digest RNA and (ii) EDN RNases from old world monkeys are catalytically more efficient than both ECP and new world monkey EDNs, significant enhancement of RNAse activity most probably occurred in the EDN lineage after gene duplication. Zhang & Rosenberg, (2002) were able to determine that after dupli- cation nine amino acid substitutions occurred in the EDN of the hominoid ancestor. Site-directed mutagenesis analysis shows that two of these substitutions, located at two inter- acting sites (positions 64 and 132) resulted in a 13-fold enhancement of EDN ribonucleolytic activity. Since the temporal order of these substitutions is unknown, two sce- narios are possible: R64S replacement first decreased RNAse activity by 46% then T132R substitution raised RNAse activity 24-fold. Alternatively, T132R substitution occurred first, reducing the RNAse activity by 21%, with the second substitution (R64S) allowing a 17-fold increase. (d) Lipase/feruloyl esterase A. Despite strong structural and sequence similarities, two distinct enzymatic activities, i.e. lipase and type-A feruloyl esterase (FAEA), are encoded by different members of this fungal gene family. Evolutionary analyses suggested that the lipase function was co-opted after gene duplication, leading to subsequent enzymatic novelty (FAEA) involved in the lignocellulolysis of plant cell wall. This functional shift was detected, and the corresponding positively selected amino acids were identified, using the branch-site model for testing positive selection on individual codons along specific lineages. Fur- thermore, site-directed mutagenesis experiments clearly confirmed that three of the amino acids under positive selection were involved in the functional shift. It could be argued that environmental changes such as colonization by terrestrial plants might have driven adaptation by func- tional diversification (Levasseur et al., 2006). Role of the environment on gene evolution 563 Biological Reviews 82 (2007) 551–572 � 2007 The Authors Journal compilation � 2007 Cambridge Philosophical Society ( ii ) No clear link with an environmental shift (a) Glutamate dehydrogenase 2 (GLUD2). The ancestral GLUD gene was duplicated in the hominoid lineage, after the split with old world monkeys, giving rise to two paralogues: GLUD1 and GLUD2. GLUD1 is an impor- tant house-keeping gene and is expressed in many tissues, whereas GLUD2 is specifically and highly expressed in nerve tissues (brain and retina) and in testis. Burki & Kaessmann (2004) have shown that the amino acid changes responsible for GLUD2 brain-specificity occurred during a period of positive selection after the duplication event. Maximum likelihood analyses that test for selection at certain sites on the phylogenetic tree identified a subset of sites with a dN/dS value significantly greater than 1: only two among them (443 and 456) explain high brain- specificity activity. Even if the origin and adaptive phase of GLUD2 are approximately concomitant with a period of increased size and structural and functional complexity of the brain, it is difficult to identify an environmental change that could have driven this evolution. We do not pretend that such an environmental change must have existed: conceptually, a new function can emerge even in a stable environment. (b) Glutathione transferases (GSTs). Ivarsson et al. (2003) identified positively selected amino acid residues in GSTs that are multifunctional enzymes providing cellular defence against toxic electrophiles of both exogenous and endogenous origins. Site-directed mutagenesis confirmed that those substitutions might drive functional diversifica- tion in substrate specificities. It is however probably impossible to design experiments that could test if muta- tions have been fixed in response to the emergence of a specific xenobiotic or a new catalytic pathway. (g ) Case G: functional shift linked to an environmental change but with no evidence for positive selection or convergence ( i ) Iota crystallin The crystallins are not only involved in modifying the refraction index of the eye lens (see Section III.2.dii). They also, in association with a chromophore,