Baixe o app para aproveitar ainda mais
Prévia do material em texto
analys is systematic meta-analyses and field synopsis of genetic association studies in schizophrenia: the szGene database Nicole C Allen1, Sachin Bagade1, Matthew B McQueen2, John P A Ioannidis3–5, Fotini K Kavvoura3, Muin J Khoury6, Rudolph E Tanzi1 & Lars Bertram1 In an effort to pinpoint potential genetic risk factors for schizophrenia, research groups worldwide have published over 1,000 genetic association studies with largely inconsistent results. To facilitate the interpretation of these findings, we have created a regularly updated online database of all published genetic association studies for schizophrenia (‘SzGene’). For all polymorphisms having genotype data available in at least four independent case-control samples, we systematically carried out random-effects meta-analyses using allelic contrasts. Across 118 meta-analyses, a total of 24 genetic variants in 16 different genes (APOE, COMT, DAO, DRD1, DRD2, DRD4, DTNBP1, GABRB2, GRIN2B, HP, IL1B, MTHFR, PLXNA2, SLC6A4, TP53 and TPH1) showed nominally significant effects with average summary odds ratios of ~1.23. Seven of these variants had not been previously meta-analyzed. According to recently proposed criteria for the assessment of cumulative evidence in genetic association studies, four of the significant results can be characterized as showing ‘strong’ epidemiological credibility. Our project represents the first comprehensive online resource for systematically synthesized and graded evidence of genetic association studies in schizophrenia. As such, it could serve as a model for field synopses of genetic associations in other common and genetically complex disorders. Schizophrenia is a common disorder caused by the interaction of mul- tiple genetic and environmental factors, but its etiology has proved dif- ficult to determine1. In particular, genetic research has been hindered by the largely nonmendelian patterns of familial transmission and the lack of disease-specific neuropathological features or biomarkers2. Although the heritability of schizophrenia is high (~80%), nongenetic factors likely also considerably modify disease risk3, further compli- cating the identification of susceptibility genes. Genome-wide linkage analyses have identified several chromosomal regions thought to harbor schizophrenia genes, but only a few overlap across studies4,5. To identify the potential loci underlying these signals, well over 1,000 studies have been published claiming or refuting genetic association between puta- tive schizophrenia genes and affection status, onset age and/or certain endophenotypes. Currently, about 150 genetic association studies are published each year, at increasing pace (Supplementary Fig. 1 online). Despite these efforts, no single gene or genetic variant has been estab- lished as a bona fide schizophrenia susceptibility gene, at least not with the confidence accorded to other genes associated with susceptibility to complex disease, such as APOE in Alzheimer’s disease6 or CFH in macular degeneration7. For health care providers, researchers and the general public, the accumulating information is increasingly difficult to follow, evaluate and interpret. To address this problem, we have collected and comprehensively catalogued all genetic association studies published in the field of schizophrenia. Furthermore, we subjected all polymorphisms with data available from at least four independent case-control samples to systematic meta-analyses. Detailed summaries of all association studies and meta-analyses have been posted on a regularly updated and publicly available online database, ‘SchizophreniaGene’ (‘SzGene’). In addition, we have applied interim guidelines developed by the Human Genome Epidemiology Network (HuGENet) to assess the epidemiological cred- ibility of these associations. Our study is the first of its kind in schizo- phrenia, and substantially facilitates the interpretation of findings in the quest for genuine genetic susceptibility factors of this disorder. RESULTS Literature searches On 30 April 2007, the database content was frozen for the current analyses. At that time, 1,179 individual publications reporting on 3,608 genetic variants in 516 different genes were included in SzGene (after screening approximately 15,000 titles and abstracts). More than 90% (1,093) of these 1Genetics and Aging Research Unit, MassGeneral Institute for Neurodegenerative Disease (MIND), Department of Neurology, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA. 2Institute for Behavioral Genetics, University of Colorado, Boulder, Colorado 80309, USA. 3Clinical and Molecular Epidemiology Unit, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina 45110, Greece. 4Biomedical Research Institute, Foundation for Research and Technology Hellas, Ioannina 45110, Greece. 5Department of Medicine, Tufts University School of Medicine, Boston, Massachusetts 02110, USA. 6National Office of Public Health Genomics, Centers for Disease Control and Prevention, Atlanta, Georgia 30341, USA. Correspondence should be addressed to L.B. (bertram@helix.mgh.harvard.edu). Published online 26 June 2008; doi:10.1038/ng.171 nature genetics | volume 40 | number 7 | july 2008 827 © 20 08 N at ur e Pu bl is hi ng G ro u p h ttp :// w w w. n at ur e. co m /n at ur eg en et ic s analys is studies were published after 1995, and about half of those during the past three years (Supplementary Fig. 1). Both the average number of poly- morphisms and the combined sample sizes studied per publication have steadily increased over the past 10 years, averaging 6 and 660, respectively, from 2003 to 2006, as compared to 2 and 380 between 1997 and 2002. To determine the completeness of our search strategies, we compared the number of studies included in SzGene to those available in other literature databases (HuGENet, Genetic Association Database and EMBASE) on ten randomly chosen genes. For these, HuGENet listed 27 studies, GAD 16, and EMBASE 33, whereas SzGene included 47 pub- lications (Supplementary Table 1 online). This supports our literature search strategies as both comprehensive and specific. Meta-analyses Of the 3,608 polymorphisms included in the analyses herein, 118 variants in 52 genes had sufficient data to warrant meta-analysis (Supplementary Table 2 online). On average, these were based on 3,589 subjects (median; interquartile range (IQR) = 2,335–5,669) originating from 6 (median; IQR = 4–9) case–control samples. Twenty four (20%) of the meta- analyzed variants showed nominally significant (P ≤ 0.05) summary ORs, and for convenience will be referred to as ‘positive’ SNPs (those with P > 0.05 are designated as ‘negative’). Details of the meta-analyses for all polymorphisms showing significant summary ORs in either the ‘all ethnicities’ or the ‘Caucasian only’ (that is, of self-reported European ancestry) paradigms are summarized in Table 1, as well as in Supplementary Table 3 and Supplementary Fig. 2 online). A total of 24 variants within 16 genes (APOE, COMT, DAO, DRD1, DRD2, DRD4, DTNBP1, GABRB2, GRIN2B, HP, IL1B, MTHFR, PLXNA2, SLC6A4, TP53 and TPH1) yielded summary ORs suggesting a nominally significant increase or decrease in risk for schizophrenia. Seven meta-analyses showed nominal significance only across samples of European ancestry and not across the combination of samples of all ancestries. These results should be interpreted with caution, as ancestry- specific effects are possible but probably uncommon8, and differences in statistical significance do not constitute proof of true differences in genetic effects across populations of diverse ancestral descent.Note that since the 30 April 2007 data freeze, new genotype data have been published for 8 of the 24 positive variants (in COMT, DRD2, GABRB2 and IL1B; see SzGene website for details). However, none of the eight updated meta-analyses was changed appreciably, and all variants con- tinue to show nominally significant results. The average combined sample size (across studies for cases and controls) in the positive meta-analyses was 3,378 subjects (median; IQR = 2,410–5,419), drawn from an average of 6 independent case– control samples (median; IQR = 5–8; Supplementary Table 2). The average significant allelic summary OR for positive analyses was 1.23 (range: 1.11–1.52; range of P values from 0.048 to <0.0001), with a mean risk OR of 1.24 (range: 1.11–1.52) and a mean protective OR of 0.82 828 volume 40 | number 7 | july 2008 | nature genetics Table 1 Random-effects meta-analyses using allelic contrasts for polymorphisms showing significant summary ORs (as of 30 april 2007) Gene Polymorphism Model Cases vs. controls (number of independent samples) OR (95% CI)a P value Heterogeneity P valueb I2 Gradec APOE APOE (ε2/3/4) E4 vs. E3 E4 vs. E3, Caucasiand 1,500 vs. 2,702 (15) 1.16 (1.00–1.34) 0.043 0.60 0 B COMT rs165599 G vs. A, all ancestries 2,628 vs. 7,340 (6) 1.11 (1.02–1.21) 0.019 0.24 25 C COMT rs737865 C vs. T, Caucasiand 1,605 vs. 4,021 (3) 1.13 (1.01–1.28) 0.039 0.22 34 C DAO rs4623951 C vs. T, all ancestries 1,509 vs. 1,521 (4) 0.88 (0.79–0.98) 0.026 0.85 0 C DRD1 rs4532 (DRD1_48A/G) G vs. A, all ancestries 725 vs. 1,075 (5) 1.18 (1.01–1.38) 0.037 0.56 0 A DRD2 rs1801028 (S311C) G vs. C, Caucasiand 2,299 vs. 3,777 (15) 1.52 (1.09–2.12) 0.013 0.27 16 B DRD2 rs6277 (P319P) C vs. T, Caucasiand 473 vs. 896 (3) 1.45 (1.21–1.73) <0.00004 0.31 15 C DRD4 rs1800955 (521T/C) C vs. T, all ancestries 2,002 vs. 1,986 (6) 1.15 (1.05–1.26) 0.003 0.77 0 C DRD4 120-bp TR S vs. L, all ancestries 1,236 vs. 1,199 (4) 0.81 (0.70–0.94) 0.005 0.36 7 C DTNBP1 rs1011313 (P1325) T vs. C, Caucasiand 2,696 vs. 2,849 (8) 1.23 (1.07–1.40) 0.003 0.59 0 A GABRB2 rs1816072 C vs. T, Caucasiand 1,129 vs. 995 (4) 0.82 (0.72–0.93) 0.002 0.54 0 C GABRB2 rs1816071 G vs. A, Caucasiand 1,133 vs. 993 (4) 0.82 (0.72–0.93) 0.002 0.69 0 C GABRB2 rs194072 C vs. T, Caucasiand 1,137 vs. 991 (4) 0.83 (0.69–1.00) 0.048 0.36 7 B GABRB2 rs6556547 T vs. G, Caucasiand 774 vs. 620 (3) 0.70 (0.52–0.95) 0.022 0.96 0 B GRIN2B rs7301328 (366G/C) G vs. C, all ancestries 903 vs. 810 (4) 1.16 (1.01–1.33) 0.034 0.43 27 C GRIN2B rs1019385 (200T/G) G vs. T, all ancestries 502 vs. 466 (4) 1.45 (1.14–1.85) 0.003 0.15 44 C HP Hp1/2 1 vs. 2, all ancestries 1,346 vs. 2,018 (6) 0.88 (0.80–0.98) 0.016 0.67 0 C IL1B rs16944 (C511T) T vs. C, Caucasiane 819 vs. 1,302 (5) 0.78 (0.65–0.93) 0.006 0.25 26 C MTHFR rs1801133 (C677T) T vs. C, all ancestries 3,327 vs. 4,093 (14) 1.16 (1.05–1.30) 0.005 0.006 56 C MTHFR rs1801131 (A1298C) C vs. A, Caucasiane 1,211 vs. 1,729 (5) 1.19 (1.07–1.34) 0.002 0.73 0 A PLXNA2 rs752016 C vs. T, all ancestries 1,122 vs. 1,211 (6) 0.82 (0.69–0.99) 0.037 0.19 33 C SLC6A4 5-HTTVNTR 10 vs. 12, all ancestries 2,335 vs. 2,688 (11) 0.86 (0.74–0.99) 0.036 0.03 50 C TP53 rs1042522 C vs. G, all ancestries 1,418 vs. 1,410 (5) 1.13 (1.01–1.26) 0.029 0.50 0 C TPH1 rs1800532 (218A/C) A vs. C, all ancestries 829 vs. 1,268 (5) 1.31 (1.15–1.51) <0.00008 0.33 13 A Note that this table lists only the polymorphisms found to be nominally significantly associated with schizophrenia in SzGene. For a more detailed presentation of these positive polymorphisms, including genotypic analyses and results after the exclusion of HWE deviating samples, see supplementary Table 3. For a complete list of meta-analyses done in SzGene, see (supplementary Table 1). When nominally statistically significant results are obtained both in the analysis including all samples and in the analysis including only samples of European descent (usually refered to as ‘Caucasian’ in the original publications), only the analysis that has the largest genetic effect size (OR deviating the most from 1.00) is reported here. aSummary ORs are based on random-effects allelic contrasts comparing minor and major alleles (based on frequencies in the control samples). bBased on the Q statistic across crude ORs cal- culated for each study. P < 0.1 is considered to indicate significant evidence of between-study heterogeneity (see Methods). cDegree of ‘epidemiological credibility’ based on the interim Venice guidelines (A, strong; B, modest; C, weak; see text and supplementary Table 6 for more details). dNominally significant only in analyses restricted to samples of European descent. eNominally significant in analyses when combining samples of all ancestries, but showing a larger genetic effect size in analyses restricted to individuals of European descent, which is listed here. © 20 08 N at ur e Pu bl is hi ng G ro u p h ttp :// w w w. n at ur e. co m /n at ur eg en et ic s analys is (range: 0.70–0.88). The corresponding average genotypic summary OR using either recessive or dominant models was slightly more pronounced for risk genotypes (1.37 (1.12–1.75)) and for protective genotypes (0.77 (0.57–0.87)) as compared to the allele-based analyses. Five of the 16 implicated genes were located in one of the linkage areas suggested in the genome-wide linkage meta-analyses5: DRD2 (11q23.1), GABRB2 (5q34), DTNBP1 (6p22.3), IL1B (2q13) and COMT (22q11.21). With the exception of one gene (COMT), these were also located in the ‘top five’ linkage regions as suggested by Lewis et al. (the only top-five region that did not contain a SzGene positive variant was 3p25–p22). Note, however, that many of the candidate genes were originally chosen for an association assessment because of their location in or near link- age regions, which needs to be taken into account when judging the positional candidacy of any of the positive findings. Ninety-four (80%) SNPs in 45 genes showed no significant asso- ciation with schizophrenia after all published case–control samples were meta-analyzed, either in the analyses combining all samples of all ancestries or across samples of European-only ancestry. The average sample size of the negative meta-analyses was 3,928 subjects (median; IQR = 2,335–56,695) drawn from an average of 6 samples (median; IQR = 4–9). These numbers were not significantly different from those of genes with ‘positive’ outcomes. However, the sample size needed to detect an allelic OR of ~1.23 (that is, the average observed here across all meta-analyses) at α = 0.05 and a disease prevalence of 1% with a power of 80% ranges from ~2,000 to 4,000 for allele frequencies between 0.5 and 0.1 (estimated using PBAT9 v3.5). This sample size requirement was met for 76 and 44 of the 94 negative meta-analyses, respectively. Thus, the lack of a nominally significant finding for some polymorphisms may reflect insufficient power. In addition, as only ~2 SNPs were studied on average per negative gene, more variants need to be assessed in order to definitely exclude these genes as schizophrenia risk factors. Forty-nine (42%) of the meta-analyzed variants showed significant between-study heterogeneity (P value <0.1 in the Q statistic) in the analyses combining all ancestral groups, and forty (34.5%) showed large between-study heterogeneity on the basis of the I2 metric (>50%; see Supplementary Table 4 online). However, with the typically limited numbers of studies, heterogeneity estimates and inferences have consid- erable uncertainty and should be interpreted cautiously. Only two of the positive associations had large estimates of between-study heterogeneity (MTHFR and SLC6A4; Table 1). Eight of the twenty-four positive meta-analysesincluded control populations showing violation of Hardy-Weinberg equilibrium (HWE), but only one result (in SLC6A4) became insignificant after exclusion of HWE-violating studies (Supplementary Table 3). Of the positive SNPs in the ‘all ancestries’ paradigm, seven became insignificant after the initial study was removed (Table 1). Of the 11 SNPs showing the stron- gest effects in the population of European ancestry, three were initially identified in a study limited to individuals of European ancestry; two of these (MTHFR rs1801131 and APOE ε2/3/4) remained significant when the initial study was excluded from only the European-ancestry studies, whereas the third (COMT rs737865) did not. The modified regression procedure modeling the logOR as a function of precision suggested that small studies yielded significantly (P < 0.1) larger effects than larger studies (a possible indication of publication bias or related biases) for DRD2 rs6277, GABRB2 rs1816071, GABRB2 rs1816072 and IL1B rs16944. Of all 118 meta-analyses, 16 had evidence of inclusion of a significant (P < 0.10) excess of studies with nominally statistically significant results (single studies with P < 0.05). However, none of these belonged to the group of positive findings, that is, those with a nominally significant summary OR. All 16 of these meta-analyses pertained to associations in which single studies presented significant results in one direction, followed by other studies showing significant results in the opposite direction, a situation suggestive of the Proteus phenomenon10. Among schizophrenia genetic association studies as a whole, there was a clear excess of studies with a statistically significant result (146 observed, versus 84.65 expected, P < 10−6). However, this was limited to meta-analyses with nonstatistically significant results (O = 118, E = 59.44, P ≤ 10−6, compared with O = 28, E = 25.3, P = 0.55 for positive meta-analyses) and meta-analyses with large between-study het- erogeneity (O = 100, E = 33.4, P ≤ 10−6, compared with O = 46, E = 51.3, P = 0.80 for meta-analyses without large estimates of heterogeneity). A total of 56 meta-analyses on various putative schizophrenia sus- ceptibility loci had been published as of 30 April 2007 (Table 2 and Supplementary Table 5 online), altogether analyzing 75 SNPs in 30 dif- ferent genes. Ten of these variants were found to be positive in the analy- ses done for SzGene, 41 were negative, and 24 were not meta-analyzed here because of insufficient data (for example, fewer than four indepen- dent case–control samples, or more than two alleles; see Methods). Of the 16 genes found positive in SzGene, nine had previously been meta- analyzed, although in one (COMT), only negative variants had been analyzed. The most recent meta-analyses on the remaining eight genes (APOE, DRD2, DRD4, GRIN2B, IL1B, MTHFR, SLC6A4 and TPH1) all reported a positive association with schizophrenia (Table 2a). Five other genes (BDNF, DRD3, NRG1, DAOA and COMT) had been implicated by previous meta-analyses, but were found to have no significant asso- ciation in the default allelic contrasts in SzGene (Table 2b; note that although NRG1 did not show positive results in the data freeze used for the analyses in this paper, a more recent update of SzGene now lists one SNP (rs10503929) in NRG1 suggesting nominally significant asso- ciation; see the SzGene website for more details). Finally, seven of the positive loci in SzGene (DAO, DRD1, DTNBP1, GABRB2, HP, PLXNA2 and TP53) were, to the best of our knowledge, never analyzed in previous meta-analyses published before 30 April 2007. Meta-analyses published after that date were not considered for this comparison. Differences between the results of published meta-analyses and those obtained in SzGene can be ascribed to several potential causes. First, given the more up-to-date approach in SzGene, meta-analyses, on aver- age, were based on nearly six more individual case–control samples (on average representing ~2,400 more combined cases and controls) com- pared to published meta-analyses. Second, some observed differences can be attributed to different in- and exclusion criteria. For instance, several of the previous meta-analyses included family-based studies, non-English articles and, occasionally, samples containing a substantial proportion of individuals with other psychiatric illnesses. Less often, we observed discrepancies in the extraction and/or interpretation of study-level data (for example, for 366G/C in GRIN2B) in our analy- ses and those published by Li and He11. Note that despite the different results obtained for this particular variant, the association evidence for GRIN2B—based on results obtained in meta-analyses of other polymor- phisms—was also judged as ‘significant’ by Li and He, consistent with the conclusions reached here. Up until 30 April 2007, two GWA studies had been published12,13. The first report genotyped over 25,000 SNPs in 14,000 genes in 320 cases and 325 controls and found significant association with schizophrenia in individuals of European ancestry with variants in plexin A2 (PLXNA2), located on chromosome 1q32. A follow-up study14 in a Japanese sample found no association between the same PLXNA2 variants and schizo- phrenia. In our meta-analysis combining data from both studies, the C allele of SNP rs752016 showed a nominally significant protective effect across samples of all ancestries (OR = 0.82, 95% CI = 0.69–0.99), and a second SNP, rs841865, approached significance (OR = 0.84, 95% CI = 0.69–1.01). Despite these promising results, the combined sample nature genetics | volume 40 | number 7 | july 2008 829 © 20 08 N at ur e Pu bl is hi ng G ro u p h ttp :// w w w. n at ur e. co m /n at ur eg en et ic s analys is sizes for both meta-analyses were relatively small (2,333 and 2,344, respectively), and more data are needed to confirm these results. The second GWA study13, which tested over 400,000 SNPs in 178 cases and 144 controls, found that several variants in CSF2RA and neighboring IL3RA on chromosome Xp22.33 were associated with schizophrenia. However, the lack of enough independent case–control samples pre- cluded meta-analysis of these variants for SzGene. Table 1 and Supplementary Table 6 online show the results of apply- ing the Venice interim criteria15 to all 24 of the associations with nomi- nally statistically significant summary ORs. For criterion 1 (‘amount of evidence’), 19 were graded as ‘A’ and 5 as ‘B’; for criterion 2 (‘con- sistency of replication’), 17 were graded as ‘A’, 5 as ‘B’, and 2 as ‘C’; and for criterion 3 (‘protection from bias’) 9 were graded as ‘A’ and 15 as ‘C’. The main reasons for low grades in the last criterion were (i) the presence of a small summary OR (<1.15) that can easily be dissipated even by relatively small biases in meta-analyses of published data (n = 6), and/or (ii) loss of significance after excluding the initial study (n = 6). Overall, four associations (DRD1 rs4532, DTNBP1 rs1011313, MTHFR rs1801131 and TPH1 rs1800532) graded ‘A’ across all three cri- teria, and—on the basis of these guidelines—can be considered to have ‘strong’ epidemiological credibility. The remaining 20 showed either a ‘modest’ (n = 4), or only a ‘weak’ (n = 16) degree of credibility (Table 1 and Supplementary Table 6). Conclusions This is the first systematic, comprehensive field synopsis of genetic association studies in schizophrenia assembled to date following cri- teria suggested by the HuGENet Road Map16. Our study combines unique and novel features that will generalize to knowledge-synthesis efforts for genetic associations in other common diseases. First, it uses a systematic and efficient searchstrategy of the applicable published literature. Up until 30 April 2007, we collected and extracted data from 1,179 publications reporting on 3,608 polymorphisms in 516 different genes. Second, it uses quantitative methods to derive summary effect estimates by means of meta-analysis, which we carried out for a total of 118 polymorphisms across 52 different genes. This nearly doubles the number of meta-analyses thus far published in the field. Here, 24 polymorphisms in 16 genes yielded nominally significant results, with average allelic risk ORs of ~1.24 and average protective ORs of ~0.82. Third, it systematically examines sources of biases and assigns a score for the epidemiologic credibility of the findings. According to the Venice guidelines for the assessment of cumulative evidence in genetic association studies15, meta-analyses in at least four ‘positive’ genes showed a ‘strong’ degree of epidemiological credibility (DRD1, DTNBP1, MTHFR and TPH1). Thus, on the basis of the current data, these genes seem to be the best contenders to contain genuine sus- ceptibility alleles modifying disease risk within the whole domain of genetic epidemiology in schizophrenia. Fourth, our study combines information from published studies with recent results of genome- wide association analyses. To accommodate the expected avalanche of data to emerge from upcoming GWA analyses in schizophrenia, we have devised a step-wise protocol that makes efficient use of this vast body of ‘unbiased’ genotype data. Once publicly available, genotype data will be merged with the data already available from candidate gene studies included in SzGene. Finally, the results of this compre- hensive field synopsis—that is, all of the study-level data as well as the 830 volume 40 | number 7 | july 2008 | nature genetics Table 2 Previously published meta-analysis results compared to meta-analyses in szGene Gene Study Polymorphism Prior meta sample size (number of samples) SzGene sample size (number of samples) Model Meta OR (95% CI) Het. Meta SZGene OR (95% CI) Het. SZGene Genes showing significant summary ORs (‘positive’ genes) in SzGene APOE Xu, 2006 APOE ε2/3/4 5,223 (11) 4,202 (15) ε4 vs. non-ε4, Caucasian 1.23 (1.04–1.44) 0.59 1.17 (1.01–1.35) 0.615 DRD2 Glatt, 2006 rs1801028 9,070 (27) 9,335 (28) Cys vs. Ser 1.36 (1.09–1.70) 0.345 1.41 (1.16–1.72) 0.47 DRD4 Jonsson, 2003 rs1800955 1,459 (3) 3,988 (6) C vs. Ta 1.22 (1.04–1.42) n.g. 1.15 (1.05–1.26) 0.77 GRIN2B Li, 2007 rs1019385 n.g. (2) 968 (4) T vs. Gb 0.71 (0.56–0.9) 0.38 0.69 (0.54–0.88) 0.15 IL1B Shirts, 2006 rs16944 2,111 (5) 2,121 (5) C vs. T, Caucasianb 1.24 (1.09–1.41) n.g. 1.28 (1.08–1.54) 0.25 MTHFR Gilbody, 2007 rs1801133 6,125 (12) 7,420 (14) T vs. C 1.18 (1.06–1.32) n.g. 1.16 (1.05–1.30) 0.0061 SLC6A4 Fan, 2005 5-HTTVNTR 4,546 (12)d 5,023 (11) 12 vs. non-12c 1.24 (1.11–1.38) 0.28 1.16 (1.01–1.35) 0.028 TPH1 Li, 2006 rs1800532 n.g. (5) 2,097 (5) A vs. C 1.24 (1.1–1.41) 0.34 1.31 (1.15–1.51) 0.33 Genes reported to show significant effects in previous meta-analyses, but not in SzGene using the default analyses BDNF Zintzaras, 2007 270C/T 1,866 (5) 4,091 (8) T vs. C 1.63 (1.01–2.65) 0.07 1.29 (0.94–1.78) 0.11 DRD3 Jonsson, 2004 rs6280 11,066 (44) 13,693 (55) (GG + AA) vs. AGa 1.08 (1.00–1.16) n.g. 0.96 (0.91–1.03)e 0.01 NRG1f Munafo, 2006 Haplotype 10,595 (14) 12,154 (18) Haplotype P value = 0.02 0.2 1.06 (0.95–1.19) 0.0009 DAOA Li, 2006 rs1421292 n.g. (3) 5,591 (6) A vs. T 0.77 (0.66–0.89) 0.9 0.98 (0.85–1.12) 0.02 COMT Sand, 2006 rs4680 n.a. (5) 19,558 (40) G vs. Ab 1.19 (1.0–1.4) n.g. 0.99 (0.94–1.05) 0.059 Only the most recently published meta-analyses are presented here. For a more detailed presentation of all meta-analyses published before 30 April 2007, see supplementary Table 4. The random-effects model was the default model used. Note that although 16 positive genes were found in SzGene, only 8 of these have had the same positive polymorphisms analyzed previ- ously. Het. Meta, heterogeneity P value reported in published meta-analysis. Het. SZGene, heterogeneity based on the Q statistic across crude ORs calculated for each study. P < 0.1 is considered to indicate significant evidence of between-study heterogeneity (see Methods). n.g., number of subjects not given in meta-analysis; n.a., number of subjects not applicable, as only family-based studies were used in this meta-analysis. aFixed effects model used in prior meta-analysis; random effects model used for SzGene analyses. bModel was not specified; random-effects model was used for SzGene analyses. cSzGene meta-analysis compares the 12 allele frequency to the 10 allele frequency; minor allele frequencies of other alleles in this polymorphism are negligible. dFan, 2005 includes two studies not published in English, hence the larger number of samples. eSzGene default analysis using allele-contrasts produced the nonsignificant result for the DRD3 rs6280 polymorphism listed in table. When using the same genotype contrast as described previously (dominant model), we observe a nominally significant finding (OR = 1.09 (95% CI = 1.01–1.17)), in agreement with the meta- analysis by Jonsson, 2004. fThe significant findings of this study were based on haplotype comparisons, whereas the SZGene analyses are based on allelic contrasts (C vs. T) of a single variant (SNP8NRG221533). The summary OR of this SNP was reported as 1.04 (0.99–1.19) by Munafo et al., and is therefore very similar to the OR calculated on the substantially larger combined sample in SZGene, although the latter do not reach significance. Note that although none of the variants located in NRG1 showed nominally significant results with data published up until 30 April 2007, a more current update of SzGene now lists one SNP (rs10503929) as ‘positive’; see the online database for details. © 20 08 N at ur e Pu bl is hi ng G ro u p h ttp :// w w w. n at ur e. co m /n at ur eg en et ic s analys is meta-analyses—are accessible in a searchable online database, SzGene, which is embedded in the Schizophrenia Research Forum. The summary ORs obtained here for schizophrenia agree well with those found by our group in a similar project on genetic association studies of Alzheimer’s disease6, as well as with those found in large- scale meta-analyses of other genetically complex diseases17,18, and, most recently, in high-density GWA analyses carried out on several common diseases19. Notably, we identified significant risk-modifying effects in seven genes (DAO, DRD1, DTNBP1, GABRB2, HP, PLXNA2 and TP53) for which, to the best of our knowledge, no previous meta-analyses had been published. On the basis of the overall degree of epidemiologic credibility, the most notable findings to emerge from our systematic meta-analyses were with genetic variants located in four genes (DRD1, DTNBP1, MTHFR and TPH1). DRD1, which maps to chromosome 5q35, encodes dopamine receptor 1, the most abundant dopamine receptor in the cen- tral nervous system. This receptor is thought to have a role in regulation of cognitive functions in the prefrontal cortex, possibly through interac- tion with NMDA-mediated neurotransmission20, and to be involved in the action of clozapine, one of the atypical antipsychotic drugs used for the treatment of schizophrenia21. The potential functional role of the associated SNP rs4532 remains elusive, although its location in the 5′ UTR 48 bp upstream of the transcription start site suggests that it may be involved in the regulation of gene or protein expression. DTNBP1 maps to chromosome 6p22 and encodes dystrobrevin binding protein 1 (also known as dysbindin), which is expressed in many tissues includ- ing the brain (reviewed in ref. 22). Thefunctional role of the associated SNP rs1011313 (P1325) remains elusive. However, because individuals with schizophrenia often have lower DTNBP1 mRNA expression in some brain regions, it is possible that potential risk-modifying DTNBP1 variants may be involved in gene or protein expression23. MTHFR maps to chromosome 1p36.3, a chromosomal region that has not been linked to schizophrenia. It encodes 5,10-methylenetetrahydrofolate reductase, which catalyzes the reduction of 5,10-methylenetetrahydrofolate to 5-methyltetrahydrofolate. In turn, 5-methyltetrahydrofolate serves as a carbon donor for homocysteine metabolism and is involved in other intracellular methylation processes, which have long been sug- gested to be involved in schizophrenia pathogenesis on a number of levels (reviewed in ref. 24). In the SzGene meta-analyses, two MTHFR variants are associated with schizophrenia risk, rs1801131 (encod- ing A1298C) and rs1801133 (encoding C677T), but only the former showed strong epidemiological credibility, whereas the latter showed large between-study heterogeneity (I2 = 55%). These two variants show a low degree of linkage disequilibrium (r2 = 0.18 based on current CEU HapMap data). Both result in amino-acid substitutions that have been suggested to reduce MTHFR enzyme activity25,26, and were recently linked to the occurrence of negative and positive symptoms in indi- viduals with schizophrenia, effects correlated with serum folate levels24. Finally, TPH1 maps to chromosome 11p15–14 and encodes tryptophan hydroxylase 1, the rate-limiting enzyme in the biosynthesis of serotonin. However, the associated SNP rs1800532 (218A/C) is located in intron 7 of the gene and has no previously described functional effect. Despite our comprehensive and systematic approach to the schizo- phrenia literature, the outcomes of our study should be evaluated with certain limitations in mind. First, although we carried out a thorough search using a number of different strategies to identify published stud- ies eligible for inclusion in SzGene, we cannot exclude the possibil- ity that some studies were overlooked. Second, our project explicitly excluded results from association studies existing only as abstracts, or those not published in English. This may have caused a disproportion- ate exclusion of negative data resulting in publication bias, although we did not detect any evidence for such a bias in most meta-analyses with a positive outcome. Third, default SzGene meta-analyses (for example, those presented online) are based on allelic contrasts only. We chose this model because a large number of publications do not provide genotype distributions. In addition, it allowed us to condense the genetic asso- ciation data into one statistic rather than test all possible transmission models. As the underlying mode of inheritance is unknown for most complex disease genes, we consider our approach to be a reasonable compromise between loss in power and practicality. Fourth, the use of study-level allele and genotype distributions precludes more sophis- ticated analyses incorporating key covariates (such as age, gender, or potential gene–gene and gene–environment interactions), for which raw genotype data would be required. However, unless the same alleles confer opposite effects depending on the covariate, failure to account for interactions is not expected to mask any true underlying associa- tion, but will merely reduce power. Fifth, it is not possible at present to process haplotype-based genotype data in the routine SzGene meta- analyses. This may lead to missing risk effects conferred by haplotypes that are only poorly tagged by individual SNPs (as has been suggested for NRG1 (ref. 27)). Sixth, there likely exists considerable and difficult- to-detect heterogeneity across cases classified as schizophrenia, as there is as of now no ‘definite’ diagnosis (neuropathologically defined, for example, as for Alzheimer’s disease or Parkinson’s disease) or specific laboratory test (as there is for hypertension or diabetes) for the disease. Furthermore, inter-rater variability of a clinical schizophrenia diagnosis is relatively high28. This situation is potentially aggravated by the fact that many studies (~6% of all studies included here) included individu- als with schizoaffective disorder in their case samples, possibly further increasing heterogeneity and decreasing power. Seventh, we emphasize that the number of ‘true’ associations may be smaller than the number of nominally significant findings identified here29,30. This may have a number of causes, including multiple testing, linkage disequilibrium among associated variants, undetected publication or other reporting biases, and study-level technical artifacts that may have gone unnoticed. Most of the positive variants did not reach very high levels of statistical significance, and those with modest P values should be regarded cau- tiously even if there is no obvious between-study heterogeneity and no demonstrable potential for bias. Finally, the epidemiological grading is based on interim criteria that have been created by consensus among a large number of experts, but which need further prospective validation of their performance. Protection from bias in particular is very difficult to rate, as latent bias is always possible and no test can have very high sensitivity and specificity for all types of possible bias. Thus, all results in this paper and in the ongoing analyses presented on the SzGene website should be interpreted conservatively until more studies are undertaken and possible molecular mechanisms underlying the putative risk- modifying effects have been evaluated and confirmed. Despite these caveats, our project represents the first and only com- prehensive and systematic assessment of the current status of genetic epidemiology research in schizophrenia, substantially extending the existing literature. The putative genetic risk factors emerging from our meta-analyses—summarized on a publicly available and regularly updated website—provide an extensive and quantitative summary of the most promising schizophrenia candidate susceptibility genes known to date. The approach presented here can be easily adapted to genetic asso- ciation studies of other common diseases of public health significance. With the emergence of GWA studies for various diseases, systematic approaches are urgently needed to develop a credible knowledge base for the genetic architecture of human diseases, an essential prerequisite for using genetic information to improve health and prevent disease in the coming decades. nature genetics | volume 40 | number 7 | july 2008 831 © 20 08 N at ur e Pu bl is hi ng G ro u p h ttp :// w w w. n at ur e. co m /n at ur eg en et ic s analys is METHODS Literature searches. Inclusion criteria. Studies included in SzGene must satisfy three criteria. First, they must evaluate the association between a polymorphic genetic variant (one with a minor allele frequency ≥0.01 in the general popu- lation) and schizophrenia. Although we included association studies using a family-based approach in the qualitative gene summary pages, their genotype distributions are not listed nor are they included in any of the statistical analyses, because family-based studies often do not report sufficient data to reconstruct crude association odds ratios and their confidence intervals. We excluded stud- ies examining phenotypic variables among schizophrenic subjects only (that is, those without ‘healthy’ controls) and included studies of microsatellite markers only if the markers were located within or near coding regions. We considered studies on polymorphisms with three or more alleles for meta-analysisif the studies consistently reported genotype distributions using the same nomencla- ture (for example, the APOE ε2/3/4 polymorphism), or if generally only one allele was compared to other alleles or if only two alleles showed frequencies ≥0.01 (for example, the SLC6A3 3′-UTR VNTR alleles 9 and 10). We included genetic variants with a complex allelic architecture (for example, within the NAT1 locus on chromosome 8p22) in the qualitative gene summaries but generally did not consider them for meta-analysis. Second, studies must be published in a peer-reviewed journal. Information on whether or not articles are peer-reviewed before publication can generally be obtained from the website of the journal or directly from the publisher. This criterion specifically excludes studies reported in abstracts; for example, those presented at scientific meetings. Third, studies must be published in English. Our literature searches through 30 April 2007 showed that no more than 5% of all PubMed-indexed schizophrenia association studies are published in a language other than English (and the same or largely overlap- ping sets of data from many of these papers actually later appear in an English language journal), suggesting that exclusion of these studies probably did not have a significant impact on the meta-analysis results. Search strategies. To identify potential SzGene association studies, we searched PubMed using the search term ‘schizophreni* AND associat*’. This search yielded approximately 12,000 articles published before August 2006, which we then screened for eligibility using the title, abstract or the full text, as necessary. Beginning in August 2006, we have conducted weekly searches using the keyword ‘schizophreni*’. These updates yield ~300 articles per month and result in the addition of up to five new studies to SzGene per week. In addition, we regularly search the bibliographies of included publications and the tables of contents of journals in genetics and psychiatry for eligible studies. For the purpose of the analyses presented herein, the database content was ‘frozen’ on 30 April 2007, when it included 1,178 independent articles. Because of the regular and ongoing updates of SzGene online, papers published after that date are included on the database website. Thus, some results in this manuscript may differ from those available online. Data management. Demographic variables. Full-length copies of all papers eligible for inclusion in SzGene are saved in an offline database located at Massachusetts General Hospital (MGH). Each study entry consists of the name of the first author, year of publication and PubMed identification number, along with key population-specific details extracted from each study, such as ascertainment design (family-based or case-control), ancestral background and population (country) of origin, sample source (clinic-, population-, or community-based), the number of cases with gender ratio, age at onset, age at examination, method of diagnosis (see below), the number of controls with gender ratio and age at examination, and the reported study results. The criteria used to arrive at a clinical diagnosis of schizophrenia in the 30 April 2007 data freeze were DSM-III, DSM-IIIR and DSM-IV (Diagnostic and Statistical Manual of Mental Disorders), RDC (Research Diagnostic Criteria), and ICD9 and ICD10 (International Statistical Classification of Diseases and Related Health Problems), although ~5% of all studies made no reference to their diagnostic criteria. We excluded studies if more than 10% of the diagnoses were nonschizophrenia or non–schizoaffective disorder. Whenever a sample contained individuals with both schizophrenia and schizoaffective disorder, we used only the population and genotype information from the individuals with schizophrenia, unless the data could not be separated. We first entered all data into the offline database, and then double-checked all entries against the original publications before meta-analysis and upload to the SzGene website. Genotype and allele distributions. Whenever possible, we used NCBI dbSNP identifiers (‘rs numbers’) to designate polymorphism identities throughout the database. If these were not specified in the articles, we attempted to resolve rs numbers using bioinformatics tools such as NCBI BLAST. If an rs number could not be unequivocally determined, we generally adopted the most com- monly used nomenclature of the primary publications. Genotype distributions are listed for each polymorphism, with the minor allele (based on estimates in healthy controls) first, as given in the original publication. Whenever genotype distributions were not presented in the publication, we calculated them from reported allele frequencies and sample sizes (assuming no deviations from HWE unless reported otherwise in the original paper). We generally contacted first and last authors of studies with missing genotype data twice via e-mail to ask them to directly supply genotype and allele distributions before labeling studies as ‘no data provided’. Although this substantially increased the genotype data included in SzGene and the meta-analyses, approximately 8% of all genotypes remained unavailable. Duplicate publications. In many cases, authors reported the results of associa- tion analyses of a polymorphism in the same or largely overlapping samples in separately published articles. Where such overlap was specified by the authors or suspected overlap was confirmed by the authors, we typically designated the smaller sample as ‘overlaps with [Author, Year]’ and entered the result as ‘n.a.’. No genotype numbers were entered for the overlapping sample and the smaller sample was not used in the meta-analysis. When the smaller sample was also the initial sample tested for a particular meta-analyzed polymorphism, the genotype numbers from the initial report were retained as such (to allow meta-analyses after exclusion of the initial study, see below), and only the remainder of the genotype numbers were used for the largest follow-up study on the same or overlapping sample. Large-scale association studies. Recent advances in genotyping technology have allowed genome-wide association (GWA) testing19. The sheer scale of these stud- ies makes their inclusion in SzGene a daunting and computationally demanding task. We devised the step-wise protocol described below to capture the most relevant genetic information from GWA studies without having to include each data point. Stage I has already been implemented and stages II and III will be implemented as soon as the genotype data from individual GWA studies become publicly available. Note that stage II has already been implemented into a related database project on Alzheimer’s disease genetics, where genotype data from one of three published GWA studies are publicly available. The results of these analyses can be found on the AlzGene website. Stage I represents the inclusion of genes and polymorphisms featured by the authors of the GWA study, usually because they show some degree of genetic association after completion of all analyses, for example, correction for multiple comparisons and/or replication in multiple independent samples. These data represent the core findings of each GWA study and their inclusion is straight- forward, as the genotype distributions of these genes or markers are usually readily available in the original publication. Although this stage has already been implemented in SzGene, at present only two GWA studies have been published for schizophrenia up until 30 April 2007 (refs.12,13). In stage II, we will also make use of ‘nonfeatured’ genotype distributions, that is, polymorphisms not identified to be associated with schizophrenia in the original publications. We will add large-scale(GWA) genotype data for poly- morphisms already included in SzGene and recalculate the meta-analyses for SNPs with genotype data in at least four independent case–control samples. Note that if previously proposed candidate gene effects are not identified among the top GWA hits, this does not necessarily mean that such effects do not exist. Rather, it could reflect insufficient power of the individual GWA studies12,13 given their rather modest sample sizes (645 and 322 combined cases and controls in the studies published until 30 April 2007, respectively) and the generally more stringent thresholds for reaching experiment-wide statistical significance in the context of GWA testing. Stage III applies to GWA studies only. When genotype distributions become publicly available for multiple GWA scans, we will carry out systematic meta- analyses for all variants that overlap in at least four independent case–control samples. Only variants showing significant summary ORs will be shown on the SzGene website. The threshold for declaring nominal statistical significance in this context will be more stringent than for meta-analyses of individual candidate polymorphisms, because of the large number of tests performed. Procedures for 832 volume 40 | number 7 | july 2008 | nature genetics © 20 08 N at ur e Pu bl is hi ng G ro u p h ttp :// w w w. n at ur e. co m /n at ur eg en et ic s analys is implementing this stage and the definition of appropriate threshold criteria are currently being developed by our group and by others31. Statistical analyses. Statistical analyses for this manuscript were done in Statistical Analysis Software (SAS), version 9.1.3, and Intercooled STATA, ver- sion 8.2. Meta-analyses. For all variants with minor allele frequencies ≥0.01 in ‘healthy’ controls and with case–control genotype data available from four or more sam- ples, we calculated crude study-specific ORs and 95% confidence intervals (CIs) for each study using allelic contrasts (minor versus major allele). We chose a minimum of four samples, so as to have at least three data sets for meta-analysis after exclusion of the initial study for each polymorphism. In addition, we feel that with fewer data sets, the replication process is probably still too preliminary to allow any further-reaching conclusions. We obtained summary ORs and 95% CIs with the DerSimonian and Laird random-effects model32, which utilizes weights that incorporate both the within-study and between-study variance. This procedure was first done on all studies regardless of the ancestry of the individuals studied (‘All studies’ on the meta-analysis graphs; Supplementary Fig. 2). Summary ORs and 95% CIs were also calculated for studies of individu- als of European ancestry if three or more such studies existed (‘All Caucasian studies’). Generally, too few samples of non-European ancestry existed to allow meaningful meta-analyses on populations of non-European ancestry. For the analyses of this manuscript, we also carried out meta-analyses on genotype contrasts using recessive and dominant models. Note that these analyses are not available online for SzGene, for which we only consider allelic contrasts. We assessed between-study heterogeneity by calculating the Q statistic, a test for heterogeneity among the study-specific ORs, which is distributed approxi- mately as χ2 with k – 1 degrees of freedom (for k studies). Furthermore, we also calculated the I2 heterogeneity metric. In contrast to Q, which cannot be compared across meta-analyses with different numbers of studies, I2 is com- parable regardless of the number of studies meta-analyzed. It is estimated as ((Q – (k – 1))/Q) × 100 and takes values from 0% to 100% that show the extent of the heterogeneity that is beyond chance. For Q < k – 1, I is set to 0%. Generally values above 50% are considered to represent large heterogeneity33,34. When there are only few studies, both Q and I2 carry considerable uncertainty and should be interpreted cautiously35. Irrespective of the presence or absence of heterogeneity, meta-analyses for SzGene were only calculated using random-effects models (with the exception of analyses conducted for the comparison with previously published meta- analyses, see below). Although random-effects analyses typically have less power than those based on fixed-effects models, they yield more realistic CIs. Fixed effects would, by definition, be inappropriate in the presence of heterogeneity, whereas fixed and random effects coincide in the absence of between-study heterogeneity. Sensitivity analyses and bias assessment. Sensitivity analyses included calcula- tion of summary ORs and 95% CIs for all studies excluding the initial report (‘All excluding initial’) and after excluding studies violating HWE according to a χ2 test at P ≤ 0.05 (‘All excluding HWE deviations’). We also routinely constructed funnel plots, which depict the ORs (on a logarithmic scale) against their standard error for each study36; these are available online for all variants. For the analyses in this manuscript, we also carried out a modified regression procedure recently proposed37 (which models the log OR as a function of study precision) for all variants showing significant summary ORs in the random- effects meta-analyses. The modified regression test has an appropriate type I error at the P = 0.10 threshold. When larger studies show more conservative effects than smaller studies, this may be an indication that publication or other reporting biases are operating, although genuine diversity in the results of small versus larger studies is also possible38. Therefore, we carried out an additional diagnostic test that aims to detect whether there was an excess of statistically significant single studies39. This test evaluated whether the number of studies with statistically significant results at a given threshold (traditionally P < 0.05) is beyond what would be expected on the basis of power calculations informed by plausible effect sizes (for example, the summary effect seen in each meta-analysis) and the sample size and allele frequency of each study. The test was applied to each of the meta-analyses and also across the whole domain of all meta-analyses as well as subgroups thereof (meta-analyses with significant results versus nonsignificant results and meta-analyses with I2 > 50% versus those without large estimates of heterogeneity). Grading the epidemiological credibility of significant associations. The online ver- sion of SzGene maintains a continuously updated list of associations that have been evaluated in meta-analyses and yield statistically significant results (P < 0.05) in the main analysis of all studies or including only studies on individuals of European descent (‘Top results’). For all ‘top results’, we applied a grading system for the strength of the epidemiological evidence that has been recently developed by the Human Genome Epidemiology Network (HuGENet). The grading was done here independently by two investigators. Details of the grading system are pub- lished elsewhere15. Briefly, each meta-analyzed association is graded on the basis of the amount of evidence, consistency of replication, and protection from bias. For amount of evidence, we assign the grade ‘A’ when the total number of minor alleles of cases and controls combined in the meta-analyses exceeds 1,000, ‘B’ when it is between 100 and 1,000, and ‘C’ when it is less than 100. For replication and consistency, we assign the grade ‘A’ for I2 point estimates <25%, ‘B’ for I2 values of 25–50%, and ‘C’ for I2 values >50%. For protection from bias, the guidelines propose consideration of various poten- tial sources of bias, including errors in phenotypes,genotypes, confounding (popu- lation stratification) and errors or biases at the meta-analysis level (publication and other selection biases). A grade A implies that there is probably no bias that can affect the presence of the association, grade B that there is no demonstrable bias but important information is missing for its appraisal, and grade C that there is evidence for potential or clear bias that can invalidate the association. Errors and biases are also considered in the framework of the observed summary OR. Whenever the summary OR deviates less than 1.15-fold from the null in meta- analyses based on published data, we acknowledge that occult publication and selec- tive reporting biases alone may invalidate the association, regardless of the presence or absence of other biases, and therefore assign a grade of C. When the summary OR deviates more than 1.15-fold from the null, we assign a grade of C when the modified regression test or excess test suggest the possibility of publication-bias or significance-chasing bias or when the association is no longer nominally sta- tistically significant upon exclusion of the initial study or studies violating HWE. Other areas that were considered but that did not result in a change of grading for protection from bias were potential errors in phenotyping and genotyping, which may affect the magnitude but not the presence of an effect in this field (if anything, nondifferential misclassification would tend to dilute the strength of an associa- tion). Potential confounding from population stratification was also considered to have a similar ‘diluting’ impact provided that at least self-reported ancestral descent was taken into account in the analyses. We rated overall epidemiological credibility as ‘strong’ if associations received three A grades, ‘moderate’ if they received at least one B grade but no C grades, and ‘weak’ if they received a C grade in any of the three assessment criteria15. Comparison with previous meta-analyses. In order to compare the results of SzGene with previous meta-analyses on schizophrenia genetic association studies, we identified and collected all applicable meta-analysis papers published before 30 April 2007 using a search strategy similar to that used for the individual studies (see above). For each meta-analysis study, we determined the genetic contrasts and models used (allelic, dominant or recessive; random or fixed effects). These analyses were then repeated using the more current SzGene data. In cases where details of the analysis methods were not specified in the published study, we used default SzGene analyses for comparison (allelic contrasts, random-effects model). Online database structure. After completing the data entry, processing and analy- ses described above, we posted all study-specific variables, genotype data, and meta- analysis plots to a publicly available, online adaptation of the SzGene database using the same software and code as our recently developed database for Alzheimer’s disease6. The online database is hosted by the Schizophrenia Research Forum, a nonprofit, internet-based community portal dedicated to furthering collaboration among researchers to help in the search for causes, treatments, and understanding of schizophrenia. URLs. SchizophreniaGene, http://www.szgene.org/; HuGENet, http://www.cdc.gov/ genomics/hugenet/; Genetic Association Database, http://geneticassociationdb.nih. gov/; EMBASE, http://www.embase.com/; AlzGene http://www.alzgene.org/. Note: Supplementary information is available on the Nature Genetics website. nature genetics | volume 40 | number 7 | july 2008 833 © 20 08 N at ur e Pu bl is hi ng G ro u p h ttp :// w w w. n at ur e. co m /n at ur eg en et ic s analys is ACKNOWLEDGMENTS Funding for this study was provided by the National Alliance on Research in Schizophrenia and Depression (to L.B.). F.K.K. was supported by a PENED training grant co-financed by E.U. European Social Fund and the Greek Ministry of Development, General Secretariat for Research and Technology. We are grateful to the Schizophrenia Research Forum for hosting SzGene on their website. In particular, we would like to thank A. Bumstead, H. Heimer, C. Knep and P. Noyes for the online adaptation of SzGene and many helpful discussions. We would further like to thank the members of the Scientific Advisory Board (currently including W. Byerly, G.D. Smith, J. Kennedy, D.F. Levinson and M. Owen) for their repeated review of the database and their helpful comments and suggestions during the development of this project. AuThOR CONTRIBuTIONS This study was designed by L.B. (principal investigator). Literature searches, data entry and online curation of data was done by N.C.A. and S.B., with help from L.B. Analysis scripts were developed and written by M.B.M. and F.K.K., and analyses were done by N.C.A., F.K.K., J.P.A.I. and L.B. The manuscript was written by N.C.A. and L.B., with contributions from J.P.A.I, F.K.K., M.J.K. and R.E.T. Published online at http://www.nature.com/naturegenetics/ Reprints and permissions information is available online at http://npg.nature.com/ reprintsandpermissions/ 1. Mueser, K.T. & McGurk, S.R. Schizophrenia. Lancet 363, 2063–2072 (2004). 2. Owen, M.J., Craddock, N. & O’Donovan, M.C. Schizophrenia: genes at last? Trends Genet. 21, 518–525 (2005). 3. Sullivan, P.F., Kendler, K.S. & Neale, M.C. Schizophrenia as a complex trait: evidence from a meta-analysis of twin studies. Arch. Gen. Psychiatry 60, 1187–1192 (2003). 4. Badner, J.A. & Gershon, E.S. Meta-analysis of whole-genome linkage scans of bipolar disorder and schizophrenia. Mol. Psychiatry 7, 405–411 (2002). 5. Lewis, C.M. et al. Genome scan meta-analysis of schizophrenia and bipolar disorder, part II: schizophrenia. Am. J. Hum. Genet. 73, 34–48 (2003). 6. Bertram, L., McQueen, M.B., Mullin, K., Blacker, D. & Tanzi, R.E. Systematic meta- analyses of Alzheimer disease genetic association studies: the AlzGene database. Nat. Genet. 39, 17–23 (2007). 7. Thakkinstian, A. et al. Systematic review and meta-analysis of the association between complement factor H Y402H polymorphisms and age-related macular degeneration. Hum. Mol. Genet. 15, 2784–2790 (2006). 8. Ioannidis, J.P., Ntzani, E.E. & Trikalinos, T.A. ‘Racial’ differences in genetic effects for complex diseases. Nat. Genet. 36, 1312–1318 (2004). 9. Lange, C., DeMeo, D., Silverman, E.K., Weiss, S.T. & Laird, N.M. PBAT: tools for family- based association studies. Am. J. Hum. Genet. 74, 367–369 (2004). 10. Ioannidis, J.P. & Trikalinos, T.A. Early extreme contradictory estimates may appear in published research: the Proteus phenomenon in molecular genetics research and random- ized trials. J. Clin. Epidemiol. 58, 543–549 (2005). 11. Li, D. & He, L. Association study between the NMDA receptor 2B subunit gene (GRIN2B) and schizophrenia: a HuGE review and meta-analysis. Genet. Med. 9, 4–8 (2007). 12. Mah, S. et al. Identification of the semaphorin receptor PLXNA2 as a candidate for susceptibility to schizophrenia. Mol. Psychiatry 11, 471–478 (2006). 13. Lencz, T. et al. Converging evidence for a pseudoautosomal cytokine receptor gene locus in schizophrenia. Mol. Psychiatry 12, 572–580 (2007). 14. Fujii, T. et al. Failure to confirm an association between the PLXNA2 gene and schizo- phrenia in a Japanese population. Prog. Neuropsychopharmacol. Biol. Psychiatry 31, 873–877 (2007). 15. Ioannidis, J.P. et al. Assessment of cumulative evidence on genetic associations: interim guidelines. Int. J. Epidemiol. 37, 120–132 (2008); published online 26 September 2007. 16. Ioannidis, J.P. et al. A road map for efficient and reliable human genome epidemiology. Nat. Genet. 38, 3–5 (2006). 17. Ioannidis,J.P., Ntzani, E.E., Trikalinos, T.A. & Contopoulos-Ioannidis, D.G. Replication validity of genetic association studies. Nat. Genet. 29, 306–309 (2001) 18. Lohmueller, K.E., Pearce, C.L., Pike, M., Lander, E.S. & Hirschhorn, J.N. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat. Genet. 33, 177–182 (2003). 19. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007). 20. Missale, C., Fiorentini, C., Busi, C., Collo, G. & Spano, P.F. The NMDA/D1 receptor com- plex as a new target in drug development. Curr. Top. Med. Chem. 6, 801–808 (2006). 21. Tauscher, J. et al. Equivalent occupancy of dopamine D1 and D2 receptors with clozapine: differentiation from other atypical antipsychotics. Am. J. Psychiatry 161, 1620–1625 (2004). 22. Williams, N.M., O’Donovan, M.C. & Owen, M.J. Is the dysbindin gene (DTNBP1) a susceptibility gene for schizophrenia? Schizophr. Bull. 31, 800–805 (2005). 23. Weickert, C.S. et al. Human dysbindin (DTNBP1) gene expression in normal brain and in schizophrenic prefrontal cortex and midbrain. Arch. Gen. Psychiatry 61, 544–555 (2004). 24. Roffman, J.L. et al. Contribution of methylenetetrahydrofolate reductase (MTHFR) polymorphisms to negative symptoms in schizophrenia. Biol. Psychiatry 63, 42–48 (2008). 25. Frosst, P. et al. A candidate genetic risk factor for vascular disease: a common mutation in methylenetetrahydrofolate reductase. Nat. Genet. 10, 111–113 (1995). 26. van der Put, N.M. et al. A second common mutation in the methylenetetrahydrofolate reductase gene: an additional risk factor for neural-tube defects? Am. J. Hum. Genet. 62, 1044–1051 (1998) 27. Munafo, M.R., Thiselton, D.L., Clark, T.G. & Flint, J. Association of the NRG1 gene and schizophrenia: a meta-analysis. Mol. Psychiatry 11, 539–546 (2006). 28. McGorry, P.D. et al. Spurious precision: procedural validity of diagnostic assessment in psychotic disorders. Am. J. Psychiatry 152, 220–223 (1995). 29. Wacholder, S., Chanock, S., Garcia-Closas, M., El Ghormli, L. & Rothman, N. Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J. Natl. Cancer Inst. 96, 434–442 (2004). 30. Ioannidis, J.P. Why most published research findings are false. PLoS Med. 2, e124 (2005). 31. Evangelou, E., Maraganore, D.M. & Ioannidis, J.P. Meta-analysis in genome-wide asso- ciation datasets: strategies and application in Parkinson disease. PLoS ONE 2, e196 (2007). 32. DerSimonian, R. & Laird, N. Meta-analysis in clinical trials. Control. Clin. Trials 7, 177– 188 (1986). 33. Higgins, J.P. & Thompson, S.G. Quantifying heterogeneity in a meta-analysis. Stat. Med. 21, 1539–1558 (2002). 34. Higgins, J.P., Thompson, S.G., Deeks, J.J. & Altman, D.G. Measuring inconsistency in meta-analyses. Br. Med. J. 327, 557–560 (2003). 35. Ioannidis, J.P., Patsopoulos, N. & Evangelou, E. Uncertainty in estimates of heterogeneity in meta-analysis. Br. Med. J. 335, 914–916 (2007). 36. Egger, M., Davey Smith, G., Schneider, M. & Minder, C. Bias in meta-analysis detected by a simple, graphical test. Br. Med. J. 315, 629–634 (1997). 37. Harbord, R.M., Egger, M. & Sterne, J.A. A modified test for small-study effects in meta-analyses of controlled trials with binary endpoints. Stat. Med. 25, 3443–3457 (2006). 38. Lau, J., Ioannidis, J.P., Terrin, N., Schmid, C.H. & Olkin, I. The case of the misleading funnel plot. Br. Med. J. 333, 597–600 (2006). 39. Ioannidis, J.P. & Trikalinos, T.A. An exploratory test for an excess of significant findings. Clin. Trials 4, 245–253 (2007). 834 volume 40 | number 7 | july 2008 | nature genetics © 20 08 N at ur e Pu bl is hi ng G ro u p h ttp :// w w w. n at ur e. co m /n at ur eg en et ic s
Compartilhar