Buscar

esquizofrenia genética

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 3, do total de 9 páginas

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 6, do total de 9 páginas

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 9, do total de 9 páginas

Prévia do material em texto

analys is
systematic meta-analyses and field synopsis of 
genetic association studies in schizophrenia: the 
szGene database
Nicole C Allen1, Sachin Bagade1, Matthew B McQueen2, John P A Ioannidis3–5, Fotini K Kavvoura3, 
Muin J Khoury6, Rudolph E Tanzi1 & Lars Bertram1
In an effort to pinpoint potential genetic risk factors for 
schizophrenia, research groups worldwide have published 
over 1,000 genetic association studies with largely 
inconsistent results. To facilitate the interpretation of these 
findings, we have created a regularly updated online database 
of all published genetic association studies for schizophrenia 
(‘SzGene’). For all polymorphisms having genotype data 
available in at least four independent case-control samples, 
we systematically carried out random-effects meta-analyses 
using allelic contrasts. Across 118 meta-analyses, a total 
of 24 genetic variants in 16 different genes (APOE, COMT, 
DAO, DRD1, DRD2, DRD4, DTNBP1, GABRB2, GRIN2B, 
HP, IL1B, MTHFR, PLXNA2, SLC6A4, TP53 and TPH1) 
showed nominally significant effects with average summary 
odds ratios of ~1.23. Seven of these variants had not been 
previously meta-analyzed. According to recently proposed 
criteria for the assessment of cumulative evidence in 
genetic association studies, four of the significant results 
can be characterized as showing ‘strong’ epidemiological 
credibility. Our project represents the first comprehensive 
online resource for systematically synthesized and graded 
evidence of genetic association studies in schizophrenia. As 
such, it could serve as a model for field synopses of genetic 
associations in other common and genetically complex 
disorders.
Schizophrenia is a common disorder caused by the interaction of mul-
tiple genetic and environmental factors, but its etiology has proved dif-
ficult to determine1. In particular, genetic research has been hindered 
by the largely nonmendelian patterns of familial transmission and the 
lack of disease-specific neuropathological features or biomarkers2. 
Although the heritability of schizophrenia is high (~80%), nongenetic 
factors likely also considerably modify disease risk3, further compli-
cating the identification of susceptibility genes. Genome-wide linkage 
analyses have identified several chromosomal regions thought to harbor 
schizophrenia genes, but only a few overlap across studies4,5. To identify 
the potential loci underlying these signals, well over 1,000 studies have 
been published claiming or refuting genetic association between puta-
tive schizophrenia genes and affection status, onset age and/or certain 
endophenotypes. Currently, about 150 genetic association studies are 
published each year, at increasing pace (Supplementary Fig. 1 online). 
Despite these efforts, no single gene or genetic variant has been estab-
lished as a bona fide schizophrenia susceptibility gene, at least not with 
the confidence accorded to other genes associated with susceptibility 
to complex disease, such as APOE in Alzheimer’s disease6 or CFH in 
macular degeneration7. For health care providers, researchers and the 
general public, the accumulating information is increasingly difficult to 
follow, evaluate and interpret.
To address this problem, we have collected and comprehensively 
catalogued all genetic association studies published in the field of 
schizophrenia. Furthermore, we subjected all polymorphisms with 
data available from at least four independent case-control samples to 
systematic meta-analyses. Detailed summaries of all association studies 
and meta-analyses have been posted on a regularly updated and publicly 
available online database, ‘SchizophreniaGene’ (‘SzGene’). In addition, 
we have applied interim guidelines developed by the Human Genome 
Epidemiology Network (HuGENet) to assess the epidemiological cred-
ibility of these associations. Our study is the first of its kind in schizo-
phrenia, and substantially facilitates the interpretation of findings in the 
quest for genuine genetic susceptibility factors of this disorder.
RESULTS
Literature searches
On 30 April 2007, the database content was frozen for the current analyses. 
At that time, 1,179 individual publications reporting on 3,608 genetic 
variants in 516 different genes were included in SzGene (after screening 
approximately 15,000 titles and abstracts). More than 90% (1,093) of these 
1Genetics and Aging Research Unit, MassGeneral Institute for 
Neurodegenerative Disease (MIND), Department of Neurology, 
Massachusetts General Hospital, Charlestown, Massachusetts 02129, 
USA. 2Institute for Behavioral Genetics, University of Colorado, Boulder, 
Colorado 80309, USA. 3Clinical and Molecular Epidemiology Unit, 
Department of Hygiene and Epidemiology, University of Ioannina School 
of Medicine, Ioannina 45110, Greece. 4Biomedical Research Institute, 
Foundation for Research and Technology Hellas, Ioannina 45110, Greece. 
5Department of Medicine, Tufts University School of Medicine, Boston, 
Massachusetts 02110, USA. 6National Office of Public Health Genomics, 
Centers for Disease Control and Prevention, Atlanta, Georgia 30341, USA. 
Correspondence should be addressed to L.B. 
(bertram@helix.mgh.harvard.edu). 
Published online 26 June 2008; doi:10.1038/ng.171
nature genetics | volume 40 | number 7 | july 2008 827
©
20
08
 
N
at
ur
e 
Pu
bl
is
hi
ng
 G
ro
u
p 
 h
ttp
://
w
w
w.
n
at
ur
e.
co
m
/n
at
ur
eg
en
et
ic
s
analys is
studies were published after 1995, and about half of those during the past 
three years (Supplementary Fig. 1). Both the average number of poly-
morphisms and the combined sample sizes studied per publication have 
steadily increased over the past 10 years, averaging 6 and 660, respectively, 
from 2003 to 2006, as compared to 2 and 380 between 1997 and 2002.
To determine the completeness of our search strategies, we compared 
the number of studies included in SzGene to those available in other 
literature databases (HuGENet, Genetic Association Database and 
EMBASE) on ten randomly chosen genes. For these, HuGENet listed 
27 studies, GAD 16, and EMBASE 33, whereas SzGene included 47 pub-
lications (Supplementary Table 1 online). This supports our literature 
search strategies as both comprehensive and specific.
Meta-analyses
Of the 3,608 polymorphisms included in the analyses herein, 118 variants 
in 52 genes had sufficient data to warrant meta-analysis (Supplementary 
Table 2 online). On average, these were based on 3,589 subjects (median; 
interquartile range (IQR) = 2,335–5,669) originating from 6 (median; 
IQR = 4–9) case–control samples. Twenty four (20%) of the meta- 
analyzed variants showed nominally significant (P ≤ 0.05) summary 
ORs, and for convenience will be referred to as ‘positive’ SNPs (those 
with P > 0.05 are designated as ‘negative’).
Details of the meta-analyses for all polymorphisms showing significant 
summary ORs in either the ‘all ethnicities’ or the ‘Caucasian only’ (that 
is, of self-reported European ancestry) paradigms are summarized in 
Table 1, as well as in Supplementary Table 3 and Supplementary Fig. 2 
online). A total of 24 variants within 16 genes (APOE, COMT, DAO, 
DRD1, DRD2, DRD4, DTNBP1, GABRB2, GRIN2B, HP, IL1B, MTHFR, 
PLXNA2, SLC6A4, TP53 and TPH1) yielded summary ORs suggesting 
a nominally significant increase or decrease in risk for schizophrenia. 
Seven meta-analyses showed nominal significance only across samples 
of European ancestry and not across the combination of samples of all 
ancestries. These results should be interpreted with caution, as ancestry-
specific effects are possible but probably uncommon8, and differences 
in statistical significance do not constitute proof of true differences in 
genetic effects across populations of diverse ancestral descent.Note 
that since the 30 April 2007 data freeze, new genotype data have been 
published for 8 of the 24 positive variants (in COMT, DRD2, GABRB2 
and IL1B; see SzGene website for details). However, none of the eight 
updated meta-analyses was changed appreciably, and all variants con-
tinue to show nominally significant results.
The average combined sample size (across studies for cases and 
controls) in the positive meta-analyses was 3,378 subjects (median; 
IQR = 2,410–5,419), drawn from an average of 6 independent case–
control samples (median; IQR = 5–8; Supplementary Table 2). The 
average significant allelic summary OR for positive analyses was 1.23 
(range: 1.11–1.52; range of P values from 0.048 to <0.0001), with a mean 
risk OR of 1.24 (range: 1.11–1.52) and a mean protective OR of 0.82 
828 volume 40 | number 7 | july 2008 | nature genetics
Table 1 Random-effects meta-analyses using allelic contrasts for polymorphisms showing significant summary ORs (as of 30 april 2007)
Gene Polymorphism Model
Cases vs. controls
(number of independent 
samples)
OR (95% CI)a P value
Heterogeneity
P valueb
I2 Gradec
APOE APOE (ε2/3/4) E4 vs. E3 E4 vs. E3, Caucasiand 1,500 vs. 2,702 (15) 1.16 (1.00–1.34) 0.043 0.60 0 B
COMT rs165599 G vs. A, all ancestries 2,628 vs. 7,340 (6) 1.11 (1.02–1.21) 0.019 0.24 25 C
COMT rs737865 C vs. T, Caucasiand 1,605 vs. 4,021 (3) 1.13 (1.01–1.28) 0.039 0.22 34 C
DAO rs4623951 C vs. T, all ancestries 1,509 vs. 1,521 (4) 0.88 (0.79–0.98) 0.026 0.85 0 C
DRD1 rs4532 (DRD1_48A/G) G vs. A, all ancestries 725 vs. 1,075 (5) 1.18 (1.01–1.38) 0.037 0.56 0 A
DRD2 rs1801028 (S311C) G vs. C, Caucasiand 2,299 vs. 3,777 (15) 1.52 (1.09–2.12) 0.013 0.27 16 B
DRD2 rs6277 (P319P) C vs. T, Caucasiand 473 vs. 896 (3) 1.45 (1.21–1.73) <0.00004 0.31 15 C
DRD4 rs1800955 (521T/C) C vs. T, all ancestries 2,002 vs. 1,986 (6) 1.15 (1.05–1.26) 0.003 0.77 0 C
DRD4 120-bp TR S vs. L, all ancestries 1,236 vs. 1,199 (4) 0.81 (0.70–0.94) 0.005 0.36 7 C
DTNBP1 rs1011313 (P1325) T vs. C, Caucasiand 2,696 vs. 2,849 (8) 1.23 (1.07–1.40) 0.003 0.59 0 A
GABRB2 rs1816072 C vs. T, Caucasiand 1,129 vs. 995 (4) 0.82 (0.72–0.93) 0.002 0.54 0 C
GABRB2 rs1816071 G vs. A, Caucasiand 1,133 vs. 993 (4) 0.82 (0.72–0.93) 0.002 0.69 0 C
GABRB2 rs194072 C vs. T, Caucasiand 1,137 vs. 991 (4) 0.83 (0.69–1.00) 0.048 0.36 7 B
GABRB2 rs6556547 T vs. G, Caucasiand 774 vs. 620 (3) 0.70 (0.52–0.95) 0.022 0.96 0 B
GRIN2B rs7301328 (366G/C) G vs. C, all ancestries 903 vs. 810 (4) 1.16 (1.01–1.33) 0.034 0.43 27 C
GRIN2B rs1019385 (200T/G) G vs. T, all ancestries 502 vs. 466 (4) 1.45 (1.14–1.85) 0.003 0.15 44 C
HP Hp1/2 1 vs. 2, all ancestries 1,346 vs. 2,018 (6) 0.88 (0.80–0.98) 0.016 0.67 0 C
IL1B rs16944 (C511T) T vs. C, Caucasiane 819 vs. 1,302 (5) 0.78 (0.65–0.93) 0.006 0.25 26 C
MTHFR rs1801133 (C677T) T vs. C, all ancestries 3,327 vs. 4,093 (14) 1.16 (1.05–1.30) 0.005 0.006 56 C
MTHFR rs1801131 (A1298C) C vs. A, Caucasiane 1,211 vs. 1,729 (5) 1.19 (1.07–1.34) 0.002 0.73 0 A
PLXNA2 rs752016 C vs. T, all ancestries 1,122 vs. 1,211 (6) 0.82 (0.69–0.99) 0.037 0.19 33 C
SLC6A4 5-HTTVNTR 10 vs. 12, all ancestries 2,335 vs. 2,688 (11) 0.86 (0.74–0.99) 0.036 0.03 50 C
TP53 rs1042522 C vs. G, all ancestries 1,418 vs. 1,410 (5) 1.13 (1.01–1.26) 0.029 0.50 0 C
TPH1 rs1800532 (218A/C) A vs. C, all ancestries 829 vs. 1,268 (5) 1.31 (1.15–1.51) <0.00008 0.33 13 A
Note that this table lists only the polymorphisms found to be nominally significantly associated with schizophrenia in SzGene. For a more detailed presentation of these positive polymorphisms, 
including genotypic analyses and results after the exclusion of HWE deviating samples, see supplementary Table 3. For a complete list of meta-analyses done in SzGene, see (supplementary 
Table 1). When nominally statistically significant results are obtained both in the analysis including all samples and in the analysis including only samples of European descent (usually refered to 
as ‘Caucasian’ in the original publications), only the analysis that has the largest genetic effect size (OR deviating the most from 1.00) is reported here.
aSummary ORs are based on random-effects allelic contrasts comparing minor and major alleles (based on frequencies in the control samples). bBased on the Q statistic across crude ORs cal-
culated for each study. P < 0.1 is considered to indicate significant evidence of between-study heterogeneity (see Methods). cDegree of ‘epidemiological credibility’ based on the interim Venice 
guidelines (A, strong; B, modest; C, weak; see text and supplementary Table 6 for more details). dNominally significant only in analyses restricted to samples of European descent. eNominally 
significant in analyses when combining samples of all ancestries, but showing a larger genetic effect size in analyses restricted to individuals of European descent, which is listed here. 
©
20
08
 
N
at
ur
e 
Pu
bl
is
hi
ng
 G
ro
u
p 
 h
ttp
://
w
w
w.
n
at
ur
e.
co
m
/n
at
ur
eg
en
et
ic
s
analys is
(range: 0.70–0.88). The corresponding average genotypic summary OR 
using either recessive or dominant models was slightly more pronounced 
for risk genotypes (1.37 (1.12–1.75)) and for protective genotypes (0.77 
(0.57–0.87)) as compared to the allele-based analyses.
Five of the 16 implicated genes were located in one of the linkage areas 
suggested in the genome-wide linkage meta-analyses5: DRD2 (11q23.1), 
GABRB2 (5q34), DTNBP1 (6p22.3), IL1B (2q13) and COMT (22q11.21). 
With the exception of one gene (COMT), these were also located in the 
‘top five’ linkage regions as suggested by Lewis et al. (the only top-five 
region that did not contain a SzGene positive variant was 3p25–p22). 
Note, however, that many of the candidate genes were originally chosen 
for an association assessment because of their location in or near link-
age regions, which needs to be taken into account when judging the 
positional candidacy of any of the positive findings.
Ninety-four (80%) SNPs in 45 genes showed no significant asso-
ciation with schizophrenia after all published case–control samples 
were meta-analyzed, either in the analyses combining all samples of 
all ancestries or across samples of European-only ancestry. The average 
sample size of the negative meta-analyses was 3,928 subjects (median; 
IQR = 2,335–56,695) drawn from an average of 6 samples (median; 
IQR = 4–9). These numbers were not significantly different from those 
of genes with ‘positive’ outcomes. However, the sample size needed to 
detect an allelic OR of ~1.23 (that is, the average observed here across all 
meta-analyses) at α = 0.05 and a disease prevalence of 1% with a power 
of 80% ranges from ~2,000 to 4,000 for allele frequencies between 0.5 
and 0.1 (estimated using PBAT9 v3.5). This sample size requirement was 
met for 76 and 44 of the 94 negative meta-analyses, respectively. Thus, 
the lack of a nominally significant finding for some polymorphisms may 
reflect insufficient power. In addition, as only ~2 SNPs were studied on 
average per negative gene, more variants need to be assessed in order to 
definitely exclude these genes as schizophrenia risk factors.
Forty-nine (42%) of the meta-analyzed variants showed significant 
between-study heterogeneity (P value <0.1 in the Q statistic) in the 
analyses combining all ancestral groups, and forty (34.5%) showed 
large between-study heterogeneity on the basis of the I2 metric (>50%; 
see Supplementary Table 4 online). However, with the typically limited 
numbers of studies, heterogeneity estimates and inferences have consid-
erable uncertainty and should be interpreted cautiously. Only two of the 
positive associations had large estimates of between-study heterogeneity 
(MTHFR and SLC6A4; Table 1).
Eight of the twenty-four positive meta-analysesincluded control 
populations showing violation of Hardy-Weinberg equilibrium (HWE), 
but only one result (in SLC6A4) became insignificant after exclusion 
of HWE-violating studies (Supplementary Table 3). Of the positive 
SNPs in the ‘all ancestries’ paradigm, seven became insignificant after the 
initial study was removed (Table 1). Of the 11 SNPs showing the stron-
gest effects in the population of European ancestry, three were initially 
identified in a study limited to individuals of European ancestry; two of 
these (MTHFR rs1801131 and APOE ε2/3/4) remained significant when 
the initial study was excluded from only the European-ancestry studies, 
whereas the third (COMT rs737865) did not.
The modified regression procedure modeling the logOR as a function 
of precision suggested that small studies yielded significantly (P < 0.1) 
larger effects than larger studies (a possible indication of publication 
bias or related biases) for DRD2 rs6277, GABRB2 rs1816071, GABRB2 
rs1816072 and IL1B rs16944. Of all 118 meta-analyses, 16 had evidence 
of inclusion of a significant (P < 0.10) excess of studies with nominally 
statistically significant results (single studies with P < 0.05). However, 
none of these belonged to the group of positive findings, that is, those 
with a nominally significant summary OR. All 16 of these meta-analyses 
pertained to associations in which single studies presented significant 
results in one direction, followed by other studies showing significant 
results in the opposite direction, a situation suggestive of the Proteus 
phenomenon10. Among schizophrenia genetic association studies as a 
whole, there was a clear excess of studies with a statistically significant 
result (146 observed, versus 84.65 expected, P < 10−6). However, this 
was limited to meta-analyses with nonstatistically significant results 
(O = 118, E = 59.44, P ≤ 10−6, compared with O = 28, E = 25.3, P = 0.55 for 
positive meta-analyses) and meta-analyses with large between-study het-
erogeneity (O = 100, E = 33.4, P ≤ 10−6, compared with O = 46, E = 51.3, 
P = 0.80 for meta-analyses without large estimates of heterogeneity).
A total of 56 meta-analyses on various putative schizophrenia sus-
ceptibility loci had been published as of 30 April 2007 (Table 2 and 
Supplementary Table 5 online), altogether analyzing 75 SNPs in 30 dif-
ferent genes. Ten of these variants were found to be positive in the analy-
ses done for SzGene, 41 were negative, and 24 were not meta-analyzed 
here because of insufficient data (for example, fewer than four indepen-
dent case–control samples, or more than two alleles; see Methods). Of 
the 16 genes found positive in SzGene, nine had previously been meta-
analyzed, although in one (COMT), only negative variants had been 
analyzed. The most recent meta-analyses on the remaining eight genes 
(APOE, DRD2, DRD4, GRIN2B, IL1B, MTHFR, SLC6A4 and TPH1) all 
reported a positive association with schizophrenia (Table 2a). Five other 
genes (BDNF, DRD3, NRG1, DAOA and COMT) had been implicated 
by previous meta-analyses, but were found to have no significant asso-
ciation in the default allelic contrasts in SzGene (Table 2b; note that 
although NRG1 did not show positive results in the data freeze used 
for the analyses in this paper, a more recent update of SzGene now lists 
one SNP (rs10503929) in NRG1 suggesting nominally significant asso-
ciation; see the SzGene website for more details). Finally, seven of the 
positive loci in SzGene (DAO, DRD1, DTNBP1, GABRB2, HP, PLXNA2 
and TP53) were, to the best of our knowledge, never analyzed in previous 
meta-analyses published before 30 April 2007. Meta-analyses published 
after that date were not considered for this comparison.
Differences between the results of published meta-analyses and those 
obtained in SzGene can be ascribed to several potential causes. First, 
given the more up-to-date approach in SzGene, meta-analyses, on aver-
age, were based on nearly six more individual case–control samples (on 
average representing ~2,400 more combined cases and controls) com-
pared to published meta-analyses. Second, some observed differences 
can be attributed to different in- and exclusion criteria. For instance, 
several of the previous meta-analyses included family-based studies, 
non-English articles and, occasionally, samples containing a substantial 
proportion of individuals with other psychiatric illnesses. Less often, 
we observed discrepancies in the extraction and/or interpretation of 
study-level data (for example, for 366G/C in GRIN2B) in our analy-
ses and those published by Li and He11. Note that despite the different 
results obtained for this particular variant, the association evidence for 
GRIN2B—based on results obtained in meta-analyses of other polymor-
phisms—was also judged as ‘significant’ by Li and He, consistent with 
the conclusions reached here.
Up until 30 April 2007, two GWA studies had been published12,13. The 
first report genotyped over 25,000 SNPs in 14,000 genes in 320 cases and 
325 controls and found significant association with schizophrenia in 
individuals of European ancestry with variants in plexin A2 (PLXNA2), 
located on chromosome 1q32. A follow-up study14 in a Japanese sample 
found no association between the same PLXNA2 variants and schizo-
phrenia. In our meta-analysis combining data from both studies, the 
C allele of SNP rs752016 showed a nominally significant protective 
effect across samples of all ancestries (OR = 0.82, 95% CI = 0.69–0.99), 
and a second SNP, rs841865, approached significance (OR = 0.84, 95% 
CI = 0.69–1.01). Despite these promising results, the combined sample 
nature genetics | volume 40 | number 7 | july 2008 829
©
20
08
 
N
at
ur
e 
Pu
bl
is
hi
ng
 G
ro
u
p 
 h
ttp
://
w
w
w.
n
at
ur
e.
co
m
/n
at
ur
eg
en
et
ic
s
analys is
sizes for both meta-analyses were relatively small (2,333 and 2,344, 
respectively), and more data are needed to confirm these results. The 
second GWA study13, which tested over 400,000 SNPs in 178 cases and 
144 controls, found that several variants in CSF2RA and neighboring 
IL3RA on chromosome Xp22.33 were associated with schizophrenia. 
However, the lack of enough independent case–control samples pre-
cluded meta-analysis of these variants for SzGene.
Table 1 and Supplementary Table 6 online show the results of apply-
ing the Venice interim criteria15 to all 24 of the associations with nomi-
nally statistically significant summary ORs. For criterion 1 (‘amount 
of evidence’), 19 were graded as ‘A’ and 5 as ‘B’; for criterion 2 (‘con-
sistency of replication’), 17 were graded as ‘A’, 5 as ‘B’, and 2 as ‘C’; and 
for criterion 3 (‘protection from bias’) 9 were graded as ‘A’ and 15 as 
‘C’. The main reasons for low grades in the last criterion were (i) the 
presence of a small summary OR (<1.15) that can easily be dissipated 
even by relatively small biases in meta-analyses of published data 
(n = 6), and/or (ii) loss of significance after excluding the initial study 
(n = 6). Overall, four associations (DRD1 rs4532, DTNBP1 rs1011313, 
MTHFR rs1801131 and TPH1 rs1800532) graded ‘A’ across all three cri-
teria, and—on the basis of these guidelines—can be considered to have 
‘strong’ epidemiological credibility. The remaining 20 showed either a 
‘modest’ (n = 4), or only a ‘weak’ (n = 16) degree of credibility (Table 1 
and Supplementary Table 6).
Conclusions
This is the first systematic, comprehensive field synopsis of genetic 
association studies in schizophrenia assembled to date following cri-
teria suggested by the HuGENet Road Map16. Our study combines 
unique and novel features that will generalize to knowledge-synthesis 
efforts for genetic associations in other common diseases. First, it uses 
a systematic and efficient searchstrategy of the applicable published 
literature. Up until 30 April 2007, we collected and extracted data from 
1,179 publications reporting on 3,608 polymorphisms in 516 different 
genes. Second, it uses quantitative methods to derive summary effect 
estimates by means of meta-analysis, which we carried out for a total 
of 118 polymorphisms across 52 different genes. This nearly doubles 
the number of meta-analyses thus far published in the field. Here, 
24 polymorphisms in 16 genes yielded nominally significant results, 
with average allelic risk ORs of ~1.24 and average protective ORs of 
~0.82. Third, it systematically examines sources of biases and assigns 
a score for the epidemiologic credibility of the findings. According to 
the Venice guidelines for the assessment of cumulative evidence in 
genetic association studies15, meta-analyses in at least four ‘positive’ 
genes showed a ‘strong’ degree of epidemiological credibility (DRD1, 
DTNBP1, MTHFR and TPH1). Thus, on the basis of the current data, 
these genes seem to be the best contenders to contain genuine sus-
ceptibility alleles modifying disease risk within the whole domain of 
genetic epidemiology in schizophrenia. Fourth, our study combines 
information from published studies with recent results of genome-
wide association analyses. To accommodate the expected avalanche of 
data to emerge from upcoming GWA analyses in schizophrenia, we 
have devised a step-wise protocol that makes efficient use of this vast 
body of ‘unbiased’ genotype data. Once publicly available, genotype 
data will be merged with the data already available from candidate 
gene studies included in SzGene. Finally, the results of this compre-
hensive field synopsis—that is, all of the study-level data as well as the 
830 volume 40 | number 7 | july 2008 | nature genetics
Table 2 Previously published meta-analysis results compared to meta-analyses in szGene
Gene Study Polymorphism
Prior meta 
sample size 
(number of 
samples)
SzGene sample 
size (number 
of samples)
Model
Meta OR 
(95% CI) 
Het. 
Meta
SZGene OR 
(95% CI) 
Het. 
SZGene
Genes showing significant summary ORs (‘positive’ genes) in SzGene
APOE Xu, 2006 APOE ε2/3/4 5,223 (11) 4,202 (15) ε4 vs. non-ε4, 
Caucasian
1.23 (1.04–1.44) 0.59 1.17 (1.01–1.35) 0.615
DRD2 Glatt, 2006 rs1801028 9,070 (27) 9,335 (28) Cys vs. Ser 1.36 (1.09–1.70) 0.345 1.41 (1.16–1.72) 0.47
DRD4 Jonsson, 2003 rs1800955 1,459 (3) 3,988 (6) C vs. Ta 1.22 (1.04–1.42) n.g. 1.15 (1.05–1.26) 0.77
GRIN2B Li, 2007 rs1019385 n.g. (2) 968 (4) T vs. Gb 0.71 (0.56–0.9) 0.38 0.69 (0.54–0.88) 0.15
IL1B Shirts, 2006 rs16944 2,111 (5) 2,121 (5)
C vs. T, 
Caucasianb
1.24 (1.09–1.41) n.g. 1.28 (1.08–1.54) 0.25
MTHFR Gilbody, 2007 rs1801133 6,125 (12) 7,420 (14) T vs. C 1.18 (1.06–1.32) n.g. 1.16 (1.05–1.30) 0.0061
SLC6A4 Fan, 2005 5-HTTVNTR 4,546 (12)d 5,023 (11) 12 vs. non-12c 1.24 (1.11–1.38) 0.28 1.16 (1.01–1.35) 0.028
TPH1 Li, 2006 rs1800532 n.g. (5) 2,097 (5) A vs. C 1.24 (1.1–1.41) 0.34 1.31 (1.15–1.51) 0.33
Genes reported to show significant effects in previous meta-analyses, but not in SzGene using the default analyses
BDNF Zintzaras, 2007 270C/T 1,866 (5) 4,091 (8) T vs. C 1.63 (1.01–2.65) 0.07 1.29 (0.94–1.78) 0.11
DRD3 Jonsson, 2004 rs6280 11,066 (44) 13,693 (55) (GG + AA) vs. AGa 1.08 (1.00–1.16) n.g. 0.96 (0.91–1.03)e 0.01
NRG1f Munafo, 2006 Haplotype 10,595 (14) 12,154 (18) Haplotype P value = 0.02 0.2 1.06 (0.95–1.19) 0.0009
DAOA Li, 2006 rs1421292 n.g. (3) 5,591 (6) A vs. T 0.77 (0.66–0.89) 0.9 0.98 (0.85–1.12) 0.02
COMT Sand, 2006 rs4680 n.a. (5) 19,558 (40) G vs. Ab 1.19 (1.0–1.4) n.g. 0.99 (0.94–1.05) 0.059
Only the most recently published meta-analyses are presented here. For a more detailed presentation of all meta-analyses published before 30 April 2007, see supplementary Table 4.
The random-effects model was the default model used. Note that although 16 positive genes were found in SzGene, only 8 of these have had the same positive polymorphisms analyzed previ-
ously. Het. Meta, heterogeneity P value reported in published meta-analysis. Het. SZGene, heterogeneity based on the Q statistic across crude ORs calculated for each study. P < 0.1 is considered 
to indicate significant evidence of between-study heterogeneity (see Methods). n.g., number of subjects not given in meta-analysis; n.a., number of subjects not applicable, as only family-based 
studies were used in this meta-analysis.
aFixed effects model used in prior meta-analysis; random effects model used for SzGene analyses. bModel was not specified; random-effects model was used for SzGene analyses. cSzGene 
meta-analysis compares the 12 allele frequency to the 10 allele frequency; minor allele frequencies of other alleles in this polymorphism are negligible. dFan, 2005 includes two studies not 
published in English, hence the larger number of samples. eSzGene default analysis using allele-contrasts produced the nonsignificant result for the DRD3 rs6280 polymorphism listed in table. 
When using the same genotype contrast as described previously (dominant model), we observe a nominally significant finding (OR = 1.09 (95% CI = 1.01–1.17)), in agreement with the meta-
analysis by Jonsson, 2004. fThe significant findings of this study were based on haplotype comparisons, whereas the SZGene analyses are based on allelic contrasts (C vs. T) of a single variant 
(SNP8NRG221533). The summary OR of this SNP was reported as 1.04 (0.99–1.19) by Munafo et al., and is therefore very similar to the OR calculated on the substantially larger combined 
sample in SZGene, although the latter do not reach significance. Note that although none of the variants located in NRG1 showed nominally significant results with data published up until 30 
April 2007, a more current update of SzGene now lists one SNP (rs10503929) as ‘positive’; see the online database for details.
©
20
08
 
N
at
ur
e 
Pu
bl
is
hi
ng
 G
ro
u
p 
 h
ttp
://
w
w
w.
n
at
ur
e.
co
m
/n
at
ur
eg
en
et
ic
s
analys is
meta-analyses—are accessible in a searchable online database, SzGene, 
which is embedded in the Schizophrenia Research Forum.
The summary ORs obtained here for schizophrenia agree well with 
those found by our group in a similar project on genetic association 
studies of Alzheimer’s disease6, as well as with those found in large-
scale meta-analyses of other genetically complex diseases17,18, and, most 
recently, in high-density GWA analyses carried out on several common 
diseases19. Notably, we identified significant risk-modifying effects in 
seven genes (DAO, DRD1, DTNBP1, GABRB2, HP, PLXNA2 and TP53) 
for which, to the best of our knowledge, no previous meta-analyses had 
been published.
On the basis of the overall degree of epidemiologic credibility, the 
most notable findings to emerge from our systematic meta-analyses 
were with genetic variants located in four genes (DRD1, DTNBP1, 
MTHFR and TPH1). DRD1, which maps to chromosome 5q35, encodes 
dopamine receptor 1, the most abundant dopamine receptor in the cen-
tral nervous system. This receptor is thought to have a role in regulation 
of cognitive functions in the prefrontal cortex, possibly through interac-
tion with NMDA-mediated neurotransmission20, and to be involved in 
the action of clozapine, one of the atypical antipsychotic drugs used for 
the treatment of schizophrenia21. The potential functional role of the 
associated SNP rs4532 remains elusive, although its location in the 5′ 
UTR 48 bp upstream of the transcription start site suggests that it may 
be involved in the regulation of gene or protein expression. DTNBP1 
maps to chromosome 6p22 and encodes dystrobrevin binding protein 
1 (also known as dysbindin), which is expressed in many tissues includ-
ing the brain (reviewed in ref. 22). Thefunctional role of the associated 
SNP rs1011313 (P1325) remains elusive. However, because individuals 
with schizophrenia often have lower DTNBP1 mRNA expression in 
some brain regions, it is possible that potential risk-modifying DTNBP1 
variants may be involved in gene or protein expression23. MTHFR maps 
to chromosome 1p36.3, a chromosomal region that has not been linked 
to schizophrenia. It encodes 5,10-methylenetetrahydrofolate reductase, 
which catalyzes the reduction of 5,10-methylenetetrahydrofolate to 
5-methyltetrahydrofolate. In turn, 5-methyltetrahydrofolate serves 
as a carbon donor for homocysteine metabolism and is involved in 
other intracellular methylation processes, which have long been sug-
gested to be involved in schizophrenia pathogenesis on a number of 
levels (reviewed in ref. 24). In the SzGene meta-analyses, two MTHFR 
variants are associated with schizophrenia risk, rs1801131 (encod-
ing A1298C) and rs1801133 (encoding C677T), but only the former 
showed strong epidemiological credibility, whereas the latter showed 
large between-study heterogeneity (I2 = 55%). These two variants show 
a low degree of linkage disequilibrium (r2 = 0.18 based on current CEU 
HapMap data). Both result in amino-acid substitutions that have been 
suggested to reduce MTHFR enzyme activity25,26, and were recently 
linked to the occurrence of negative and positive symptoms in indi-
viduals with schizophrenia, effects correlated with serum folate levels24. 
Finally, TPH1 maps to chromosome 11p15–14 and encodes tryptophan 
hydroxylase 1, the rate-limiting enzyme in the biosynthesis of serotonin. 
However, the associated SNP rs1800532 (218A/C) is located in intron 7 
of the gene and has no previously described functional effect.
Despite our comprehensive and systematic approach to the schizo-
phrenia literature, the outcomes of our study should be evaluated with 
certain limitations in mind. First, although we carried out a thorough 
search using a number of different strategies to identify published stud-
ies eligible for inclusion in SzGene, we cannot exclude the possibil-
ity that some studies were overlooked. Second, our project explicitly 
excluded results from association studies existing only as abstracts, or 
those not published in English. This may have caused a disproportion-
ate exclusion of negative data resulting in publication bias, although we 
did not detect any evidence for such a bias in most meta-analyses with 
a positive outcome. Third, default SzGene meta-analyses (for example, 
those presented online) are based on allelic contrasts only. We chose this 
model because a large number of publications do not provide genotype 
distributions. In addition, it allowed us to condense the genetic asso-
ciation data into one statistic rather than test all possible transmission 
models. As the underlying mode of inheritance is unknown for most 
complex disease genes, we consider our approach to be a reasonable 
compromise between loss in power and practicality. Fourth, the use of 
study-level allele and genotype distributions precludes more sophis-
ticated analyses incorporating key covariates (such as age, gender, or 
potential gene–gene and gene–environment interactions), for which 
raw genotype data would be required. However, unless the same alleles 
confer opposite effects depending on the covariate, failure to account 
for interactions is not expected to mask any true underlying associa-
tion, but will merely reduce power. Fifth, it is not possible at present to 
process haplotype-based genotype data in the routine SzGene meta-
analyses. This may lead to missing risk effects conferred by haplotypes 
that are only poorly tagged by individual SNPs (as has been suggested 
for NRG1 (ref. 27)). Sixth, there likely exists considerable and difficult-
to-detect heterogeneity across cases classified as schizophrenia, as there 
is as of now no ‘definite’ diagnosis (neuropathologically defined, for 
example, as for Alzheimer’s disease or Parkinson’s disease) or specific 
laboratory test (as there is for hypertension or diabetes) for the disease. 
Furthermore, inter-rater variability of a clinical schizophrenia diagnosis 
is relatively high28. This situation is potentially aggravated by the fact 
that many studies (~6% of all studies included here) included individu-
als with schizoaffective disorder in their case samples, possibly further 
increasing heterogeneity and decreasing power. Seventh, we emphasize 
that the number of ‘true’ associations may be smaller than the number 
of nominally significant findings identified here29,30. This may have a 
number of causes, including multiple testing, linkage disequilibrium 
among associated variants, undetected publication or other reporting 
biases, and study-level technical artifacts that may have gone unnoticed. 
Most of the positive variants did not reach very high levels of statistical 
significance, and those with modest P values should be regarded cau-
tiously even if there is no obvious between-study heterogeneity and no 
demonstrable potential for bias. Finally, the epidemiological grading is 
based on interim criteria that have been created by consensus among a 
large number of experts, but which need further prospective validation 
of their performance. Protection from bias in particular is very difficult 
to rate, as latent bias is always possible and no test can have very high 
sensitivity and specificity for all types of possible bias. Thus, all results in 
this paper and in the ongoing analyses presented on the SzGene website 
should be interpreted conservatively until more studies are undertaken 
and possible molecular mechanisms underlying the putative risk- 
modifying effects have been evaluated and confirmed.
Despite these caveats, our project represents the first and only com-
prehensive and systematic assessment of the current status of genetic 
epidemiology research in schizophrenia, substantially extending the 
existing literature. The putative genetic risk factors emerging from 
our meta-analyses—summarized on a publicly available and regularly 
updated website—provide an extensive and quantitative summary of the 
most promising schizophrenia candidate susceptibility genes known to 
date. The approach presented here can be easily adapted to genetic asso-
ciation studies of other common diseases of public health significance. 
With the emergence of GWA studies for various diseases, systematic 
approaches are urgently needed to develop a credible knowledge base 
for the genetic architecture of human diseases, an essential prerequisite 
for using genetic information to improve health and prevent disease in 
the coming decades.
nature genetics | volume 40 | number 7 | july 2008 831
©
20
08
 
N
at
ur
e 
Pu
bl
is
hi
ng
 G
ro
u
p 
 h
ttp
://
w
w
w.
n
at
ur
e.
co
m
/n
at
ur
eg
en
et
ic
s
analys is
METHODS
Literature searches. Inclusion criteria. Studies included in SzGene must satisfy 
three criteria. First, they must evaluate the association between a polymorphic 
genetic variant (one with a minor allele frequency ≥0.01 in the general popu-
lation) and schizophrenia. Although we included association studies using a 
family-based approach in the qualitative gene summary pages, their genotype 
distributions are not listed nor are they included in any of the statistical analyses, 
because family-based studies often do not report sufficient data to reconstruct 
crude association odds ratios and their confidence intervals. We excluded stud-
ies examining phenotypic variables among schizophrenic subjects only (that is, 
those without ‘healthy’ controls) and included studies of microsatellite markers 
only if the markers were located within or near coding regions. We considered 
studies on polymorphisms with three or more alleles for meta-analysisif the 
studies consistently reported genotype distributions using the same nomencla-
ture (for example, the APOE ε2/3/4 polymorphism), or if generally only one 
allele was compared to other alleles or if only two alleles showed frequencies 
≥0.01 (for example, the SLC6A3 3′-UTR VNTR alleles 9 and 10). We included 
genetic variants with a complex allelic architecture (for example, within the NAT1 
locus on chromosome 8p22) in the qualitative gene summaries but generally 
did not consider them for meta-analysis. Second, studies must be published in a 
peer-reviewed journal. Information on whether or not articles are peer-reviewed 
before publication can generally be obtained from the website of the journal or 
directly from the publisher. This criterion specifically excludes studies reported in 
abstracts; for example, those presented at scientific meetings. Third, studies must 
be published in English. Our literature searches through 30 April 2007 showed 
that no more than 5% of all PubMed-indexed schizophrenia association studies 
are published in a language other than English (and the same or largely overlap-
ping sets of data from many of these papers actually later appear in an English 
language journal), suggesting that exclusion of these studies probably did not 
have a significant impact on the meta-analysis results.
Search strategies. To identify potential SzGene association studies, we searched 
PubMed using the search term ‘schizophreni* AND associat*’. This search 
yielded approximately 12,000 articles published before August 2006, which we 
then screened for eligibility using the title, abstract or the full text, as necessary. 
Beginning in August 2006, we have conducted weekly searches using the keyword 
‘schizophreni*’. These updates yield ~300 articles per month and result in the 
addition of up to five new studies to SzGene per week. In addition, we regularly 
search the bibliographies of included publications and the tables of contents of 
journals in genetics and psychiatry for eligible studies. For the purpose of the 
analyses presented herein, the database content was ‘frozen’ on 30 April 2007, 
when it included 1,178 independent articles. Because of the regular and ongoing 
updates of SzGene online, papers published after that date are included on the 
database website. Thus, some results in this manuscript may differ from those 
available online.
Data management. Demographic variables. Full-length copies of all papers eligible 
for inclusion in SzGene are saved in an offline database located at Massachusetts 
General Hospital (MGH). Each study entry consists of the name of the first 
author, year of publication and PubMed identification number, along with key 
population-specific details extracted from each study, such as ascertainment 
design (family-based or case-control), ancestral background and population 
(country) of origin, sample source (clinic-, population-, or community-based), 
the number of cases with gender ratio, age at onset, age at examination, method 
of diagnosis (see below), the number of controls with gender ratio and age at 
examination, and the reported study results.
The criteria used to arrive at a clinical diagnosis of schizophrenia in the 
30 April 2007 data freeze were DSM-III, DSM-IIIR and DSM-IV (Diagnostic and 
Statistical Manual of Mental Disorders), RDC (Research Diagnostic Criteria), 
and ICD9 and ICD10 (International Statistical Classification of Diseases and 
Related Health Problems), although ~5% of all studies made no reference to their 
diagnostic criteria. We excluded studies if more than 10% of the diagnoses were 
nonschizophrenia or non–schizoaffective disorder. Whenever a sample contained 
individuals with both schizophrenia and schizoaffective disorder, we used only the 
population and genotype information from the individuals with schizophrenia, 
unless the data could not be separated. We first entered all data into the offline 
database, and then double-checked all entries against the original publications 
before meta-analysis and upload to the SzGene website.
Genotype and allele distributions. Whenever possible, we used NCBI dbSNP 
identifiers (‘rs numbers’) to designate polymorphism identities throughout the 
database. If these were not specified in the articles, we attempted to resolve 
rs numbers using bioinformatics tools such as NCBI BLAST. If an rs number 
could not be unequivocally determined, we generally adopted the most com-
monly used nomenclature of the primary publications. Genotype distributions 
are listed for each polymorphism, with the minor allele (based on estimates in 
healthy controls) first, as given in the original publication. Whenever genotype 
distributions were not presented in the publication, we calculated them from 
reported allele frequencies and sample sizes (assuming no deviations from HWE 
unless reported otherwise in the original paper). We generally contacted first 
and last authors of studies with missing genotype data twice via e-mail to ask 
them to directly supply genotype and allele distributions before labeling studies 
as ‘no data provided’. Although this substantially increased the genotype data 
included in SzGene and the meta-analyses, approximately 8% of all genotypes 
remained unavailable.
Duplicate publications. In many cases, authors reported the results of associa-
tion analyses of a polymorphism in the same or largely overlapping samples in 
separately published articles. Where such overlap was specified by the authors 
or suspected overlap was confirmed by the authors, we typically designated the 
smaller sample as ‘overlaps with [Author, Year]’ and entered the result as ‘n.a.’. 
No genotype numbers were entered for the overlapping sample and the smaller 
sample was not used in the meta-analysis. When the smaller sample was also the 
initial sample tested for a particular meta-analyzed polymorphism, the genotype 
numbers from the initial report were retained as such (to allow meta-analyses 
after exclusion of the initial study, see below), and only the remainder of the 
genotype numbers were used for the largest follow-up study on the same or 
overlapping sample.
Large-scale association studies. Recent advances in genotyping technology have 
allowed genome-wide association (GWA) testing19. The sheer scale of these stud-
ies makes their inclusion in SzGene a daunting and computationally demanding 
task. We devised the step-wise protocol described below to capture the most 
relevant genetic information from GWA studies without having to include each 
data point. Stage I has already been implemented and stages II and III will be 
implemented as soon as the genotype data from individual GWA studies become 
publicly available. Note that stage II has already been implemented into a related 
database project on Alzheimer’s disease genetics, where genotype data from 
one of three published GWA studies are publicly available. The results of these 
analyses can be found on the AlzGene website.
Stage I represents the inclusion of genes and polymorphisms featured by the 
authors of the GWA study, usually because they show some degree of genetic 
association after completion of all analyses, for example, correction for multiple 
comparisons and/or replication in multiple independent samples. These data 
represent the core findings of each GWA study and their inclusion is straight-
forward, as the genotype distributions of these genes or markers are usually 
readily available in the original publication. Although this stage has already been 
implemented in SzGene, at present only two GWA studies have been published 
for schizophrenia up until 30 April 2007 (refs.12,13).
In stage II, we will also make use of ‘nonfeatured’ genotype distributions, 
that is, polymorphisms not identified to be associated with schizophrenia in the 
original publications. We will add large-scale(GWA) genotype data for poly-
morphisms already included in SzGene and recalculate the meta-analyses for 
SNPs with genotype data in at least four independent case–control samples. 
Note that if previously proposed candidate gene effects are not identified among 
the top GWA hits, this does not necessarily mean that such effects do not exist. 
Rather, it could reflect insufficient power of the individual GWA studies12,13 
given their rather modest sample sizes (645 and 322 combined cases and controls 
in the studies published until 30 April 2007, respectively) and the generally more 
stringent thresholds for reaching experiment-wide statistical significance in the 
context of GWA testing.
Stage III applies to GWA studies only. When genotype distributions become 
publicly available for multiple GWA scans, we will carry out systematic meta-
analyses for all variants that overlap in at least four independent case–control 
samples. Only variants showing significant summary ORs will be shown on the 
SzGene website. The threshold for declaring nominal statistical significance in 
this context will be more stringent than for meta-analyses of individual candidate 
polymorphisms, because of the large number of tests performed. Procedures for 
832 volume 40 | number 7 | july 2008 | nature genetics
©
20
08
 
N
at
ur
e 
Pu
bl
is
hi
ng
 G
ro
u
p 
 h
ttp
://
w
w
w.
n
at
ur
e.
co
m
/n
at
ur
eg
en
et
ic
s
analys is
implementing this stage and the definition of appropriate threshold criteria are 
currently being developed by our group and by others31.
Statistical analyses. Statistical analyses for this manuscript were done in 
Statistical Analysis Software (SAS), version 9.1.3, and Intercooled STATA, ver-
sion 8.2.
Meta-analyses. For all variants with minor allele frequencies ≥0.01 in ‘healthy’ 
controls and with case–control genotype data available from four or more sam-
ples, we calculated crude study-specific ORs and 95% confidence intervals (CIs) 
for each study using allelic contrasts (minor versus major allele). We chose a 
minimum of four samples, so as to have at least three data sets for meta-analysis 
after exclusion of the initial study for each polymorphism. In addition, we feel 
that with fewer data sets, the replication process is probably still too preliminary 
to allow any further-reaching conclusions. We obtained summary ORs and 95% 
CIs with the DerSimonian and Laird random-effects model32, which utilizes 
weights that incorporate both the within-study and between-study variance. 
This procedure was first done on all studies regardless of the ancestry of the 
individuals studied (‘All studies’ on the meta-analysis graphs; Supplementary 
Fig. 2). Summary ORs and 95% CIs were also calculated for studies of individu-
als of European ancestry if three or more such studies existed (‘All Caucasian 
studies’). Generally, too few samples of non-European ancestry existed to allow 
meaningful meta-analyses on populations of non-European ancestry. For the 
analyses of this manuscript, we also carried out meta-analyses on genotype 
contrasts using recessive and dominant models. Note that these analyses are not 
available online for SzGene, for which we only consider allelic contrasts.
We assessed between-study heterogeneity by calculating the Q statistic, a test 
for heterogeneity among the study-specific ORs, which is distributed approxi-
mately as χ2 with k – 1 degrees of freedom (for k studies). Furthermore, we 
also calculated the I2 heterogeneity metric. In contrast to Q, which cannot be 
compared across meta-analyses with different numbers of studies, I2 is com-
parable regardless of the number of studies meta-analyzed. It is estimated as 
((Q – (k – 1))/Q) × 100 and takes values from 0% to 100% that show the 
extent of the heterogeneity that is beyond chance. For Q < k – 1, I is set to 0%. 
Generally values above 50% are considered to represent large heterogeneity33,34. 
When there are only few studies, both Q and I2 carry considerable uncertainty 
and should be interpreted cautiously35.
Irrespective of the presence or absence of heterogeneity, meta-analyses for 
SzGene were only calculated using random-effects models (with the exception 
of analyses conducted for the comparison with previously published meta-
analyses, see below). Although random-effects analyses typically have less power 
than those based on fixed-effects models, they yield more realistic CIs. Fixed 
effects would, by definition, be inappropriate in the presence of heterogeneity, 
whereas fixed and random effects coincide in the absence of between-study 
heterogeneity.
Sensitivity analyses and bias assessment. Sensitivity analyses included calcula-
tion of summary ORs and 95% CIs for all studies excluding the initial report 
(‘All excluding initial’) and after excluding studies violating HWE according 
to a χ2 test at P ≤ 0.05 (‘All excluding HWE deviations’). We also routinely 
constructed funnel plots, which depict the ORs (on a logarithmic scale) against 
their standard error for each study36; these are available online for all variants. 
For the analyses in this manuscript, we also carried out a modified regression 
procedure recently proposed37 (which models the log OR as a function of study 
precision) for all variants showing significant summary ORs in the random-
effects meta-analyses. The modified regression test has an appropriate type I 
error at the P = 0.10 threshold.
When larger studies show more conservative effects than smaller studies, this 
may be an indication that publication or other reporting biases are operating, 
although genuine diversity in the results of small versus larger studies is also 
possible38. Therefore, we carried out an additional diagnostic test that aims to 
detect whether there was an excess of statistically significant single studies39. 
This test evaluated whether the number of studies with statistically significant 
results at a given threshold (traditionally P < 0.05) is beyond what would be 
expected on the basis of power calculations informed by plausible effect sizes 
(for example, the summary effect seen in each meta-analysis) and the sample 
size and allele frequency of each study. The test was applied to each of the 
meta-analyses and also across the whole domain of all meta-analyses as well as 
subgroups thereof (meta-analyses with significant results versus nonsignificant 
results and meta-analyses with I2 > 50% versus those without large estimates 
of heterogeneity).
Grading the epidemiological credibility of significant associations. The online ver-
sion of SzGene maintains a continuously updated list of associations that have 
been evaluated in meta-analyses and yield statistically significant results (P < 
0.05) in the main analysis of all studies or including only studies on individuals of 
European descent (‘Top results’). For all ‘top results’, we applied a grading system 
for the strength of the epidemiological evidence that has been recently developed 
by the Human Genome Epidemiology Network (HuGENet). The grading was done 
here independently by two investigators. Details of the grading system are pub-
lished elsewhere15. Briefly, each meta-analyzed association is graded on the basis 
of the amount of evidence, consistency of replication, and protection from bias. 
For amount of evidence, we assign the grade ‘A’ when the total number of minor 
alleles of cases and controls combined in the meta-analyses exceeds 1,000, ‘B’ when 
it is between 100 and 1,000, and ‘C’ when it is less than 100. For replication and 
consistency, we assign the grade ‘A’ for I2 point estimates <25%, ‘B’ for I2 values of 
25–50%, and ‘C’ for I2 values >50%.
For protection from bias, the guidelines propose consideration of various poten-
tial sources of bias, including errors in phenotypes,genotypes, confounding (popu-
lation stratification) and errors or biases at the meta-analysis level (publication 
and other selection biases). A grade A implies that there is probably no bias that 
can affect the presence of the association, grade B that there is no demonstrable 
bias but important information is missing for its appraisal, and grade C that there 
is evidence for potential or clear bias that can invalidate the association. Errors 
and biases are also considered in the framework of the observed summary OR. 
Whenever the summary OR deviates less than 1.15-fold from the null in meta-
analyses based on published data, we acknowledge that occult publication and selec-
tive reporting biases alone may invalidate the association, regardless of the presence 
or absence of other biases, and therefore assign a grade of C. When the summary 
OR deviates more than 1.15-fold from the null, we assign a grade of C when the 
modified regression test or excess test suggest the possibility of publication-bias 
or significance-chasing bias or when the association is no longer nominally sta-
tistically significant upon exclusion of the initial study or studies violating HWE. 
Other areas that were considered but that did not result in a change of grading for 
protection from bias were potential errors in phenotyping and genotyping, which 
may affect the magnitude but not the presence of an effect in this field (if anything, 
nondifferential misclassification would tend to dilute the strength of an associa-
tion). Potential confounding from population stratification was also considered to 
have a similar ‘diluting’ impact provided that at least self-reported ancestral descent 
was taken into account in the analyses. We rated overall epidemiological credibility 
as ‘strong’ if associations received three A grades, ‘moderate’ if they received at least 
one B grade but no C grades, and ‘weak’ if they received a C grade in any of the 
three assessment criteria15.
Comparison with previous meta-analyses. In order to compare the results of 
SzGene with previous meta-analyses on schizophrenia genetic association studies, 
we identified and collected all applicable meta-analysis papers published before 
30 April 2007 using a search strategy similar to that used for the individual studies 
(see above). For each meta-analysis study, we determined the genetic contrasts and 
models used (allelic, dominant or recessive; random or fixed effects). These analyses 
were then repeated using the more current SzGene data. In cases where details of the 
analysis methods were not specified in the published study, we used default SzGene 
analyses for comparison (allelic contrasts, random-effects model).
Online database structure. After completing the data entry, processing and analy-
ses described above, we posted all study-specific variables, genotype data, and meta-
analysis plots to a publicly available, online adaptation of the SzGene database using 
the same software and code as our recently developed database for Alzheimer’s 
disease6. The online database is hosted by the Schizophrenia Research Forum, a 
nonprofit, internet-based community portal dedicated to furthering collaboration 
among researchers to help in the search for causes, treatments, and understanding 
of schizophrenia.
URLs. SchizophreniaGene, http://www.szgene.org/; HuGENet, http://www.cdc.gov/
genomics/hugenet/; Genetic Association Database, http://geneticassociationdb.nih.
gov/; EMBASE, http://www.embase.com/; AlzGene http://www.alzgene.org/.
Note: Supplementary information is available on the Nature Genetics website.
nature genetics | volume 40 | number 7 | july 2008 833
©
20
08
 
N
at
ur
e 
Pu
bl
is
hi
ng
 G
ro
u
p 
 h
ttp
://
w
w
w.
n
at
ur
e.
co
m
/n
at
ur
eg
en
et
ic
s
analys is
ACKNOWLEDGMENTS
Funding for this study was provided by the National Alliance on Research in 
Schizophrenia and Depression (to L.B.). F.K.K. was supported by a PENED 
training grant co-financed by E.U. European Social Fund and the Greek Ministry 
of Development, General Secretariat for Research and Technology. We are grateful 
to the Schizophrenia Research Forum for hosting SzGene on their website. In 
particular, we would like to thank A. Bumstead, H. Heimer, C. Knep and P. Noyes 
for the online adaptation of SzGene and many helpful discussions. We would 
further like to thank the members of the Scientific Advisory Board (currently 
including W. Byerly, G.D. Smith, J. Kennedy, D.F. Levinson and M. Owen) for 
their repeated review of the database and their helpful comments and suggestions 
during the development of this project.
AuThOR CONTRIBuTIONS
This study was designed by L.B. (principal investigator). Literature searches, 
data entry and online curation of data was done by N.C.A. and S.B., with help 
from L.B. Analysis scripts were developed and written by M.B.M. and F.K.K., and 
analyses were done by N.C.A., F.K.K., J.P.A.I. and L.B. The manuscript was written 
by N.C.A. and L.B., with contributions from J.P.A.I, F.K.K., M.J.K. and R.E.T.
Published online at http://www.nature.com/naturegenetics/
Reprints and permissions information is available online at http://npg.nature.com/
reprintsandpermissions/
1. Mueser, K.T. & McGurk, S.R. Schizophrenia. Lancet 363, 2063–2072 (2004).
2. Owen, M.J., Craddock, N. & O’Donovan, M.C. Schizophrenia: genes at last? Trends Genet. 
21, 518–525 (2005).
3. Sullivan, P.F., Kendler, K.S. & Neale, M.C. Schizophrenia as a complex trait: evidence 
from a meta-analysis of twin studies. Arch. Gen. Psychiatry 60, 1187–1192 (2003).
4. Badner, J.A. & Gershon, E.S. Meta-analysis of whole-genome linkage scans of bipolar 
disorder and schizophrenia. Mol. Psychiatry 7, 405–411 (2002).
5. Lewis, C.M. et al. Genome scan meta-analysis of schizophrenia and bipolar disorder, part 
II: schizophrenia. Am. J. Hum. Genet. 73, 34–48 (2003).
6. Bertram, L., McQueen, M.B., Mullin, K., Blacker, D. & Tanzi, R.E. Systematic meta-
analyses of Alzheimer disease genetic association studies: the AlzGene database. Nat. 
Genet. 39, 17–23 (2007).
7. Thakkinstian, A. et al. Systematic review and meta-analysis of the association between 
complement factor H Y402H polymorphisms and age-related macular degeneration. 
Hum. Mol. Genet. 15, 2784–2790 (2006).
8. Ioannidis, J.P., Ntzani, E.E. & Trikalinos, T.A. ‘Racial’ differences in genetic effects for 
complex diseases. Nat. Genet. 36, 1312–1318 (2004).
9. Lange, C., DeMeo, D., Silverman, E.K., Weiss, S.T. & Laird, N.M. PBAT: tools for family-
based association studies. Am. J. Hum. Genet. 74, 367–369 (2004).
10. Ioannidis, J.P. & Trikalinos, T.A. Early extreme contradictory estimates may appear in 
published research: the Proteus phenomenon in molecular genetics research and random-
ized trials. J. Clin. Epidemiol. 58, 543–549 (2005).
11. Li, D. & He, L. Association study between the NMDA receptor 2B subunit gene (GRIN2B) 
and schizophrenia: a HuGE review and meta-analysis. Genet. Med. 9, 4–8 (2007).
12. Mah, S. et al. Identification of the semaphorin receptor PLXNA2 as a candidate for 
susceptibility to schizophrenia. Mol. Psychiatry 11, 471–478 (2006).
13. Lencz, T. et al. Converging evidence for a pseudoautosomal cytokine receptor gene locus 
in schizophrenia. Mol. Psychiatry 12, 572–580 (2007).
14. Fujii, T. et al. Failure to confirm an association between the PLXNA2 gene and schizo-
phrenia in a Japanese population. Prog. Neuropsychopharmacol. Biol. Psychiatry 31, 
873–877 (2007).
15. Ioannidis, J.P. et al. Assessment of cumulative evidence on genetic associations: interim 
guidelines. Int. J. Epidemiol. 37, 120–132 (2008); published online 26 September 
2007.
16. Ioannidis, J.P. et al. A road map for efficient and reliable human genome epidemiology. 
Nat. Genet. 38, 3–5 (2006).
17. Ioannidis,J.P., Ntzani, E.E., Trikalinos, T.A. & Contopoulos-Ioannidis, D.G. Replication 
validity of genetic association studies. Nat. Genet. 29, 306–309 (2001)
18. Lohmueller, K.E., Pearce, C.L., Pike, M., Lander, E.S. & Hirschhorn, J.N. Meta-analysis of 
genetic association studies supports a contribution of common variants to susceptibility 
to common disease. Nat. Genet. 33, 177–182 (2003).
19. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 
cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 
(2007).
20. Missale, C., Fiorentini, C., Busi, C., Collo, G. & Spano, P.F. The NMDA/D1 receptor com-
plex as a new target in drug development. Curr. Top. Med. Chem. 6, 801–808 (2006).
21. Tauscher, J. et al. Equivalent occupancy of dopamine D1 and D2 receptors with clozapine: 
differentiation from other atypical antipsychotics. Am. J. Psychiatry 161, 1620–1625 
(2004).
22. Williams, N.M., O’Donovan, M.C. & Owen, M.J. Is the dysbindin gene (DTNBP1) a 
susceptibility gene for schizophrenia? Schizophr. Bull. 31, 800–805 (2005).
23. Weickert, C.S. et al. Human dysbindin (DTNBP1) gene expression in normal brain and 
in schizophrenic prefrontal cortex and midbrain. Arch. Gen. Psychiatry 61, 544–555 
(2004).
24. Roffman, J.L. et al. Contribution of methylenetetrahydrofolate reductase (MTHFR) 
polymorphisms to negative symptoms in schizophrenia. Biol. Psychiatry 63, 42–48 
(2008).
25. Frosst, P. et al. A candidate genetic risk factor for vascular disease: a common mutation 
in methylenetetrahydrofolate reductase. Nat. Genet. 10, 111–113 (1995).
26. van der Put, N.M. et al. A second common mutation in the methylenetetrahydrofolate 
reductase gene: an additional risk factor for neural-tube defects? Am. J. Hum. Genet. 
62, 1044–1051 (1998)
27. Munafo, M.R., Thiselton, D.L., Clark, T.G. & Flint, J. Association of the NRG1 gene and 
schizophrenia: a meta-analysis. Mol. Psychiatry 11, 539–546 (2006).
28. McGorry, P.D. et al. Spurious precision: procedural validity of diagnostic assessment in 
psychotic disorders. Am. J. Psychiatry 152, 220–223 (1995).
29. Wacholder, S., Chanock, S., Garcia-Closas, M., El Ghormli, L. & Rothman, N. Assessing 
the probability that a positive report is false: an approach for molecular epidemiology 
studies. J. Natl. Cancer Inst. 96, 434–442 (2004).
30. Ioannidis, J.P. Why most published research findings are false. PLoS Med. 2, e124 
(2005).
31. Evangelou, E., Maraganore, D.M. & Ioannidis, J.P. Meta-analysis in genome-wide asso-
ciation datasets: strategies and application in Parkinson disease. PLoS ONE 2, e196 
(2007).
32. DerSimonian, R. & Laird, N. Meta-analysis in clinical trials. Control. Clin. Trials 7, 177–
188 (1986).
33. Higgins, J.P. & Thompson, S.G. Quantifying heterogeneity in a meta-analysis. Stat. Med. 
21, 1539–1558 (2002).
34. Higgins, J.P., Thompson, S.G., Deeks, J.J. & Altman, D.G. Measuring inconsistency in 
meta-analyses. Br. Med. J. 327, 557–560 (2003).
35. Ioannidis, J.P., Patsopoulos, N. & Evangelou, E. Uncertainty in estimates of heterogeneity 
in meta-analysis. Br. Med. J. 335, 914–916 (2007).
36. Egger, M., Davey Smith, G., Schneider, M. & Minder, C. Bias in meta-analysis detected 
by a simple, graphical test. Br. Med. J. 315, 629–634 (1997).
37. Harbord, R.M., Egger, M. & Sterne, J.A. A modified test for small-study effects in 
meta-analyses of controlled trials with binary endpoints. Stat. Med. 25, 3443–3457 
(2006).
38. Lau, J., Ioannidis, J.P., Terrin, N., Schmid, C.H. & Olkin, I. The case of the misleading 
funnel plot. Br. Med. J. 333, 597–600 (2006).
39. Ioannidis, J.P. & Trikalinos, T.A. An exploratory test for an excess of significant findings. 
Clin. Trials 4, 245–253 (2007).
834 volume 40 | number 7 | july 2008 | nature genetics
©
20
08
 
N
at
ur
e 
Pu
bl
is
hi
ng
 G
ro
u
p 
 h
ttp
://
w
w
w.
n
at
ur
e.
co
m
/n
at
ur
eg
en
et
ic
s

Outros materiais