Gender Differences in Mathematics Performance A Meta Analysis

•
UEA

Fernanda Pires
23.04.2018
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 3, do total de 17 páginas
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 6, do total de 17 páginas
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 9, do total de 17 páginas
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
E aí, curtiu este material?
Ajude a incentivar outros estudantes a melhorar o conteúdo
Gostou desse material? Compartilhe! 🧡
Informática I

54.750 Materiais compartilhados
Baixe o app para aproveitar ainda mais
Leia os materiais offline, sem usar a internet. Além de vários outros recursos!
Prévia do material em texto
Psychological Bulk-tin
1990, Vol. 107, No. 2,13!
Copyright 1990 by the American Psychological Association, Inc.
OQ33-2909/90/S00.75
Gender Differences in Mathematics Performance: A Meta-Analysis
Janet Shibley Hyde, Elizabeth Fennema, and Susan J. Lamon
University of Wisconsin—Madison
Reviewers have consistently concluded that males perform better on mathematics tests than females
do. To make a refined assessment of the magnitude of gender differences in mathematics perfor-
mance, we performed a meta-analysis of 100 studies. They yielded 254 independent effect sizes,
representing the testing of 3,175,188 Ss. Averaged over all effect sizes based on samples of the general
population, d was -0.05, indicating that females outperformed males by only a negligible amount.
For computation, (/was -0.14 (the negative value indicating superior performance by females). For
understanding of mathematical concepts, rfwas —0.03; for complex problem solving, d was 0.08. An
examination of age trends indicated that girls showed a slight superiority in computation in elemen-
tary school and middle school. There were no gender differences in problem solving in elementary
or middle school; differences favoring men emerged in high school (d = 0.29) and in college (d =
0.32). Gender differences were smallest and actually favored females in samples of the general popu-
lation, grew larger with increasingly selective samples, and were largest for highly selected samples
and samples of highly precocious persons. The magnitude of the gender difference has declined over
the years; for studies published in 197 3 or earlier d was 0.31, whereas it was 0.14 for studies published
in 1974 or later. We conclude that gender differences in mathematics performance are small. None-
theless, the lower performance of women in problem solving that is evident in high school requires
attention.
During the past 15 years, there has been much concern about
women and mathematics. Since Lucy Sells (1973) identified
mathematics as the "critical filter" that prevented many women
from having access to higher paying, prestigious occupations,
there has been much rhetoric and many investigations focused
on gender differences in mathematics performance.
Particularly within the fields of psychology and education,
gender differences in mathematics performance have been stud-
ied intensively, and there has been some consensus on the pat-
tern of differences. Anastasi (1958), in her classic differential
psychology test, stated that although differences in numerical
aptitude favored boys, these differences did not appear until
well into the elementary school years. Furthermore, she stated
that if gender differences in computation did appear, they fa-
vored females, whereas males excelled on tests of numerical
reasoning. Concurring with this, Maccoby and Jacklin (1974)
concluded that one of four sex differences that "were fairly well
established" was that "boys excel in mathematical ability" (p.
352). They also noted that there were few sex differences until
about ages 12-13, when boys' "mathematical skills increase
faster than girls' " (p. 352).
This research was supported by National Science Foundation Grant
MDR 8709533. The opinions expressed are our own and not those of
the National Science Foundation.
We thank Marilyn Ryan for her assistance in conducting the meta-
analysis. We thank researchers at the Educational Testing Service, espe-
cially Carol Dwyer and Eldon Park, for their help in providing Educa-
tional Testing Service data.
Correspondence concerning this article should be addressed to Janet
Shibley Hyde, Department of Psychology, Brogden Psychology Build-
ing, University of Wisconsin, Madison, Wisconsin 53706.
Most recently, Halpern (1986) concluded that "the finding
that males outperform females in tests of quantitative or mathe-
matical ability is robust" (p. 57). She stated that the differences
emerge reliably between 13-16 years of age.
The literature in education has reported conclusions that are
basically in agreement with the psychological literature. In
1974, Fennema reviewed published studies and concluded that
No significant differences between boys' and girls' mathematics
achievement were found before boys and girls entered elementary
school or during early elementary years. In upper elementary and
early high school years significant differences were not always ap-
parent. However, when significant differences did appear they were
more apt to be in the boys' favor when higher-level cognitive tasks
were being measured and in the girls' favor when lower-level cogni-
tive tasks were being measured. (Fennema, 1974, pp. 136-137)
In the Fennema review, no conclusions were made about high
school learners because of the scarcity of studies of subjects of
that age. However, a few years later, Fennema and Carpenter
(1981) reported that the National Assessment of Educational
Progress showed that there were gender differences in high
school, with males outperforming females, particularly in high
cognitive-level tasks. This conclusion has been reported by each
succeeding National Assessment (Meyer, in press).
Stage, Kreinberg, Eccles, and Becker (1985), in a thorough
review of the major studies that had been reported up to 1985,
concluded that
The following results are fairly consistent across studies using a
variety of achievement tests: I) high school boys perform a little
better than high school girls on tests of mathematical reasoning
(primarily solving word problems); 2) boys and girls perform sim-
ilarly on tests of algebra and basic mathematical knowledge; and 3)
girls occasionally outperform boys on tests of computational skills.
. . . Among normal populations, achievement differences favoring
139
140 J. HYDE, E. FENNEMA, AND S. LAMON
boys do not emerge with any consistency prior to the 10th grade,
are typically not very large, and are not universally found, even in
advanced high school populations. There is some evidence, how-
ever, that the general pattern of sex differences may emerge some-
what earlier among gifted and talented students, (p. 240)
Thus, although there are some variations, there is a consensus
that, overall, gender differences in mathematics performance
have existed in the past and are still present. Global conclusions
tend to assert simply that males outperform females on mathe-
matics tests. More refined discussions generally conclude that
the overall differences in mathematics performance are not ap-
parent in early childhood; they appear in adolescence and usu-
ally favor boys in tasks involving high cognitive complexity
(problem solving) and favor girls in tasks of less complexity
(computation).
Theoretical Models of Gender and
Mathematics Performance
Theoretical models concerning gender and mathematics per-
formance generally begin with the assumption that males out-
perform females in mathematics. The models are designed to
explain the causes of that phenomenon. For example, Eccles
and her colleagues (e.g., Eccles, 1987; Meece, [Eccles] Parsons,
Kaczala, Goff, & Futterman, 1982) have built an Expectation
X Value model to explain differential selection of mathematics
courses in high school. Fennema and Peterson (1985) proposed
an autonomous learning behavior model that suggested that
failure to participate in independent learning in mathematics
contributes to the development of gender differences in mathe-
matics performance. Others have proposed biological theories
focusing, for example, on brain lateralization (reviewed by
Halpern, 1986).
This model building may be premature because the basic
phenomenon that the models seek to explain—the gender
difference in mathematics performance—is in need of reassess-
ment, using the modern tools of meta-analysis.
Meta-Analysis and Psychological Gender Differences
The reviews cited previously haveall used the method of nar-
rative review. That is, the reviewers located studies of gender
differences, organized them in some fashion, and reported their
conclusions in narrative form. The narrative review, however,
has been criticized on several grounds: It is nonquantitative, un-
systematic, and subjective, and the task of reviewing 100 or
more studies simply exceeds the human mind's information-
processing capacity (Hunter, Schmidt, & Jackson, 1982).
Meta-analysis has been denned as the application of "quanti-
tative methods to combining evidence from different studies"
(Hedges & Olkin, 1985, p. 13). In the 1980s, meta-analysis be-
gan to make important contributions to the literature on psy-
chological gender differences (e.g., Hyde & Linn, 1986). Hyde
(1981) performed a meta-analysis on the 16 studies of quantita-
tive ability of subjects aged 12 or older that were included in
Maccoby and Jacklin's (1974) review (12 being the age at which
Maccoby and Jacklin concluded that the sexes begin to diverge
in mathematics performance). Hyde found a median effect size
of .43 and noted that this difference was not as large as one
might have expected given the widely held view that the differ-
ence is well established.
The Hyde (1981) meta-analysis included only studies re-
ported through 1973, and thus there is a need to update it with
recent research. Furthermore, the median value of rfwas com-
puted on the basis of only seven values. In addition, statistical
methods have advanced considerably since the time of the Hyde
review. Hedges and his colleagues have developed homogeneity
statistics that allow one to determine whether a group of studies
is uniform in its outcomes (Hedges & Olkin, 1985; Rosenthal
& Rubin, 1982a). Applied to the topic of gender differences in
mathematics performance, these statistical techniques allow
one to determine whether the magnitude of the gender differ-
ence varies according to the cognitive level of the task, the age
group, and so on. Thus, modern techniques of meta-analysis
can answer considerably more sophisticated questions than
could the earlier meta-analyses and certainly more than could
earlier narrative reviews.
Current Study
We performed a meta-analysis of studies of gender differences
in mathematics performance. Our goal was to provide answers
to the following questions:
1. What is the magnitude of gender differences in mathemat-
ics performance, using the d metric? We were chiefly interested
in answering this question for the general population. However,
we also provide analyses for selective samples.
2. Does the magnitude or direction of the gender difference
vary as a function of the cognitive level of the task?
3. Does the magnitude or direction of the gender difference
vary as a function of the mathematics content of the test (arith-
metic, geometry, algebra, and so on)?
4. Developmentally, at what ages do gender differences ap-
pear or disappear, and for what cognitive levels?
5. Are there variations across ethnic groups in the magnitude
or direction of the gender difference?
6. Does the magnitude of the gender difference vary depend-
ing on the selectivity of the sample, whether the sample is of the
general population or of a population that is selected for high
performance?
7. Has the magnitude of gender differences in mathematics
performance increased or declined over the years?
Method
Sample of Studies
The sample of studies came from seven sources: (a) a computerized
data base search of PsyclNFO for the years 1967-1987, using the key
terms human-sex-differences crossed with (mathematics or mathemat-
ics-concepts or mathematics-achievement or standardized tests}, which
yielded 198 citations; (b) a computerized data base search of ERIC,
using the key terms sex-differences crossed with (mathematics or math-
ematics achievement or mathematics-tests), which yielded 435 cita-
tions; (c) inspection of all articles in Journal fur Research in Mathemat-
ics Education and Educational Studies in Mathematics: (d) the bibliog-
raphy of Maccoby and Jacklin (1974); (e) the bibliography of Fennema
(1974); (f) norming data from widely used standardized tests; and (g)
state assessments of mathematics performance.
In the case of the computerized literature searches, abstracts were
GENDER DIFFERENCES IN MATHEMATICS PERFORMANCE 141
printed for each citation. The abstracts were inspected, and citations
that did not promise to yield relevant data (e.g., review articles or non-
empirical articles) were excluded. All relevant articles were photocop-
ied. Doctoral dissertations were obtained through interlibrary loan and
were then inspected for the data necessary to compute effect sizes.
Only studies reporting psychometricaily developed mathematics tests
were included. Specifically, we excluded studies using Piagetian mea-
sures (e.g., the concept of conservation of number) because they assess
a much different construct than do standardized tests. Grades, too, were
excluded because they may measure a different construct, and because
they are assigned more subjectively and may therefore be more subject
to bias than are standardized tests. (See Kimball, 1989, for a review
of gender differences in classroom grades; girls consistently outperform
boys in mathematics grades.)
If an article appeared to have relevant data but the data were not
presented in a form that permitted computation of an effect size, a letter
was sent to the author at the address specified for reprints or at a more
recent address found in the American Psychological Association Mem-
bership Register or the American Educational Research Association Di-
rectory.
Large-sample, normative data were obtained for the following widely
used tests: American College Testing Program test (ACT), Graduate
Management Admissions Test (GMAT), Scholastic Aptitude Test (SAT-
Q), SAT Mathematics Level I and Level 2, Differential Aptitude Test
(DAT), Graduate Record Examination (GRE-Q), GRE-Mathematics,
California Achievement Test, and the Iowa Test of Basic Skills (ITBS).1
Data from the National Assessment of Educational Progress (NAEP;
Dossey, Mullis, Lindquist, & Chambers, 1988) were also included.
To obtain data from additional large-scale assessments, a letter was
sent to one official of each state department of education and of the
departments of education of the District of Columbia and the Canadian
provinces of Manitoba, Nova Scotia, Ontario, and Saskatchewan (based
on the 1987-1988 membership list of the Association of State Supervi-
sors of Mathematics), for a total of 55 letters. There were 29 responses,
and nine states provided usable data: Alabama, Connecticut, Michigan,
North Carolina, Oregon, Pennsylvania, South Carolina, Texas, and
Wisconsin.
It is possible to obtain several independent effect sizes from a single
article if, for example, data from several age groups (in a cross-sectional
design) or several ethnic groups are reported. These groups can essen-
tially be regarded as separate samples (Hedges, 1987, personal commu-
nication).
The result was 100 usable sources, yielding 259 independent effect
sizes. This represents the testing of 3,985,682 subjects (1,968,846 males
and 2,016,836 females). When data from the SATs were excluded (for
reasons discussed later), there were 254 effect sizes, representing the
testing of 3,175,188 subjects(l,585,712 males and 1,589,476 females).
Coding the Studies
For each study, the following information was recorded: (a) all statis-
tics on gender differences in mathematics performance measure(s), in-
cluding means and standard deviations or t, F, and df\ (b) the number
of female and male subjects; (c) the cognitive level of the measure (com-
putation,2 concepts, problem solving, and general-mixed); (d) the
mathematics content of the test (arithmetic, algebra, geometry, calcu-
lus, and mixed-unreported);(e) the age(s) of the subjects (if the article
reported no age but reported "undergraduates" or students in an intro-
ductory college course, the age was set equal to 19; if a grade level was
reported, 5 years was added to that level to yield the age: e.g., third
graders were recorded as 8-year-olds); (f) the ethnicity of the sample
(Black, Hispanic, Asian American, American Indian, White, Austra-
lian, Canadian, or mixed-unreported); (g) the selectivity of the sample
(general samples, such as national samples or classrooms; moderately
selected samples, such as college students or college-bound students;
highly selected samples, such as students at highly selective colleges;
samples selected for extreme precocity, such as the Study of Mathemati-
cally Precocious Youth; samples selected for poor performance, such
as Headstart samples, low socioeconomic status samples, or remedial
college samples; and adult nonstudent samples); and (h) the year of pub-
lication.
Interrater Reliability
Interrater agreement was computed for ratings of ethnicity, sample
selectivity, cognitive level of the test, and mathematics content of the
test. The formula used was Scott's (1955) pi coefficient, as recom-
mended by Zwick (1988).
Pi was 1.00 for ethnicity, .90 for sample selectivity, .88 for cognitive
level, and 1.00 for mathematics content. Thus, these categories were
coded with high reliability.
Statistical Analysis
The effect size computed was d, defined as the mean for males minus
the mean for females, divided by the mean within-sexes standard devia-
tion. Thus, positive values of d represent superior male performance
and negative values represent superior female performance. Depending
on the statistics available for a given study, formulas provided by Hedges
and Becker (1986) were used for the computation of d and the homoge-
neity statistics. All effect sizes were computed independently by two
researchers, Janet Shibley Hyde and an advanced graduate student.
There were discrepancies in fewer than 4% of the d values; these were
resolved. All values of d were corrected for bias in estimation of the
population effect size, using the formula provided by Hedges (1981).
The complete listing of all studies, with effect sizes, is provided in Ta-
ble 1.
Results
Magnitude of Gender Differences in
Mathematics Performance
Averaged over 259 values, the weighted mean effect size was
0.20. When data from the SATs (Ramist & Arbeiter, 1986) were
1
 Although we tried to sample broadly over the major standardized
tests, the number of these tests is great and it was not feasible to report
data for all. In some cases, the test publisher was not able to provide the
needed data. In other cases, we did not wish to include too many tests
by the same publisher with the same format, thereby weighting those
tests too greatly. For example, we include the GMAT but not the Law
School Admission Test (LSAT) or the Medical College Admission Test
(MCAT). All are published by Educational Testing Service and are sim-
ilar, in the quantitative portion, in content and format. Furthermore, all
include selective samples, although it is difficult to assess the degree of
selection for mathematics performance. Therefore, we included the
GMAT but not the LSAT or MCAT. Because our major interest was in
assessing the magnitude of gender differences in mathematics perfor-
mance in the general population, inclusion of data from tests (e.g., the
MCAT) based on very selective samples was counterproductive.
2
 The definitions of the cognitive levels were as follows: Computation
refers to a test that requires the use of only algorithmic procedures to
find a single numerical answer. Conceptual refers to a test that involves
analysis or comprehension of mathematical ideas. Problem solving re-
fers to a test that involves extending knowledge or applying it to new
situations. Mixed tests include a combination of items from these cate-
gories.
(text continues on page 146)
142 J. HYDE, E. FENNEMA, AND S. LAMON
Table 1
Studies of Gender Differences in Mathematics Performance (in Alphabetical Order)
N
Study
Advanced Placement Calculus, 1988 (personal
communication, Carol Dwyer, January 20,
1989)
Alabama Department of Education, 1986-
1987
Alabama Department of Education, 1986-
1987
Alabama Department of Education, 1986-
1987
Alabama Department of Education, 1986-
1987
Alabama Department of Education, 1986-
1987
Alabama Department of Education, 1986-
1987
American College Testing Program, 1970
(American College Testing Program, 1987)
American College Testing Program, 1 987
Backman, 1972
Behrens & Verron, 1978
Bell & Ward, 1980
Benbow & Stanley, 1980
Benbow& Stanley, 1980
Benbow & Stanley, 1980
Benbow & Stanley, 1980
Benbow & Stanley, 1980
Benbow & Stanely, 1980
Benbow & Stanley, 1980
Benbow & Stanley, 1980
Benbow & Stanley, 1980
Benbow & Stanley, 1983
Boli, Allen, & Payne, 1985
Brandon et al., 1985
Brandon et al., 1985
Brandon et al., 1985
Brandon et al., 1985
Brandon etal., 1985
Brandon et al., 1985
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1 987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1 987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1 987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1 987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1 987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1 987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1987)
Mean
age
18
6
7
9
10
13
15
18
18
18
12
12
12
14
12
14
12
14
12
13
13
12
18
9
11
13
9
11
13
6
5
7
8
9
10
11
12
13
14
15
5
6
7
8
9
10
1 ]
12
13
14
15
5
6
7
Male
subjects
31,280
34,250
30,419
27,307
25,845
26,657
24,427
11,994
356,704
1,406
155
31
90
133
135
286
372
556
495
1,549
2,046
19,883
689
1,237
1,259
1,137
891
1,000
1,122
959
377
1,953
476
304
351
369
472
411
283
224
374
553
1,316
280
229
224
217
379
278
188
112
2,507
3,649
7.486
Female
subjects
22,115
31,336
28,573
26,872
25,095
25,889
25,388
11,664
420,740
1,519
137
41
77
96
88
158
222
369
356
,249
,628
19,937
465
,207
,176
,107
857
953
1,087
858
419
2,001
529
331
389
378
465
402
329
275
367
540
1,228
277
228
212
207
332
314
227
132
2,425
3,377
7,353
d>
0.20
-0.02
-0.03
-0.06
-0.07
-0.02
0.00
0.36
0.32
0.92
-0.12
-0.10
0.41
0.76
0.73
0.54
0.43
0.48
0.46
0.44
0.39
0.37
0.55
-0.10
0.02
-0.06
-0.07
-0.11
-0.15
-0.02
-0.13
-0.11
-0.32
-0.16
-0.28
-0.09
-0.38
-0.07
-0.14
-0.08
-0.18
0.09
0.04
-0.12
-0.23
-0.45
-0.07
-0.15
-0.10
-0.30
0.04
-0.09
-0.03
0.01
Ethnic Selectivity Cognitive
group" ofsample' leveld
6 5 4
6 1 4
6 1 4
6 1 4
6 1 4
6 1 4
6 1 4
6 2 4
6 2 4
6 4 4
8 1 4
6 1 4
6 4 4
6 4 4
6 4 4
6 4 4
6 4 4
6 4 4
6 4 4
6 4 4
6 4 4
6 4 4
6 3 4
5 1 3
5
5
3
3
3
2
2
2
2
2
2
2
2
2
2
2
6
6
6
3
3
3
3
3
1
2
2
1
2
1
2
1
2
1
2
2
1
2
1
2
1
2
1
2
1
2
2
1
2
Mathematicscontent"
4
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
GENDER DIFFERENCES IN MATHEMATICS PERFORMANCE 143
Table 1 (continued)
N
Study
California Achievement Test (Green, 1987)
California Achievement Test (Green, 1 987)
California Achievement Test (Green, 1 987)
California Achievement Test (Green, 1 987)
California Achievement Test (Green, 1 987)
California Achievement Test (Green, 1 987)
California Achievement Test (Green, 1 987)
California Achievement Test (Green, 1987)
Carrier, Post, & Heck, 1985
Connecticut Department of Education, 1987
Connecticut Department of Education, 1987
Connecticut Department of Education, 1987
Connor &Serbin, 1980
Connor & Serbin, 1980
Differential Aptitude Test (Bennett, Seashore,
&Wesman, 1979)
Differential Aptitude Test (Bennett et al., 1 979)
Differential Aptitude Test (Bennett et al., 1979)
Differential Aptitude Test (Bennett et al., 1979)
Differential Aptitude Test (Bennett et al., 1 979)
D' Augustine, 1966
D' Augustine, 1966
D' Augustine, 1966
Davis, 1973
Dees, 1982
deWolf, 1981
Dick &Balomenos, 1984
Edge &Friedberg, 1984
Edge &Friedberg, 1984
Engle&Lerch, 1971
Ethington & Wolfle, 1986
Ethington & Wolfle, 1984
Exezidis, 1982
Exezidis, 1982
Exezidis, 1982
Exezidis, 1982
Exezidis, 1982
Exezidis, 1982
Exezidis, 1982
Exezidis, 1982
Exezidis, 1982
Fendrich-Salowey, Buchanan, & Drew, 1982
Fennema & Sherman, 1978
Fennema & Sherman, 1978
Fennema & Sherman, 1978
Fennema & Sherman, 1977
Fennema & Sherman, 1977
Fennema & Sherman, 1977
Fennema & Sherman, 1977
Ferrini-Mundy, 1987
Flaugher, 1971
Flaugher, 1971
Flaugher, 1971
Flaugher, 1971
Flaugher, 1971
Flaugher, 1971
Flexer, 1984
Graduate Management Admission Council,
1987
Graduate Management Admission Council,
1987
Mean
age
8
9
10
11
12
13
14
15
9
9
11
13
12
15
13
14
15
16
17
10
11
12
13
15
16
19
19
19
6
15
18
11
11
11
12
12
12
13
13
13
11
11
12
13
14
15
16
17
19
16
16
16
16
16
16
13
21
23
Male
subjects
2,035
1,266
1,402
1,547
2,178
2,010
1,646
1,121
65
15,465
14,504
15,009
71
108
7,000
7,000
6,400
5,350
5,000
29
33
34
45
1,053
962
72
74
158
67
3,610
2,306
80
80
80
80
80
80
80
80
80
12
203
206
223
194
181
199
70
127
1,211
155
207
512
1,120
864
61
2,952
25,048
Female
subjects
,925
,175
,279
,429
,967
,947
,748
,170
79
15,462
14,722
14,919
63
97
6,900
7,350
6,750
5,800
5,350
31
27
26
45
962
1,131
62
51
207
63
4,226
2,807
80
80
80
80
80
80
80
80
80
12
203
225
260
219
169
167
34
122
1,923
151
200
562
1,614
950
63
2,392
17,687
d'
-0.11
-0.06
-0.32
-0.08
-0.35
-0.08
-0.32
0.05
-0.43
0.02
-0.02
0.06
0.04
0.23
-0.11
-0.08
0.00
0.03
0.13
-0.59
0.15
-0.09
0.81
0.14
0.38
0.09
-0.15
-0.05
-0.35
0.21
0.27
0.06
-0.11
-0.18
-0.09
-0.15
-0.21
0.00
-0.66
-0.03
0.18
0.30
-0.05
-0.11
0.23
0.35
0.41
0.22
0.06
0.29
0.28
0.27
0.49
0.18
0.33
0.18
0.45
0.43
Ethnic Selectivity Cognitive
groupb of sample" leveld
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
5
1
2
5
1
2
5
1
2
6
6
6
6
6
6
6
6
6i
5
3
2
1
5
1
2
1
2
1
2
1
2
1
4
4
4
4
4
4
4
4
3
4
3
4
I
1
4
4
3
3
3
3
3
3
2
2
2
1
3
2
1
4
4
4
4
4
4
4
4
4
4
4
6 3 1
6 3 4
6 3 4
Mathematics
content1
5
5
5
5
5
5
5
5
1
5
5
5
3
2
I
1
1
1
1
3
3
3
3
3
5
2
2
2
1
5
5
5
5
5
5
5
5
5
5
5
1
5
5
5
5
5
5
5
2
5
5
5
5
5
5
5
5
5
(Table continues)
144
Table 1 (continued)
J. HYDE, E. FENNEMA, AND S. LAMON
N
Study
Graduate Management Admission Council,
1987
Graduate Management Admission Council,
1987
Graduate Management Admission Council,
1987
Graduate Management Admission Council,
1987
Graduate Management Admission Council,
1987
Graduate Management Admission Council,
1987
Graduate Management Admission Council,
1987
GRE-Mathematics, 1978 (personal
communication, Eldon Park, January 9,
1989)
GRE-Q (Educational Testing Service, 1 987)
Hancock, 1975
Hanna, 1986
Harnisch & Ryan, 1983
Harris & Romberg, 1974
Hawnetal., 1981
Hawnetal., 1981
Henderson, Landesman, & Kachuck, 1985
Hilton &Berglund, 1974
Hilton & Berglund, 1974
Howe, 1982
Iowa Test of Basic Skills, 1 984 (Lewis &
Hoover, 1987)
Iowa Test of Basic Skills, 1 984 (Lewis &
Hoover, 1 987)
Iowa Test of Basic Skills, 1984 (Lewis &
Hoover, 1987)
Iowa Test of Basic Skills, 1 978 (Lewis &
Hoover, 1987)
Jacobs, 1973
Jacobs, 1973
Jarvis, 1964
Jerman, 1973
Johnson, 1984
Johnson, 1984
Johnson, 1984
Johnson, 1984
Johnson, 1984
Johnson, 1984
Kaczala, 1983
Kaczala, 1983
Kaczala, 1983
Kaczala, 1983
Kaczala, 1983
Kaplan & Flake, 1982
Kissane, 1986
Kissane, 1986
Kloosterman, 1985
Koffman & Lips, 1980
Lee&Coflrnan, 1974
Lee&Coflman, 1974
Leinhardt, Scewald, & Engel, 1979
Lewis & Hoover, 1983
Lloyd, 1983
Marjoribanks, 1987
Marsh, Smith, & Barnes, 1985
Mean
age
25
27
29
33
37
45
55
27
27
14
13
17
10
7
8
15
16
16
13
7
10
13
8
12
17
11
10
19
19
19
19
19
19
10
11
12
13
14
19
13
16
15
30
13
10
7
11
10
11
10
Male
subjects
25,855
19,246
19,233
14,088
8,967
4,445
954
1,813
92,722
65
1,773
4,791
195
324
324
45
632
249
40
4,623
5,088
5,085
4,497
40
40
366
107
97
99
58
42
46
49
50
36
52
46
48
18
52
50
63
35
76
93
372
223
497
472
422
Female
subjects
15,681
10,078
8,704
6,633
4,570
2,419
397
734
104,922
54
1,750
4,791
196
272
301
36
688
290
40
4,712
5,152
5,148
4,875
40
40
347
133
97
104
67
44
42
58
46
43
53
52
45
76
46
20
61
35
74
61
354
234
466
456
137
d"
0.41
0.42
0.44
0.42
0.39
0.51
0.47
0.77
0.67
0.20
0.17
0.06
-0.25
-0.12
-0.13
-0.28
0.40
0.33
-0.01
0.00
0.00
0.00
-0.04
0.20
0.67
0.10
-0.06
0.36
0.66
0.37
0.81
0.56
0.88
-0.06
-0.47
-0.21
0.20
-0.27
0.74
0.66
0.49
0.24
0.10
0.09
0.13
-0.12
0.14
0.10
0.11
-0.30
Ethnic
group"
6
6
6
6
6
6
6
6
6
6
8
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
7
7
6
6
6
6
6
6
6
7
7
Selectivity
of sample"
3
3
3
3
3
3
3
9
3
1
|
1
1
I
1
I
3
0
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
1
1
I
1
1
2
3
3
1
NA
I
1
1
1
1
1
I
Cognitive
leveld
4
4
4
4
4
4
4
3
4
4
4
4
2
4
4
4
3
3
4
1
2
3
3
4
4
3
1
3
3
3
3
3
3
4
4
4
4
4
4
3
3
2
4
4
4
4
4
4
4
4
Mathematics
contente
5
5
5
5
5
5
5
5
5
5
3
5
3
5
5
1
5
5
5
5
5
5
5
5
5
1
1
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
3
5
5
(Table continues)
GENDER DIFFERENCES IN MATHEMATICS PERFORMANCE 145
Table 1 (continued)
N
Study
Marshall* Smith, 1987
Meyer, 1978
Michigan Department of Education, 1987
Michigan Department of Education, 1987
Michigan Department of Education, 1 987
Mills, 1981
Moore &Smith, 1987
Moore* Smith, 1987
Moore & Smith, 1987
Moore & Smith, 1987
Moore & Smith, 1987
Moore & Smith, 1987
Moore & Smith, 1987
Moore & Smith, 1987
Moore & Smith, 1987
Muscio, 1962
National Assessment of Educational Progress
[NAEP], 1978 (Dossey, Mullis, Lindquist,
& Chambers, 1988)NAEP, 1978 (Dossey etal., 1988)
NAEP, 1978 (Dossey etal., 1988)
NAEP, 1986 (Dossey etal., 1988)
NAEP, 1986 (Dossey etal., 1988)
NAEP, 1986 (Dossey etal., 1988)
Newman, 1984
North Carolina Department of Public
Instruction, 1987
North Carolina Department of Public
Instruction, 1987
North Carolina Department of Public
Instruction, 1987
Oregon Department of Education, 1987
Parsley, Powell, O'Connor, & Deutsch, 1 963
Parsley etal., 1963
Parslev etal., 1963
Parsley etal., 1963
Parsley et al., 1963
Parsley etal., 1963
Parsley etal., 1963
Pattison & Grieve, 1984
Pattison & Grieve, 1984
Pattison & Grieve, 1984
Pederson, Shinedling, & Johnson, 1968
Pennsylvania Department of Education, 1987
Pennsylvania Department of Education, 1987
Pennsylvania Department of Education, 1987
Plake, Ansorge, Parker, & Lowry, 1982
Powell & Steelman, 1983
Randhawa & Hunt, 1987
Randhawa & Hunt, 1987
Randhawa & Hunt, 1987
Rosenberg & Sutton-Smith, 1969
Saltzen, 1982
Saltzen, 1982
Saltzen, 1982
Saltzen, 1982
SAT Mathematics
Level 1 (Ramist&Arbeiter, 1986)
SAT Mathematics
Level 2 (Ramist & Arbeiter, 1986)
Schonberger, 1981
Schratz, 1978
Schratz, 1978
Mean
age
11
9
9
12
15
13
19
19
19
19
19
19
19
19
19
11
9
13
17
9
13
17
7
8
11
13
13
7
8
9
10
11
12
13
15
18
18
8
8
10
13
19
21
9
12
15
20
7
10
7
10
18
18
19
9
9
Male
subjects
3,750
97
2,486
2,391
2,435
42
316
668
212
314
971
118
95
454
57
206
3,688
6,052
6,689
1,733
1,550
967
82
41,053
41,279
42,817
1,027
379
379
379
379
379
379
379
192
31
91
12
52,228
49,851
55,384
26
30
675
790
859
355
92
104
76
122
71,881
28,890
34
20
20
Female
subjects
3,650
82
2,479
2,563
2,520
73
247
553
207
365
1078
137
161
532
62
207
3,688
6,052
6,689
1,733
1,550
967
61
38,439
38,855
40,938
1,028
338
338
383
383
383
338
338
156
11
95
12
52,150
50,184
54,309
31
21
654
706
900
658
75
80
77
144
76,373
17,000
23
20
20
d'
-0.12
-0.14
-0.15
-0.09
-0.15
0.18
-0.04
0.11
0.08
0.31
0.41
0.45
0.28
0.48
0.76
0.21
-0.08
-0.03
0.22
0.00
0.07
0.18
-0.20
-0.12
-0.24
-0.26
0.06
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.29
0.05
0.33
-0.70
-0.02
-0.06
0.00
0.53
0.83
0.16
0.06
-0.06
0.16
-0.13
-0.39
0.08
0.15
0.40
0.38
0.48
-0.34
0.03
Ethnic
group"
6
6
6
6
6
6
1
5
2
1
5
2
1
5
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
7
7
7
6
6
6
6
6
6
8
8
8
6
6
6
6
6
6
6
6
2
[
Selectivity
of sample'
1
1
1
1
1
1
0
0
0
1
1
1
2
2
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
3
1
1
1
2
1
1
1
1
2
2
0
0
0
Cognitive Mathematics
level11 content5
1
3
1
1
1
4
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
1
4
4
4
4
4
4
4
4
4
4
4
3
3
3
1
4
4
4
4
4
2
3
1
4
1
1
2
2
4
4
2
2
2
(Tab!
1
5
5
5
5
5
5
5
5
5
5
5
5
1
5
5
5
5
1
1
1
1
1
1
1
3
3
3
1
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
2
5
5
» continues)
146
Table 1 (continued)
J. HYDE, E. FENNEMA, AND S. LAMON
N
Study
Schratz, 1978
Schratz, 1978
Schratz, 1978
Schratz, 1978
Scnk & Usiskin, 1983
Senk, 1982
Senk, 1982
Senk, 1982
Sheehan, 1968
South Carolina Department of Education,
1987
South Carolina Deparlment of Education,
1987
South Carolina Department of Education,
1987
South Carolina Department of Education,
1987
South Carolina Department of Education,
1987
Steel, 1978
Swafford, 1980
Texas Education Agency, 1987
Todd, 1985
Usiskin, 1972
Usiskin, 1972
Verbeke, 1982
Verbeke, 1982
Verbeke, 1982
Webb, 1984
Weiner, 1983
Whigham, 1985
Whigham, 1985
Whigham, 1985
Whigham, 1985
Wisconsin Department of Public Instruction,
1984
Wisconsin Department of Public Instruction,
1984
Wisconsin Department of Public Instruction,
1984
Wozencraft, 1963
Wozencraft, 1963
Wrabel, 1985
Yawkey, 1981
Zahn, 1966
Mean
age
9
14
14
14
16
16
16
16
14
9
10
12
14
16
18
14
NA
9
15
15
13
15
16
13
13
20
20
20
20
9
13
17
8
11
15
5
13
Male
subjects
20
20
20
20
674
266
268
245
57
22,531
21,622
23,390
25,559
18,778
546
294
95,168
63
87
74
17
14
10
44
43
63
123
88
20
871
783
691
282
301
99
48
14
Female
subjects
20
20
20
20
690
240
240
261
50
22,313
21,076
22,513
24,370
19,627
621
329
97,366
60
67
75
23
12
14
33
27
54
115
89
26
867
761
750
282
302
103
48
13
d'
0.08
-0.89
-0.05
0.45
0.05
0.04
0.12
-0.03
-0.04
-0.26
-0.01
-0.29
-0.04
-0.02
0.01
-0.09
0.03
-0.21
0.33
0.30
-0.49
0.46
0.00
0.15
0.32
-0.19
-0.05
-0.09
0.08
-0.10
0.06
-0.17
-0.23
-0.15
-0.06
-0.42
0.86
Ethnic
groupb
5
2
1
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
Selectivity
of sample*
0
0
0
0
1
1
1
1
1
1
1
1
1
1
4
4
4
2
4
2
2
2
2
1
1
1
1
1
1
1
Cognitive
level"
2
2
2
2
4
3
3
3
3
1
2
1
2
1
4
1
4
4
2
2
4
4
4
4
4
4
4
4
4
1
2
1
3
1
4
4
2
Mathematics
content*
5
5
5
5
3
3
3
3
2
5
5
5
5
5
5
1
5
5
3
3
5
5
5
5
5
2
2
3
4
5
5
5
1
1
5
1
1
a
 Positive values reflect better performance by males; negative values reflect better performance by females.
" 1 = Black, 2 = Hispanic, 3 = Asian American, 5 = White, 6 = mixed or unreported, 7 = Australian, 8 = Canadian, 9 = American Indian.
e
 0 = Selected for low performance, 1 = general samples, 2 = moderately selected, 3 = highly selected, 4 = highly precocious samples.
d
 I = computation, 2 - understanding of concepts, 3 = problem solving, 4 = mixed or unreported.
e
 1 — arithmetic, 2 = algebra, 3 = geometry, 4 = calculus, 5 - mixed or unreported.
excluded, the remaining 254 effect sizes yielded a weighted
mean d of 0.15. In both cases, this small positive value indicates
that, overall, males outperformed females by a small amount.
When one looks just at samples of the general population, rfwas
—0.05, reflecting a superiority in female performance, but of
negligible magnitude.
We excluded the SAT data from the remainder of the meta-
analysis for the following reason. The number of subjects in this
group was so enormous (810,494) that they accounted for 20%
of all subjects and, in a weighted means analysis, they exerted
a disproportionate effect. We reserve a separate section of the
discussion for the SAT data.
Overall, 131 (51%) of the 259 effect sizes were positive, re-
flecting superior male performance; 17 (6%) were exactly zero;
GENDER DIFFERENCES IN MATHEMATICS PERFORMANCE 147
Table 2
Magnitude of Gender Differences as a Function
of the Cognitive Level of the Test
Cognitive level
Computation
Concepts
Problem solving
Mixed or
unreported
k
45
41
48
120
d
-0.14
-0.03
0.08
0.19
95% confidence
interval for d
-0.14 to -0.13
-0.04 to -0.02
0.07 to 0.10
0.18to0.19
H
1,144*
118*
703*
39,557*
Note, k represents the number of effect sizes; H is the within-groups
homogeneity statistic (Hedges & Becker, 1986).
* Significant nonhomogeneity at p < .05, according to chi-square test.
All other categories are homogeneous.
and 111 (43%) were negative, reflecting superior female perfor-
mance.
Homogeneity analyses using procedures specified by Hedges
and Becker (1986)indicated that the set of 254 effect sizes was
significantly nonhomogeneous, H = 49,001.09, compared with
a critical value of x2(253) = 300 (approximation), p < .0001.
Therefore, we concluded that the set of effect sizes is heteroge-
neous and we sought to partition the set of studies into more
homogeneous subgroups, using factors that we hypothesized
would predict effect size. These factors are ones that have pre-
viously been shown to be important moderators of gender
differences in mathematics performance (e.g., Fennema, 1974;
Stage et al., 1985). Subsequently, we performed regression anal-
yses to determine which variables are the best predictors of
variations in d.
Cognitive Level
The results of the analysis of effect sizes, arranged according
to the cognitive level of the test, are shown in Table 2. As in the
overall analysis, the effect sizes are small. There is a slight fe-
male superiority in computation, no gender difference in under-
standing of concepts, and a slight male superiority in problem
solving. Oddly, the gender difference for tests with a mixture of
cognitive levels (or no report of cognitive level) is largest, al-
though still less than 0.25 standard deviation.
Homogeneity analyses indicate that there are significant
differences between the four effect sizes shown in Table 2; the
between-groups homogeneity statistic (Hs) was "7,479 com-
pared with a critical x2(3) = 7.81. However, it should be noted
that the number of subjects and the number of effect sizes in
this analysis is so great that small differences can be significant.
In the succeeding analyses, HBs can be compared to see which
between-groups effects are strongest. The cognitive-level effect
is a large one compared with the others.
Mathematics Content of the Tests
The analysis according to the mathematics content of the
tests was less successful because so many studies failed to report
the mathematics content or used tests with a mixture of con-
tent. The results of the analysis are shown in Table 3. They indi-
cate that there was no gender difference in arithmetic or algebra
performance. The male superiority in geometry was small
(0.13), and the tests with mixed content showed the largest gen-
der difference.
Homogeneity analyses indicated that there was a significant
difference between the effect sizes for the different types of math
content, HE = 548 compared against a critical x2(4) = 9.49.
This between-groups difference was smaller than most of the
others.
Age Differences
The ages were divided into five subgroups: (a) 5- to 10-year-
olds, (b) 11- to 14-year-olds, (c) 15- to 18-year-olds, (d) 19- to
25-year-olds, and (e) those 26 and older. These age groupings
were chosen for two reasons. First, they correspond roughly to
elementary school, middle or junior high school, high school,
college, and adulthood. Second, some reviewers have asserted
that there is no gender difference in mathematics performance
until the age of 12, when it begins to emerge (e.g., Maccoby &
Jacklin, 1974). Other reviewers believe that the difference does
not emerge until the last 2 or 3 years of high school (e.g., Meece
et al., 1982; Stage et al., 1985). Thus, it was important to have
age categories reflecting these two hypotheses.
The results of the analysis for age categories are shown in
Table 4. Overall, there was a small female superiority in the
elementary and middle school years. There was a more substan-
tial male superiority in the high school years, the college years,
and beyond, although this last finding is based on relatively few
effect sizes, most of them from the ORE.
Homogeneity analyses indicate that there are significant
differences in the magnitude of the gender difference as a func-
tion of age group, HB = 37,669 compared with a critical x2(4) =
9.49. The age effect is strong.
The results of the analysis of Age X Cognitive Level of the
Test interaction are also shown in Table 4. Females were supe-
rior in computation in elementary school and middle school,
although all differences were small. There was essentially no
gender difference at any age level in understanding of mathe-
matical concepts. Problem solving, on the other hand, presents
Table 3
Magnitude of Gender Differences as a Function
of the Mathematics Content of the Test
Mathematics
content
Arithmetic
Algebra
Geometry
Calculus
Mixed or
unreported
k
35
9
19
2
190
d
0.00
0.02
0.13
0.20
0.15
95% confidence
interval for d
-0.02 to 0.01
-0.08 to 0.11
0.09 to 0.16
0.1 8 to 0.22
0.15to0.15
II
368*
8
47*
0.17
48,064*
Note, k represents the number of effect sizes; H is the within-groups
homogeneity statistic (Hedges & Becker, 1986).
* Significant nonhomogeneity at p < .05, according to chi-square test.
All other categories are homogeneous.
148 J. HYDE, E. FENNEMA, AND S. LAMON
Table A
Magnitude of Gender Differences as a. Function
of Age and Cognitive Level of the Test
Table 6
Magnitude of the Gender Difference as a Function
of the Selectivity of the Sample
Cognitive level
Age group
5-10
11-14
15-18
19-25
26 and older
All
studies
-0.06
(6?)
-0.07
(93)
0.29"
(53)
0.41
m)
0.59
(9)
Computation
-0.20
(30)
-0.22
(38)
0.00
(12)
NA
NA
Concepts
-0.02
(33)
-0.06
(28)
0.07
(9)
NA
NA
Problem
solving
0.00
( I D
-0.02
(21)
0.29
(10)
0.32
(15)
NA
Sample Jt
General 1 84
Moderately
selective 24
Highly selective 18
Precocious 15
Selected for low
performance 12
d
-0.05
0.33
0.54
0.41
0.11
95% confidence
interval for d
-0.06 to -0.05
0.331o0.34
0.53100.54
0.391o0.43
0.041oO.IS
H
5,461'
290*
1,674*
211*
24*
Note, k represents the number of effect sizes; H is the within-groups
homogeneity statistic (Hedges & Becker, 1986).
* Significant nonnomogeneity at jj < .05. according to chi-square test.
Note. NA = not available: there were two or fewer effect sizes, so a mean
could not be computed, fe is show-n in parentheses* where k = number
of effect sizes on which the computation of the mean was based.
a
 Data for the Scholastic Aptitude Test were excluded in the computa-
tion of this effect size.
a different picture. There was a slight female superiority or no
gender difference in the elementary and middle school groups;
however, a moderate gender difference favoring males was found
in the high school and college groups.
Ethnicity
The results for the analysis of gender differences as a function
of ethnicity are shown in Table 5. Data forthe SAT are provided
by ethnic group and were coded in that manner for the present
meta-analysis. Two effect sizes are provided: d, is the mean of
all effect sizes including the SAT, and d2 is the mean of effect
sizes excluding the SAT.
When the SAT data were excluded, there was essentially no
gender difference in mathematics performance for Blacks, His-
panics, and Asian Americans. Indeed, the 95<5; confidence inter-
Table 5
Magnitude of Gender Differences as a Function of Ethnicity
Ethnic ^ roup
Black
Hispanic
Asian American
White
Australian
Canadian
American Indian
Mixed or unreported
d,
0.23 (22)
0.30(21)
0.29 (5)
0.41 (14)
0.11(7)
0.09 (5)
0.44(1)
0.15(184)
*
-0.02(21)
0.00 (20)
-0.09 (4)
0.13(13)
0.11(7)
0.09 (5)
NA
0.15(184)
H
219'
157*
15'
152*
31*
21*
48,114*
Note. NA = Not available; no effect size was available in this caregorv
tli - the mean for all effect sizes, rf, = the mean effect size excluding
Scholastic Aptitude Test (SAT) data, H= homogeneity statistic based on
data excluding the SAT. All samples are from the United States sinless
otherwise indicated, k. the number of effect sizes ofl which each mean
is based, is shown in parentheses.
* Significant nonhomogeneity atp < .05 according to cni-squaretest.
val for d covers 0 for both Blacks and Hispanics. The slight
difference for Asian Americans favored females. Only for White
Americans was there evidence of superior male performance,
and the difference was still small. The mean effect size for Amer-
ican Indians should not be taten too seriously because it is
based on a single value.
Homogeneity analyses, using the data set excluding the SAT,
indicated that there were significant differences between ethnic
groups in the magnitude of the gender difference, HK - 293
compared with a critical x2(6) = 12.59. Ethnicity was not one
of the stranger effects.
Selectivity of the Sample
The analysis for the magnitude of the gender difference as a
function of the selectivity of the sample is shown in Table 6.
Notice that the gender difference was close to zero (favoring fe-
males slightly) for general samples; a larger gender difference
favoring males was found for each successive level of selection
for higher ability. The gender difference was moderate to large
for highly selected samples (d = 0.54) and for samples selected
for extreme precocity (d = 0.41). Also note that the great major-
ity of samples (184) in this meta-analysis were general and unse-
lected. Not surprisingly, thegreatest heterogeneity ofeffect sizes
was for the general samples.
Homogeneity analyses indicated that there were significant
differences in effect size depending on how selective the sample
was, HE = 41,341 compared withacritical x'(4) = 9.49. Sample
selectivity was one of the large effects.
When the interaction of sample selectivity and cognitive level
was examined, it was apparent that the effects of sample selec-
tivity were found most strongly for problem solving. For such
measures, the magnitude of Ihe gender difference varied from
0.02 for general samples to 0.43 for highly selected samples.
Year of Publication
Studies were divided into two subgroups depending on the
year of publication: those published in 1973 or earlier and those
published after 1973. We chose 1973 as a divider between older
studies and more recent ones because it marked the last year
that was included in the Maccoby and Jackhn (1974) and Fen-
nema (1974) reviews.
GENDER DIFFERENCES IN MATHEMATICS PERFORMANCE 149
For studies published in 1973 and earlier, d was 0.31, based
on 37 effect sizes. For studies published in 1974 or later, d was
0.14, based on 217 effect sizes. Thus, the data show both the
increase in research on gender and mathematics and a substan-
tial trend for smaller gender differences in more recent studies.
Regression Analysis
In view of the fact that the first homogeneity analysis indi-
cated that, overall, the set of effect sizes was nonhomogeneous,
multiple regression analysis was used to construct a model of
the sources of variation in effect sizes (Hedges & Becker, 1986).
The effect size was the criterion variable. On the basis of the
results of the categorical analyses reported previously, we per-
formed an initial regression analysis using the following predic-
tors: age of subjects, year of publication, ethnicity of sample,
selectivity of sample, cognitive level of the test, mathematics
content of the test, and the Age X Cognitive Level interaction.
The regression analyses were conducted by using the GLM pro-
cedure in the SAS statistics program. Repeated regression anal-
yses indicated that the SAT data were having a disproportionate
effect on the results, particularly in terms of the strength of the
ethnicity variable, because of the large sample size. Thus, the
SAT data were deleted in the final multiple regression analysis.
In addition, those few studies in which the sample had been
selected for poor performance were also deleted, because they
did not fit conceptually with the ratings of samples for increas-
ingly greater selectivity for high performance. For the final re-
gression analysis, predictors that were nonsignificant in previ-
ous analyses were deleted.
The result was a simple, well-defined equation in which 87%
of the variance in d was predicted by three variables: subjects'
age, selectivity of the sample, and cognitive level of the test. All
three were significant predictors; Age was the strongest predic-
tor, F(l, 232) = 1,171.04, p < .0001, followed by sample selec-
tivity, F(3, 232) = 113.22, p < .0001, which was followed by
cognitive level, F\3, 232) = 7.88, p < .0001. (Sample selectivity
and cognitive level were coded as class variables.)
Discussion
Averaged over all studies, the mean magnitude of the gender
difference in mathematics performance was 0.20. When SAT
data were excluded, d was 0.15. The positive value indicates
better performance by males on the average, but the magnitude
of the effect size is small. Figure 1 shows two normal distribu-
tions that are 0.15 standard deviation apart. If one looks only
at samples of the general population (excluding selective sam-
ples), d was —0.05, indicating a female superiority in perfor-
mance, but one of negligible magnitude. We can place consider-
able confidence in these results because they are based on test-
ing literally millions of subjects, on more than 200 effect sizes,
and on many well-sampled, large studies such as the state assess-
ments.
These findings are in contrast to the results of Hyde's (1981)
earlier meta-analysis, in which she reported a d of 0.43 for
quantitative ability. The discrepancy may be accounted for in
two ways. First, her computation was based on a small sample
of studies taken from the Maccoby and Jacklin (1974) review;
Z SCORE - 4 - 3 - 2 - 1 0 1 2 3 4
Figure 1. Two normal distributions that are 0.15 standard deviations
apart (i.e., d = 0.15. This is the approximate magnitude of the gender
difference in mathematics performance, averaging over all samples.)
sufficient information was available for the computation of only
seven values of d. In addition, to test Maccoby and Jacklin's
hypothesis that gender differences in mathematics performance
emerge around the age of 12 or 13, only studies with subjects 12
years old or older were included. Using only that set of studies
probably produced a larger gender difference than if studies
with younger subjects had also been included. Second, the pres-
ent meta-analysis provides evidence that the magnitude of gen-
der differences has declined over the past three decades. We
found that d was 0.31 for studies published in 1973 or earlier
and 0.14 for studies published in 1974 or later. Thus, there prob-
ably has been a decline in the gender difference since 1973.
These findings are consistent with those of Feingold (1988), who
documented a decline in the magnitude of gender differences in
abilities as measured by several standardized tests.
It is important to recognize that the set of effect sizes is not
homogeneous. It is therefore essential to consider variations in
the magnitude of the gender difference as a function of the three
variables that were significant predictors in the multiple regres-
sion analyses: age, selectivity of the sample, and cognitive level
of the test.
Age Trends and Cognitive Level
Age trends in the magnitude of the gender difference in math-
ematics performance are important. Averaging over all studies,
there was a slight female superiority in performance in the ele-
mentary and middle school years. A moderate male superiority
emerged in the high school years (d = 0.29) and continued in
the college years (d - 0.41), as well as in adulthood (d = 0.59).
However, the age trends were a function of the cognitive level
tapped by the test. Females were superior in computation in
elementary and middle school, and the difference was essen-
tially zero in the high school years. The gender difference was
essentially zero for understanding of mathematical concepts at
all ages for which data were available. It was in problem solving
that dramaticage trends emerged. The gender difference in
problem solving favored females slightly (effect size essentially
zero) in the elementary and middle school years, but in the high
school and college years there was a moderate effect size favor-
ing males. These are precisely the years when students are per-
mitted to select their own courses, and females elect somewhat
150 J. HYDE, E. FENNEMA, AND S. LAMON
fewer mathematics courses than do males (Meece et al., 1982).
Differences in course selection appear to account for some but
not all of the gender difference in performance on standardized
tests in the high school and college years (Kimball, 1989).
We are puzzled by the fact that tests with mixed or unre-
ported cognitive levels had a slightly larger gender difference
(0.19) than tests of problem solving (0.08). One possible expla-
nation is that there may be some feature of the format or admin-
istration of these tests, about which we lacked information, that
produced a male advantage on the tests. For example, the con-
tent of problem-solving items on those tests may have heavy
representation of masculine-stereotyped content, which has
been shown to produce better performance by males in some
studies, although results on the issue are mixed (e.g., Donlon,
1973;Selkow, 1984).
Sample Selectivity
Sample selectivity was one of the three most powerful predic-
tors of effect size in the multiple regression analysis. When all
effect sizes (excluding the SAT) were averaged, d was 0.15. Yet
when only those 184 effect sizes based on general, unselected
populations were averaged, d was —0.05. That is, there was a
shift to a slight female advantage, although the difference was
essentially zero. The magnitude of the gender difference favor-
ing males grew larger as the sample was more highly selected: d
was 0.33 for moderately selected samples (such as college stu-
dents), 0.54 for highly selected samples (such as students at
highly selective colleges, or graduate students), and 0.41 for
samples selected for exceptional mathematical precocity.
These findings are very helpful in interpreting the results of
Benbow and Stanley's (1980, 1983) study of mathematically
precocious youth. Their research has found large gender differ-
ences favoring males in mathematics performance, and the re-
sults have been widely publicized. Often the secondary reports
fail to acknowledge the specialized sampling in the study, im-
plying that the large gender differences are true of the general
population. The results of the present meta-analysis demon-
strate empirically exactly what would be expected from a con-
sideration of normal distributions (Hyde, 1981): Large gender
differences can be found at the extreme tails of distributions
even though the gender difference for the entire population is
small. Certainly it is important to study gifted populations, but
it is essential to remember that results from studies like Benbow
and Stanley's do not generalize to the rest of the population.
We must raise one caveat about studies that were coded as
unselected samples of the general population. In high school,
males have a higher dropout rate than females (Ekstrom,
Goertz, Pollack, & Rock, 1986). Dropouts tend to be low scor-
ers, and they are not included in data based on the testing of
high school students. Thus, male advantages in performance in
high school and later may in part result from the selective loss
of low-scoring males from the samples.
The SAT-Math
A recent meta-analysis of gender differences in verbal ability
(Hyde & Linn, 1988) indicated that the SAT-Verbal produced
idiosyncratic results. The average of all effect sizes yielded a d
of 0.11, indicating a slight female superiority in performance,
although the authors concluded that the gender difference had
essentially become zero. Yet the SAT-Verbal produced a d of
-.11 (the negative sign reflecting superior male performance
in that meta-analysis). That is, the SAT yielded superior male
performance when the pattern over all other tests was a slight
female superiority in performance.
The SAT-Math also yielded discrepant results in the present
analysis. The overall effect size, excluding the SAT, was 0.15.
Yet, according to the data from the 1985 administration of the
SAT (Ramist & Arbeiter, 1986), for males the mean was 499
(SD = 121), and for females the mean was 452 (SD = 112),
resulting in a do!.40. That is, the SAT produced a considerably
larger gender difference than our overall meta-analysis found.
The larger gender difference favoring males on the SAT may be
due to several factors:
1. The SAT data are based on a moderately selected sample,
those who are college-hound. As we indicated earlier, sample
selectivity increases the magnitude of the gender difference. For
moderately selected samples excluding the SAT, rfwas 0.33.
2. As Hyde and Linn (1988) pointed out, a larger number of
females take the SAT, and the males appear to be a somewhat
more advantaged sample in terms of parental income, father's
education, and attendance at private schools (Ramist & Ar-
beiter, 1986). In short, the male SAT sample may be more highly
selected than the female sample.
3. There may be features of the content of the test itself or of
its administration that enlarge the difference between males and
females. For example, the present meta-analysis indicates that
gender differences are larger in the high school years for mea-
sures of problem solving but not for computation. Although the
SAT includes many items that tap problem solving, there also
are some purely computational items.3 The SAT was coded as
"mixed" in our cognitive-level analysis. The mixture of prob-
lem solving and computational items should produce a gender
difference favoring males, but it should be smaller than 0.40.
How Large Are the Gender Differences in
Mathematics Performance?
The interpretation of the magnitude of effect sizes has been
debated. Cohen (1969) considered a d of 0.20 small, a dof 0.50
medium, and a d of 0.80 large. On the other hand, Rosenthal
and Rubin (1982b) have introduced the binomial effect size dis-
play as a means of translating effect sizes into practical signifi-
cance. For example, an effect size reported for success in curing
cancer, reported as a correlation of .20, translates into increas-
ing the cure rate from 40% to 60%, surely an important practi-
cal effect. Our overall value for samples of the general popula-
tion, a d of -0.05, translates into a correlation of-.025, which
yields only a 3% increase in success rate (from 48.5% to 51.5%).
Applied to the analysis of gender differences, it means that ap-
proximately 51.5% of females score above the mean for the gen-
3
 An example of a computational item from the SAT is the following:
The test taker is asked to tell which of the following quantities is greater
or whether the two are equal: ('/3 - 'A) and 2/i 5 (College Entrance Exami-
nation Board, 1986).
GENDER DIFFERENCES IN MATHEMATICS PERFORMANCE 151
eral population, whereas 48.5% of males score above the mean.
Thus, the overall effect size is so small that even the binomial
effect size display indicates little practical significance.
The effect size of 0.29 for problem solving in high school-
aged students translates into 43% of females and 57% of males
falling above the mean of the overall distribution, using the bi-
nomial effect size display.
Some idea of the magnitude of the overall effect size of—0.05
for general populations or the effect size of 0.29 for problem
solving in high school students can also be gained by comparing
them with effect sizes found in other meta-analyses. For exam-
ple, a meta-analysis of gender differences in verbal ability found
d to be 0.11, and the authors concluded that the value was so
small as to indicate no difference (Hyde & Linn, 1988). A meta-
analysis of genderdifferences in spatial ability indicated that the
magnitude of the gender difference depended considerably on
the type of spatial ability tested (Linn & Petersen, 1985). For
measures of spatial perception (e.g., the rod-and-frame test), d
was 0.44. For measures of spatial visualization (e.g., Hidden
Figures Test), d was 0.13. For measures of mental rotation (e.g.,
PMA Space or the Vandenberg), d was 0.73. In all cases the
differences favored males. Linn and Petersen concluded that the
only substantial gender difference was in measures of mental
rotation.
Meta-analyses in the realm of social behavior have indicated
that d was .50 for gender differences in aggression, including
studies with subjects of all ages (Hyde, 1984). For social-psycho-
logical studies of aggression by adult subjects, rfwas .40 (Eagly
& Steffen, 1986). For gender differences in helping behavior, d
was .13, although the effect sizes were extremely heterogeneous
and d varied, for example, from -0.18 for studies conducted in
the laboratory to 0.50 for studies conducted off campus (Eagly
&Crowley, 1986).
One can also compare the magnitude of the gender difference
with effects that have been obtained outside the realm of gender
differences. For example, the average effect of psychotherapy,
comparing treated with control groups, is .68 (Smith & Glass,
1977).
Thus, the overall effect size of 0.15 (or -0.05 for samples of
only the general population) for gender differences in mathe-
matics performance can surely be called small. The largest
effect sizes we obtained were 0.29 and 0.32 for problem solving
in the high school and college years, respectively. These are
moderate differences that are comparable, for example, to the
gender difference in aggressive behavior, yet they are smaller
than the effects of psychotherapy.
Implications
This meta-analysis provided little support for the global con-
clusions that "boys excel in mathematical ability" (Maccoby &
Jacklin, 1974, p. 352) or "the finding that males outperform
females in tests of quantitative or mathematical ability is ro-
bust" (Halpern, 1986, p. 57). The overall gender difference is
small at most (d = 0.15 for all samples or —0.05 for general
samples). Furthermore, a general statement about gender
differences is misleading because it masks the complexity of the
pattern. For example, females are superior in computation,
there are no gender differences in understanding of mathemati-
cal concepts, and gender differences favoring males in problem
solving do not emerge until the high school years.
However, where gender differences do exist, they are in criti-
cal areas. It is important for us to know that females begin in
high school to perform less well than males on mathematical
problem-solving tasks. Problem solving is critical for success in
many mathematics-related fields, such as engineering and phys-
ics. In this sense, mathematics skills may continue to be a criti-
cal filter. The curriculum in mathematics, beginning well before
high school, should emphasize problem solving for all students
(National Council of Teachers of Mathematics, 1988). Cur-
rently, it emphasizes computation, and girls seem to learn that
very well. The schools must take more responsibility in the
teaching of problem solving, both because it is an important
area of mathematics and because it is an issue of gender equity.
Boys may have more access to problem-solving experiences
outside the mathematics classroom than do girls, creating boys'
pattern of better performance (Kimball, 1989). For example,
data from California high schools from 1983 to 1987 indicate
that girls made up only about 38% of physics students, 34% of
advanced physics students, and 42% of chemistry students
(Linn & Hyde, in press). These science courses are likely to pro-
vide extensive experience with problem solving, and fewer girls
than boys gain that experience.
The gender difference that was found on the SAT-Math also
has significant implications. Scores on the SAT are used as cri-
teria for college admission and for selection of scholarship re-
cipients. Thus, lower SAT-Math scores may influence these crit-
ical decisions about female students. The format and items of
the SAT-Math should continue to be inspected for two
purposes: (a) to determine whether some items are gender-
biased and should be eliminated from the test, and (b) to deter-
mine whether certain items tap important problem-solving
skills that are not taught adequately in the mathematics curric-
ulum of the schools. Then schools will be able to take positive
steps to improve the teaching of the mathematics required to
solve such problems.
One frustration that occurred in the process of conducting
this meta-analysis was the difficulty of analyzing the results ac-
cording to the mathematics content of the test. Few authors
specified the content clearly, probably because the content was
mixed. We must know if there are large gender gaps for certain
types of content. That can be determined only when researchers
construct tests and report results that assess the various kinds
of mathematics content separately.
Nonetheless, the gender differences in mathematics perfor-
mance, even among college students or college-bound students,
are at most moderate. Thus, in explaining the lesser presence
of women in college-level mathematics courses and in mathe-
matics-related occupations, we must look to other factors, such
as internalized belief systems about mathematics, external fac-
tors such as sex discrimination in education and in employ-
ment (Kimball, 1989), and the mathematics curriculum at the
precollege level.
References
Anastasi, A. (1958). Differential psychology(3rded.). New York: Mac-
millan.
152 J. HYDE, E. FENNEMA, AND S. LAMON
Benbow, C. P., & Stanley, J. C. (1980). Sex differences in mathematical
ability: Fact or artifact? Science, 210, 1262-1264.
Benbow, C. P., & Stanley, J. C. (1983). Sex differences in mathematical
reasoning ability: More facts. Science, 222, 1029-1031.
Cohen, J. (1969). Statistical power analysis for the behavioral sciences.
New York: Academic Press.
College Entrance Examination Board. (1986). 10 SATs. New York: Au-
thor.
Donlon, T. F. (1973). Content factors in sex differences on test questions
(ETS RB 73-28). Princeton, NJ: Educational Testing Service.
Dossey, J. A., Mullis, I. V. S., Lindquist, M. M., & Chambers, D. L.
(1988). The mathematics report card: Are we measuring up? (1986
National Assessment of Educational Progress Report No. 17-M-01).
Princeton, NJ: Educational Testing Service.
Eagly, A. H., & Crowley, M. (1986). Gender and helping behavior: A
meta-analytic review of the social psychological literature. Psycholog-
ical Bulletin, 100, 283-308.
Eagly, A. H., & Steffen, V. J. (1986). Gender and aggressive behavior: A
meta-analytic review of the social psychological literature. Psycholog-
ical Bulletin. 100. 309-330.
Eccles, J. S. (1987). Gender roles and women's achievement-related de-
cisions. Psychology of Women Quarterly, 11, 135-172.
Ekstrom, R., Goertz, M. E., Pollack, 1. M., & Rock, D. A. (1986). Who
drops out of high school and why? Findings from a national study.
Teachers College Record, 87, 356-373.
Feingold, A. (1988). Cognitive gender differences are disappearing.
American Psychologist, 43, 95-103.
Fennema, E. (1974). Mathematics learning and the sexes. Journal for
Research in Mathematics Education, 5, 126-129.
Fennema, E., & Carpenter, T. P. (1981). Sex-related differences in math-
ematics: Results from the National Assessment. Mathematics
Teacher. 74, 554-559.
Fennema, E., & Peterson, P. (1985). Autonomous learning behavior: A
possible explanation of gender-related differences in mathematics. In
L. S. Wilkinson & C. B. Marrett (Eds.), Gender influences in class-
room interaction (pp. 17-36). New York: Academic Press.
Halpern,D. F. (1986). Sex differences in cognitive abilities. Hillsdale,
NJ: Erlbaum.
Hedges, L. V. (1981). Distribution theory for Glass's estimator of effect
size and related estimators. Journal of Educational Statistics, 7, 119-
137.
Hedges, L. V., & Becker, B. J. (1986). Statistical methods in the meta-
analysis of research on gender differences. In J. S. Hyde & M. C. Linn
(Eds.), The psychology of gender: Advances through meta-analysis
(pp. 14-50). Baltimore: Johns Hopkins University Press.
Hedges, L. V., &Olkin, I. (1985). Statistical methods for meta-analysis.
New York: Academic Press.
Hunter, J. E., Schmidt, F. L., & Jackson, G. B. (1982). Meta-analysis:
Cumulating research findings across studies. Beverly Hills, CA: Sage.
Hyde, J. S. (1981). How large are cognitive gender differences? A meta-
analysis using u2 and d. American Psychologist, 36, 892-901.
Hyde, J. S. (1984). How large are gender differences in aggression? A
developmental meta-analysis. Developmental Psychology, 20, 722-
736.
Hyde, J. S., & Linn, M. C. (Eds.). (1986). The psychology of gender:
Advances through meta-analysis. Baltimore: Johns Hopkins Univer-
sity Press.
Hyde, J. S., & Linn, M. C. (1988). Gender differences in verbal ability:
A meta-analysis. Psychological Bulletin, 104, 53-69.
Kimball, M. M. (1989). A new perspective on women's math achieve-
ment. Psychological Bulletin, 105, 198-214.
Linn, M. C., & Hyde, J. S. (in press). Trends in cognitive and psychoso-
cial gender differences. In R. M. Leraer, A. C. Petersen, & J. Brooks-
Gunn (Eds.), The encyclopedia of adolescence. New York: Garland
Publishing.
Linn, M. C, & Petersen, A. C. (1985). Emergence and characterization
of sex differences in spatial ability: A meta-analysis. Child Develop-
ment, 56, 1479-1498.
Maccoby, E. E., & Jacklin, C. N. (1974). The psychology of sex differ-
ences. Stanford, CA: Stanford University Press.
Meece, J. L., (Eccles) Parsons, J., Kaczala, C. M., Goff, S. B., & Futter-
man, R. (1982). Sex differences in math achievement: Toward a
model of academic choice. Psychological Bulletin, 91, 324-348.
Meyer, M. R. (in press). Gender differences in mathematics. In M. M.
Lindquist (Ed.), Results from the fourth mathematics assessment of
the National Assessment of Educational Progress. Reston, VA: Na-
tional Council of Teachers of Mathematics.
National Council of Teachers of Mathematics. (1988). Curriculum and
evaluation standards for school mathematics. Reston, VA: Author.
Ramist, L., & Arbeiter, S. (1986). Profiles, college-bound seniors, 1985.
New York: College Entrance Examination Board.
Rosenthal, R., & Rubin, D. B. (1982a). Comparing effect sizes of inde-
pendent studies. Psychological Bulletin, 92, 500-504.
Rosenthal, R., & Rubin, D. B. (1982b). A simple, general purpose dis-
play of magnitude of experimental effect. Journal of Educational Psy-
chology, 74, 166-169.
Scott, W. A. (1955). Reliability of content analysis: The case of nominal
scale coding. Public Opinion Quarterly, 19, 321-325.
Selkow, P. (1984). Assessing sex bias in testing. Westport, CT: Green-
wood Press.
Sells, L. W (1973). High school mathematics as the critical filter in the
job market. In R. T. Thomas (Ed.), Developing opportunities for mi-
norities in graduate education (pp. 37-39). Berkeley: University of
California Press.
Smith, M. L., & Glass, G. V. (1977). Meta-analysis of psychotherapy
outcome studies. American Psychologist, 32, 752-760.
Stage, E. K., Kreinberg, N., Eccles, J. R., & Becker, J. R. (1985). In-
creasing the participation and achievement of girls and women in
mathematics, science, and engineering. In S. S. Klein (Ed.), Hand-
book for achieving sex equity through education (pp. 237-269). Balti-
more: Johns Hopkins University Press.
Zwick, R. (1988). Another look at interrater agreement. Psychological
Bulletin, 103, 374-378.
GENDER DIFFERENCES IN MATHEMATICS PERFORMANCE 153
Appendix
Studies Used in the Meta-Analysis
Alabama Department of Education. (1986-1987). [State mathematics
assessment]. Personal communication: Rex C. Jones.
American College Testing Program. (1987). State and national trend
data for students who lake the ACT assessment. Iowa City, Iowa: Au-
thor.
Backman, M. E. (1972). Patterns of mental abilities: Ethnic, socioeco-
nomic, and sex differences. American Educational Research Journal,
9, 1-12.
Behrens, L. T., & Vernon, P. E. (1978). Personality correlates of over-
achievement and under-achievement. British Journal of Educational
Psychology, 48, 290-297.
Bell, C., & Ward, G. R. (1980). An investigation of the relationship
between dimensions of self concept (DOSC) and achievement in
mathematics. Adolescence, IS, 895-901.
Benbow, C. P., & Stanley, J. C. (1980). Sex differences in mathematical
ability: Fact or artifact? Science, 210, 1262-1264.
Benbow, C. P., & Stanley, J. C. (1983). Sex differences in mathematical
reasoning ability: More facts. Science. 222, 1029-1031.
Bennett, G. K., Seashore, H. G., & Wesman, A. G. (1979). Differential
aptitude tests: Fifth edition manual. New \brk: Psychological Corpo-
ration.
Boli, J., Allen, M. L., & Payne, A. (1985). High-ability women and men
in undergraduate mathematics and chemistry courses. American Ed-
ucational Research Journal, 22,605-626.
Brandon, P. R. (1985, April). The superiority of girls over boys in mathe-
matics achievement in Hawaii. Paper presented at the 69th annual
meeting of the American Educational Research Association, Chi-
cago, IL. (ERIC Document Reproduction Service No. 260 906)
Carrier, C., Post, T. R., & Heck, W. (1985). Using microcomputers with
fourth-grade students to reinforce arithmetic skills. Journal for Re-
search in Mathematics Education, 16, 45-51.
Connecticut Department of Education. (1987). [State mathematics as-
sessment data]. Unpublished raw data.
Connor, J. M., & Serbin, L. A. (1980). Mathematics, visual-spatial abil-
ity, and sex roles (Final Report). Washington, DC: National Institute
of Education. (ERIC Document Reproduction Service No. 205 385)
D'Augustine, C. H. (1966). Factors relating to achievement with se-
lected topics in geometry and topology. The Arithmetic Teacher, 13,
192-197.
Davis, E. J. (1973). A study of the ability of school pupils to perceive
and identify the plane sections of selected solid figures. Journal for
Research in Mathematics Education, 4, 132-140.
Dees, R. L. (1982). Sex differences in geometry achievement. Paper pre-
sented at the annual meeting of the American Educational Research
Association, New York, NY. (ERIC Document Reproduction Service
No. 215 873)
De Wolf, V. A. (1981). High school mathematics preparation and sex
differences in quantitative abilities. Psychology of Women Quarterly,
5, 555-567.
Dick, T. P., & Balomenos, R. H. (1984). An investigation of calculus
learning usingfactorial modeling. Paper presented at the 68th annual
meeting of the American Educational Research Association, New Or-
leans, LA. (ERIC Document Reproduction Service No. 245 033)
Dossey, J. A., Mullis, I. V. S., Lindquist, M. M., & Chambers, D. L.
(1988). The mathematics report card: Are we measuring up? (1986
National Assessment of Educational Progress Report No. I7-M-01).
Princeton, NJ: Educational Testing Service.
Edge, O. P., & Friedberg, S. H. (1984). Factors affecting achievement in
the first course in calculus. Journal of Experimental Education, 52,
136-140.
Educational Testing Service. (1987). A summary of data collected from
Graduate Record Examinations test-takers during 1985-86.
Princeton, NJ: Author.
Engle, C. D., & Lerch, H. H. (1971). A comparison of first-grade chil-
dren's abilities on two types of arithmetical practice exercises. School
Science and Mathematics, 71, 327-334.
Ethington, C. A., & Woffle, L. M. (1984). Sex differences in a causal
model of mathematics achievement. Journal for Research