Baixe o app para aproveitar ainda mais
Prévia do material em texto
Learning and Instruction 16 (2006) 526e537 www.elsevier.com/locate/learninstruc Multimedia learning: Working memory and the learning of word and picture diagrams Stephan Dutke a,*, Mike Rinck b a University of Kaiserslautern, Department of Psychology, Pfaffenbergstr. 95, D-67663 Kaiserslautern, Germany b Radboud University Nijmegen, Clinical Psychology and Behavioural Science Institute, PO Box 9104, 6500 HE Nijmegen, The Netherlands Abstract From the cognitive model of multimedia learning proposed by [Schnotz, W., & Bannert, M. (2003). Construction and interference in learning from multiple representation. Learning and Instruction, 13, 141e156], two hypotheses regarding the learning of spatial arrangements of objects were derived: the integration hypothesis and the multiple source hypothesis. In the experiment, ninety-six participants first studied spatial arrangements of five objects each. The complete arrangements had to be inferred from pairs of objects, because participants were shown either word pairs or picture pairs depicting adjacent objects. Afterwards, they were tested using either object pairs or complete arrangements, and the test items consisted either of words or of pictures. In addition, the par- ticipants were divided into four groups according to their verbal and visuospatial working memory capacity. The results showed (a) that integrating pairs of objects into complete spatial arrangements required more working memory resources than evaluating the pairs, irrespective of the objects represented by words or pictures, (b) that integration of elements from different sources (verbal descriptions and pictorial depictions) required more working memory resources than integrating only depictive elements. The results yield evidence for the proposed internal structure of Schnotz and Bannert’s model. The results are discussed with regard to individual differences in working memory capacity, cognitive load and the design of multimedia-supported learning tasks. � 2006 Elsevier Ltd. All rights reserved. Keywords: Multimedia learning; Working memory; Comprehension of text and graphics Recently, Schnotz and Bannert (2003) proposed a cognitive model of multimedia learning, which integrates a con- siderable amount of empirical findings from the text and picture comprehension literature. So far, the model has been evaluated primarily using learning performance data (Schnotz & Bannert, 1999, 2003). Designing multimedia learn- ing environments, however, is aimed not only at enhancing learning results but also at optimizing learning efficiency (e.g., Mayer & Moreno, 2003; Paas, Renkl, & Sweller, 2003). Optimizing efficiency requires data about learning re- sults and the cognitive resources that have to be invested to achieve these results. Especially the amount of working memory resources required for achieving a learning task is critical because working memory resources are assumed to be strictly limited (e.g., Baddeley, 1986; Engle, Cantor, & Carullo, 1992; Mayer, 2003). For reasons explained below, Schnotz and Bannert’s model is especially suitable to derive hypotheses on working memory demands required for learning based on verbal and pictorial materials. * Corresponding author. Tel.: þ49 631 205 2721; fax: þ49 631 205 3910. E-mail address: dutke@rhrk.uni-kl.de (S. Dutke). 0959-4752/$ - see front matter � 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.learninstruc.2006.10.002 mailto:dutke@rhrk.uni-kl.de http://www.elsevier.com/locate/learninstruc 527S. Dutke, M. Rinck / Learning and Instruction 16 (2006) 526e537 Schnotz and Bannert’s (2003) model of multimedia learning is aimed at explaining how information from different external representations is integrated. It consists of a descriptive and a depictive branch of processes (see Fig. 1). The descriptive branch comprises processes of symbol analysis that first construct a surface representation and then a prop- ositional representation of the externally represented text. The depictive branch comprises analog structure mapping processes that first construct a visual image and then a mental model of the externally represented picture or diagram. As in pure (multi-level) text comprehension theories (e.g., van Dijk & Kintsch, 1983; Gernsbacher, 1990; Graesser, Millis, & Zwaan, 1997; Graesser, Singer, & Trabasso, 1994; Johnson-Laird, 1983; Kintsch, 1998; Zwaan, Langston, & Graesser, 1995), the mental model is assumed to be constructed on the basis of the surface representation and the prop- ositional representation of the text. In contrast to these representations, the mental model does not represent features of the text itself, but of the entities the text is referring to (Glenberg, Meyer, & Lindem, 1987). Going beyond the scope of pure text comprehension models, Schnotz and Bannert assume a third source of information for the construction of the mental model: the visual image generated from the external picture or diagram. In line with Johnson-Laird’s theory (1983, 1996), the mental model differs from the visual image in that (a) the mental model is not bound to specific sensory modalities, (b) not all graphical or pictorial elements in the visual image are mapped onto the mental model, but only to task-relevant elements, and (c) the mental model is enriched by general knowledge. To summarize, the mental model is the representation that integrates propositions from the text base, pictorial elements from the visual image, and general world knowledge into a new, coherent structure representing the entities that text and picture are jointly referring to. This integration process is assumed to require working memory resources. Correlations between reading compre- hension and different working memory span measures have been reported for children (De Jonge & De Jonge, 1996) as well as for adults (Hacker & Osterland, 1995). Individuals with higher reading abilities were shown to perform working memory updating processes more reliably than low ability readers (Palladino, Cornoldi, De Beni, & Pazzaglia, 2001). Moreover, with regard to mental model construction, Friedman and Miyake (2000) demonstrated that the efficiency of evaluating text probes requiring spatial inferences showed higher correlations with spatial working memory span than Fig. 1. A cognitive model of multimedia learning (Schnotz & Bannert, 2003, p. 145). Copyright Pergamon Press, reprinted with permission?? 528 S. Dutke, M. Rinck / Learning and Instruction 16 (2006) 526e537 with verbal working memory span, whereas the opposite was shown for text probes requiring nonspatial inferences. To summarize, the construction, updating, and use of mental models seem to require quite different types of working memory resources. From Schnotz and Bannert’s model, at least two hypotheses about the amount of working memory resources re- quired in multimedia comprehension may be derived. We refer to the first one as the integration hypothesis. It states that a learning task involving the integration of several propositions into a coherent mental model requires more verbal working memory resources than a task involving recognition of single propositions. A corresponding prediction applies to pictorial information: a task involving the integration of several pictorial elements into a coherent mental model requires more visuospatial working memory resources than a task involving recognition of single pictorial elements. The integration hypothesis follows from Schnotz and Bannert’s model for two reasons: First, Schnotz and Bannert (2003) assume that comprehension is a continuous process in which mental structures are constructed step by step in the learning process and are updated by currently processed information (verbal or pictorial). This in- tegration process requires old and new information to be simultaneously available, which taxes working memory re- sources. Availability of the integrated mental model requires not only retrieval of a particular proposition or pictorial element, butalso retrieving its (inferred) relation(s) to other elements. Although this argument clearly corresponds to Schnotz and Bannert’s model it also fits several other mental model theories of comprehension (e.g., van Dijk & Kintsch, 1983; Gernsbacher, 1990; Graesser et al., 1994, 1997; Johnson-Laird, 1983; Kintsch, 1998; Zwaan et al., 1995). Another argument, however, is unique to Schnotz and Bannert’s model. Their model consists of two different processing branches specialized to processing of different types of representations. Based on a concept of working memory which differentiates subsystems suitable for manipulating verbal (symbolic) and visuospatial (analog) rep- resentations (e.g., Baddeley, 1986, 1992), it is concluded that integration processes in the descriptive branch primarily tax verbal working memory resources and that integration processes in the depictive branch primarily tax visuospatial working memory resources. This argument is unique to Schnotz and Bannert’s model because pure text comprehen- sion theories as well as multimedia-oriented theories such as Sweller’s (1994) cognitive load theory lack the differ- entiation of representational formats in processing verbal and pictorial materials. Mayer’s cognitive theory of multimedia learning (Mayer & Moreno, 2002, 2003) has a similar architecture as Schnotz and Bannert’s model in that it distinguished two processing ‘‘channels’’, one for words or text and one for pictures (Fig. 2). However, the distinction between the ‘‘verbal-auditory channel’’ and the ‘‘visual-pictorial channel’’ is based on a combination of modality (how is the information perceived) and representational format (how is the perceived information externally represented). In contrast, Schnotz and Bannert (2003) explicitly stated that the descriptive and depictive branch of their model are specialized for processing information represented in specific formats irrespective of the modality in which this information is perceived. This conception corresponds more closely to the distinction of verbal and visuospatial working memory resources than Mayer’s theory. The second hypothesis (multiple source hypothesis) refers to the interaction of the descriptive and the depictive branch in the Schnotz and Bannert model. This feature can be best explained by comparing Schnotz and Bannert’s model to Mayer and Moreno’s (2002) cognitive theory of multimedia learning (cf. Fig. 2). Mayer and Moreno assume that processing in both channels results in two mental models, a ‘‘verbal mental model’’ and a ‘‘visual mental model’’. In both channels, information is processed independently until the two mental models are established. Referential con- nections between the models are constructed only at this level of processing. In contrast, Schnotz and Bannert (2003) Fig. 2. A cognitive theory of multimedia learning (Mayer & Moreno, 2002, p. 111). Copyright Pergamon Press, reprinted with permission?? 529S. Dutke, M. Rinck / Learning and Instruction 16 (2006) 526e537 assume that comprehension results in only one mental model constructed from elements of the visual image and the propositional representation. Even the text surface representation may contribute to the construction of a referential mental model (see Fig. 1). Thus, although the Schnotz and Bannert model assumes two processing branches, both branches work independently only at their most basic level, that is, in subsemantic processing of verbal materials and perceptual processing of pictorial materials. At the higher levels, however, Schnotz and Bannert expect the de- pictive and descriptive branches to interact, most intensively in creating a common modality-unspecific mental model. Interrelating the two processing branches, however, produces coordination costs absorbing working memory resources (Baddeley, 1986; Hagendorf & Sá, 1996; Oberauer, 1993). This assumption leads to the hypothesis that integrating elements from different sources (verbal descriptions and pictorial depictions) requires more working mem- ory resources than integrating elements from one source. The multiple source hypothesis can be studied with diagrams in which either words or other textual elements are depicted in a specific spatial relation to each other (‘‘word diagrams’’, see Fig. 3) or in which pictures are depicted in a specific spatial relation to each other (‘‘picture diagrams’’, see Fig. 4). Learning word diagrams requires (a) the anal- ysis of symbol structures (descriptive branch) to construct a surface representation and (b) structure mapping processes (depictive branch) to map the spatial relation between the objects onto the mental model. Processing in both branches needs to be highly interrelated, because mapping the spatial relations depicted in the diagram onto the mental model requires a representation of the elements (denoted by words) that constitute the spatial relation. Because of this interrelatedness, working memory demands are assumed to be high. In learning picture diagrams, processing is restricted to the depictive branch. Because no costs of interrelating descriptive and depictive processing emerge, working memory demands are predicted to be lower for word diagrams. Mayer and Moreno’s (2002) theory would not predict this difference. In learning word diagrams, processing would stop at the level of word base. Accord- ing to their theory, a verbal mental model cannot be constructed because the words alone (in a word diagram) provide no information about the spatial relation between the denoted objects. The spatial relation is processed in the visual- pictorial channel and is finally (and solely) represented in the visual mental model. Thus, no coordination costs can emerge between the processing channels, neither with word diagrams nor with picture diagrams. Note that corre- sponding word and picture diagrams have the same informational content and involve the same modality. They differ only with regard to the type of internal representations required to build a meaningful mental model. The integration hypothesis can also be tested with these diagrams. In this study, participants learned (a) elements of word diagrams (two words denoting objects in a specific spatial relation) and (b) elements of picture diagrams (a color drawing depicting two objects in a specific spatial relation), whereas the complete object arrangement consisted of five words or pictures. However, during the learning phase the complete arrangement was never shown to the participants. After learning all elements of an object arrangement separately, participants were tested for recognition of (a) the pre- sented elements and (b) the complete, integrated object arrangement. Both elements and integrated arrangements were presented as word diagrams or as picture diagrams. According to the integration hypothesis, testing of integrated ob- ject arrangements will yield longer recognition times than testing of object pairs, irrespective of whether they are tested with word or picture diagrams. According to the multiple source hypothesis, testing of word diagrams will cause longer recognition times than testing of picture diagrams, irrespective of whether they are tested with elements or with complete object arrangements. The hypothesis that increased recognition times are due to increased demands on working memory was tested by differentiating between individuals with higher and lower verbal or visuospatial working memory capacity. It was expected that the hypothesized differences between recognizing (a) elements vs. integrated arrangements and (b) word diagrams vs. picture diagrams should be greater for participants with lower working memory resources. Thus, performance differences in individuals differing with regard to their working mem- ory resources will be interpreted as indicating different working memory demands. Strawberries Apple Banana Pear Pineapple Fig. 3. A sample arrangement of objects denoted by words (word diagram). Note: Complete arrangements wereused for testing only; they were not studied by the participants. Original words were in German. 530 S. Dutke, M. Rinck / Learning and Instruction 16 (2006) 526e537 1. Method 1.1. Participants Ninety-six students (37 men, 59 women) from various departments of Dresden University of Technology partic- ipated in the experiment, either for course credit or for a small monetary compensation. Depending on their verbal working memory capacity and their visuospatial working memory capacity, the participants were divided into four groups of comparable size (between n¼ 23 and n¼ 26, low vs. high verbal capacity combined with low vs. high vi- suospatial capacity). 1.2. Study materials A total of seven spatial arrangements were created for the experiment (a practice arrangement and six experimental ones). Each spatial arrangement consisted of five objects located at five of the six possible positions created by a matrix of two rows and three columns. Fig. 3 shows a sample experimental arrangement of fruit items denoted by words. The other experimental arrangements contained desk items, toys, animals, musical instruments, or tools. Across the six ex- perimental arrangements, each of the six possible positions remained unfilled exactly once. There were two versions of each arrangement, yielding word diagrams such as the one shown in Fig. 3 and structurally equivalent picture diagrams, in which the words were replaced by simple color drawings (see Fig. 4). It is very important to note, however, that during the study phase of the experiment, participants never saw the complete arrangements. Instead, for each arrange- ment, four pairs of adjacent objects were shown to them, and they had to infer the complete spatial arrangement from these four pairs. For half of the arrangements, all pairs contained words, for the rest, all pairs showed colored pictures. 1.3. Test materials Directly after learning of each arrangement, 24 test items were presented to assess memory for the arrangement just studied. Six test items each consisted of word pairs, picture pairs, complete word arrangements (Fig. 3), and complete Fig. 4. A sample arrangement of objects denoted by pictures (picture diagram), equivalent to the word diagram shown in Fig. 3. Note: Complete arrangements were used for testing only, they were not studied by the participants. Original pictures were in color. 531S. Dutke, M. Rinck / Learning and Instruction 16 (2006) 526e537 picture arrangements (Fig. 4), irrespective of whether the participant had just studied pictures or words. Of the six picture pairs shown, three were correct and three were incorrect. Of the six complete picture arrangements, one was the correct one, and the others were incorrect. The same was true for word pairs and complete word arrangements, respectively. 1.4. Working memory tests The reading span test (RST) employed in this study is similar to the sentence span test introduced by Daneman and Carpenter (1980). Unlike the sentence span test, however, the RST ensures that both the processing function and the memory function of verbal working memory are tapped by the task. In the RST, single sentences are presented to the participants. Each sentence is shown for 5 s, and the participant has to understand the sentence as well as memorize its final word. The RST starts with what is called level 2 of difficulty: two sentences are presented successively, and the participant first reads both sentences. Then he or she writes down the meanings and the final words on a sheet of paper. For the meaning, a few keywords are sufficient, whereas the final word has to be recalled literally. The RST starts with five trials at level 2 and gets increasingly more difficult, until five trials each at levels 2, 3, 4, 5, and 6 are completed (if performance breaks down earlier, the test may be aborted). Each sentence for which the participant gets both the meaning and the final word correct earns a point, yielding a maximum of 100 points in the RST. The spatial span test (SST) followed the procedure introduced by Shah and Miyake (1996). It is formally equivalent to the RST in that it assesses both processing and retention of information, in this case visuospatial information. In the SST, participants receive sequences of mental rotation tasks. In its easiest version, a single letter is presented for 3 s. The letter is either correct or mirror imaged, and it is rotated around its vertical axis by 0, 45, 90, 135, 180, 225, 270, or 315�. Within the 3-s limit, the participant has to indicate by pressing a key whether the letter is correctly printed, and simultaneously memorize the rotation angle of the letter. For instance, if the correct letter F would be presented rotated by 180�, the participant has to respond by pressing the ‘‘correct’’ key on the computer keyboard and by memorizing that the ‘‘head’’ of the letter (the upper horizontal line of the F) points exactly downwards. This direction is then writ- ten down on an answer sheet. At level 2 of the SST, two letters are presented successively, and the participant first judges whether they are correctly printed and then writes down the directions of the letters’ heads. At levels 3, 4, and 5, the number of letters presented on each trial is increased correspondingly. The test starts with level 2 and gets increasingly more difficult. In the SST used here, each participant received a maximum of 20 trials (levels 2, 3, 4, and 5 five times each). Each time the participant responds correctly to the mirror question and indicated the direction of this letter correctly, he or she receives a point, yielding a maximum of 70 points in the SST. 1.5. Procedure The study consisted of two separate sessions on consecutive days. During the first session, the experiment was conducted. The participants were informed that they would learn five spatial arrangements of objects (the practice arrangement followed by four experimental ones selected randomly from the six existing ones). The practice arrange- ment was identical for all participants, whereas the experimental arrangements differed according to a rotation scheme which ensured that each arrangement was presented equally often across participants. As mentioned above, partici- pants only saw four adjacent pairs of objects for each arrangement, and they had to infer the complete spatial arrange- ment from these four pairs. The pairs contained either two words or two colored pictures, and each pair was presented in the center of the computer screen for 3 s. Thus, the complete arrangement could not be inferred from the absolute positions of objects on the screen. The order of the four pairs varied between arrangements, but for each arrangement, the order was identical for all participants. The orders were chosen such that argument overlap between consecutive pairs was maximized (e.g., for the sample arrangement shown in Figs. 3 and 4, the order was strawberrieseapple, appleebanana, bananaepineapple, pineappleepear). Pilot tests had shown that other orders were too difficult for many potential participants. For each of the five arrangements shown to each participant, the study phase was followed directly by the test phase. In this phase, the 24 test items were presented in random order. For each test item, partic- ipants had to indicate correctly and as quickly as possible whether the depicted spatial relations were correct. No feed- back was given. After completion of the test phase, participants were free to either take a short break or continue with the next arrangement. It took participants about 60 min to complete the experiment. 532 S. Dutke, M. Rinck / Learning and Instruction 16 (2006) 526e537 During the second session on the next day, participants took the reading span test and the spatial span test, designed to measure verbal working memory capacity and visuospatial working memory capacity, respectively. The order of the two tests was counterbalanced across participants. It took participants about 45 minto complete both working mem- ory tests. Afterwards, they were debriefed, thanked, and compensated for their participation. 1.6. Design The experiment followed a mixed design with the within-subjects factors ‘study materials’ (pictures vs. words), ‘test materials’ (pictures vs. words), and ‘test complexity’ (object pairs vs. complete arrangements). ‘Verbal working memory capacity’ and ‘visuospatial working memory capacity’ were used as between-subjects factors by creating four groups (verbal capacity low or high and visuospatial capacity low or high, defined by median splits according to the participants’ scores in the RST and the SST). The study materials were varied across arrangements, such that each participant studied two arrangements of pictures and two arrangements of words. Test materials and test complexity were varied within arrangements, such that each arrangement was tested with picture pairs, word pairs, complete picture arrangements, and complete word arrangements. Reaction times to test probes in the test phase of the experiment and error rates of these reactions were recorded as dependent variables. 2. Results 2.1. Preliminary analyses Both the reading span test (RST) and the spatial span test (SST) yielded considerable variance in the participants’ scores, allowing for the separation of low vs. high-capacity participants (ranges 4e67 on the RST and 0e53 on the SST). For the median splits, the limits were set between 22 and 23 for the RST, and between 9 and 10 for the SST, in order to yield four groups of comparable sample size. The mean scores and sample sizes for each of the four groups are shown in Table 1. RST scores and SST scores were only marginally correlated (r¼ .17, p¼ .09). Reaction times (RTs) to test probes were subjected to analyses of variance (ANOVA) after outlier RTs, and RTs of incorrect responses had been excluded from the analyses. Outliers were defined as the lower and upper 2% of the cor- rect RTs in each group, respectively. The RTs were analyzed according to a mixed-factors 2� 2� 2� 2� 2-ANOVA with the factors reading span group, spatial span group, study materials, test materials, and test complexity. Mean RTs and standard deviations for this analysis are shown in Table 2. Corresponding analyses were computed for the error rates depicted in Table 3. These analyses yielded very similar results, although as expected, the observed effects were generally smaller for error rates than for RTs. Moreover, there was no indication of any speed-accuracy trade-offs in the data. Therefore, we only report the results for RTs. Verbal and visuospatial capacities yielded the expected main effects on RTs: participants with higher verbal capac- ity responded more quickly than those with lower capacity [3182 vs. 3826 ms; F(1,92)¼ 13.99, p< .001], and the same was true for participants with higher visuospatial capacity compared to those with lower capacity [3158 vs. 3913 ms; F(1,92)¼ 16.59, p< .001]. These beneficial effects were additive, as there was no interaction of verbal and visuospatial capacity, F(1,92)< 1. A marginally significant main effect of study materials on RTs in the test phase [F(1,92)¼ 3.72, p¼ .057] suggested that studying pictures was easier than studying words. This was true for all participant groups, as there was no interaction of study materials with either reading span [F(1,92)< 1] or spatial span [F(1,92)¼ 2.67, p¼ .106]. Table 1 Mean scores in the reading span test and the spatial span test for the four participant groups defined by median splits Groups Verbal low, spatial low (n¼ 26) Verbal low, spatial high (n¼ 24) Verbal high, spatial low (n¼ 23) Verbal high, spatial high (n¼ 23) Reading span test 13.5 (6.0) 17.4 (4.7) 33.6 (8.9) 35.9 (11.0) Spatial span test 3.7 (3.0) 20.6 (9.9) 3.4 (2.9) 23.0 (10.4) 533S. Dutke, M. Rinck / Learning and Instruction 16 (2006) 526e537 2.2. Hypothesis testing The integration hypothesis predicts that complete arrangements should be more demanding to evaluate than pairs. Actually, the main effect of test complexity was highly significant [F(1,92)¼ 94.20, p< .001]. Participants were faster to judge object pairs than complete arrangements. As expected, this effect interacted with both verbal span [F(1,92)¼ 11.80, p< .01] and spatial span [F(1,92)¼ 5.25, p< .05]: The disadvantage for complete arrangements decreased with increasing verbal and visuospatial resources, and it almost vanished for participants who scored high in both working memory tests (see Fig. 5). Consequently, these latter participants showed only a marginally sig- nificant disadvantage for tests of complete arrangements [F(1,22)¼ 4.24, p¼ .051], whereas the other three groups showed highly significant disadvantages [all F> 22; all p< .001]. The multiple source hypothesis predicts that evaluating word diagrams should be more demanding than picture diagrams. Actually, participants responded more quickly to pictures than to words, yielding a significant main effect of test materials [F(1,92)¼ 27.49, p< .001]. We also observed a marginally significant interaction of test materials and verbal span [F(1,92)¼ 3.79, p¼ .055]. The disadvantage for words was smaller for participants with higher Table 2 Mean RTs in ms the test phase (with standard deviations), broken down by reading span group, spatial span group, test complexity, study materials, and test materials Study materials and test materials Study pictures Study words Test pictures Test words Test pictures Test words Verbal low, spatial low Test object pairs 3012 (944) 3898 (1316) 3535 (1308) 3788 (1353) Test arrangements 4111 (1934) 5270 (2370) 5184 (2452) 5234 (2408) Verbal low, spatial high Test object pairs 2546 (813) 3222 (1170) 2852 (806) 3224 (1269) Test arrangements 3406 (1444) 4418 (2181) 4110 (1721) 3952 (1754) Verbal high, spatial low Test object pairs 2618 (1024) 3305 (1052) 3223 (1383) 3357 (1210) Test arrangements 3593 (1627) 3912 (1333) 4164 (2087) 4040 (2256) Verbal high, spatial high Test object pairs 2472 (1005) 2825 (969) 2579 (954) 2783 (961) Test arrangements 2791 (1125) 3418 (1715) 3039 (1659) 2790 (1455) Table 3 Mean percent error in the test phase (with standard deviations), broken down by reading span group, spatial span group, test complexity, study materials, and test materials Study materials and test materials Study pictures Study words Test pictures Test words Test pictures Test words Verbal low, spatial low Test object pairs 5.9 (9.5) 5.0 (8.5) 10.1 (12.5) 9.5 (12.0) Test arrangements 4.8 (8.1) 6.6 (8.8) 7.1 (8.1) 6.9 (7.9) Verbal low, spatial high Test object pairs 2.3 (5.9) 3.1 (6.6) 4.3 (8.4) 4.3 (8.6) Test arrangements 2.4 (5.7) 4.3 (6.7) 4.2 (6.2) 3.0 (5.6) Verbal high, spatial low Test object pairs 2.0 (5.3) 2.5 (5.5) 6.0 (10.2) 4.9 (9.1) Test arrangements 2.2 (5.7) 2.5 (7.0) 4.4 (7.6) 4.2 (5.8) Verbal high, spatial high Test object pairs 1.8 (5.8) 2.7 (5.8) 3.8 (7.8) 3.4 (7.8) Test arrangements 1.3 (3.5) 1.1 (3.3) 2.5 (5.2) 3.1 (5.7) 534 S. Dutke, M. Rinck / Learning and Instruction 16 (2006) 526e537 verbal working memory resources. However, no interaction of test materials and spatial span was observed [F(1,92)< 1]. This pattern of interactions is illustrated by Fig. 6, and it was corroborated by post hoc tests: both groups with low verbal span showed a highly significant disadvantage for words [both F> 8.7, both p< .01], whereas the two groups with high verbal span did not [both F< 2.87, n.s.]. Finally, a study-test compatibility effect occurred: RTs were shorter when study materials and test materials (pictures vs. words) were identical, yielding a significant interaction [F(1,92)¼ 54.73, p< .001]. 3. Discussion The present experiment was designed to test hypotheses about the demands on working memory resources in in- tegrating verbal and/or pictorial information into a coherent mental model. As predicted in the integration hypothesis, evaluating complete spatial arrangements of words or pictures denotingobjects required more working memory resources than evaluating pairs of words or pictures. Whereas a main effect with longer reaction times for complete arrangements may be predicted from other theories of human memory as well, we also found that test complexity interacted with both reading span and spatial span: integrating five objects into a single representation was most dif- ficult for participants with lower verbal capacity and lower visuospatial capacity. This result corresponds specifically to Schnotz and Bannert’s (2003) model of multimedia learning because the interaction between test complexity and 2500 3000 3500 4000 4500 5000 Low verbal, Low spatial Low verbal, High spatial High verbal, Low spatial High verbal, High spatial Verbal Span and Spatial Span Complete Arrangements Object Pairs Fig. 5. Mean reaction times in ms depending on test complexity, verbal span, and spatial span. 2500 3000 3500 4000 4500 5000 Low verbal, Low spatial Low verbal, High spatial High verbal, Low spatial High verbal, High spatial Verbal Span and Spatial Span Pictures Words Fig. 6. Mean reaction times in ms depending on test materials, verbal span, and spatial span. 535S. Dutke, M. Rinck / Learning and Instruction 16 (2006) 526e537 working memory capacity held for the descriptive and depictive processing branch: study and test materials involved words and pictures, and test complexity interacted with verbal and visuospatial working memory capacity. As far as we know, this result is new, because it is based on a full combination of verbal vs. pictorial elements in the study and the test phase. As assumed by Schnotz and Bannert (2003), learning descriptive and depictive external representations of spatial arrangements of objects seem to require a stepwise and capacity-consuming construction process. The results also provide evidence for the multiple source hypothesis. Participants were faster to judge diagrams (pairs or arrangements) consisting of pictures rather than words. We argued that this effect is specific to Schnotz and Bannert’s model, because processing word diagrams would require the coordination of the depictive and the de- scriptive processing branch, whereas picture diagrams only tax the depictive branch. Even more corroborative is the finding that the effect of test materials interacted with verbal span, but not with spatial span. This is quite plausible because visuospatial resources are needed for both processing word diagrams and picture diagrams, whereas verbal resources are needed only with word diagrams. Thus, low spatial resources will reduce performance with both mate- rials, whereas low verbal resources can reduce performance only with word diagrams. This pattern fits the architecture of the model proposed by Schnotz and Bannert better than the multimedia learning theory by Mayer and Moreno, because the latter did not specify whether and how the auditory/verbal processing channel and the visual/pictorial processing channel may interact. In summary, the results reported here supply evidence for two hypotheses derived from the internal structure of Schnotz and Bannert’s cognitive model of multimedia learning. Both the integration hypothesis and the multiple source hypothesis were supported by main effects of experimental variations and by interactions of the experimental factors with verbal and visuospatial working memory capacity. However, the present experiment also provides infor- mation about the relation between Schnotz and Bannert’s model and the competing theory of multimedia learning (Mayer & Moreno, 2002, 2003). While in the latter model the auditory/verbal processing channel and the visual/pic- torial processing channel are defined by a combination of perceptual modality and representational format, the dis- tinction of the descriptive and depictive processing branch in Schnotz and Bannert’s model is exclusively defined by representational format. In the present experiment, modality was kept constant: verbal and visuospatial materials were presented visually, in the study phase as well as in the test phase. As all effects are compatible with Schnotz and Bannert’s model although only representational format but not modality was varied, Mayer and Moreno’s theory seems to require some additional specifications with regard to the role of differences in modality. The theories by Schnotz and Bannert, Mayer and Moreno, and many others are used to derive recommendations for the design of multimedia learning systems e in experimental as well as real-world settings. One may argue that the simple spatial arrangements employed here do not adequately reflect learning from texts and pictures in multimedia environments because they are not complex enough. However, picture diagrams represent the central dimension, spa- tiality, which nearly all sorts of graphical representations are based on, and word diagrams are prototypical instances of a great class of graphical representations incorporating verbal labels or explanations, such as the ones used in science, education, business communication, public information, and many other fields. Moreover, we believe that it is exactly the simplicity of the spatial stimuli employed here which allows for a strict test of the model of text and picture comprehension proposed by Schnotz and Bannert (2003). First, picture and word diagrams represented exactly the same spatial scenes and differed only in the way the objects were denoted. Second, we were able to avoid complications introduced by differences in prior knowledge between participants. Earlier research has shown that large knowledge-dependent effects in mental model construction may occur during learning of spatial scenes (e.g., Dutke, 1993, 1996). Thus, simple experimental materials that nevertheless allow theoretically relevant manipulations may enhance the power of empirical tests. The more rigorously such theories are empirically tested (and eventually modified), the more reliably they are a basis for design decisions. Beyond theory evaluation, the present work has some implications for research on educational and instructional design. The first is related to Sweller’s (1994) concept of element interactivity. Material that can be understood and learned only considering several elements simultaneously (high element interactivity) is hypothesized to contrib- ute to intrinsic cognitive load. Usually, effects of the degree of element interactivity are investigated by varying the learning materials (e.g., Pollock, Chandler, & Sweller, 2002). In the present experiment, element interactivity was not varied in the learning materials, but in the way learning performance was tested. During the learning phase, only pairs of words or pictures (isolated elements) were presented. However, during the learning test, participants evaluated el- ements (pairs) and complete spatial arrangements. Even in this setting, the element interactivity effect was replicated: testing complete arrangements (high element interactivity) required more working memory capacity than testing 536 S. Dutke, M. Rinck / Learning and Instruction 16 (2006) 526e537 elements alone (low element interactivity). The practical conclusion is that high element interactivity (and thus, in- creased load) can be established by the way the learning task is designed and by the way the learning test is designed. Moreover, it was demonstrated that the element interactivity effect also held for pure picture diagrams involving no other processes than identifying pictures of objects and relating them spatially in a way that represents all pair rela- tions correctly. This corroborates the assumption that element interactivity is a quite basic feature of learning materials to be attended to in designing learning tasks. This interpretation is also in line with working memory research iden- tifying the coordination of formerly independent elements as a specific class of demands on working memory (Dutke, 2005; Hagendorf & Sá,1996; Halford, Wilson, & Philipps, 1998; Mayr & Kliegl, 1993; Oberauer, 1993; Oberauer, Süß, Schulze, Wilhelm, & Wittmann, 2000). Second, the result that evaluating word diagrams required more working memory resources than evaluating picture diagrams is of high practical relevance. As mentioned above, word diagrams, that is, depictions of spatial relations combined with verbal descriptions of the elements constituting such a diagram are frequently used forms of diagrams (e.g., see Pollock et al., 2002). Based on the Schnotz and Bannert model, this effect can be interpreted as costs emerg- ing from the coordination of the depictive and the descriptive processing branch. However, the present study was not designed to maximize this effect because the presentation time of 3 s per pair was chosen to ensure sufficient learning even with word diagrams, even for participants with lower working memory spans. Rather, long presentation time was chosen here because the present study focused mainly on performance in the test phase of the experiment. Future stud- ies should try to complement the present results by employing self-paced presentation times. According to the mul- tiple source hypothesis, in this case, presentation times should be longer for word diagrams than for picture diagrams, especially in participants with low reading span. Third, with regard to research strategies, the present experiment demonstrated a fruitful way to test hypotheses in- volving assumptions about the cognitive load of learning tasks. Particularly, in the context of the cognitive load theory (Sweller, 1994) it has become more and more important to assess the load caused by differently designed learning tasks or learning environments (e.g., Mayer & Moreno, 2003; Paas et al., 2003). Usually, subjective ratings of learners are used as indicators of cognitive load (e.g., van Merriënboer, Schuurmann, de Crook, & Paas, 2002). As an alternative, Brünken, Plass, and Leutner (2003) proposed a dual task technique. In the present experiment, however, a third strategy was demonstrated: we measured individual differences in cognitive resources and investigated how this variation af- fected learning performance, in conjunction with experimental manipulations of the learning task and the test situation. Although this combined approach is costly in terms of number of participants, it has at least two advantages. Compared to the subjective rating approach, assessing individual differences with evaluated instruments enhances the construct validity and allows more precise conclusions about the type of load generated by a learning task. In the present study, for example, we could differentiate between task demands tapping the verbal and the visuospatial working memory capacity, respectively. Compared with the dual task approach, assessing individual differences does not encounter problems arising from task-specific (and hard to predict) interactions between the primary and the secondary task. Acknowledgements We are grateful to Ulrich Herzberg, Saskia Schanz, and Karin Wolf for their invaluable help in preparing and con- ducting the experiment. We would also like to thank two anonymous reviewers for helpful comments on an earlier version of this article. This research was supported by grants Du 312/1-1 and Ri 600/9-1 from the German Research Foundation (DFG) to the authors. References Baddeley, A. D. (1986). Working memory. Oxford, UK: Clarendon Press. Baddeley, A. D. (1992). Is working memory working? The fifteenth Bartlett lecture. The Quarterly Journal of Experimental Psychology, 44A, 1e31. Brünken, R., Plass, J. L., & Leutner, D. (2003). Direct measurement of cognitive load in multimedia learning. Educational Psychologist, 38, 53e61. Daneman, M., & Carpenter, P. A. (1980). Individual differences in working memory. Journal of Verbal Learning and Verbal Behavior, 19, 450e466. De Jonge, P., & De Jonge, P. F. (1996). Working memory, intelligence and reading ability in children. Personality and Individual Differences, 21, 1007e1020. van Dijk, T. A., & Kintsch, W. (1983). Strategies of discourse comprehension. New York: Academic Press. 537S. Dutke, M. Rinck / Learning and Instruction 16 (2006) 526e537 Dutke, S. (1993). Mentale Modelle beim Erinnern sprachlich beschriebener räumlicher Anordnungen: Zur Interaktion von Gedächtnisschemata und Textrepräsentation. [Mental models in remembering verbally described spatial arrangements: Towards the interaction of memory sche- mata and text representation]. Zeitschrift für experimentelle und angewandte Psychologie, 40, 44e71. Dutke, S. (1996). Generic and generative knowledge: memory schemata in the construction of mental models. In W. Battmann, & S. Dutke (Eds.), Processes of the molar regulation of behavior (pp. 35e54). Lengerich: Pabst Science Publishers. Dutke, S. (2005). Remembered duration: working memory and the reproduction of intervals. Perception & Psychophysics, 67, 1404e1422. Engle, R. W., Cantor, J., & Carullo, J. (1992). Individual differences in working memory and comprehension: a test of four hypotheses. Journal of Experimental Psychology: Learning, Memory and Cognition, 18, 972e992. Friedman, N. P., & Miyake, A. (2000). Differential roles for visuospatial and verbal working memory in situation model construction. Journal of Experimental Psychology: General, 129, 61e83. Gernsbacher, M. A. (1990). Language comprehension as structure building. Hillsdale, NJ: Erlbaum. Glenberg, A. M., Meyer, M., & Lindem, K. (1987). Mental models contribute to fore grounding during text comprehension. Journal of Memory and Language, 26, 69e83. Graesser, A. C., Millis, K. K., & Zwaan, R. A. (1997). Discourse comprehension. Annual Review of Psychology, 48, 163e189. Graesser, A. C., Singer, M., & Trabasso, T. (1994). Constructing inferences during narrative text comprehension. Psychological Review, 101, 371e395. Hacker, W., & Osterland, D. (1995). Mentale Koordinationskapazität: Einfluß von Text- und Arbeitsgedächtnismerkmalen auf das Verstehen von Instruktionstexten. [Mental capacity for coordination: Effects of text features and working memory resources on the comprehension of instruc- tional texts]. Zeitschrift für Experimentelle Psychologie, 42, 646e671. Hagendorf, H., & Sá, B. (1996). Coordination in visual working memory. Psychological Research, 58, 294e306. Halford, G. S., Wilson, W. H., & Philipps, S. (1998). Processing capacity defined by relational complexity: implications for comparative, developmental, and cognitive psychology. Behavioral and Brain Sciences, 21, 803e864. Johnson-Laird, P. N. (1983). Mental models. Cambridge, Great Britain: Cambridge University Press. Johnson-Laird, P. N. (1996). Images, models, and propositional representations. In M. de Vega, M. J. Intons-Peterson, P. N. Johnson-Laird, M. Denis, & M. Marschark (Eds.), Models of visuospatial cognition (pp. 90e127). Oxford: Oxford University Press. Kintsch, W. (1998). Comprehension. New York: Cambridge University Press. Mayer, R. E. (2003). The promise of multimedia learning: using the same instructional design methods across different media. Learning and Instruction, 13, 125e139. Mayer, R. E., & Moreno, R. (2002). Aids to computer-based multimedia learning. Learning and Instruction, 12, 107e119. Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational Psychologist, 38, 43e52. Mayr, U., & Kliegl, R. (1993). Sequential and coordinative complexity: age-based processing limitations in figural transformations. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 1297e1320. van Merriënboer, J. J. G., Schuurman, J. G., de Crook, M. B. M., & Paas, F. G. W. C. (2002). Redirecting learner’s attention during training: effects on cognitive load, transfer test performance and training efficiency. Learning and Instruction, 12, 11e37. Oberauer, K. (1993). Die Koordination kognitiver Operationen e eine Studie über die Beziehungzwischen Intelligenz und ‘‘working memory’’. [Coordination of cognitive operations e a study on the relation of intelligence and working memory]. Zeitschrift für Psychologie, 201, 57e84. Oberauer, K., Süß, H.-M., Schulze, R., Wilhelm, O., & Wittmann, W. W. (2000). Working memory capacity e facets of a cognitive ability con- struct. Personality and Individual Differences, 29, 1017e1045. Paas, F., Renkl, A., & Sweller, J. (2003). Cognitive load theory and instructional design: recent developments. Educational Psychologist, 38, 1e4. Palladino, P., Cornoldi, C., De Beni, R., & Pazzaglia, F. (2001). Working memory and updating processes in reading comprehension. Memory and Cognition, 29, 344e354. Pollock, E., Chandler, P., & Sweller, J. (2002). Assimilating complex information. Learning and Instruction, 12, 61e86. Schnotz, W., & Bannert, M. (1999). Einfüsse der Visualisierungsform auf die Konstruktion mentaler Modelle beim Text- und Bildverstehen. [In- fluence of the type of visualization on the construction of mental models during picture and text comprehension]. Zeitschrift für Experimentelle Psychologie, 46, 217e236. Schnotz, W., & Bannert, M. (2003). Construction and interference in learning from multiple representation. Learning and Instruction, 13, 141e156. Shah, P., & Miyake, A. (1996). The separability of working memory resources for spatial thinking and language processing: an individual differences approach. Journal of Experimental Psychology: General, 125, 4e27. Sweller, J. (1994). Cognitive load theory, learning difficulty, and instructional design. Learning and Instruction, 4, 295e312. Zwaan, R. A., Langston, M. C., & Graesser, A. C. (1995). The construction of situation models in narrative comprehension: an event-indexing model. Psychological Science, 6, 292e297. Multimedia learning: Working memory and the learning of word and picture diagrams Method Participants Study materials Test materials Working memory tests Procedure Design Results Preliminary analyses Hypothesis testing Discussion Acknowledgements References
Compartilhar