Introduction

Small-group collaboration has been widely recognized as a factor that is conducive to learning (Cohen 1994; Johnson et al. 2007). Several review studies have shown that working together in small groups has a positive effect on learning performance (Roseth et al. 2008; Slavin 1983; Springer et al. 1999). Springer et al. (1999) pointed to the phenomenon that differing theoretical assumptions have been put forward to explain this effect. Indeed, widely differing explanations of the effectiveness of collaborative learning have been proposed (Slavin et al. 2003). O’Donnell (2006) distinguishes two major theoretical perspectives on collaborative learning: a socio-behavioral and a cognitive one. From the socio-behavioral perspective motivation, social cohesion and positive socio-cultural values are considered conducive to learning, whereas the cognitive perspective stresses elaboration and recourse to prior knowledge and experiences. The idea underpinning the cognitive elaborative perspective on collaborative learning is that students process information at deeper levels when they learn collaboratively (O’Donnell 2006). The current study tested the assumptions underlying this perspective.

Elaboration is a form of higher-order thinking in which new ideas are generated by connecting new information with knowledge already present in memory and by combining new ideas. Elaboration leads to deep levels of information processing (Craik and Lockhart 1972) and is assumed to inhibit forgetting, because it produces a richer, more redundant memory structure (Reder 1980).

In collaborative settings, several learning strategies that promote elaboration have been shown to enhance academic achievement. Some of these strategies involved measures to stimulate elaboration when collaborating students were communicating. That this can be effective appeared from a sequence of intervention studies which showed that scripting collaborative talk enhanced learning (see Dansereau 1988, for an overview). In scripted cooperation (O’Donnell 1996) a predetermined structure guided the way dyads studied a text. The text was broken up into sections and pairs of students studied the sections one by one, switching the roles of recaller and listener for each section. After the students had read one section, they put it away and the recaller reproduced as much information from the text as possible while the listener concentrated on detecting errors and omissions. Next, the two students produced ideas that might help them memorize the material, such as analogies. This strategy resulted in better performance than did either collaborative learning without a strategy or individual learning (McDonald et al. 1985; O’Donnell et al. 1985). One study found similar results in small study groups of four to five students (Yager et al. 1985) but Rewey et al. (1989) found no significant effect of scripted cooperation on performance in small groups, although the trend was positive. The broad trend that seems to emerge from the above findings on collaborative learning is that learning can benefit from elaborative communication during cooperative discourse.

Another series of intervention studies used a procedure called guided peer questioning to enhance comprehension of new material (King 2007). In one of these studies, students were trained to ask thought-provoking questions before engaging in small group discussions. The questions were specifically designed to elicit answers that related (old and) new information. For example: “Why are … and … similar,” or “How does … relate to …?” (King 1990). In a later study, students were also trained to provide elaborated explanations in response to these questions (King et al. 1998). These interventions generally increased participating students’ academic performance.

Finally, there are indications that the amount of exploratory talk used by children enhances their ability to solve problems on the Raven test (Wegerif et al. 1999). Exploratory talk is a constructive form of collaborative communication, in which participants reach consensus by presenting and listening to carefully balanced arguments and counterarguments (Mercer 1996). Training children to use this form of talk has been found to have a positive effect on their learning (Mercer and Sams 2006; Mercer et al. 2004).

In most of the studies that involved manipulation of elaborative communication, group discussions either took place while people were processing new information, or after they had done so (King 1991; Larson et al. 1985; McDonald et al. 1985; O’Donnell et al. 1985; O’Donnell 1996; Webb and Farivar 1999; Webb et al. 1995; Yager et al. 1985). Interestingly, elaborative discourse has also been shown to be beneficial when it precedes learning (De Grave et al. 2001; O’Donnell et al. 1990; Schmidt et al. 1989). In a study by Schmidt et al. (1989), small groups of students discussed a problem illustrating the osmosis process in cells before studying a text about osmosis and diffusion. These students recalled more information from the text than students who had discussed a non-relevant problem, about airplanes taking off, before studying the same text. De Grave et al. (2001) reported similar results. Apparently, discussing the problem facilitated learning from the text that was studied immediately afterwards. In summary, it seems that learning is facilitated by elaborative discourse regardless of whether this occurs before, during or after the actual learning task.

Based on the literature discussed so far it seems safe to assume that attempts to stimulate elaborative communication in small study groups or dyads are beneficial to individual group members’ academic performance, whether elaborative communication precedes, follows or coincides with studying new material.

Unfortunately, studies have provided scant information about why students benefit from group discussion and what cognitive activities play a role. For instance, it could be the act of listening carefully to what others say that aids learning the most. On the other hand, the act of explaining something to someone else might also be productive for one’s own learning. This is in agreement with Slavin’s (1996) argument that explaining something to others is a cognitive process that evokes elaboration during small group work. Explaining requires structuring of knowledge as well as restructuring of knowledge when inconsistencies in one’s reasoning are revealed (Webb 1989). So, providing explanations during small group discussions stimulates elaboration, which, in turn, is expected to foster learning.

Studies have indicated that providing explanations with the aim to elaborate (i.e., creating connections between newly studied information) can increase retention of information. For instance, individuals who had to guess the second word of word-pairings remembered more words than individuals who studied the same words without having to guess (Slamecka and Graf 1978). Also, generating self-explanations while studying a text increased performance on various post-test measures (Chi et al. 1994; Roscoe and Chi 2007). However, these results were found in individuals who were not studying in a collaborative setting and consequently were unable to learn by listening to the input of others.

Still, a review study of small-group learning of mathematics and computer sciences also showed that the relationship between engaging in explaining things to others and learning outcomes was predominantly positive (Webb 1989). Here, it needs to be mentioned that some positive relationships were found between listening to explanations from others and learning performance. For instance, Peterson and Swing (1985) found that both listening and explaining were positively related to final group performance when students cooperatively tried to solve math problems in small groups. The more mixed results from these small-group studies may be attributable to the more dynamic circumstances under which these studies took place. The small groups were observed in classrooms, which allowed for a variety of influences on learning. By contrast, some studies with cooperative dyads showed that dyad members who provided oral summaries recalled more information than their listening counterparts (Ross and DiVesta 1976; Spurlin et al. 1984). Thus, the general impression from prior studies is that, when students collaborate, explaining yields a stronger positive effect than listening.

The preceding has shown that providing explanations may be a pivotal process for the effectiveness of learning in small groups. Therefore, it is well worth investigating but researchers have to face the problem that explaining and its role and effects cannot always be disentangled from the myriad of other factors that impact on learning in group processes. First, group members’ differing levels of prior knowledge may affect the ease of learning new information. This was shown by a study where one of the major predictors of final math performance of collaborating school children proved to be their general math performance before the study (Webb and Farivar 1999; Webb et al. 1995). Perhaps the children with superior prior performance had more relevant prior knowledge, which they could relate to new information, which led to better results. Second, it cannot be ruled out that the quality of small group discussions affects individual learning outcomes. That is, some group discussions may be more productive than others, perhaps because they contain more exploratory talk. Other group discussions may me less productive, for instance when group members engage in off-task behavior. This was illustrated by Dansereau (1988), who noted in studies of scripted cooperation that some dyads talked about unnecessary details and irrelevant subjects, which may well have impaired their learning. However, this was not taken into account as a confounding variable, which is not surprising seeing that this was probably impossible due to random group dynamics. From an example like this, it is easy to understand why some authors call for more controlled, experimental research in order to improve understanding of collaborative learning (Springer et al. 1999). Controlled conditions will lend greater robustness to research into cognitive processes that may contribute to learning in small groups.

To summarize: studies have usually found positive effects of elaborative communication on individual learning performance and there are strong indications that explaining things to others is an important cognitive process that accounts for this effect, although this notion has never been tested in a controlled, experimental study. Studying effects of explaining in a laboratory setting gives researchers greater control over individual differences in prior knowledge and the quality of the group discussion. Hence it enables evaluation of the differential effects of explaining on learning compared to that of other forms of participation in group discussions. In order to study explanation in isolation, we developed a new research method which aimed to let participants enter a group discussion with the same level of prior knowledge. Also, they were exposed to the same simulated group process. This was done by presenting the group discussion on a video. The effects of explanation on learning could be studied by stopping the video at various points in time and instructing the participants to respond to the discussion during these ‘gaps’. This way, an important contribution could be made to the existing literature about small-group learning, for the research method allowed us to study one cognitive activity (i.e., providing explanations) and to compare the learning-effects of this activity with those of other cognitive activities (i.e., listening to relevant or irrelevant information).

While the main aim of this study was to examine the effects of explaining during small-group discussion while controlling for confounding effects of individual prior knowledge and random group processes, the specific aim was to determine whether students who actively explained issues had better recall of a text that was studied after the group discussion. As mentioned earlier, the results of some studies suggested that collective elaborative activities can improve the processing of new information, so information that is studied after the elaborative activities took place (O’Donnell et al. 1990; Schmidt et al. 1989; De Grave et al. 2001). This study aimed to replicate these findings, while zooming in on the role of providing explanations versus listening. Therefore, the main hypothesis was that providing explanations during a small group discussion would lead to a better recall of subsequently studied information than merely listening to a group discussion. Since the important role of listening during a group discussion has not been ruled out (see Peterson and Swing 1985) and group discussion in general has been found to facilitate subsequent learning (Schmidt et al. 1989; De Grave et al. 2001), a second hypothesis was included as well. This one stated that participation in a relevant group discussion, regardless of providing explanations or listening, would increase the recall of subsequently studied information. In order to test this second hypothesis, the learning results of students who participated in a relevant discussion were compared with the results of students who participated in an irrelevant group discussion.

Method

Overview

Undergraduate students took part in simulated group discussions after receiving instruction on a topic. They were assigned to one of three conditions. In the explanation condition, they were encouraged to verbally participate in the group discussion, using knowledge from the pre-experiment instruction. The discussion was based on a problem, which provided a contextual cue for elaboration (Schmidt 1993). Participants in the listening condition took part in the same group discussion but they were instructed not to contribute actively but only listen to the other group members, while keeping in mind the information from the instruction. In the control condition, participants attended a group discussion about an unrelated problem, which precluded activating and elaborating on relevant prior knowledge. This condition was included to compare the effects of attending and of not-attending a discussion about a relevant problem. After participating in the simulated discussion, all the participating students studied the same text about a topic related to the problem discussed in the group viewed by the students in the explaining and listening condition. The participants’ recall of this text was tested by 13 open-ended questions both immediately after the experiment and 1 month later.

Participants

The participants were 70 students at the Faculties of Health, Medicine and Life Sciences, Psychology, Cultural Sciences, and Economics and Business Administration, Maastricht University, The Netherlands. They were recruited with advertisements and received a financial compensation for their participation. At Maastricht University, group discussions are the main educational format, so all participants in this study were used to taking part in such discussions. Their average age was 21.04 years (SD = 2.21). None of the participants reported having taken their final secondary school examination in physics. This implied that they had not studied physics in school after the third grade of secondary education, so approximately after the age of 15 years.

Materials

Based on Mayer and Cook (1981; also see Mayer 1985) and a Dutch physics textbook used in secondary education (Middelink 1991), two texts were written aimed at providing a very basic understanding of waves and radar systems, respectively. English translations of excerpts from these texts are provided in “Appendix A”. The first text explained the physical principles of waves and was aimed at creating a basic understanding of waves, which could be used during the simulated group discussion. It also explained how sound waves can be used to measure distance. The text was tailored to the highest level of mandatory physics education in the Netherlands, which is third grade in secondary education. The text was reviewed by a secondary education physics teacher. The second text explained how radar works, the parts of a radar system, the types of wave it uses (electromagnetic waves) and why this is so. It also explained attenuation of electromagnetic waves and how this can be prevented. The contents were reviewed by a radar specialist of the Royal Dutch Meteorological Institute.

The first text consisted of nine sections. Three multiple choice questions were constructed for each section to assess participants’ conceptual knowledge of the contents. This resulted in a test consisting of 27 multiple choice questions. For the post-experiment test 13 open-ended questions about radar were constructed. These questions generally introduced one topic from the text and asked participants to give an in-depth explanation about this topic. For example, the first question gave the names of various radar parts and asked participants to explain how these parts constitute a radar system. The open-ended questions can be found in “Appendix B”.

For the explanation and listening conditions one problem was written, describing an air traffic controller who managed the radar system of Amsterdam Airport Schiphol, the Netherlands. The problem for the control condition was taken from the Maastricht medical curriculum and described neurological symptoms after drinking alcohol. An English translation of the radar problem can be found in “Appendix C”.

Before the actual experiment started, two group discussions were digitally recorded in a studio: one about the radar problem and one about the alcohol problem. Four students, all members of a drama club, participated in both discussions. In several rehearsal sessions, they first familiarized themselves with the contents of the text for the pre-experiment instruction and they then discussed the radar problem based on their prior knowledge. They also discussed the alcohol problem during these sessions. Based on the videotaped rehearsal sessions two written scripts were produced, one for the radar problem and one for the alcohol problem. In a studio, these scripts were performed by the same student actors and recorded.

In both simulated discussions one actor did most of the explaining. The other actors asked him for explanations, for he was obviously the most knowledgeable student of the group. Of the radar discussion, two versions were prepared: an unedited version showing the integral discussion and an edited version from which the part of the actor who did most of the explaining was removed. The unedited version was used for the listening condition and the edited version for the explanation condition. For the control condition the integral recording of the alcohol discussion was used. Figure 1 shows a screen shot of the radar discussion.

Fig. 1
figure 1

Screen shot from the group discussion

The students who participated in the experimental study viewed one of the video recorded group discussions on a computer screen. The first text and the nine sets of three multiple choice questions relating to this text were available on a webpage designed by Maastricht University Department of Educational Development and Research, using RP’s httpserver©. The two problems, the videotaped discussions and the second text with its accompanying open-ended questions were presented on a website constructed with Macromedia Flash 8© by the first author (FvB).

Procedure

During the pre-experimental instruction phase all the participants studied the first text, which provided them with relevant prior knowledge about waves. In a large computer room the participants individually studied the first text for 10 min and then answered nine multiple choice questions about the contents of each of the nine paragraphs. Based on a procedure by Chi et al. (1994), the students who did not get all the questions right during the first time were given 7 min to reread the text and then took a retest consisting of new multiple choice questions covering the contents of the paragraphs that had yielded wrong answers. This procedure was repeated and after the third round of questions, 89% of the participants answered all questions correctly. All participants continued with the next stages of the experiment.

One week after the pre-experimental instruction, all participants returned for the experimental phase. They were randomly assigned to the explanation (N = 24), listening (N = 24), or control condition (N = 22). Participants from different faculties and different years of study were equally distributed over the three conditions. Because the participants in the explanation condition had to speak out loud, the experimental task was performed in isolated computer rooms.

The participants in the explanation condition sat at a desk in front of a computer screen that was approximately 70 cm away from them. They were told that they were going to participate in a simulated group discussion about radar and that they could use their knowledge from the pre-experimental session to explain what they knew about radar to the other group members, just as if they were a member of the simulated group. They were also instructed to explain as much as possible. After these instructions, the participants read the radar problem and then observed a small part of the edited recording of the radar discussion, in order to familiarize themselves with the setting and the actors. They could hear the simulated group members via two speakers on either side of the computer screen and they were encouraged to answer a question from the group during this practice phase. After again being instructed to explain as much as possible, the edited video was started. At different points during the simulated discussion, the actors asked questions to which the participants could respond. The actors asked about factual knowledge from the pre-experimental instruction session, such as “How does a bat know that a wall is nearby?”, but they also asked for more elaborate explanations (“Wait, I don’t understand. Why not?”). Translations of these questions can be found in “Appendix D”.

One participant’s response to the question how a bat knows that a wall is nearby is presented below. This participant correctly explained, after a few tries, how sound waves are used to estimate distance:

That’s because it emits waves that go to the walls, and the number of seconds, so the time it takes for the waves to go to the wall and back, if it multiplies this by the…hey… Ah! So if you multiply time by the… distance? No. Yes, if it multiplies the speed of the sound by the time. What was it again? Yes, if it multiplies speed by the time it takes, then… you get the distance and the bat can estimate the distance to the wall.

When the simulated group discussion about radar was finished, the participants were given 20 min to read the text about radar, which answered the questions that had been raised during the discussion. Immediately after that, they answered 13 open-ended questions. One month later they returned and answered the same sequence of 13 open-ended questions.

The procedure for the participants in the listening condition was identical, except that they watched the integral group discussion, in which the questions were answered by one of the actors. They were instructed to listen carefully and keep in mind what they had learned during the pre-experimental session. The participants in the control condition read the alcohol problem and watched the simulated discussion about this problem. They were only instructed to listen carefully.

Analyses

Following Mayer (1985), FvB parsed the text about radar into propositions, i.e., units that constitute a subject-verb clause. The answers to the open-ended questions were parsed into propositions by four judges, i.e., FvB and three student assistants. Two pairs of judges parsed the answers given by six randomly selected participants, i.e., each pair parsed the answers from three students. Combined average inter-judge agreement was 89%. Next, the two pairs of judges matched the parsed answers of the three participants to propositions from the radar text. Combined average inter-judge agreement on these matches was 79%. Differences were resolved by discussion, resulting in some additional agreements (e.g., not to analyze text that concerned paraphrasing questions or repeating answers).

After this, the judges used the same procedure to individually analyze the remaining answers. Each judge generated propositions and matched these to the propositions from the radar text. A match yielded one point and the summed points for all questions were a participant’s cumulative recall score. Since there were two measurements, there were two cumulative recall scores, one for immediate recall and one for delayed recall.

A one-way analysis of variance (ANOVA) with planned contrasts was performed with condition as the independent variable and immediate and delayed recall as the dependent variables. One planned contrast tested the hypothesis of better recall of participants in the explanation condition compared to those in the listening condition. The second planned contrast tested the hypothesis of better recall by participants in the explanation and listening condition compared to those in the control condition. Effect sizes r for the contrast analysis were calculated according to Rosnow and Rosenthal (2008, pp. 325–326). That is, the t-statistics of the contrast analyses were squared and then divided by this same squared t-value, plus the sum of the within-group degrees of freedom. Then, the square root of this result was taken. In general, an effect size of .10 is considered a small effect, an effect size of .30 medium and an effect size of .50 large (Rosnow and Rosenthal 2008, p. 278).

Results

The mean scores per condition are presented in Table 1 and depicted in Fig. 2. There was no significant difference between the three conditions on immediate recall, p(2, 67) = 2.80, p = .068. Planned contrasts revealed that, directly after the group discussion, the participants who had given explanations (M = 57.50, SD = 22.72) or listened (M = 54.63, SD = 15.01) had higher recall scores than the participants in the control condition (M = 44.86, SD = 17.99), t(67) = 2.31, p < .05 (two-tailed), r = 0.27. However, participants who had given explanations did not recall more than those who had listened, t(67) = .53, p = .60, r = 0.08.

Table 1 Immediate and delayed recall as a function of condition
Fig. 2
figure 2

Mean recall immediately and one month after the group discussion

After 1 month, there was a significant between-conditions difference in recall scores, F(2, 67) = 3.57, p < .05. Planned contrasts indicated that, during this second measurement interval, the participants who had provided explanations (M = 49.29, SD = 19.71) or listened during the radar discussion (M = 39.04, SD = 13.56) had no higher recall scores than the participants in the control condition (M = 38.05, SD = 13.72), t(67) = 1.49, p = .14, r = 0.18. In contrast, participants who had explained during the discussion 1 month before recalled more than the participants who had listened, t(67) = 2.22, p < .05 (two-tailed), r = 0.31.

Discussion

The aim of this study was to explore the effects of explaining on learning during collaborative small group discussions, while controlling for the effects of random group processes. To meet this objective, the group process was controlled by using a video-recorded, simulated group discussion. The conditions for the two experimental groups and the control group differed in that participants in the experimental conditions were asked to either participate in the group discussion or to listen only, while the participants in the control group watched a discussion on a topic with no relevance to the topic of the post-experimental text. The hypothesis that providing explanations during a relevant discussion would lead to better recall of a related text than only listening to a relevant discussion appears to be confirmed by the measurement after 1 month but not by the measurement immediately following the discussion. The hypothesis that participation in a relevant group discussion would lead to increased recall compared to participation in an irrelevant discussion was only confirmed immediately after the group discussion and not after 1 month. Our results suggest a delayed positive effect of providing explanations and an immediate positive effect of participation in a relevant discussion. In other words, taking part in a relevant group discussion had a direct positive impact on recall, whilst this positive effect persisted over a longer period only for those who had given explanations during the discussion.

These findings fit theoretical assumptions about elaboration during small group learning, such as Slavin’s (1996) view that learning in small groups is successful in part because students are able to elaborate by explaining to each other. More specifically, explaining may increase learning because it requires students to structure or restructure their knowledge before they can verbalize it (Webb 1989). The need to resolve inconsistencies during this process may trigger new ideas, which may lead to a more elaborate knowledge structure, which may militate against forgetting and facilitate retention of information (Reder 1980).

Furthermore, our findings are consistent with earlier empirical evidence. One review study found that giving elaborate explanations during cooperative group work had a consistent beneficial effect on learning performance (Webb 1989). Other studies found that although students who engaged in problem-based discussions did not outperform students who learned from lectures, they did retain more knowledge over longer periods of time (Capon and Kuhn 2004; Dochy et al. 2003; Eisenstaedt et al. 1990; Tans et al. 1986). This is in line with the expectations that verbal explanations increase elaboration and elaboration increases long-term recall. More specifically, Tans et al. (1986) hypothesized that students who discussed problems did not acquire more information than those listening to lectures, but stored what they learned in a well-structured manner, which facilitated retrieval after longer periods of time. The current experiment cannot reject this hypothesis. Moreover, it specifies that especially producing explanations is beneficial to post-discussion learning.

The findings of this study are unique because the learning of individual participants was not affected by arbitrary effects from varying group processes. As far as we know, no other studies have controlled for this factor. Moreover, other studies did not allow a clear distinction to be made between the effects of explaining and listening (Peterson and Swing 1985). By standardizing the group process, we were able to compare the effects of providing explanations with those of only listening to others during group work.

This study has some limitations. First, in the explanation condition there was more variance in the recalled propositions, indicating that in this condition a larger part of the variance remained unexplained than in the other conditions. This may be associated with factors that were not analyzed, such as motivation and students’ verbal ability. However, it seems even more striking that a long-term effect of producing explanations was found, despite the fact that other factors were not included in the analysis.

Second, it is debatable whether processes induced by watching a video recording of a simulated group discussion are equivalent to processes during a real group discussion. Real discussions may be more dynamic. For instance, during real discussions, students are more likely to engage in irrelevant, off topic talk. Also, the artificial environment of the current experiment may have confused some of the participants, because it was entirely new to them. So, despite their familiarity with group discussions, the participants may have behaved differently in the experiment than they would have in a real group. However, this potential bias appears to be outweighed by the advantage of stable independent variables for the group discussion. Moreover, it is commonly acknowledged that laboratory experiments are conducted at the cost of at least some ecological validity. In all, the current approach seems fit for the purposes of this study. Nevertheless, future experiments should always aim to minimize discrepancies between artificial and real group discussions.

In sum, the findings indicate that providing explanations as well as listening to others during a small group discussion benefit short-term recall, while producing explanations increases retention of knowledge during 1 month. These findings are supported by the theoretical assumptions about elaboration during small group learning and consistent with earlier empirical findings. Moreover, the results suggest that providing explanations is effective for long-term retention of information, regardless of the group process in which one participates.

The approach to studying small group discussions used in this study has some implications for future research. Based on the methods of this experiment, other processes can be studied as well. For instance, one could vary the quality of the group discussion, resulting in a condition with a high-quality discussion and one with a low-quality discussion, and look for effects on short-term and long-term recall. Earlier studies have shown that certain types of discourse are probably more productive for individual learning than other types of talk (Wegerif et al. 1999). These successful types of discourse might be simulated in an experimental design in order to investigate their effects on individual learning. The method we developed appears to be promising for empirical studies to test other theoretical assumptions about the success of collaborative, small group learning.