Using proportional reasoning to think relationally about quantities is highly relevant in everyday life. For example, if one pound of pasta serves four people, you may think proportionally to approximate how many pounds serves six people. Proportional reasoning is also valuable in developing and understanding math concepts. Previous studies have found proportional reasoning with continuous, partwhole representations to be correlated with fraction knowledge and later math achievement (Möhring, Newcombe, Levine, & Frick, 2016). Based on the Common Core State Standards for mathematics, children in the United States should be introduced to fractions around 3^{rd} grade and continue developing fraction concepts (e.g., equivalence, adding, and subtracting) through 5^{th} grade. By 6^{th} and 7^{th} grade, children should be able to analyze ratios and proportional relationships, using realistic examples to solve problems (e.g., constant speed, simple interest; National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010). As an important mathematical foundation that students often find challenging, proportional reasoning has been extensively researched and classroom interventions have been proposed (Berk, Taber, Gorowara, & Poetzl, 2009; Fujimura, 2001; Jitendra, Star, Rodriguez, Lindell, & Someki, 2011). Some neurocognitive building blocks such as the approximate number system have been identified to provide an early foundation for developing numerical concepts (Halberda & Feigenson, 2008; vanMarle et al., 2018). Likewise, a noncount based, ratioprocessing system (RPS) which involves nonsymbolic processing of ratios can serve as an intuitive basis for understanding proportions. In fact, performance on four RPS tasks using visual quantities such as line lengths or numbers of dots was found to predict later symbolic fraction understanding (Matthews, Lewis, & Hubbard, 2016). Similarly, when presented with continuous representations, infants and 4yearolds are able to encode extent (height) in the presence of a standard, demonstrating this early, intuitive sense of proportions (Duffy, Huttenlocher, & Levine, 2005; Huttenlocher, Duffy, & Levine, 2002).
Despite an earlydeveloping competency, schoolage children struggle to focus on proportions when matching based on whole number quantities is an available strategy (Boyer, Levine, & Huttenlocher, 2008). For example, 6yearolds can match equivalent proportions when they are presented in a continuous format; however, children at 10yearsold continue to struggle to match discrete proportions, which elicit an inclination to countandmatch rather than compare ratios (Boyer & Levine, 2015; Jeong, Levine, & Huttenlocher, 2007; SingerFreeman & Goswami, 2001). Boyer and colleagues (2008) compared performance on a proportional equivalence choice task in which participants matched a target (continuous or discrete proportion) to one of two choices (both either continuous or discrete proportions). Children from kindergarten to 4^{th} grade were least likely to choose the correct proportional match in the discretetarget and discretechoices condition (Boyer et al., 2008). When directly comparing countable, discrete units, children implemented erroneous countandmatch strategies (e.g., matching 2/6 to 5/6 because both have equal total amounts, see Figure 1a; Boyer et al., 2008; Jeong et al., 2007).
There is also some evidence that erroneous countandmatch strategies are more prevalent when children can match the numerical quantities of parts of the proportion, versus the whole proportion – indicative of a partmatching bias. Boyer and colleagues (2008) presented elementaryaged children a target proportion (e.g., 2/6) and two answer choices; one choice was a proportional equivalent (e.g., 3/9), and the other choice matched based on either numerical parts (e.g., 2/3, a “part foil”) or wholes (e.g., 4/6, a “whole foil”). Participants incorrectly matched based on parts more frequently than wholes (Boyer et al., 2008), corroborating the presence of a partmatching bias in young children. Children’s focus on parts of proportions, and their focus on partpart relations over partwhole relations, has been noted in other studies (Singer & Resnick, 1992; Spinillo, 2002). Given these findings, we looked for potential differences in children’s performance when partmatching was possible (partfoil trials, e.g., Figure 1b) and when it was not (wholefoil trials, e.g., Figure 1a).
Notably, the countandmatch error children often make when reasoning about discrete proportions is similar to the wholenumber bias in formal fraction learning, i.e., the tendency to interpret a fraction’s numerator and denominator as separate wholenumber quantities instead of as a ratio (Ni & Zhou, 2005). In both cases, children compare proportions (or fractions) based on the numerical similarity of their components (i.e., their parts or the whole), rather than the relations between those components (Boyer et al., 2008; Braithwaite & Siegler, 2018); the primary difference is whether the parts and whole are represented as number symbols (in a fraction) or as visual countable units (in a discrete proportion). Given their conceptual similarity, research on how to reduce children’s countandmatch bias in discrete proportional reasoning may have implications for reducing children’s wholenumber bias in fraction learning.
Here, we propose that methods proven to support analogical reasoning should also be effective in improving children’s proportional reasoning. Analogical and proportional reasoning may evoke similar cognitive processes because both require reasoning about relational similarities, rather than perceptual similarities. For example, in a typical analogical reasoning task, children must determine whether a bicycle is more like a skateboard or eyeglasses; choosing the skateboard requires focusing on categorical commonalities rather than perceptual ones (Gentner & Namy, 1999). Determining whether a discrete visual representation of 3/4 is equivalent to 6/8 or 3/9 could require similar skills (Boyer et al., 2008; Möhring et al., 2016). Similar to the bicycle and eyeglasses, 3/4 is perceptually similar to 3/9 (i.e., both have a part quantity of 3 units), but they are proportionally distinct. Though somewhat perceptually similar, 3/4 and 6/8 are distinct in terms of their whole number quantities, but relationally equivalent.
Some prior work has shown links between children’s analogical reasoning ability and their performance on proportional reasoning tasks (Boyer & Levine, 2015; Hurst & Cordes, 2018; Kotovsky & Gentner, 1996). In one study, 3 and 6yearolds’ performance on a pattern analogy task (e.g., yellow diamond:yellow circle::red diamond:?) was correlated with their performance on a continuous proportional reasoning task (a spinner comparison task in which children were asked to choose which of two spinners would increase their likelihood of winning stickers) (Hurst & Cordes, 2018). Further, two studies have found that completing trials using continuous proportions helped children to reason proportionally on subsequent trials using discrete proportions, although this was only the case for older children in each study (Boyer & Levine [2015]: 4^{th} graders, but not kindergarten or 2^{nd} graders on a proportional equivalence choice task; Hurst & Cordes [2018]: 56yearolds, but not 34yearolds on a spinner comparison task). The sequencing of trials from continuous to discrete is similar to another support for analogical reasoning, progressive alignment, in which identifying concrete relations makes it easier for children to identify subsequent abstract relations (Kotovsky & Gentner, 1996). Thus, these few existing studies provide tentative support for an empirical link between analogical and proportional reasoning.
Based on these theoretical and empirical considerations, we test the novel hypotheses that incorporating multiple exemplars and common labeling – evidencebased supports for analogical reasoning – will facilitate children’s ability to reason proportionally about discrete units.
Prior work has identified two key supports for analogical reasoning: multiple exemplars and common labels. Studies have found that children who were shown multiple exemplars of a relational category were more successful in choosing the relational match than children who were shown only one exemplar (Gentner & Namy, 1999; Namy & Gentner, 2002). For example, Gentner and Namy (1999) asked 4yearolds to choose another “blicket” after being shown a bicycle and/or a tricycle as exemplars. With only one exemplar (either the bicycle or tricycle), children selected the perceptually similar, but conceptually different match (eyeglasses). In contrast, when shown two exemplars (bicycle and tricycle), children selected the conceptually similar, but perceptually different match (skateboard). Further, two exemplars only helped children find relational commonalities when the exemplars shared relational similarities (Namy & Gentner, 2002; Waxman & Klibanoff, 2000). Based on this evidence, we hypothesize that aligning two proportionally equivalent, but perceptually different exemplars will help children find the relational commonality and select the proportional match.
In addition to multiple exemplars, common labeling has been shown to support analogical reasoning. Sharing labels between two objects encourages children to compare their commonalities, even when using novel labels such as “blicket” (Christie & Gentner, 2014; Gentner & Namy, 2006; Namy & Gentner, 2002; Waxman & Klibanoff, 2000). For example, 4yearolds who were shown two objects with the same label (i.e., “This is a blicket and this is also a blicket”) chose the category or relational match (Namy & Gentner, 2002). In contrast, participants who were given conflicting labels (i.e., “This is a blicket and this is a daxen”) chose the perceptual match (Namy & Gentner, 2002). These results indicate that young children can understand deeper relational commonalities when objects are given common labels. In this regard, we expect that providing common labels will prompt children to compare commonalities between discrete proportions, encouraging them to find a relation beyond numbers of units and helping them recognize relations of proportional equivalence.
Moreover, labeling a relation using a novel adjective (as opposed to a novel noun) may help offset a wholeobject wordlearning bias – the erroneous tendency to assume a novel label refers to an entire object as opposed to its properties or parts (Hollich et al., 2000). This wordlearning bias can impact learning of mathematical concepts. For example, when children are shown two lines meeting at a single vertex (a “V” shape) and are told, “this is an angle”, children assume that the word “angle” refers to the whole object (i.e., the V shape), rather than a particular property of the object (i.e., the degree of rotation), which can lead to misconceptions about the meaning of “larger angles” and “smaller angles” (Gibson, Congdon, & Levine, 2015). Similarly, using an adjective rather than a noun may reduce the ambiguity of the novel word’s referent and help children associate the novel word with a property of the object (in this case, the proportional relation) rather than with the whole object. Adjectives can also work in conjunction with exemplars to facilitate comparison of properties across different objects (Waxman & Klibanoff, 2000). Waxman and Klibanoff (2000) found that when 3yearolds were shown two different exemplars with a shared property (e.g., a transparent plate and a transparent toothbrush), both labeled by a novel adjective (i.e., “blickish”), they were able to correctly extend the novel adjective to other transparent objects. In contrast, children failed to do so if the exemplars were the same type of object or if the property was not labeled (Waxman & Klibanoff, 2000). Here, we expected novel adjective labels to facilitate children’s ability to match proportions, when compared to no labels.
We were also interested in children’s response to labeling proportions in a narrative context of mixing juice and water, a relatively complex referent often used in prior research (Boyer & Levine, 2015; Boyer et al., 2008; Möhring et al., 2016). One purpose for including the juice mixing script in the present study was to replicate and extend these findings. Another aim was to address the utility of the juice mixing narrative, that is, to determine if this descriptive script prompts children to reason proportionally more so than less descriptive scripts. The juice mixing script may help children avoid wholeobject wordlearning biases by providing different labels for the whole object (“glass of juice”) and for both parts of the proportion (“juice” and “water”). Moreover, the conceptual relation between the terms “juice”, “water”, and “glass of juice” may present similar advantages to relational labels (e.g., “Daddy”, “Mommy”, “Baby”) and invite children to compare relations rather than numerical quantities (Ratterman & Gentner, 1998). Encouraging children to think about a discrete representation continuously (i.e., mixing juice) may activate strategies based on continuous proportion estimation and help children avoid using incorrect countandmatch strategies. A conceptual referent, such as mixing juice and water, can be quite powerful in influencing how children think about proportions. For example, two different food analogies had disparate outcomes on a proportional matching task (SingerFreeman & Goswami, 2001). Though both foods were depicted in a discrete manner, 3 and 4yearolds had greater success reasoning proportionally about a pizza (a continuous conceptual referent) than a box of chocolates (or discrete referent) (SingerFreeman & Goswami, 2001). For this reason, a continuous paradigm such as the juice mixing narrative may provide an advantageous context for reasoning about proportionality. Therefore, we expected the juice mixing narrative to be more advantageous for proportional reasoning than no labels, which provide very little information of the task at hand.
Based on evidence that children struggle with discrete quantities (Boyer & Levine, 2015; Jeong et al., 2007; SingerFreeman & Goswami, 2001) and because discrete representations are commonly associated with symbolic fractions (Rapp, Bassok, DeWolf, & Holyoak, 2015; SingerFreeman & Goswami, 2001), the current study focused solely on proportional reasoning about discrete units. Further, we recruited 4^{th} and 5^{th} graders based on prior research showing that children this age continue to struggle on a discrete proportion matching task (Boyer & Levine, 2015). Drawing on a theoretical analysis of proportional reasoning and analogical reasoning, we asked: do multiple exemplars and common labeling encourage children to reason proportionally when comparing discrete quantities? Based on previous studies utilizing the juice and water mixing paradigm in proportional reasoning contexts, we also asked: does the juice mixing narrative help children to match equivalent proportions, over and above common labeling or no labels?
To address these questions, we compared proportional reasoning performance across two exemplar conditions (oneexemplar versus twoexemplars) and three script conditions (juice mixing, novel adjectives, and no labels), a 2 x 3 betweensubjects design. We also manipulated foil type (partfoil, wholefoil) as a withinsubjects factor. As noted previously, prior studies using the proportional equivalence choice task have shown that young children typically perform worse on partfoil than wholefoil items, indicating a partmatching bias (Boyer et al., 2008).
We hypothesized that children would make more proportional matches with two exemplars than one exemplar on the basis that aligning two exemplars encourages deep, relational thinking beyond perceptual commonalities. We also hypothesized that a juice mixing narrative (Boyer et al., 2008) and a novel adjectives condition, which both implement forms of common labeling, would yield more proportional matches than a no label condition. Our hypothesis regarding the relation between the juice narrative and the novel adjectives conditions was less clear. On the one hand, children could perform better in the juice narrative than the novel adjectives condition because the juice narrative is more elaborated, highlights continuity (i.e., mixing juice), and draws on prior knowledge by using commonplace, related terms (i.e., juice, water, mixing, taste). On the other hand, prior research shows that common labeling alone – even with novel adjectives – may be sufficient to direct attention to relational commonalities (Gentner & Namy, 1999, 2006; Namy & Gentner, 2002), potentially leading to no significant differences between the novel adjectives and juice mixing scripts.
It is also worth noting that, in the condition with no labels and a single exemplar, the task is ambiguous – a child could reasonably interpret the instruction, “Which one of these is just like this one?” as requesting a match based on proportion or based on numerical values. Thus, this condition serves as a baseline for children’s tendency to match based on proportions versus numerical values of the part or whole. Thus, we can compare performance in the other conditions to this baseline to establish the impact of additional supports (novel labels, juice narrative, and/or multiple exemplars) on children’s tendency to match based on proportions.
We expect the results of the present study to elucidate which, if any, analogical reasoning supports (multiple exemplars and labeling) are most beneficial in helping children overcome incorrect countandmatch strategies to reason about discrete proportions.
Method
Participants
Children who received parental consent and assented to the study were enrolled. Participants were 119 children in 4^{th} and 5^{th} grades (M_{age} = 10.04 years, SD_{age} = 0.59; 64 girls; 78 4^{th} graders) from eight schools in a large city in the Eastern United States. The average family income was $44,049 (SD = $30,400, n = 71) and average parental education (maximum of both parents) was 14.0 years (SD = 2.47, n = 83). Of the participants whose parents reported their race/ethnicity (n = 86), 38.4% were Black or African American, 18.6% Hispanic, 17.4% White, 15.1% multiple race/ethnicities, 7.0% Asian/Asian American, 2.3% American Indian/Alaskan Native, and 1.2% Other.
Measure
Children completed a 16item, computerized proportional equivalence choice task adapted from Boyer and colleagues (2008). The task displayed one or two exemplar proportions on the left side of the laptop screen and two choice proportions on the right side (see Figure 1).
The task presented the exemplar(s) first and choices second, in order to encourage processing of the exemplar(s) before viewing the choices. All stimuli remained on the screen for the child to choose his/her answer. The exemplar and choice proportions were represented as columns of discrete units, demarcated by black lines. Consistent with prior research using the juice mixing paradigm, the lower part units (colored units) were colored yellow, red, purple, green, or orange, and the upper part units were always light blue. A small picture of a teddybear named “Wallybear” was displayed on the top left corner of the screen. Stimuli were identical across all script conditions. The oneexemplar and twoexemplars conditions differed only in whether one or two targets were displayed.
Each trial had two choices: the correct proportional match (e.g., 2 purple and 4 blue units matched 3 purple and 6 blue units, Figure 1a) and either a partfoil or wholefoil choice (see Table A1). On partfoil trials (8 trials, see Figure 1b for an example), the incorrect choice matched the target in terms of the number of colored units, whereas on wholefoil trials (8 trials, see Figure 1a for an example), the incorrect choice matched the target in terms of the total number of units (colored units plus light blue units). Targets A and B (in twoexemplars trials) and the correct proportional choice were counterbalanced for side of presentation (i.e., left or right). Trials were presented in a random order. Scores were calculated as percent correct.
Figure 1
Design and Procedure
Participants were randomly assigned, within classroom, to one of six betweensubjects conditions in a 2 (number of exemplars) x 3 (script type) design. Foil type was manipulated within subjects. In the oneexemplar conditions (Figure 1a), participants were shown one exemplar and two choices; in the twoexemplars conditions (Figure 1b), they were shown two exemplars of the same proportion and two choices (see Appendix 1 for all trials).
The three scripts (juice mixing, novel adjectives, and no label) differed in how the task was introduced and how the proportions were verbally labeled on each trial (see Appendix 2 for scripts). In all conditions, the experimenter introduced participants to “Wallybear” (Boyer et al., 2008). In the juice mixing conditions, Wallybear was described as a character who mixed various amounts of juice with water. On each trial, participants were asked (twoexemplars script in parentheses): “Wally says this glass of juice tastes just right. (He says this one also tastes just right). Which glass of juice has just the right amount of juice and just the right amount of water so that it tastes just like this one (these)?” In the novel adjectives condition, participants were asked (twoexemplars script in parentheses): “Wally says this is a very dakish one. (He says this one is also very dakish). Now Wally wants another one that is dakish. Can you give him another one that is dakish?” A different novel adjective was used for every trial. In the no label condition, participants were asked (twoexemplars script in parentheses): “Wally likes this one. (He also likes this one). Now Wally wants another one. Which one of these is just like this one (these)?”
Children were tested individually at their school by a trained experimenter. The proportional equivalence choice task was administered on a laptop while the experimenter read the script corresponding to the assigned condition. No feedback was given.
Results
Preliminary Analyses
Across conditions, performance was above chance, M = .63, SD = .25, t(118) = 5.60, p < .001. Fifth graders (M = .70, SD = .23) performed significantly better than 4^{th} graders (M = .59, SD = .25), t(117) = 2.38, p = .019, d = .46. Both grade levels performed above chance, ps < .01. Performance did not differ as a function of gender (girls: M = .62, SD = .26; boys: M = .64, SD = .25, t(117) = .45, p = .656, d = .08). In addition, a preliminary ANOVA including gender and grade as factors found no significant interactions of these factors with our manipulated variables, 0.01 < F < 2.80, ps > .05, .001 < ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ < .029; therefore, we did not include grade or gender in our main analyses.
Main Analyses
We conducted a mixedeffects ANOVA on accuracy, with exemplar (one, two) and script type (juice mixing, novel adjectives, no labels) as betweensubjects factors and foil type (partfoil, wholefoil) as a withinsubjects factor (Figure 2; Table 1).
There were no significant main effects of exemplar, F(1, 113) = 1.89, p = .172, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .02 or script type, F(2, 113) = 2.35, p = .100, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .04. However, there was a significant threeway interaction between exemplar, script type, and foil type, F(2, 113) = 4.28, p = .016, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .07. There was also a significant main effect of foil type, F(1, 113) = 5.67, p = .019, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .05, and a significant interaction between foil type and exemplar, F(1, 113) = 6.78, p = .010, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .06. The other twoway interactions were not significant (exemplar x script, F(2, 113) = .77, p = .467, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .01; foil type x script, F(2, 113) = 1.94, p = .148, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .03).
Effects Within OneExemplar Conditions
To further understand this 3way interaction, we first examined performance within the oneexemplar conditions (Figure 2a). Within the oneexemplar conditions, we conducted a 3 (script type) x 2 (foil type) mixedeffects ANOVA. There was a significant main effect of foil type, F(1, 55) = 9.73, p = .003, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .15, but not script type, F(2, 55) = 2.67, p = .078, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .09, and a significant interaction between foil type and script type, F(2, 55) = 4.40, p = .017, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .14.
Figure 2
Table 1
Measure  Juice narrative

Novel adjectives

No label



One exemplar (n = 20) 
Two exemplars (n = 20) 
One exemplar (n = 18)  Two exemplars (n = 20) 
One exemplar (n = 20) 
Two exemplars (n = 21) 

Demographics  
n_{girls}, n_{boys}  9, 11  12, 8  10, 8  11, 9  10, 10  12, 9 
n_{4th}, n_{5th} Grade  14, 6  14, 6  11, 7  12, 8  13, 7  14, 7 
M_{age} in years (SD)  9.99 (.72)  9.94 (.43)  10.08 (.61)  10.31 (.61)  9.92 (.54)  10.00 (.55) 
Proportional reasoning task performance, % accuracy (SD)  
Total accuracy  .67 (.26)**  .68 (.20)***  .64 (.24)*  .68 (.25)**  .49 (.27)  .63 (.28)* 
Partfoil trials accuracy  .68 (.27)**  .67 (.22)**  .56 (.30)  .66 (.32)*  .38 (.36)  .66 (.30)* 
Wholefoil trials accuracy  .65 (.28)*  .68 (.19)***  .72 (.25)**  .69 (.20)***  .61 (.29)  .60 (.33) 
*p < .05. **p < .01. ***p < .001. Asterisks indicate pvalues from a onesample ttest comparing accuracy to chance (50%).
Next, we examined the effect of foil type within the oneexemplar conditions, separately for each script type. In the oneexemplar, no label condition (a baseline with ambiguous instructions), children performed significantly worse on partfoil than wholefoil trials, t(19) = 2.85, p = .010, d = .64. Children also had worse performance on partfoil than wholefoil trials in the oneexemplar, novel adjectives condition, t(17) = 2.35, p = .031, d = .55. In contrast, there was no significant effect of foil type in the oneexemplar, juice mixing condition, t(19) = .72, p = .480, d = .16.
We also directly compared performance in the baseline oneexemplar, no label condition to the other oneexemplar conditions, to assess whether the juice narrative or novel adjectives led to greater proportional matching. Children’s overall accuracy was significantly higher in the oneexemplar juicemixing narrative condition than in the baseline oneexemplar, no labels condition, t(38) = 2.10, p = .042, d = .68; this difference was driven by significantly better performance on partfoil trials, t(38) = 3.04, p = .004, d = .99. In addition, overall accuracy was somewhat higher in the oneexemplar, novel adjectives condition, than in the baseline oneexemplar, no labels condition, though this difference was not statistically significant, t(36) = 1.78, p = .083, d = .59.
Effects Within TwoExemplar Conditions
In contrast to the oneexemplar conditions, within the twoexemplars conditions (Figure 2b), there were no significant main effects of foil type, F(1, 58) = .03, p = .856, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .001, script type, F(2, 58) = .25, p = .784, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .01, or interactions of script type and foil type, F(2, 58) = .93, p = .400, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .03. Thus, script type and foil type impacted performance when one exemplar was shown, but did not significantly impact performance when two exemplars were provided.
Effects Within PartFoil Trials
In order to understand the effects of the betweensubjects manipulations on the more difficult partfoil trials, we also conducted followup ANOVAs within foil type. On partfoil trial trials, a 2 (number of exemplars) x 3 (script type) betweensubjects ANOVA revealed a significant main effect of exemplar, F(1, 113) = 5.16, p = .025, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .04, no significant main effect of script type, F(2, 113) = 2.85, p = .062, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .05, and no significant interaction between script type and exemplar, F(2, 113) = 2.58, p = .080, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .04. The main effect of exemplar indicated that on partfoil trials, performance was significantly better with two exemplars (M = .66, SD = .28) than with one exemplar (M = .54, SD = .33), t(117) = 2.22, p = .028, d = .41.
Effects Within WholeFoil Trials
On wholefoil trials, a 2 (number of exemplars) x 3 (script type) betweensubjects ANOVA found no significant main effects of script type, F(2, 113) = 1.50, p = .228, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .026, exemplar, F(1, 113) = .001, p = .974, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ < .001, or interactions of script type and exemplar, F(2, 113) = .10, p = .901, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .002.
In sum, the presence of two exemplars led to higher performance specifically on partfoil trials, leading to no foiltype effect within any of the twoexemplars conditions. Notably, foil type did not significantly impact performance in any of the juice mixing conditions (even with only one exemplar). However, children performed worse on partfoil than wholefoil trials in the two conditions in which there was neither a juice mixing narrative nor multiple exemplars (i.e., oneexemplar with novel adjectives, and oneexemplar with no labels).
Discussion
Our results show that multiple exemplars and common labeling – supports that have been shown to aid analogical reasoning – can also support proportional reasoning in late elementary school. Although we had predicted overall benefits (i.e., main effects) of multiple exemplars and scripts that included common labeling (juice narrative and novel labels), our results were more nuanced and depended on trial type (a threeway interaction between exemplar, script type, and foil type with a smalltomedium effect, ${\mathrm{\eta}}_{\mathrm{p}}^{2}$ = .07). Specifically, a strong partmatching bias (i.e., a mediumtolarge difference in accuracy between partfoil and wholefoil trials, d = .64) appeared in ambiguous contexts (i.e., one exemplar with no labels) when children were simply asked which images were alike and could match based on numerical quantity or proportional equivalence. However, this partmatching bias was eliminated when two exemplars were provided. This may be because showing children two examples of a proportion, which always differed in the numerical quantities of the part and whole (e.g., 4/16 and 3/12), helped children focus on relations rather than countable quantities, and therefore helped children avoid choosing based on the perceptual similarity between the exemplar and the partfoil choice (i.e., the number of countable, colored units).
In addition, the juice mixing narrative, which incorporates common labeling (“juice”, “water”) and also draws on conceptual knowledge about mixing continuous quantities, was effective in promoting proportional reasoning over and above baseline performance with a mediumtolarge effect on total accuracy (oneexemplar juice narrative versus nolabel conditions, d = .68). The juice mixing narrative also buffered against the partmatching bias that was elicited in ambiguous contexts. This provides support for the theory that the narrative script impacts proportional reasoning. Surprisingly, providing two exemplars and the juice mixing narrative together did not have an additive effect on performance. In other words, showing either two exemplars or applying the juice paradigm helped children think proportionally; however, providing both did not lead to additional benefits.
We also tested whether common labeling alone would improve proportional reasoning via the novel adjectives conditions (e.g., “this one is dakish”). With only one exemplar, novel adjectives led to performance that was marginally better than no labels, a mediumtolarge effect (d = .59). In addition, the partmatching bias was still present in the oneexemplar, novel adjective condition with a mediumtolarge effect (d = .55). These results indicate that novel adjectives alone were not sufficient to eliminate the partmatching bias. It is possible that the oneexemplar novel adjectives condition was ambiguous in the same way that the baseline, no labels condition was – although we expected children to interpret the novel adjective (e.g., “dakish”) as describing the property of proportionality, some children may have interpreted it as relating to the numerical quantity of the parts or wholes. Nevertheless, the marginal improvement compared to the no labels condition, with a mediumtolarge effect size (d = .59), was consistent with our prediction that novel adjectives would encourage proportional thinking. Future research replicating these findings could lead to greater clarity regarding the impact of novel adjectives alone on proportional reasoning.
The partmatching bias we observed in this study is consistent with previous findings that young children struggled more on partfoil trials than wholefoil trials (Boyer et al., 2008). As discussed by Boyer and colleagues (2008), this could be due to the perceptual salience of the numerator, as it was the only color that changed from trial to trial. Additionally, children may have a more general bias to reason parttopart rather than parttowhole when thinking relationally (Singer & Resnick, 1992). However, whereas prior work indicated that this partmatching bias disappeared by 4^{th} grade (Boyer et al., 2008), the current study observed poorer performance on partfoil items than wholefoil items in 4^{th} and 5^{th} graders, but only in those conditions that did not provide multiple exemplars or a juice mixing narrative. Without either support, this partmatching bias, previously observed in younger children with the juice narrative, emerges at this older age. This finding raises the possibility that the juice mixing narrative may be as uninformative to 1^{st} and 2^{nd} graders as the no label condition is to 4^{th} and 5^{th} graders, if younger children have less familiarity than older children with the juice mixing context, or have more difficulty applying the notion of continuity (i.e., mixing juice) to reason proportionally. In fact, the juice mixing script is similar in complexity to a word problem, requiring the participant to first understand the grammatical rules and semantics of English in order to highlight key information and implement a strategy to find the solution (Jitendra et al., 2011), potentially making it difficult for young children to benefit from the juice mixing narrative. Future research could test younger children to determine whether 1^{st} and 2^{nd} graders benefit from the juice mixing narrative and presentation of multiple exemplars in the same way that 4^{th} and 5^{th} graders did in the present study. Prior research suggests that younger children may not benefit from such supports; for example, 4^{th} graders benefitted from viewing continuous proportion matching trials before discrete ones, but kindergarteners and 2^{nd} graders did not (Boyer & Levine, 2015). It may not be until 3^{rd} or 4^{th} grade that children can understand how to use such supports for analogical reasoning to overcome maladaptive countandmatch strategies.
These results are consistent with prior research showing that generally, children perform poorly on the proportional equivalence choice task when presented with a discrete target and discrete choices. On the same version of the task (i.e., one exemplar, juice mixing narrative), 4^{th} graders in Boyer and colleagues (2008) performed with 66% accuracy on both part and wholefoil trials, while our 4^{th} and 5^{th} graders performed with 68% accuracy on partfoil trials and 65% accuracy on wholefoil trials. While supports from analogical reasoning may improve understanding of matching discrete proportions, no manipulation in this study resulted in performance that was as high as performance involving continuous proportions in prior studies. For example, 4^{th} graders shown a continuous target and discrete choices (Boyer et al., 2008) or eight continuous trials before eight discrete trials (Boyer & Levine, 2015) obtained accuracy levels of 80% or higher.
To our knowledge, this is the first study to show that multiple exemplars and script context (i.e., the juice narrative) improve children’s ability to match discrete proportions. Moreover, our participants are older than children in previous analogical reasoning studies, extending evidence for the benefits of multiple exemplars and common labeling to a novel domain (proportional reasoning) and age group. Taken together, the implications of this research can inform classroom instruction and improve the teaching and learning of proportions and fractions, showing that young children with strong wholenumber biases may benefit from analogical reasoning supports to overcome those biases and reason relationally.