Rational number knowledge is an important component of mathematical learning and predicts later mathematical achievement (Bailey, Hoard, Nugent, & Geary, 2012; Siegler, Fazio, Bailey, & Zhou, 2013). Yet, across age and nationality, individuals struggle with understanding and using rational numbers in different mathematical contexts (e.g. Behr, Harel, Post, & Lesh, 1994; Gómez, Jiménez, Bobadilla, Reyes, & Dartnell, 2014; Iuculano & Butterworth, 2011; Mazzocco & Devlin, 2008; Vamvakoussi, Christou, Mertens, & Van Dooren, 2011). Difficulties may appear because prior knowledge and experience with natural numbers often does not support rational number learning. Natural numbers are based on different principles and properties than rational numbers, such that application of natural number properties and rules when reasoning with rational numbers may lead to misconceptions and errors (Carpenter, Fennema, & Romberg, 1993; Ni & Zhou, 2005; Smith, Solomon, & Carey, 2005; Vamvakoussi & Vosniadou, 2010). The current study investigates how prior natural number knowledge may result in a welldocumented misconception in the mathematics education community: The tendency for students to associate arithmetic operations with specific result sizes, i.e., bigger numbers in multiplication and smaller numbers in division (Fischbein, Deri, Nello, & Marino, 1985; Greer, 1987; Izsák & Beckmann, 2018; Onslow, 1990; Prediger, 2008).
The Natural Number Bias Phenomenon
The fact that prior number knowledge interferes with learning more advanced number concepts has been wellestablished in mathematics education for decades (e.g. Hart, 1981; Rees & Barr, 1984). From a developmental perspective, this phenomenon is called the whole number bias (Ni & Zhou, 2005) or more recently the natural number bias (NNB) (Vamvakoussi, Van Dooren, & Verschaffel, 2012; Van Dooren, Lehtinen, & Verschaffel, 2015), and refers to students’ tendency to use natural number knowledge when reasoning about rational numbers. The NNB can explain student difficulties and erroneous behaviors in different number task domains. For example, when reasoning about rational number magnitudes, students may erroneously think that – as with natural numbers – decimal numbers with more digits are larger, e.g., that 2.367 is larger than 2.6 (Moutier & Houdé, 2003; Nesher & Peled, 1986; Resnick et al., 1989; Roell, Viarouge, Houdé, & Borst, 2017, 2019). Similarly, students tend to think that the bigger the numerator and denominator of a fraction, the bigger the fraction value, which results in mistakes in ordering fractions, e.g., 249/1000 is larger than 1/4 (DeWolf & Vosniadou, 2011; Hartnett & Gelman, 1998; Moss, 2005). As another example, the NNB may create difficulties for students to understand that the set of rational numbers is dense (i.e., that there are infinitely many numbers between any two rational numbers). Because the set of natural numbers is discrete (i.e., between two successive natural numbers there is no other natural number), the NNB may lead students to think that between two pseudoconsecutive numbers (e.g., 0.5 and 0.6), there is no other number (Desmet, Grégoire, & Mussolin, 2010; Vamvakoussi & Vosniadou, 2010).
Most often associated with the NNB is students’ reasoning about the size of the result of rational number arithmetic. Students tend to think that multiplication always results in a larger number while division always results in a smaller number. Wellknown to teachers, this misconception has been reported in the literature since the 1920s (Thorndike, 1922 as mentioned by Krueger & Hallford, 1984). Empirical research has shown that this misconception systematically appears when students solve word problems (Bell, Swan, & Taylor, 1981; Dixon, Deets, & Bangert, 2001; Fischbein et al., 1985; Graeber, Tirosh, & Glover, 1989; Greer, 1987, 1994; Harel & Confrey, 1994; Hart, 1981) and occurs across schooling. As examples, a majority of second graders have incorrectly answered whether 4.6 ÷ 0.6 is more or less than 4.6 (Greer, 1987), secondary students have responded that x > x × 2 cannot be true (Van Hoof, Vandewalle, Verschaffel, & Van Dooren, 2015), and college students have responded that z × 7 cannot be smaller than 7 (Vamvakoussi, Van Dooren, & Verschaffel, 2013).
Fischbein et al. (1985), who offered one of the first insights for this phenomenon, suggested that students hold implicit intuitive models of arithmetic, which shape their expectations about the effect of operations. These intuitive models associate addition with putting together, subtraction with taking away, multiplication with repeated addition, and division with equal sharing. From a NNB perspective, Vamvakoussi and colleagues (2013) argued that these intuitive models are compatible with  and based on  natural number operations, and relate to the operation and not the numbers involved. Specifically, addition and multiplication between natural numbers (excluding 0 and 1) always results in larger numbers. Similarly, the result of subtraction or division between two natural numbers is always a smaller number. This is, however, not true for nonnatural numbers, for which the effects of operations depend on the numbers involved. Multiplication with rational numbers less than 1 results in answers that are smaller than at least one operand (e.g., 4 × 0.2 = 0.8) and division with rational numbers less than 1 results in answers that are larger than the operands (e.g., 8 ÷ 0.4 = 20).
Students’ years of experience working with natural numbers may create a strong intuition about the results that can be expected from naturalnumber arithmetic (e.g., multiplication makes larger) (Greer, 1994), or addition makes larger (Dixon et al., 2001). From a framework theory approach to conceptual change, students from very early on construct an initial understanding for the number concept, which, based on their experience with counting and the sequence of the number words, acquires the characteristics of the mathematical concept of natural numbers (Gelman, 2000; Smith et al., 2005; Vamvakoussi, Christou, & Vosniadou, 2018; Vosniadou, Vamvakoussi, & Skopeliti, 2008). This initial number concept, which is organized into a framework theory for number, may form students’ beliefs, their interpretations and their anticipations about the properties of numbers. From this perspective, the influence of the initial number concept when reasoning with more advanced numbers such as the rational and the real numbers may result in the NNB phenomenon and lead to systematic errors in rational number tasks where the properties of natural numbers do not hold (Vamvakoussi et al., 2018).
Importantly, the initial number conception remains even after learners have acquired the correct mathematical knowledge about rational numbers and, in principle, are able to solve rational number tasks. Thus, natural number knowledge may still interfere after students have learned to reason with rational numbers and have acquired the knowledge to solve rational number tasks (see also Shtulman & Valcarcel, 2012; Vosniadou et al., 2008). Support for this idea, and specifically for the NNB with arithmetic operations, comes from studies using accuracy and reaction time measurements. If properties of natural numbers come to mind first, even in learners who have the correct rational number knowledge, responses to arithmetic equations that are aligned with natural number knowledge (called congruent items) would be quicker and more accurate than for incongruent items, in which natural number knowledge leads to a false conclusion. This is because a learner has to overcome this initial idea using analytical reasoning that may not always be successful, which in turn produces longer response times and fewer accurate responses. These response patterns align with dual process theories that rely on the distinction and interaction between intuitive and analytical reasoning processes (Gillard, Van Dooren, Schaeken, & Verschaffel, 2009; Vamvakoussi et al., 2013). Several reaction time studies with student and adult populations support the theories of coexistence and intuitive interference of natural number knowledge by showing statistically significant differences in accuracy rates and response times between tasks that were inline with intuitions about the results of arithmetic operations and tasks that falsified these intuitions (Obersteiner, Van Hoof, Verschaffel, & Van Dooren, 2016; Vamvakoussi et al., 2012; Van Hoof et al., 2015).
The Dual Effect of NNB in Arithmetic Operations With Missing Numbers
Students’ initial conception of numbers, coupled with their intuitions for the results of arithmetic operations, suggests the NNB may affect how students solve arithmetic problems with missing operands. To test this, Christou administered paper and pencil tasks to primary (Christou, 2015a, 2015b) and secondary school students (Christou, 2017). The tasks included arithmetic equations with operations between a given number and a missing number (e.g., 9 ÷ _ = 4). Different categories of tasks captured the potential duality of the NNB effect. The three categories included: a) congruent tasks, in which the results of the operations were inline with students’ intuitions (i.e., bigger results for multiplication and smaller for division), and the missing numbers were natural numbers (e.g., 3 × _ = 12); b) tasks which were again inline with students’ intuitions about the results of the operations, but the missing numbers were rational numbers larger than 1 instead of natural numbers (e.g., 3 × _ = 11); and c) tasks in which the magnitude of the results of the operations falsified students’ intuitions that multiplication makes bigger and division makes smaller (e.g., 3 × _ = 2), and in which the missing number was a rational number smaller than 1. In all those studies (Christou, 2015a, 2015b, 2017), students had statistically significantly higher accuracy rates for the items aligned with their intuitions about the size of the results of each operation than items that falsified these intuitions. Additional studies have shown similar performance differences (Obersteiner et al., 2016; Vamvakoussi et al., 2013; Van Hoof et al., 2015).
Importantly, these results also indicated that students tend to think of the missing numbers in the items as natural numbers. Students performed significantly better when the missing number was a natural number than when it was a rational number, even when their intuitions about the results of the operations were not violated (Christou, 2015a, 2015b, 2017). Students’ also employ intuitions about missing numbers as natural numbers when evaluating algebraic expressions that contain literal symbols. Christou and colleagues found that students tend to substitute mostly natural numbers for literal symbols when evaluating algebraic expressions with operations between numbers and literal symbols (Christou & Vosniadou, 2012; Christou, Vosniadou, & Vamvakoussi, 2007). For example, students thought that k + 3 represented only natural numbers larger than three. Additionally, in an interview study with tenth graders, the majority of the students claimed that 5d is always bigger than 4/d because multiplication makes the numbers bigger than division, and most students supported this claim by substituting natural numbers for the literal symbols, despite the hints provided by the interviewer to also try with other kinds of numbers (Christou & Vosniadou, 2012). Similarly, Van Hoof and colleagues (2015) found that students who were interviewed, explicitly referred to general rules about the results of the operations that are valid only for natural numbers, or substituted literal symbols with natural numbers to come to an answer.
Strictly speaking, thinking of missing numbers or literal symbols as natural numbers only would not fall under the definition of the NNB as relying on the properties of natural numbers when reasoning about rational numbers. However, it would indicate the dominance of natural numbers in students’ thinking in numerical situations, and as such could be seen as another instance of the NNB. Thus, these studies suggest a dual effect of the NNB: intuitions about results of arithmetic operations and intuitions that missing numbers or literal symbols represent natural numbers.
The Present Study
The present study seeks to further examine and disentangle the dual effect of the NNB by measuring participants’ accuracy and reaction times as they make evaluations about whether arithmetic equations with a missing operand could be true or not (e.g., Is there a number such that 3 ÷ _ = 12?). In these tasks, the NNB may affect participants’ evaluations about the validity of such equations in two main ways. First, it would affect their strategy to check the validity of the equations by mentally substituting specific numbers for the missing number symbols. Under the influence of the NNB, participants may disproportionately substitute natural numbers. Second, participants may make evaluations based on intuitive expectations about the size of the results of operations (e.g., that multiplication should provide a larger outcome than the operand numbers), as shown in prior studies with literal symbols as missing numbers (Obersteiner et al., 2016; Vamvakoussi et al., 2012, 2013; Van Hoof et al., 2015).
The study has three aims:

As reviewed above, prior studies have focused on interviews and paper and pencil tasks that measure student accuracy. This study tests the dual effect assumption by measuring a different aspect of student reasoning: timed evaluations about missing number operands in arithmetic equations. This allows us to judge whether the NNB effect is still present, even when participants give correct responses.

It disentangles the two aspects of the NNB that may appear in tasks for which finding the missing number is more or less difficult.

It seeks to clarify whether participants’ tendencies to correctly evaluate the validity of statements like 3 × _ = 12 relate to a trialanderror mental process with specific natural numbers, or rather relate to retrieval of multiplication facts from long term memory.
Based on the dual aspect of the NNB in operations with missing numbers, participants may evaluate the validity of equations such as 3 ÷_ = 12 based on two main strategies. First, participants who are affected by the NNB may tend to respond that operations with larger results in multiplication and smaller in division would be evaluated as possible, and those with larger results in division and smaller in multiplication would be evaluated as impossible. Second, some participants may try to find the specific missing number that makes the expression possible in a trialanderror process. Thus, participants may indicate that there is a missing number that would make the equation true when the missing natural number is easy to find and would incorrectly respond that there is no number that makes the equation true when they cannot immediately find a natural number that would make the equation hold, either because the missing number is not a natural number, or because finding it is difficult. For this reason, we included three groups of equations with varying difficulty for finding the missing number:

the Group of Small Numbers, with numbers smaller than one hundred, where the missing number is easy to find when the missing number is a natural number,

the Group of Decimal Numbers, with equations using decimal numbers, and

the Group of Large Numbers, with numbers larger than one hundred.
In the last two groups of items, the missing number is more difficult to find than in the first group, even when in all cases the missing number is a natural number. As we explain below, differences in participants’ accuracy and reaction time among the three groups of items would provide further insights about the underlying cognitive processes when reasoning about the results of such arithmetic operations.
First, an important reason to include a set of items involving decimal numbers is to investigate a context effect. If students tend to think of an unknown number as a natural number in the first place, this tendency may be stronger when all numbers that are given in the equation are natural too. Students may be reminded to think of the unknown number as possibly nonnatural when the equations contain decimals. Indeed, recent findings from an empirical study which used the same design of tasks in a paper and pencil condition showed that students are affected by the types of numbers (either natural or decimals) that appear in arithmetic operations (Christou, 2017).
Second, the three sets of items are all needed to achieve the third aim: to clarify whether participants’ tendency to correctly evaluate the validity of statements like 3 × _ = 12 may be through trialanderror or because of familiarity with multiplication facts. To achieve this, this study included arithmetic combinations that would be familiar to participants because specific arithmetic facts come from the multiplication table or because the arithmetic facts were easily confused with familiar arithmetic combinations (a false familiarity, e.g., 3 ÷ _ = 12). Krueger (1986) showed that when there is very good verbatim memory for particular multiplication equations, participants count not only on a plausibility evaluation of the given statements, but also direct fact retrieval from memory (see also Stazyk, Ashcraft, & Hamann, 1982). To test for plausibility evaluation and retrieval mechanisms, the current study included arithmetic equations with small number combinations that appear in the multiplication table and thus draw on strong verbatim memory, and equations with decimals and large natural numbers where there is less or no verbatim memory to support specific evaluations. For the group of Decimal and the group of Large numbers, participants may only count on exact calculations in a mental trialanderror process, or on plausibility evaluations, which are based on operational patterns (see Prather & Alibali, 2008) that may determine the expected results from each operation independently of the numbers involved. Considering the first strategy, calculations especially with large numbers are more difficult than with small numbers (Ashcraft, 1992), and the NNB may affect the second strategy.
Finally, as part of the third aim, this study tested the effect of false familiarity in participants’ evaluations with equations that appeared familiar, but were not. In these cases, a false result may be difficult to reject because the equation is similar to a true equation but with a different operation. For example, in a crossoperation confusion condition (Krueger & Hallford, 1984) people find it difficult to reject such equations as 7 + 1 = 6 for which the answer would be true if the same pair of numbers would appear under subtraction (Ashcraft & Battaglia, 1978), and 3 ÷ 3 = 9, for which the answer would be true with multiplication (Winkelman & Schmidt, 1974). Similar research has shown that participants tend to think that 16 ÷ 32 = 2 is correct (Bell et al., 1981). Additionally, Stazyk and colleagues (1982) reported slower response times and increased error rates in confusing problems such as 7 × 4 = 21 or 7 × 4 = 35, than in nonconfusing problems (e.g., 7 × 4 = 18), in which confusion problems were characterized as those with answers adjacent or near the correct answer in the multiplication table, that is answers differing ± 1 in one of the operands. Both confusion effects stemmed from relatedness among the numbers in the memory representation of multiplication facts. To test for the effect of false familiarity, the current study included a category of items in which the missing numbers were unit fractions. These equations falsified participants’ intuitions about the size of the results of multiplication and division with the missing number. For example, in the equation 7 ÷_ = 42, the missing number is 1/6 and this makes it seem familiar to 7 × _ = 42 (where the missing number is 6), which appears in the multiplication table.
In sum, the current study examined accuracy and reaction time for evaluations about arithmetic equations with missing operands. Items were inline with or violated participants’ intuitions about the size of the results of the given operations (operationcongruent/incongruent items) and also about whether only natural numbers could be substituted for the missing number symbols (numbercongruent/incongruent items). Also, differences in the effect of each aspect of the NNB were tested by comparing participants’ performance across three groups of items that were designed with the above characteristics: The Group of Small numbers, the Group of Decimal Numbers, and the Group of Large Numbers. In order to account for the effect of false familiarity, participants’ performance was examined with numberincongruent and operationincongruent items in which the missing number was a unit fraction.
Method
Sample
Seventyseven college participants participated in this experiment. The participants were mostly first year bachelor students of Educational Sciences, or were graduate students in Education or Art. Participants’ age ranged from 18 to 29 years (M = 19.16 years), and 64 defined themselves as female. The students participated in the experiment as a prerequisite for taking the exams in one of their main courses.
Stimuli
There were 108 stimuli, consisting of 72 experimental and 36 distractor items. Each item consisted of a multiplication or division equation with a given number and a missing number (e.g., 3 ×_ = 12). Missing number symbols were used instead of literal symbols (e.g., x) to discourage participants from applying a general insight about the solvability of linear equations, such as the general solution of x=b/a for any equation of the form a × x = b, a ≠ 0. Using missing number symbols may discourage participants from using this strategy of equation solving. For each item, participants were asked to evaluate whether it is possible for the equation to be true or not, i.e., whether it would be possible to find a missing number that would provide the given result, without the need to actually find the missing number.
Experimental Items
The experimental items were generated using a 3 (group type: Small, Decimal, Large Numbers) × 4 (numbercongruent/incongruent × operationcongruent/incongruent) × 2 (operation: multiplication or division) design. Examples of each type are presented in Table 1.
Table 1
Group / Item Type  Multiplication  Division 

Group of Small Numbers  
CC  3 × _ = 12  56 ÷ _ = 8 
IC  4 × _ = 31  26 ÷ _ = 9 
II 1  30 × _ = 6  7 ÷ _ = 42 
II 2  55 × _ = 8  3 ÷ _ = 11 
Distractors  0 × _ = 42  12 ÷ _ = 0 
Group of Decimal Numbers  
CC  3.1 × _ = 15.5  12.8 ÷ _ = 3.2 
IC  6.1 × _ = 17.2  7.5 ÷ _ = 4.3 
II1  18.3 × _ = 6.1  4.3 ÷ _ = 8.6 
II2  14.4 × _ = 3.1  3.2 ÷ _ = 11.7 
Distractors  0 × _ = 10.8  8.6 ÷ _ = 0 
Group of Large Numbers  
CC  6 × _ = 498  292 ÷ _ = 4 
IC  7 × _ = 384  735 ÷ _ = 8 
II1  438 × _ = 3  9 ÷ _ = 657 
II2  291 × _ = 4  6 ÷ _ = 497 
Distractors  0 × _ = 438  657 ÷ _ = 0 
The Three Groups of Items
Items were organized into three Groups, based on whether small numbers, decimal numbers, or large numbers were operators and outcomes. The Group of Small Numbers included items with small natural number combinations (smaller than 100). Participants are more familiar with these numbers since they are used in the classroom and in everyday life. Multiplication and division between numbers smaller than 100 also appear in the multiplication table, which participants have used extensively throughout schooling. The Group of Decimal Numbers consisted of multiplication and division where the operand number and the outcome were decimal numbers. Lastly, the Group of Large Numbers included number combinations greater than those in the multiplication table. Verbatim memory could support participants’ evaluations with small numbers, but not with decimals or large numbers.
The Four Types of Items
Numbercongruent items were equations in which a missing natural number would make the equation true. Numberincongruent items were equations in which a missing rational number would make the equations true. The operationcongruent items were aligned with properties of natural number arithmetic: multiplication equations with a larger outcome or division equations with a smaller outcome. Operationincongruent items were either multiplication equations with a smaller outcome or division equations with a larger outcome. Both types of equations were true.
Combinations of the item characteristics produced four types of items, per group, that differed in their numbercongruency and operationcongruency. First, there were items that were numbercongruent and operationcongruent (e.g., 3 ×_= 12); these items were called CC (abbreviation for Congruent/Congruent). In the Group of Small Numbers, the CC items were all part of the multiplication table, and would be familiar to the participants. The remaining item types were numberincongruent because substituting missing number symbols with a natural number would not make the equation true; thus, substituting only natural numbers would lead participants to the incorrect conclusion that the expression was not true. Second, some numberincongruent items were operationcongruent, in which multiplication produced bigger outcomes and division smaller outcomes; for example: 4 × _ = 31. These items are called IC (abbreviation for Incongruent/Congruent).
The third and fourth remaining categories of items were both numberincongruent and operationincongruent. For these items, the missing number was a nonnatural number and the size of the results of the operations were not inline with natural number arithmetic. Thus, both item types are II (abbreviation for Incongruent/Incongruent). In the first category of II items (named: II1) the missing number was a unit fraction. The II1 item number combinations may have appeared familiar to participants because they evoke number combinations that appear in the multiplication table, but these items had an inversion in the operant and the output. Thus, these items were false familiar because, for example, 7 ÷_ = 42 evokes ‘6’ as missing number, but the correct answer is 1/6; and this appears familiar because 42 ÷ 6 = 7. In the second category of II items (named: II2), there was a missing rational number smaller than one, which resulted in incongruent outcome size for each operation, i.e., smaller than the given operant numbers in cases of multiplication and larger in cases of division (e.g., 55 × _ = 8, in which case the unknown number is 8/55, or 0.1455).
There were 36 items for each Group: 18 items per operation, with three items of each Type, plus 6 distractor items. Note that there were no items that were natural numbercongruent and operationincongruent. In this case, the operation would need to be the opposite (multiplication makes smaller; division makes bigger), but with a natural number missing, such items are mathematically impossible to create.
Distractor Items
Since all experimental items had a number that would make the equation true, there were also distractor items that had no solution, and thus, for these items the correct response was that the equation could not be true. Distractors were arithmetic expressions with zero as an operator or outcome (see Table 1), since multiplication by zero always results in zero and division can never have zero as a quotient. Each group had 12 distractor items.
Procedure
Participants completed the experiment in one 30minute session in a classroom on a computer. Participants provided demographic information (i.e., age, gender) and then completed the experimental task. Stimuli were presented using Eprime. The items were presented in three blocks, with a break between each block. Each block included the four item types for one group. The order of the blocks was randomized.
Each equation was displayed with the question: Can this be true? Y/N. Participants responded by pressing Y for Yes and N for No. Participants were told that they did not need to find the missing number, only to respond whether a number exists that would make the equation true, as quickly and accurately as possible. Items were displayed until response and were preceded by a 500 ms fixation cross. Response time and accuracy were recorded for each trial.
Predictions
We had several predictions for participants’ response patterns to the four item types. If the participants’ responses were based on mentally substituting natural numbers, there would be greater accuracy (Prediction 1a) and shorter reaction time (Prediction 1b) on the numbercongruent/operationcongruent items (i.e., CC) than on the numberincongruent/operationcongruent items (i.e., IC). Additionally, if participants based their answers on the effect of operation, then there would be higher accuracy (Prediction 2a) and shorter reaction time (Prediction 2b) on operationcongruent items (i.e., the CC and IC items) than on operationincongruent items (i.e., the II1 and II2 items). If participants were affected by the familiarity of the number combinations from the multiplication table, there would be higher accuracy (Prediction 3a) and shorter reaction time (Prediction 3b) on false familiar items (II1 items) than on the other numberincongruent/operationincongruent items (II2 items), even though they are both number and operationincongruent types of items.
Predictions 1 to 3 as described so far apply to all the three Groups of items. They are the core predictions of our study, as they specifically refer to the way the items are solved: by mentally substituting (specific) natural numbers or by judging the effect of the given operation. However, we further wanted to explore differential effects for the three Groups of items, which may further corroborate the underlying mechanisms. For instance, for the Group of Decimal Numbers and the Group of Large Numbers, we predict the effect of numbercongruency will be less strong than in the Group of Small Numbers, because small number combinations are more easily memorized (Prediction 4). Especially for missing natural numbers (CC items), it is easier to determine the missing number in the Group of Small numbers, and in some cases in the Group of Decimals (e.g., the missing number 5 for 3.1 × _ = 15.5) than in the Group of Large Numbers (e.g. 6 × _ = 498). Therefore, participants may tend to think less of a specific number for equations with decimal numbers and even less for equations with large numbers, than for equations with small numbers. Instead, for decimal and large number combinations, participants may count more on the operationcongruency effect (i.e., whether there are larger outcomes for multiplication and smaller for division). Thus, a stronger effect of operationcongruency in the Group of Large Numbers and the Group of Decimal Numbers would be expected, compared to the Group of Small Numbers (Prediction 5). For the same reasons, it would be expected that the effect of false familiarity would be stronger in the Group of Small Numbers than in the other groups (Prediction 6).
Data Analysis
Distractor items were excluded from all analyses. Responses for multiplication and division items were analyzed separately, since the notions that multiplication makes bigger and division makes smaller may have differential effects. A separate analysis therefore provides a clearer picture of the phenomenon at hand for each operation. For each of the three groups, and again separately for multiplication and division items, we calculated outliers as trials that were more than three standard deviations above or below the average reaction time. Excluding distractor items, 18 of 3327 correct trials (0.5%) were excluded in total.
The Generalized Estimating of Equations (GEE) module in SPSS was used, with logistic regression to model accuracy (i.e., correct = 1 and incorrect = 0) and linear regression to model reaction time. The GEE approach accounts for the dependence of repeated measurements within subjects. In order to test the effect of number and operationcongruency the GEE analyses were applied to the accuracy rates, and also to reaction times, in the four types of items, for each one of the three Groups of items and each operation type. For both outcomes, Bonferroniadjusted pairwise comparisons were applied in order to test possible statistically significant differences between item types. For accuracy rates, odds ratios were calculated for these pairwise comparisons, and for reaction times Cohen’s d was used. To test for differences within each item type across the three groups, GEE analyses were also applied to the group analyses for each type. In this case, odd ratios were compared to account for differences in the effect of number and operationcongruency between the different groups of number combinations.
Results
Number and OperationCongruency Effects on Accuracy Rates
The Effect of Multiplication for Each Group of Items
Table 2 shows participants’ performance on all multiplication items, by item and group. GEE analysis of mean performance showed that in the Group of Small Numbers there was a significant effect of Type, χ^{2}(3, N = 924) = 82.059, p < .001. Posthoc Bonferroniadjusted pairwise comparisons showed a statistically significant difference between CC and IC, p < .001, OR = 28.17, 95% CI [9.62, 82.53], CC and II1, p < .001, OR = 8.00, 95% CI [2.67, 23.98], and CC and II2, p < .001, OR = 29.33, 95% CI [10.01, 85.95], in which CC items had the highest accuracy. Estimated odds ratios suggest that the odds of answering a CC item correctly were about 28 times the odds of answering an IC item correctly and about 29 times the odds of answering an II2 item correctly. Both of these odds ratios are higher than the estimated odds ratio of answering a CC item correctly, versus II1 item. Taken together, these results indicate a large effect for numbercongruency (inline with Prediction 1a), which is stronger between tasks that appear unfamiliar. Participants performed higher in IC than in II2 items which, in line with Prediction 2a, indicates an effect for operationcongruency but this difference was not statistically significant, p = 1.000, OR = 1.04, 95% CI [0.6, 1.82]. Not in line with Prediction 2a, participants performed statistically significantly higher on II1 items than on IC items, p < .001, OR = 3.52, 95% CI [1.93, 6.41]. Finally, accuracy on II1 items was statistically significantly higher than on II2 items, p < .001, OR = 3.67, 95% CI [2.01, 6.68], which indicate that the familiarity of the number combinations in the II1 items affected participants’ responses (Prediction 3a). For small numbers, the effect of numbercongruency is the largest, followed by the effect of familiarity. However, these results do not support a strong effect of operationcongruency for small numbers.
Table 2
Group / Item Type  M  SE  95% Wald CI



LL  UL  
Group of Small Numbers  
CC (e.g., 3 × _ = 12)  .96  .012  .93  .98 
IC (e.g., 4 × _ = 31)  .46  .064  .34  .58 
II1 (e.g., 30 × _ = 6)  .75  .044  .65  .82 
II2 (e.g., 55 × _ = 8)  .45  .059  .34  .57 
Group of Decimal Numbers  
CC (e.g., 3.1 × _ = 15.5)  .95  .015  .91  .97 
IC (e.g., 6.1 × _ = 17.2)  .56  .046  .47  .65 
II1 (e.g., 18.3 × _ = 6.1)  .74  .041  .65  .81 
II2 (e.g., 14.4 × _ = 3.1)  .46  .049  .37  .56 
Group of Large Numbers  
CC (e.g., 6 × _ = 498)  .83  .026  .78  .88 
IC (e.g., 7 × _ = 384)  .69  .040  .60  .76 
II1 (e.g., 438 × _ = 3)  .60  .047  .51  .69 
II2 (e.g., 291 × _ = 4)  .45  .047  .36  .54 
For the Group of Decimal Numbers, there was a statistically significant effect of Type χ^{2}(3, N = 924) = 90.477, p < .001. Pairwise comparisons showed a statistically significant difference between Types CC and IC, p < .001, OR = 14.93, 95% CI [5.59, 39.86], CC and II1, p < .001, OR = 6.68, 95% CI [2.45, 18.22], and CC and II2, p < .001, OR = 22.3, 95% CI [8.36, 59.52], in which CC items always had the highest accuracy. Again, the estimated odds ratios suggest that the odds of answering a CC item correctly were about 14 times the odds of answering an IC item correctly and about 22 times the odds of answering an II2 item correctly. Both of these odds ratios are higher than the estimated odds ratio of answering a CC item correctly, versus a II1 item. Taken together, these results show an effect for numbercongruency for decimal numbers (in line with Prediction 1a) which is stronger between tasks that do not appear familiar. Additionally, there was a statistically significant difference between IC and II2, p = .040, OR = 1.49, 95% CI [0.86, 2.61], which indicates an effect for operationcongruency (Prediction 2a). There was also a statistically significant difference between IC and II1 in which, counter to Prediction 2a, performance on II1 items was higher than on IC items, p = .010, OR = 2.24, 95% CI [1.23, 4.06]. Accuracy in II1 items was statistically significantly higher than in II2 items, p < .001, OR = 3.34, 95% CI [1.84, 6.06], which suggests quite an effect of familiarity for II1 items (Prediction 3a). In sum, for decimal numbers, numbercongruency appears to have the largest effect, followed by familiarity, whereas operationcongruency appears to have the smallest effect.
For the Group of Large Numbers, there was a statistically significant effect of Type, χ^{2}(3, N = 924) = 85.235, p < .001. There was a statistically significant difference between CC and IC, p < .001, OR = 2.19, 95% CI [1.12, 4.3], CC and II1, p < .001, OR = 3.25, 95% CI [1.69, 6.28], and CC and II2, p < .001, OR = 5.97, 95% CI [3.1, 11.47], in which CC items always had the highest accuracy. As above, these results suggest an effect of numbercongruency (Prediction 1a) however, analysis of odd ratios showed less strong effects than in previous groups. The estimated odds ratios suggest that the odds of answering a CC item correctly were about 6 times the odds of answering an II2 item correctly and about 3 times the odds of answering an II1 item correctly, and both of these odds ratios are higher than the estimated odds ratio of answering a CC item correctly, versus a IC item. Finally, inline with Prediction 2a, participants’ performance on IC items was higher than on II1 and II2; the difference between IC and II2 was statistically significant, p < .001, OR = 2.72, 95% CI [1.53, 4.85], but the difference between IC and II1 was not statistically significant, p = .762 OR = 1.48, 95% CI [0.83, 2.66]. Additionally, accuracy on II1 items was higher than in II2 items, p < .001, OR = 1.83, 95% CI [1.05, 3.21], which suggests an effect of familiarity (Prediction 3a). Accuracy rates for multiplication items by type and group are visually presented in the Appendix, in Panel (a) of Figure A1. In sum, for large numbers, the effect of numbercongruency was largest, followed by the effect of operationcongruency, and the effect of familiarity was the smallest. However, the effect of numbercongruency appeared smaller for large numbers than for small numbers and decimal numbers.
The Effect of Division for Each Group of Items
Table 3 shows participants’ performance on division items, by item and group. For Small Numbers, there was a statistically significant effect of Type, χ^{2}(3, N = 924) = 201.732, p < .001. Bonferroniadjusted pairwise comparisons showed a statistically significant difference between CC and IC, p < .001, OR = 25.19, 95% CI [9.43, 67.28], CC and II1, p < .001, OR = 15.55, 95% CI [5.82, 41.49], CC and II2, p < .001, OR = 51.37, 95% CI [18.86, 139.89], in which CC items always had the highest accuracy. Estimated odds ratios suggest that the odds of answering a CC item correctly were about 25 times the odds of answering an IC item correctly and about 51 times the odds of answering an II2 item correctly. Both of these odds ratios are higher than the estimated odds ratio of answering a CC item correctly, versus a II1 item. These indicate an effect for numbercongruency (in line with Prediction 1a) which is again stronger between tasks that appear unfamiliar. There was a statistically significant difference between IC and II2 items, p < .001, OR = 2.04, 95% CI [1.13, 3.69], with higher performance on IC items which is partially inline with Prediction 2a and indicates an effect of operationcongruency. On the other hand, there was no statistically significant differences between IC and II1 items, p = .284, OR = 0.62, 95% CI [0.35, 1.08]. Finally, accuracy in II1 items was higher than in II2 items, and this difference was statistically significant, p < .001, OR = 3.3, 95% CI [1.83, 5.97], (Prediction 3a). These results suggest that for small numbers, the effect of numbercongruency was larger than the effect of familiarity, which was in turn larger than the effect of operationcongruency.
Table 3
Group / Item Type  M  SE  95% Wald CI



LL  UL  
Group of Small Numbers  
CC (e.g., 56 ÷ _ = 8)  .95  .014  .91  .97 
IC (e.g., 26 ÷ _ = 9)  .43  .047  .35  .53 
II1 (e.g., 7 ÷ _ = 42)  .55  .050  .45  .64 
II2 (e.g., 3 ÷ _ = 11)  .27  .047  .19  .37 
Group of Decimal Numbers  
CC (e.g., 12.8 ÷ _ = 3.2)  .93  .017  .89  .96 
IC (e.g., 7.5 ÷ _ = 4.3)  .49  .050  .39  .58 
II1 (e.g., 4.3 ÷ _ = 8.6)  .64  .046  .54  .72 
II2 (e.g., 3.2 ÷ _ = 11.7)  .39  .046  .30  .48 
Group of Large Numbers  
CC (e.g., 292 ÷_ = 4)  .93  .028  .72  .83 
IC (e.g., 735 ÷ _ = 8)  .49  .046  .44  .62 
II1 (e.g., 9 ÷_ = 657)  .64  .045  .36  .53 
II2 (e.g., 6 ÷ _ = 497)  .32  .045  .24  .42 
For Decimal Numbers, there was a statistically significant effect of Type, χ^{2}(3, N = 924) = 132.866, p < .001. Inline with Predictions 1a and 2a, there was a statistically significant difference between CC and IC, p < .001, OR = 13.83, 95% CI [5.84, 32.76], CC and II1, p < .001, OR = 7.47, 95% CI [3.13, 17.84], CC and II2, p < .001, OR = 20.78, 95% CI [8.73, 49.45], in which CC items had the highest accuracy. Estimated odds ratios suggest that the odds of answering a CC item correctly were about 13 times the odds of answering an IC item correctly and about 20 times the odds of answering an II2 item correctly. Both of these odds ratios are higher than the estimated odds ratio of answering a CC item correctly, versus a II1 item. These results show an effect of numbercongruency (in line with Prediction 1a) which is stronger between unfamiliar tasks. There was a statistically significant difference between IC and II2 items, p = .008, OR = 1.50, 95% CI [0.86, 2.63], reflects an effect of operationcongruency, but the difference between IC and II1 did not reach statistical significance, p = .075, OR = 0.54, 95% CI [0.31, 0.95], providing only partial support for Prediction 2a. Finally, accuracy in II1 items was higher than in II2 items, and this difference was statistically significant, p < .001, OR = 2.78, 95% CI [1.57, 4.93], (Prediction 3a). For decimal numbers, the effect of numbercongruency was the largest, followed by the effect of familiarity, with the smallest effects and limited support for operationcongruency.
For the Group of Large Numbers, there was a statistically significant effect of Type, χ^{2}(3, N = 924) = 101.611, p < .001, for division items. There was a statistically significant difference between CC and Types IC, p < .001, OR = 13.83, 95% CI [5.84, 32.76], CC and II1, p < .001, OR = 9.55, 95% CI [4.05, 22.49], CC and II2, p < .001, OR = 28.23, 95% CI [11.76, 67.76], in which CC items had the highest accuracy. Estimated odds ratios suggest that the odds of answering a CC item correctly were about 14 times the odds of answering an IC item correctly and about 28 times the odds of answering an II2 item correctly. Both of these odds ratios are higher than the estimated odds ratio of answering a CC item correctly, versus a II1 item. These results show an effect of numbercongruency (inline with Prediction 1a), especially for items that appear unfamiliar. Additionally, there was a significant difference between IC and II2, p < .001, OR = 2.04, 95% CI [1.15, 3.63], with higher performance in the IC items, which refers to a difference concerning operationcongruency (inline with Prediction 2a). There was no statistically significant difference between IC and II1, p = .724 OR = 1.38, 95% CI [0.79, 2.40]. Finally, accuracy in II1 items was higher than in II2 items, and this difference was statistically significant, p = .003, OR = 3.78, 95% CI [2.10, 6.79], (Prediction 3a). For large numbers, numbercongruency had the largest effect, with smaller effects for familiarity and operationcongruency. Accuracy rates for division items by type and group are visually presented in the Appendix, in Panel (b) of Figure A1.
In sum, the above results supported Predictions 1a about the effect of numbercongruency in participants’ accuracy rates, since there were statistically significantly higher performances in the numbercongruent items (i.e., CC), than in the numberincongruent items (i.e., IC, II1, II2) items, in all groups of items, in multiplication and also in division. Results suggest mixed support for Prediction 2a that participants would tend to base their answers on the effect of operation and there would be higher accuracy for operationcongruent than operationincongruent items. For all groups of items there were higher accuracy rates for IC items than II2 items in both operations, which supports Prediction 2a about the effect of operationcongruency. However, according to Prediction 2a, IC items should also have higher accuracy than II1 items, but this was only the case in the Group of Large Numbers in multiplication. Lower accuracies in operationcongruent IC items than in the operationincongruent II1 items suggests a familiarity effect on participants’ evaluations about the validity of the multiplication and division equations. Inline with this interpretation, results support Prediction 3a, that there would be higher accuracy for false familiar items, with higher accuracy on II1 items than on II2 items, in which in both cases the items are operationincongruent. The comparison of odd ratios showed that, for both operations, the numbercongruency effect was consistently the largest, in relation to the operationcongruency effect or the effect of familiarity, and these effects were larger for the groups of small and decimal numbers than for the group of large numbers.
To further explore the above results, a GEE model was applied to accuracy rates across the three Groups for selected pairs of item types (i.e., comparing CC and IC; IC and II1; IC and II2; and II1 and II2). Comparing the odd ratios provides indications about the possible differences between the number and operationcongruency effects, in each Group of items.
The NNB Effect Between the Three Groups of Items
The NumberCongruency Effect Between the Groups
To test the main effect of numbercongruency and the interaction of numbercongruency and type, a GEE analysis was applied to the CC and IC Items, for the three Groups. For multiplication, there was a statistically significant interaction effect of group and numbercongruency, χ^{2}(5, N = 1386) = 98.767, p < .001. Performance was higher in CC than IC items in the Group of Small Numbers, p < .001, OR = 55.26, 95% CI [12.91, 236.51], in the Group of Decimal Numbers, p < .001, OR = 18.11, 95% CI [6.18, 53.08], and in the Group of Large Numbers, p < .001, OR = 2.40, 95% CI [1.23, 4.69]. The odd ratios showed that, inline with Prediction 4, the effect of numbercongruency was larger in the Group of Small Numbers than in the other groups.
Results for division were similar, in which there was also a statistically significant interaction effect of group and numbercongruency, χ^{2}(5, N = 1386) = 113.682, p < .001. Again, performance in CC items was higher than in IC items in the Group of Small Numbers, p < .001, OR = 26.00, 95% CI [8.88, 76.13], in the Group of Decimal Numbers, p < .001, OR = 9.80, 95% CI [4.30, 22.30], and in the Group of Large Numbers, p < .001, OR = 3.73, 95% CI [1.96, 7.10]. The odds ratios showed again that the effect of numbercongruency was larger in the Group of Small than in the other groups (inline with Prediction 4).
The OperationCongruency Effect Between the Groups of Items
The same analysis was applied to the IC and II1 items to test the main effect of operationcongruency and the interaction of operationcongruency and type, for the three Groups. For multiplication, there was a statistically significant interaction effect of group and operationcongruency, χ^{2}(5, N = 1386) = 43.436, p < .001. Accuracy in II1 items was higher than in IC items in the Group of Small Numbers, p < .001, OR = 3.34, 95% CI [1.84, 6.06], and in the Group of Decimal Numbers but this difference was not statistically significant, p = .061, OR = 2.11, 95% CI [1.15, 3.88]. In the Group of Large Numbers, accuracy in IC items was higher than in II1 items but again this difference was not statistically significant, p = 1.000, OR = 1.19, 95% CI [0.67, 2.10]. The odds ratio showed that, inline with Prediction 5, the effect of operationcongruency was largest in the Group of Small Numbers in which performance on II1 items was higher than in IC items. For division, participants’ performance on II1 items was always higher than for IC items, and the odds ratios showed the same pattern as in multiplication; however, the interaction effect of group and operationcongruency was not statistically significant, χ^{2}(5, N = 1386) = 11.036, p = .051.
Additionally, the main effect of operationcongruency and the interaction of operationcongruency and type was tested in the IC and II2 items, for the three Groups. For multiplication, there was a statistically significant interaction effect of group and operationcongruency, χ^{2}(5, N = 1386) = 36.075, p < .001, with higher performance on IC items than performance on II2 items in the Group of Large Numbers, p < .001, OR = 2.57, 95% CI [1.45, 4.56], where the differences were statistically significant, and also in the Group of Decimal Numbers, p = 1.000, OR = 1.38, 95% CI [0.79, 2.42], and in the Group of Small Numbers, p = 1.000, OR = 1.04, 95% CI [0.60, 1.81], where the differences were not statistically significant. The odds ratios showed that the effect of operationcongruency is larger in the Group of Large Numbers, compared to its effect in the other groups (inline with Prediction 5).
For division, there was also a statistically significant interaction effect of group and operationcongruency, χ^{2}(5, N = 1386) = 55.173, p < .001, with higher performance on IC items than on II2 items in the Group of Small Numbers, p < .001, OR = 3.14, 95% CI [1.70, 5.81], in the Group of Decimal Numbers, p < .001, OR = 2.43, 95% CI [1.36, 4.34], and in the Group of Large Numbers, p = .022, OR = 2.07, 95% CI [1.11, 3.80]. However, counter to Prediction 5, the odds ratios showed that the effect of operationcongruency is larger in the Group of Small Numbers than in the other groups.
The Effect of Item Familiarity Between the Groups of Items
The same analysis was also conducted on the II1 and II2 items, to test the effect of familiarity with the items that appear in the multiplication table, for the three Groups. For multiplication, there was a statistically significant interaction effect of group and familiarity, χ^{2}(5, N = 2079) = 123.435, p < .001. Accuracy in II1 items was higher than in II2 items in Group of Small Numbers, p < .001, OR = 5.88, 95% CI [3.18, 10.89], in the Group of Decimal Numbers, p < .001, OR = 4.1, 95% CI [2.25, 7.45], and also in the Group of Large Numbers. p < .001, OR = 2.26, 95% CI [1.28, 3.99]. The odds ratio showed that, inline with Prediction 6, the effect of familiarity is larger in the Group of Small Numbers than in the Group of Decimal Numbers and is smallest in the Group of Large Numbers.
For division, the odds ratios showed almost the same pattern. There was also a statistically significant interaction effect of group and familiarity, χ^{2}(5, N = 2079) = 32.196, p < .001. Accuracy in II1 items was higher than in II2 items in Group of Small Numbers, p = .004. OR = 2.01, 95% CI [1.14, 3.55], in the Group of Decimal Numbers, p < .001, OR = 2.25, 95% CI [1.28, 3.96], and in the Group of Large Numbers, p = .194, OR = 1.51, 95% CI [0.86, 2.67], in which the differences were not statistically significant. The odds ratio showed that the effect of familiarity was similar in magnitude in the Group of Small Numbers and the Group of Decimal Numbers and smaller in the Group of Large Numbers.
In summary, for both multiplication and division, the effect of numbercongruency was larger in the Group of Small Numbers than the Group of Decimal Numbers and the Group of Large Numbers, results that support Prediction 4 that the effect of numbercongruency will be less strong than in the Group of Small Numbers, because small number combinations are more easily memorized. Higher accuracy rates on the IC than on the II2 items showed an effect of operationcongruency which was larger in the Group of Large Numbers than in the Group of Decimal Numbers and in the Group of Small Numbers, in multiplication, but not in division. These results support Prediction 5 that a stronger effect of operationcongruency would be expected in the Group of Large Numbers and the Group of Decimal Numbers compared to the Group of Small Numbers. Interestingly, higher accuracy on IC than II1 items in multiplication also showed an operationcongruency effect that was larger in the Group of Large Numbers than in the Group of Decimal Numbers and the Group of Small Numbers, where accuracy was higher in II1 than in the IC items. Again, these results may indicate that the effect of familiarity (i.e., the II1 items) is larger in small number combinations than in decimal and large number combinations. In support of this interpretation, inline with Prediction 6, the effect of familiarity was larger in the Group of Small Numbers than the Group of Decimal Numbers and the Group of Large Numbers as shown in the higher accuracy rates in II1 than in II2 items for both operations.
Analysis of the Reaction Times
Results for Reaction Time for Multiplication Items
Table 4 shows participants’ mean reaction times for multiplication items, for each group and type. In the Group of Small Numbers, there was a statistically significant effect of Type, χ^{2}(3, N = 603) = 24.007, p < .001. Inline with Prediction 1b the participants responded to CC items faster than to IC items, p = .023, d = 0.39, II1 items, p = .001, d = 0.45, and II2 items, p < .001, d = 0.47. There was no statistically significant difference between IC and II1 items, p = 1.000, d = 0.057, IC and II2 items, p = 1.000, d = 0.072, or II1 and II2 items, p = 1.000, d = 0.015 (Prediction 3b). These suggest a medium effect of numbercongruency for small numbers.
Table 4
Group / Item Type  M  SE  95% Wald CI



LL  UL  
Group of Small Numbers  
CC (e.g., 3 ×_= 12)  1901.98  139.811  1627.96  2176.00 
IC (e.g., 4 × _ = 31)  2699.41  237.433  2234.05  3164.77 
II1 (e.g., 30 × _ = 6)  2814.38  208.210  2406.30  3222.46 
II2 (e.g., 55 × _ = 8)  2844.59  237.221  2379.64  3309.53 
Group of Decimal Numbers  
CC (e.g., 3.1 × _ = 15.5)  2823.10  204.423  2422.44  3223.76 
IC (e.g., 6.1 × _ = 17.2)  2752.19  261.682  2239.30  3265.08 
II1 (e.g., 18.3 × _ = 6.1)  3385.88  268.082  2860.45  3911.32 
II2 (e.g., 14.4 × _ = 3.1)  3783.57  323.882  3148.77  4418.36 
Group of Large Numbers  
CC (e.g., 6 × _ = 498)  3238.68  257.427  2734.13  3743.23 
IC (e.g., 7 × _ = 384)  2911.23  236.297  2448.10  3374.37 
II1 (e.g., 438 × _ =3)  3950.17  303.911  3354.51  4545.82 
II2 (e.g., 291 × _ = 4)  3663.03  296.369  3082.16  4243.91 
In the Group of Decimal Numbers there was a statistically significant effect of Type, χ^{2}(3, N = 616) = 12.756, p = .005. Inline with Prediction 1b, there were statistically significant differences between mean response time on the CC and II1 items, p = .007, d = 0.22, and between the CC and II2 items, p = .010, d = 0.38. There was not a statistically significant difference in reaction time between the CC and IC items, p = 1.000, d = 0.057. Additionally, mean reaction time was statistically significantly faster on IC items than on II2 items, p = .029, d = 0.41, but there were no statistically significant differences between mean reaction time on the IC and II1 items, p = .253, d = 0.25, (Prediction 2b), or between the II1 and II2 items, p = 1.000, d = 0.16 (Prediction 3b). Together, these results suggest small to medium effects for numbercongruency and support for medium effects of operationcongruency for decimal numbers.
In the Group of Large Numbers, there was a statistically significant effect of Type, χ^{2}(3, N = 553) = 18.481, p < .001. However, results only partially supported Prediction 1b. There were statistically significant differences in mean reaction times between CC and II1 items, with faster responses on CC items, p = .007, d = 0.25 (Prediction 1b). There was no statistically significant difference in mean reaction time between the CC and IC items, p = 1.000, d = 0.11. Additionally, there were statistically significant differences in mean reaction times between IC and II1 items, p = .001, d = 0.37, and also between IC and II2 items, p = .030, d = 0.26, with faster responses on IC items (Prediction 2b). As with the other two groups of numbers, differences in mean reaction time between II1 and II2 items were not statistically significant, p = 1.000, d = 0.10 (Prediction 3b). Together, these results suggest limited support for a small effect of numbercongruency for large numbers and small to medium effects of operationcongruency. Notably, there is no evidence of a familiarity effect; we expect no effect, since the numbers are outside of the multiplication table. Mean reaction time for multiplication items by type and group are presented in the Appendix, in Panel (a) of Figure A2.
Results for Reaction Time for Division Items
Table 5 shows participants’ mean performance by group and item. There was no statistically significant effect of Type in the Group of Small Numbers, χ^{2}(3, N = 490) = 2.921, p = .404, or in the Group of Decimal Numbers, χ^{2}(3, N = 546) = 0.914, p = .822. However, there was a statistically significant effect of Type in the Group of Large Numbers, χ^{2}(3, N = 458) = 9.565, p = .023. There was a statistically significant difference only between mean reaction time in IC and II1 items (p = .013, d = 0.32), with faster reaction times on the IC items than on the II1 items. This result partly supports Prediction 2b. Mean reaction time for division items by type and group are presented in the Appendix, in Panel (b) of Figure A2.
Table 5
Group / Item type  M  SE  95% Wald CI



LL  UL  
Group of Small Numbers  
CC (e.g., 56 ÷ _ = 8)  2390.05  128.475  2138.24  2641.85 
IC (e.g., 26 ÷ _ = 9)  2386.81  233.145  1929.85  2843.77 
II1 (e.g., 7 ÷ _ = 42)  2702.49  199.940  2310.62  3094.37 
II2 (e.g., 3 ÷ _ = 11)  2877.80  393.923  2105.72  3649.87 
Group of Decimal Numbers  
CC (e.g., 12.8 ÷ _ = 3.2)  2963.79  189.703  2591.98  3335.60 
IC (e.g., 7.5 ÷ _ = 4.3)  3054.95  228.722  2606.66  3503.24 
II1 (e.g., 4.3 ÷ _ = 8.6)  3050.05  230.782  2597.72  3502.37 
II2 (e.g., 3.2 ÷ _ = 11.7)  3161.36  267.176  2637.70  3685.01 
Group of Large Numbers  
CC (e.g., 292 ÷ _ = 4)  3587.94  272.445  3053.95  4121.92 
IC (e.g., 735 ÷ _ = 8)  2889.53  249.229  2401.05  3378.01 
II1 (e.g., 9 ÷ _ = 657)  3841.23  285.222  3282.20  4400.25 
II2 (e.g., 6 ÷ _ = 497)  3560.87  323.871  2926.10  4195.65 
In sum, in most cases, inline with Prediction 1b, that responses on the numbercongruent/operationcongruent items would be quicker, the participants responded statistically significantly faster on the CC items than on II1 and II2 items. Also, inline with Prediction 2b, that there would be shorter reaction time on operationcongruent items participants responded faster on IC items than on II1 and II2 items. In some cases, the participants responded faster on IC than on CC items, however these differences were not statistically significant. Also, the participants’ mean reaction time was in most cases faster in on false familiar items II1 than on II2 items (inline with Prediction 3), however these differences were again not statistically significant. The patterns presented above were clear only in multiplication items in the Group of Small Numbers, and were less clear in the Group of decimal Numbers and the Group of Large Numbers, and almost absent in all groups of division items.
Discussion
This study examined the effect of the NNB on operations between given and missing numbers, using accuracy and response time measurement to extend previous findings that suggest a dual effect of the NNB in arithmetic operations between missing numbers (Christou, 2015a, 2015b). The main hypothesis of the study was that the NNB affects evaluations about the validity of equations that present arithmetic operations between given and missing numbers in two main ways: a) a tendency to think that missing numbers are natural numbers; and b) a tendency to associate each operation with specific results independently of the numbers involved in the operations, i.e., larger results than the operand in multiplication and smaller results in division.
Based on the above hypotheses, the participants were expected to evaluate the validity of given equations (e.g., 7 ÷ _ = 42) using two main strategies: a) to mentally substitute different numbers to test the validity of the equations and b) to compare the operand and result. To test these assumptions, four types of equations were designed to be congruent or incongruent with the intuitive beliefs about the missing numbers and the size of the results of each operation, and were tested using three groups of numbers (i.e., small natural numbers, decimal numbers, large natural numbers). Within the different number groups, the familiarity effect was tested, using number combinations that appear in the multiplication table or seem to appear in the multiplication table, i.e., false familiarity).
Results supported the first aspect of the NNB, showing a large effect of numbercongruency on participants’ evaluations for all groups of items, with higher accuracy rates on those items with missing natural numbers (i.e., the numbercongruent, CC items), compared to those items that falsified this belief (i.e., all other items). These results support the main prediction of the study. The effect of numbercongruency was larger in the Group of Small Number items than in the Group of Decimal Numbers and the Group of Large numbers. This suggests that participants may mentally test the effect of operations by substituting specific numbers for missing numbers, which under the influence of the NNB, are mostly natural numbers; a finding that further supports previous research (Christou, 2015a, 2015b; Van Hoof et al., 2015). This strategy is also easier to apply effectively in the case of small number combinations than in other categories of tasks. Importantly, it may appear that these items preclude our ability to disentangle the mechanisms of arithmetic fact retrieval from the NNB, if students try to find the missing number (presumably by substituting natural numbers). From our perspective, however, these two mechanisms may not be separable. Multiplication fact retrieval relies on memorization of the multiplication table, which comprises natural number arithmetic. The emphasis on developing fact fluency for natural number arithmetic throughout primary school years are among the factors that may strengthen the NNB phenomenon.
Results supported the second aspect of the NNB, an effect of operationcongruency on participants’ evaluations, more clearly for multiplication than division. Specifically, the results showed higher accuracy rates for items that were inline with participants’ intuitive beliefs about the size of the results of multiplication and division (i.e., the operationcongruent, IC items), comparing with their evaluations in those items that falsified these beliefs when those items were not familiar (i.e., the operationincongruent, II2 items). This is inline with the prediction regarding the second part of the dual effect hypothesis, and with previous research that reported the multiplication makes bigger misconception (Fischbein et al., 1985; Greer, 1994; Onslow, 1990; Vamvakoussi et al., 2012, 2013; Van Hoof et al., 2015). For accuracy rates, results for multiplication items supported our prediction that the operationcongruency effect was larger in the Group of Large Number items and the Group of Decimal Numbers than in the Group of Small numbers. This shows that when strategies such as finding the missing number or recognizing number facts from the multiplication table are more difficult to apply, participants base their evaluations on their intuitions about the size of the results expected from each arithmetic operation. However, for division items, the operationcongruency effect was larger in the Group of Small Number items than in the Group of Decimal Numbers and the Group of Large numbers. This result requires further investigation, such as by coupling accuracy and reaction time data with participant interviews about their problemsolving strategies.
Further support for the notion that participants may base their evaluations on trying to find the missing number when this is possible was suggested with higher accuracy rates on operationincongruent items that appeared familiar because the missing number was a unit fraction (i.e., II1 items), than operationcongruent items that were inline with intuitive beliefs about the size of the results of operation (i.e., IC items), in all cases except in large number multiplication items; a result counter to our predictions. Most often these differences were not statistically significant, however they suggest that in cases where the tasks are operationincongruent and appear familiar, participants tended to respond even higher than in those tasks where the size of the results of the operations were inline with participants’ intuitive beliefs. In those cases, the effect of operationcongruency was stronger in the Group of Small Numbers than in the Group of Decimal and Large Numbers, both in multiplication and division. These results further support the above interpretation, that participants may base their evaluations on number substitution or task familiarity, rather than on the effect of the operation, and this is a more effective strategy in the case of small numbers.
Statistically significant differences between accuracy rates on items that falsified intuitions about number and operationcongruency (i.e., the II1 and the II2 items) further support this interpretation. As was predicted, accuracy was higher on items that appeared familiar (i.e., the II1 items) than on items that did not (i.e., the II2 items), for both multiplication and division. Again, this effect of false familiarity was stronger in the Group of Small Numbers than the other two groups, for both operations, however the effect sizes were smaller than the number congruency effect.
For reaction time, only multiplication items with small numbers supported the above interpretations; results were less clear for other multiplication items and groups, and for all division items and groups. In most cases, the participants responded statistically significantly faster on numbercongruent and operationcongruent (i.e., CC) items than on the numberincongruent and operationincongruent (i.e., the II1 and the II2) items. Also, as it was predicted, participants responded faster on items that were aligned with their intuitions about the size of the result of the operations (i.e., the IC items) and slower on items that falsified these intuitions (i.e., the II1 and the II2 items). There was also a familiarity effect for items that falsified intuitions about the effect of operations. In line with our predictions, participants’ responses were most often faster when items appeared familiar (i.e., II1 items) than when they did not (i.e., II2 items). Interestingly, for items that aligned with intuitions about the size of the results of operations, in most cases participants responded slower when a natural number was missing (i.e., on the CC items) than when a nonnatural number was missing (i.e., the IC items). These differences were not statistically significant; however, it is possible that there was not sufficient statistical power to detect these effects, due to large variation in reaction times overall. Speculatively, differences may suggest that participants spend time to confirm that a specific natural number is missing, yet do not when the missing number is not natural. This hypothesis could be tested in subsequent studies.
Overall, the results of the study support the dual effect of the NNB in arithmetic operations between missing numbers, providing further empirical support to previous findings (Christou, 2015a, 2015b; Obersteiner et al., 2016; Vamvakoussi et al., 2012, 2013; Van Hoof et al., 2015). Results also suggest that when small natural number combinations are presented with missing numbers, participants are inclined to mentally substitute natural numbers to decide whether the expression can be true, and thus they are more susceptible to numbercongruency than operationcongruency effects, and to familiarity effects, especially in multiplication. However, when it is more difficult to trace the missing natural number, and the effect of familiarity is less strong, such as with decimal number or large number combinations, students base their evaluations on intuitions about the size of results from multiplication and division, as a more effective strategy. This tendency appeared when natural numbers were missing and when unit fractions were missing. The latter may have created an effect of false familiarity with multiplication facts, suggesting that students may directly retrieve facts from memory, as suggested in previous studies (Krueger, 1986; Krueger & Hallford, 1984; Stazyk et al., 1982).
Limitations
To focus on testing the dual effect of the NNB, this study maximized experimental items to examine number and operationcongruency effects, and had relatively fewer distractor items (since distractor items must contain zero as an operand or result). As a result, to answer correctly, participants needed to indicate more often that the expression was possible than impossible. However, given the observed accuracy rates, participants evaluated many of the expressions to be impossible, suggesting item imbalance was not an issue.
Another potential limitation was that students could still respond using general insights about the solvability of linear equations. There may be contexts in which this strategy of applying a general insight (understanding that all expressions are possible as long as they don’t involve a 0) is frequently relied on. For instance, Obersteiner et al. (2016) observed this in mathematical experts. In our study, missing number symbols rather than literal symbols were used to specifically discourage this strategy. Use of the general insight strategy in our current study seems unlikely; there were not ceiling effects, systematic fast reaction times on all items, or indications that participants figured out this strategy during the course of the experiment. Specifically, only one student gave the correct response for all items, and only 10% of participants scored higher than 95. From our perspective, these results show that even if some students used either of the two strategies (i.e., either solved the equations, or evoked knowledge on equation solvability) they did not do it systematically. This implies that although they presumably recognized that knowledge of equations was relevant to the tasks at hand, they failed to use this knowledge in all items. This could be interpreted as an instance where the intuitive, “biased” response (Leron & Hazzan, 2006) overrode the analytic, perhaps because students felt more confident about it. Future studies that incorporate interviews could shed more light on the actual strategies that the participants tend to use in each category of items, as the present study did not collect strategy information.
Finally, a limitation of this research approach generally is that it involves rating symbolic equations only in a purely symbolic form. Performance on mathematical tasks can be highly context dependent (e.g., Saxe, 2015), and so can knowledge of mathematical principles (Prather & Alibali, 2008). Along this line, someone could argue that thinking of the unknown number as a decimal number, for example, in a context where all the other numbers in the equation are natural, is more difficult than when decimal numbers would appear in the same equation. Indeed, in the present study, the accuracy rates on the tasks that were number incongruent but operation congruent were slightly higher in the group of decimal numbers than in the group of small natural numbers. However, since participants are exposed to natural and decimal numbers throughout the experiment, this potential difference in difficulty should be eliminated or greatly reduced for any specific equation. Additionally, from the NNB perspective that we endorse, such an effect of context would further support our main claim about an intuitive/analytic distinction between the cognitive processes that underlie reasoning with arithmetic operations. In other words, no context effect should appear if participants’ responses were not affected by the NNB, considering that the participants have been exposed to decimal numbers and fractions for many years throughout schooling. Support for this position comes from previous studies which have shown that the NNB may affect how students apply different properties to the different symbolic representations of rational numbers, i.e., when they appear as decimals or as fractions (for a detailed discussion see Vamvakoussi & Vosniadou 2010). However, participants may still perform differently if tasks are presented in verbal or problemsolving form. As such, the potentially contextdependent nature of performance on these tasks requires further examination. Also, further studies should test whether the reported effects could be education dependent, by testing how performance on the tasks might differ between adults with more and less formal education or with younger students.
Implications
The results of the present study provide valuable information about the cognitive processes that underlie reasoning with arithmetic operations and the role of prior natural number knowledge in these processes. Results further support that the NNB contributes to the effect of number size on students’ evaluations of the validity of multiplication and division equations, which further suggests that classroom instructors will need to address the NNB. However, addressing the NNB in instruction is a complex endeavor that requires remedial and anticipation approaches (Vamvakoussi et al., 2018; Van Dooren et al., 2015).
As an initial step, teachers should learn that students hold intuitive ideas about various mathematical topics, including the size and the type of the results of operations, since preservice teachers are insufficiently aware of such issues (Depaepe et al., 2015; Depaepe et al., 2018). Students should also learn about their intuitions, since these beliefs are often implicit and not under their conscious control (Fischbein, 1987). However, merely challenging students’ erroneous beliefs, without interventions that raise awareness about the discrepancy between incorrect beliefs and the mathematically correct perspective, is not enough for students to abandon their alternative conceptions and accept the mathematically correct knowledge (Merenluoto & Lehtinen, 2004).
Further, merely raising awareness may not be enough. For example, students who can verbalize the fact that nonnatural numbers can be substituted for missing number symbols or literal symbols may still substitute natural numbers only, even after hints from interviewers (Christou & Vosniadou, 2012; Van Hoof et al., 2015). Along the same line, merely changing the missing number symbol to x in the given equations (rather than the ‘_’ that was used), would be unlikely to solve the NNB issue. That is because by following an equationsolving process (mentally or not), the participants may determine the nonnatural missing number (i.e., the solution). This, however, would draw on cognitive processes different from students’ intuitive responses that are affected by the NNB. Therefore, as has been argued before (Christou, 2015b; Dimitrakopoulou & Christou, 2018; Vamvakoussi et al., 2013), for students to overcome their intuitions about the effects of arithmetic operations, they need explicit strategies to inhibit natural number knowledge interference (Moutier & Houdé, 2003; Roell et al., 2017, 2019; Van Dooren & Inglis, 2015). One inhibition strategy is to always try with at least one nonnatural number—a negative or a positive rational number smaller than 1—in cases of missing number tasks, or to always recall that multiplication does not always make bigger.
Lastly, the design of the items presented in the study may be fruitfully used and empirically tested as educational materials that could illuminate and falsify students’ intuitive beliefs about the size of the results of operations. These items could be used in constructivist teaching environments to raise students’ awareness about the familiarity of multiplication facts due to natural number calculations. Students can learn that the tendency to think that calculations between missing numbers hold only between natural numbers could create further constraints on successfully completing mathematical tasks, such as solving equations.