Natural Number Bias in Arithmetic Operations With Missing Numbers – A Reaction Time Study

When reasoning about numbers, students are susceptible to a natural number bias (NNB): When reasoning about non-natural numbers they use properties of natural numbers that do not apply. The present study examined the NNB when students are asked to evaluate the validity of algebraic equations involving multiplication and division, with an unknown, a given operand, and a given result; numbers were either small or large natural numbers, or decimal numbers (e.g., 3 × _ = 12, 6 × _ = 498, 6.1 × _ = 17.2). Equations varied on number congruency (unknown operands were either natural or rational numbers), and operation congruency (operations were either consistent – e.g., a product is larger than its operand – or inconsistent with natural number arithmetic). In a response-time paradigm, 77 adults viewed equations and determined whether a number could be found that would make the equation true. The results showed that the NNB affects evaluations in two main ways: a) the tendency to think that missing numbers are natural numbers; and b) the tendency to associate each operation with specific size of result, i.e., that multiplication makes bigger and division makes smaller. The effect was larger for items with small numbers, which is likely because these number combinations appear in the multiplication table, which is automatized through primary education. This suggests that students may count on the strategy of direct fact retrieval from memory when possible. Overall the findings suggest that the NNB led to decreased student performance on problems requiring rational number reasoning.

than 1 results in answers that are smaller than at least one operand (e.g., 4 × 0.2 = 0.8) and division with rational numbers less than 1 results in answers that are larger than the operands (e.g., 8 ÷ 0.4 = 20).
Students' years of experience working with natural numbers may create a strong intuition about the results that can be expected from natural-number arithmetic (e.g., multiplication makes larger) (Greer, 1994), or addition makes larger (Dixon et al., 2001). From a framework theory approach to conceptual change, students from very early on construct an initial understanding for the number concept, which, based on their experience with counting and the sequence of the number words, acquires the characteristics of the mathematical concept of natural numbers (Gelman, 2000;Smith et al., 2005;Vamvakoussi, Christou, & Vosniadou, 2018;Vosniadou, Vamvakoussi, & Skopeliti, 2008). This initial number concept, which is organized into a framework theory for number, may form students' beliefs, their interpretations and their anticipations about the properties of numbers. From this perspective, the influence of the initial number concept when reasoning with more advanced numbers such as the rational and the real numbers may result in the NNB phenomenon and lead to systematic errors in rational number tasks where the properties of natural numbers do not hold (Vamvakoussi et al., 2018). Importantly, the initial number conception remains even after learners have acquired the correct mathematical knowledge about rational numbers and, in principle, are able to solve rational number tasks. Thus, natural number knowledge may still interfere after students have learned to reason with rational numbers and have acquired the knowledge to solve rational number tasks (see also Shtulman & Valcarcel, 2012;Vosniadou et al., 2008). Support for this idea, and specifically for the NNB with arithmetic operations, comes from studies using accuracy and reaction time measurements. If properties of natural numbers come to mind first, even in learners who have the correct rational number knowledge, responses to arithmetic equations that are aligned with natural number knowledge (called congruent items) would be quicker and more accurate than for incongruent items, in which natural number knowledge leads to a false conclusion. This is because a learner has to overcome this initial idea using analytical reasoning that may not always be successful, which in turn produces longer response times and fewer accurate responses. These response patterns align with dual process theories that rely on the distinction and interaction between intuitive and analytical reasoning processes (Gillard, Van Dooren, Schaeken, & Verschaffel, 2009;Vamvakoussi et al., 2013). Several reaction time studies with student and adult populations support the theories of co-existence and intuitive interference of natural number knowledge by showing statistically significant differences in accuracy rates and response times between tasks that were in-line with intuitions about the results of arithmetic operations and tasks that falsified these intuitions (Obersteiner, Van Hoof, Verschaffel, & Van Dooren, 2016;Vamvakoussi et al., 2012;Van Hoof et al., 2015). the operations, but the missing numbers were rational numbers larger than 1 instead of natural numbers (e.g., 3 × _ = 11); and c) tasks in which the magnitude of the results of the operations falsified students' intuitions that multiplication makes bigger and division makes smaller (e.g., 3 × _ = 2), and in which the missing number was a rational number smaller than 1. In all those studies (Christou, 2015a(Christou, , 2015b(Christou, , 2017, students had statistically significantly higher accuracy rates for the items aligned with their intuitions about the size of the results of each operation than items that falsified these intuitions. Additional studies have shown similar performance differences (Obersteiner et al., 2016;Vamvakoussi et al., 2013;Van Hoof et al., 2015).
Importantly, these results also indicated that students tend to think of the missing numbers in the items as natural numbers. Students performed significantly better when the missing number was a natural number than when it was a rational number, even when their intuitions about the results of the operations were not violated (Christou, 2015a(Christou, , 2015b(Christou, , 2017. Students' also employ intuitions about missing numbers as natural numbers when evaluating algebraic expressions that contain literal symbols. Christou and colleagues found that students tend to substitute mostly natural numbers for literal symbols when evaluating algebraic expressions with operations between numbers and literal symbols (Christou & Vosniadou, 2012;Christou, Vosniadou, & Vamvakoussi, 2007). For example, students thought that k + 3 represented only natural numbers larger than three. Additionally, in an interview study with tenth graders, the majority of the students claimed that 5d is always bigger than 4/d because multiplication makes the numbers bigger than division, and most students supported this claim by substituting natural numbers for the literal symbols, despite the hints provided by the interviewer to also try with other kinds of numbers (Christou & Vosniadou, 2012). Similarly, Van Hoof and colleagues (2015) found that students who were interviewed, explicitly referred to general rules about the results of the operations that are valid only for natural numbers, or substituted literal symbols with natural numbers to come to an answer.
Strictly speaking, thinking of missing numbers or literal symbols as natural numbers only would not fall under the definition of the NNB as relying on the properties of natural numbers when reasoning about rational numbers. However, it would indicate the dominance of natural numbers in students' thinking in numerical situations, and as such could be seen as another instance of the NNB. Thus, these studies suggest a dual effect of the NNB: intuitions about results of arithmetic operations and intuitions that missing numbers or literal symbols represent natural numbers.

The Present Study
The present study seeks to further examine and disentangle the dual effect of the NNB by measuring participants' accuracy and reaction times as they make evaluations about whether arithmetic equations with a missing operand could be true or not (e.g., Is there a number such that 3 ÷ _ = 12?). In these tasks, the NNB may affect participants' evaluations about the validity of such equations in two main ways. First, it would affect their strategy to check the validity of the equations by mentally substituting specific numbers for the missing number symbols. Under the influence of the NNB, participants may disproportionately substitute natural numbers. Second, participants may make evaluations based on intuitive expectations about the size of the results of operations (e.g., that multiplication should provide a larger outcome than the operand numbers), as shown in prior studies with literal symbols as missing numbers (Obersteiner et al., 2016; 1. As reviewed above, prior studies have focused on interviews and paper and pencil tasks that measure student accuracy. This study tests the dual effect assumption by measuring a different aspect of student reasoning: timed evaluations about missing number operands in arithmetic equations. This allows us to judge whether the NNB effect is still present, even when participants give correct responses.

2.
It disentangles the two aspects of the NNB that may appear in tasks for which finding the missing number is more or less difficult.

3.
It seeks to clarify whether participants' tendencies to correctly evaluate the validity of statements like 3 × _ = 12 relate to a trial-and-error mental process with specific natural numbers, or rather relate to retrieval of multiplication facts from long term memory.
Based on the dual aspect of the NNB in operations with missing numbers, participants may evaluate the validity of equations such as 3 ÷_ = 12 based on two main strategies. First, participants who are affected by the NNB may tend to respond that operations with larger results in multiplication and smaller in division would be evaluated as possible, and those with larger results in division and smaller in multiplication would be evaluated as impossible. Second, some participants may try to find the specific missing number that makes the expression possible in a trial-and-error process. In the last two groups of items, the missing number is more difficult to find than in the first group, even when in all cases the missing number is a natural number. As we explain below, differences in participants' accuracy and reaction time among the three groups of items would provide further insights about the underlying cognitive processes when reasoning about the results of such arithmetic operations.
First, an important reason to include a set of items involving decimal numbers is to investigate a context effect.
If students tend to think of an unknown number as a natural number in the first place, this tendency may be stronger when all numbers that are given in the equation are natural too. Students may be reminded to think of the unknown number as possibly non-natural when the equations contain decimals. Indeed, recent findings from an empirical study which used the same design of tasks in a paper and pencil condition showed that students are affected by the types of numbers (either natural or decimals) that appear in arithmetic operations (Christou, 2017).
Second, the three sets of items are all needed to achieve the third aim: to clarify whether participants' tendency to correctly evaluate the validity of statements like 3 × _ = 12 may be through trial-and-error or because of familiarity with multiplication facts. To achieve this, this study included arithmetic combinations that would be familiar to participants because specific arithmetic facts come from the multiplication table or because the Natural Number Bias in Arithmetic Operations 26 arithmetic facts were easily confused with familiar arithmetic combinations (a false familiarity, e.g., 3 ÷ _ = 12). Krueger (1986) showed that when there is very good verbatim memory for particular multiplication equations, participants count not only on a plausibility evaluation of the given statements, but also direct fact retrieval from memory (see also Stazyk, Ashcraft, & Hamann, 1982). To test for plausibility evaluation and retrieval mechanisms, the current study included arithmetic equations with small number combinations that appear in the multiplication table and thus draw on strong verbatim memory, and equations with decimals and large natural numbers where there is less or no verbatim memory to support specific evaluations. For the group of Decimal and the group of Large numbers, participants may only count on exact calculations in a mental trial-and-error process, or on plausibility evaluations, which are based on operational patterns (see Prather & Alibali, 2008) that may determine the expected results from each operation independently of the numbers involved. Considering the first strategy, calculations especially with large numbers are more difficult than with small numbers (Ashcraft, 1992), and the NNB may affect the second strategy.
Finally, as part of the third aim, this study tested the effect of false familiarity in participants' evaluations with equations that appeared familiar, but were not. In these cases, a false result may be difficult to reject because the equation is similar to a true equation but with a different operation. For example, in a cross-operation confusion condition (Krueger & Hallford, 1984) people find it difficult to reject such equations as 7 + 1 = 6 for which the answer would be true if the same pair of numbers would appear under subtraction (Ashcraft & Battaglia, 1978), and 3 ÷ 3 = 9, for which the answer would be true with multiplication (Winkelman & Schmidt, 1974). Similar research has shown that participants tend to think that 16 ÷ 32 = 2 is correct (Bell et al., 1981). Additionally, Stazyk and colleagues (1982) reported slower response times and increased error rates in confusing problems such as 7 × 4 = 21 or 7 × 4 = 35, than in non-confusing problems (e.g., 7 × 4 = 18), in which confusion problems were characterized as those with answers adjacent or near the correct answer in the multiplication table, that is answers differing ± 1 in one of the operands. Both confusion effects stemmed from relatedness among the numbers in the memory representation of multiplication facts. To test for the effect of false familiarity, the current study included a category of items in which the missing numbers were unit fractions. These equations falsified participants' intuitions about the size of the results of multiplication and division with the missing number. For example, in the equation 7 ÷_ = 42, the missing number is 1/6 and this makes it seem familiar to 7 × _ = 42 (where the missing number is 6), which appears in the multiplication table.
In sum, the current study examined accuracy and reaction time for evaluations about arithmetic equations with missing operands. Items were in-line with or violated participants' intuitions about the size of the results of the given operations (operation-congruent/incongruent items) and also about whether only natural numbers could be substituted for the missing number symbols (number-congruent/incongruent items). Also, differences in the effect of each aspect of the NNB were tested by comparing participants' performance across three groups of items that were designed with the above characteristics: The Group of Small numbers, the Group of Decimal Numbers, and the Group of Large Numbers. In order to account for the effect of false familiarity, participants' performance was examined with number-incongruent and operation-incongruent items in which the missing number was a unit fraction.

Stimuli
There were 108 stimuli, consisting of 72 experimental and 36 distractor items. Each item consisted of a multiplication or division equation with a given number and a missing number (e.g., 3 ×_ = 12). Missing number symbols were used instead of literal symbols (e.g., x) to discourage participants from applying a general insight about the solvability of linear equations, such as the general solution of x=b/a for any equation of the form a × x = b, a ≠ 0. Using missing number symbols may discourage participants from using this strategy of equation solving. For each item, participants were asked to evaluate whether it is possible for the equation to be true or not, i.e., whether it would be possible to find a missing number that would provide the given result, without the need to actually find the missing number.
Examples of each type are presented in Table 1.

The Three Groups of Items
Items were organized into three Groups, based on whether small numbers, decimal numbers, or large numbers were operators and outcomes. The Group of Small Numbers included items with small natural number combinations (smaller than 100). Participants are more familiar with these numbers since they are used in the classroom and in everyday life. Multiplication and division between numbers smaller than 100 also appear in the multiplication table, which participants have used extensively throughout schooling. The Group of Decimal Numbers consisted of multiplication and division where the operand number and the outcome were decimal numbers. Lastly, the Group of Large Numbers included number combinations greater than those in the multiplication table. Verbatim memory could support participants' evaluations with small numbers, but not with decimals or large numbers.

The Four Types of Items
Number-congruent items were equations in which a missing natural number would make the equation true.
Number-incongruent items were equations in which a missing rational number would make the equations true. The operation-congruent items were aligned with properties of natural number arithmetic: multiplication equations with a larger outcome or division equations with a smaller outcome. Operation-incongruent items were either multiplication equations with a smaller outcome or division equations with a larger outcome. Both types of equations were true.

Natural Number Bias in Arithmetic Operations 28
Combinations of the item characteristics produced four types of items, per group, that differed in their number-congruency and operation-congruency. First, there were items that were number-congruent and operation-congruent (e.g., 3 ×_= 12); these items were called CC (abbreviation for Congruent/Congruent). In the Group of Small Numbers, the CC items were all part of the multiplication table, and would be familiar to the participants. The remaining item types were number-incongruent because substituting missing number symbols with a natural number would not make the equation true; thus, substituting only natural numbers would lead participants to the incorrect conclusion that the expression was not true. Second, some number-incongruent items were operation-congruent, in which multiplication produced bigger outcomes and division smaller outcomes; for example: 4 × _ = 31. These items are called IC (abbreviation for Incongruent/Congruent).
The third and fourth remaining categories of items were both number-incongruent and operation-incongruent.
For these items, the missing number was a non-natural number and the size of the results of the operations were not in-line with natural number arithmetic. Thus, both item types are II (abbreviation for Incongruent/Incongruent). In the first category of II items (named: II1) the missing number was a unit fraction. The II1 item number combinations may have appeared familiar to participants because they evoke number combinations that appear in the multiplication table, but these items had an inversion in the operant and the output. Thus, these items were false familiar because, for example, 7 ÷_ = 42 evokes '6' as missing number, but the correct answer is 1/6; and this appears familiar because 42 ÷ 6 = 7. In the second category of II items (named: II2), there was a missing rational number smaller than one, which resulted in incongruent outcome size for each operation, i.e., smaller than the given operant numbers in cases of multiplication and larger in cases of division (e.g., 55 × _ = 8, in which case the unknown number is 8/55, or 0.1455). There were 36 items for each Group: 18 items per operation, with three items of each Type, plus 6 distractor items. Note that there were no items that were natural number-congruent and operation-incongruent. In this case, the operation would need to be the opposite (multiplication makes smaller; division makes bigger), but with a natural number missing, such items are mathematically impossible to create.

Distractor Items
Since all experimental items had a number that would make the equation true, there were also distractor items that had no solution, and thus, for these items the correct response was that the equation could not be true.
Distractors were arithmetic expressions with zero as an operator or outcome (see Table 1), since multiplication by zero always results in zero and division can never have zero as a quotient. Each group had 12 distractor items.

Procedure
Participants completed the experiment in one 30-minute session in a classroom on a computer. Participants provided demographic information (i.e., age, gender) and then completed the experimental task. Stimuli were presented using E-prime. The items were presented in three blocks, with a break between each block. Each block included the four item types for one group. The order of the blocks was randomized.
Each equation was displayed with the question: Can this be true? Y/N. Participants responded by pressing Y for Yes and N for No. Participants were told that they did not need to find the missing number, only to respond whether a number exists that would make the equation true, as quickly and accurately as possible. Items were displayed until response and were preceded by a 500 ms fixation cross. Response time and accuracy were recorded for each trial.

Predictions
We had several predictions for participants' response patterns to the four item types. If the participants' responses were based on mentally substituting natural numbers, there would be greater accuracy (Prediction 1a) and shorter reaction time (Prediction 1b) on the number-congruent/operation-congruent items (i.e., CC) than on the number-incongruent/operation-congruent items (i.e., IC). Additionally, if participants based their answers on the effect of operation, then there would be higher accuracy (Prediction 2a) and shorter reaction time (Prediction 2b) on operation-congruent items (i.e., the CC and IC items) than on operation-incongruent items (i.e., the II1 and II2 items). If participants were affected by the familiarity of the number combinations from the multiplication table, there would be higher accuracy (Prediction 3a) and shorter reaction time (Prediction 3b) on false familiar items (II1 items) than on the other number-incongruent/operation-incongruent items (II2 items), even though they are both number and operation-incongruent types of items.
Predictions 1 to 3 as described so far apply to all the three Groups of items. They are the core predictions of our study, as they specifically refer to the way the items are solved: by mentally substituting (specific) natural numbers or by judging the effect of the given operation. However, we further wanted to explore differential effects for the three Groups of items, which may further corroborate the underlying mechanisms. For instance, for the Group of Decimal Numbers and the Group of Large Numbers, we predict the effect of number-congruency will be less strong than in the Group of Small Numbers, because small number combinations are more easily memorized (Prediction 4). Especially for missing natural numbers (CC items), it is easier to determine Natural Number Bias in Arithmetic Operations 30 the missing number in the Group of Small numbers, and in some cases in the Group of Decimals (e.g., the missing number 5 for 3.1 × _ = 15.5) than in the Group of Large Numbers (e.g. 6 × _ = 498). Therefore, participants may tend to think less of a specific number for equations with decimal numbers and even less for equations with large numbers, than for equations with small numbers. Instead, for decimal and large number combinations, participants may count more on the operation-congruency effect (i.e., whether there are larger outcomes for multiplication and smaller for division). Thus, a stronger effect of operation-congruency in the Group of Large Numbers and the Group of Decimal Numbers would be expected, compared to the Group of Small Numbers (Prediction 5). For the same reasons, it would be expected that the effect of false familiarity would be stronger in the Group of Small Numbers than in the other groups (Prediction 6).

Data Analysis
Distractor items were excluded from all analyses. Responses for multiplication and division items were analyzed separately, since the notions that multiplication makes bigger and division makes smaller may have differential effects. A separate analysis therefore provides a clearer picture of the phenomenon at hand for each operation. For each of the three groups, and again separately for multiplication and division items, we calculated outliers as trials that were more than three standard deviations above or below the average reaction time. Excluding distractor items, 18 of 3327 correct trials (0.5%) were excluded in total.
The Generalized Estimating of Equations (GEE) module in SPSS was used, with logistic regression to model accuracy (i.e., correct = 1 and incorrect = 0) and linear regression to model reaction time. The GEE approach accounts for the dependence of repeated measurements within subjects. In order to test the effect of number and operation-congruency the GEE analyses were applied to the accuracy rates, and also to reaction times, in the four types of items, for each one of the three Groups of items and each operation type. For both outcomes, Bonferroni-adjusted pairwise comparisons were applied in order to test possible statistically significant differences between item types. For accuracy rates, odds ratios were calculated for these pairwise comparisons, and for reaction times Cohen's d was used. To test for differences within each item type across the three groups, GEE analyses were also applied to the group analyses for each type. In this case, odd ratios were compared to account for differences in the effect of number and operation-congruency between the different groups of number combinations.  which suggests an effect of familiarity (Prediction 3a). Accuracy rates for multiplication items by type and group are visually presented in the Appendix, in Panel (a) of Figure A1. In sum, for large numbers, the effect of number-congruency was largest, followed by the effect of operation-congruency, and the effect of familiarity was the smallest. However, the effect of number-congruency appeared smaller for large numbers than for small numbers and decimal numbers.  smaller effects for familiarity and operation-congruency. Accuracy rates for division items by type and group are visually presented in the Appendix, in Panel (b) of Figure A1.

The Effect of Division for Each Group of Items
In sum, the above results supported Predictions 1a about the effect of number-congruency in participants' accuracy rates, since there were statistically significantly higher performances in the number-congruent items (i.e., CC), than in the number-incongruent items (i.e., IC, II1, II2) items, in all groups of items, in multiplication and also in division. Results suggest mixed support for Prediction 2a that participants would tend to base their answers on the effect of operation and there would be higher accuracy for operation-congruent than operation-incongruent items. For all groups of items there were higher accuracy rates for IC items than II2 items in both operations, which supports Prediction 2a about the effect of operation-congruency. However, according to Prediction 2a, IC items should also have higher accuracy than II1 items, but this was only the case in the Group of Large Numbers in multiplication. Lower accuracies in operation-congruent IC items than in the operation-incongruent II1 items suggests a familiarity effect on participants' evaluations about the validity of the multiplication and division equations. In-line with this interpretation, results support Prediction 3a, that there would be higher accuracy for false familiar items, with higher accuracy on II1 items than on II2 items, in which in both cases the items are operation-incongruent. The comparison of odd ratios showed that, for both operations, the number-congruency effect was consistently the largest, in relation to the operation-congruency effect or the effect of familiarity, and these effects were larger for the groups of small and decimal numbers than for the group of large numbers.
To further explore the above results, a GEE model was applied to accuracy rates across the three Groups for selected pairs of item types (i.e., comparing CC and IC; IC and II1; IC and II2; and II1 and II2). Comparing the odd ratios provides indications about the possible differences between the number and operation-congruency effects, in each Group of items.

The NNB Effect Between the Three Groups of Items The Number-Congruency Effect Between the Groups
To test the main effect of number-congruency and the interaction of number-congruency and type, a GEE The odds ratios showed again that the effect of number-congruency was larger in the Group of Small than in the other groups (in-line with Prediction 4).

The Operation-Congruency Effect Between the Groups of Items
The same analysis was applied to the IC and II1 items to test the main effect of operation-congruency and the interaction of operation-congruency and type, for the three Groups. For multiplication, there was a statistically significant interaction effect of group and operation-congruency, χ 2 (5, N = 1386 performance on II1 items was always higher than for IC items, and the odds ratios showed the same pattern as in multiplication; however, the interaction effect of group and operation-congruency was not statistically significant, χ 2 (5, N = 1386) = 11.036, p = .051.
Additionally, the main effect of operation-congruency and the interaction of operation-congruency and type was tested in the IC and II2 items, for the three Groups. For multiplication, there was a statistically significant inter-  1.11, 3.80]. However, counter to Prediction 5, the odds ratios showed that the effect of operation-congruency is larger in the Group of Small Numbers than in the other groups.

The Effect of Item Familiarity Between the Groups of Items
The same analysis was also conducted on the II1 and II2 items, to test the effect of familiarity with the items that appear in the multiplication In summary, for both multiplication and division, the effect of number-congruency was larger in the Group of Small Numbers than the Group of Decimal Numbers and the Group of Large Numbers, results that support Prediction 4 that the effect of number-congruency will be less strong than in the Group of Small Numbers, because small number combinations are more easily memorized. Higher accuracy rates on the IC than on the II2 items showed an effect of operation-congruency which was larger in the Group of Large Numbers than in the Group of Decimal Numbers and in the Group of Small Numbers, in multiplication, but not in division.
These results support Prediction 5 that a stronger effect of operation-congruency would be expected in the Group of Large Numbers and the Group of Decimal Numbers compared to the Group of Small Numbers.
Interestingly, higher accuracy on IC than II1 items in multiplication also showed an operation-congruency effect that was larger in the Group of Large Numbers than in the Group of Decimal Numbers and the Group of Small Numbers, where accuracy was higher in II1 than in the IC items. Again, these results may indicate that the effect of familiarity (i.e., the II1 items) is larger in small number combinations than in decimal and large number combinations. In support of this interpretation, in-line with Prediction 6, the effect of familiarity was larger in the Group of Small Numbers than the Group of Decimal Numbers and the Group of Large Numbers as shown in the higher accuracy rates in II1 than in II2 items for both operations. Table 4 shows participants' mean reaction times for multiplication items, for each group and type. In the Group of Small Numbers, there was a statistically significant effect of Type, χ 2 (3, N = 603) = 24.007, p < .001. In-line with Prediction 1b the participants responded to CC items faster than to IC items, p = .023, d = 0.39, II1 items, p = .001, d = 0.45, and II2 items, p < .001, d = 0.47. There was no statistically significant difference between IC and II1 items, p = 1.000, d = 0.057, IC and II2 items, p = 1.000, d = 0.072, or II1 and II2 items, p = 1.000, d = 0.015 (Prediction 3b). These suggest a medium effect of number-congruency for small numbers.

Results for Reaction Time for Multiplication Items
In the Group of Decimal Numbers there was a statistically significant effect of Type, χ 2 (3, N = 616) = 12.756, p = .005. In-line with Prediction 1b, there were statistically significant differences between mean response time on the CC and II1 items, p = .007, d = 0.22, and between the CC and II2 items, p = .010, d = 0.38. There was not a statistically significant difference in reaction time between the CC and IC items, p = 1.000, d = 0.057.
Additionally, mean reaction time was statistically significantly faster on IC items than on II2 items, p = .029, d = 0.41, but there were no statistically significant differences between mean reaction time on the IC and II1 items, p = .253, d = 0.25, (Prediction 2b), or between the II1 and II2 items, p = 1.000, d = 0.16 (Prediction 3b).
Together, these results suggest small to medium effects for number-congruency and support for medium effects of operation-congruency for decimal numbers.  In the Group of Large Numbers, there was a statistically significant effect of Type, χ 2 (3, N = 553) = 18.481, p < .001. However, results only partially supported Prediction 1b. There were statistically significant differences in mean reaction times between CC and II1 items, with faster responses on CC items, p = .007, d = 0.25 (Prediction 1b). There was no statistically significant difference in mean reaction time between the CC and IC items, p = 1.000, d = 0.11. Additionally, there were statistically significant differences in mean reaction times between IC and II1 items, p = .001, d = 0.37, and also between IC and II2 items, p = .030, d = 0.26, with faster responses on IC items (Prediction 2b). As with the other two groups of numbers, differences in mean reaction time between II1 and II2 items were not statistically significant, p = 1.000, d = 0.10 (Prediction 3b).
Together, these results suggest limited support for a small effect of number-congruency for large numbers and small to medium effects of operation-congruency. Notably, there is no evidence of a familiarity effect; we expect no effect, since the numbers are outside of the multiplication table. Mean reaction time for multiplication items by type and group are presented in the Appendix, in Panel (a) of Figure A2.   In sum, in most cases, in-line with Prediction 1b, that responses on the number-congruent/operation-congruent items would be quicker, the participants responded statistically significantly faster on the CC items than on II1 and II2 items. Also, in-line with Prediction 2b, that there would be shorter reaction time on operation-congruent items participants responded faster on IC items than on II1 and II2 items. In some cases, the participants responded faster on IC than on CC items, however these differences were not statistically significant. Also, the participants' mean reaction time was in most cases faster in on false familiar items II1 than on II2 items (in-line with Prediction 3), however these differences were again not statistically significant. The patterns presented above were clear only in multiplication items in the Group of Small Numbers, and were less clear in the Group of decimal Numbers and the Group of Large Numbers, and almost absent in all groups of division items.

Discussion
This study examined the effect of the NNB on operations between given and missing numbers, using accuracy and response time measurement to extend previous findings that suggest a dual effect of the NNB in arithmetic operations between missing numbers (Christou, 2015a(Christou, , 2015b. The main hypothesis of the study was that the NNB affects evaluations about the validity of equations that present arithmetic operations between given and missing numbers in two main ways: a) a tendency to think that missing numbers are natural numbers; and b) a tendency to associate each operation with specific results independently of the numbers involved in the operations, i.e., larger results than the operand in multiplication and smaller results in division.
Based on the above hypotheses, the participants were expected to evaluate the validity of given equations (e.g., 7 ÷ _ = 42) using two main strategies: a) to mentally substitute different numbers to test the validity of the equations and b) to compare the operand and result. To test these assumptions, four types of equations were designed to be congruent or incongruent with the intuitive beliefs about the missing numbers and the size Christou,Pollack,Van Hoof,& Van Dooren 39 of the results of each operation, and were tested using three groups of numbers (i.e., small natural numbers, decimal numbers, large natural numbers). Within the different number groups, the familiarity effect was tested, using number combinations that appear in the multiplication table or seem to appear in the multiplication table, i.e., false familiarity).
Results supported the first aspect of the NNB, showing a large effect of number-congruency on participants' evaluations for all groups of items, with higher accuracy rates on those items with missing natural numbers (i.e., the number-congruent, CC items), compared to those items that falsified this belief (i.e., all other items).
These results support the main prediction of the study. The effect of number-congruency was larger in the Group of Small Number items than in the Group of Decimal Numbers and the Group of Large numbers.
This suggests that participants may mentally test the effect of operations by substituting specific numbers for missing numbers, which under the influence of the NNB, are mostly natural numbers; a finding that further supports previous research (Christou, 2015a(Christou, , 2015bVan Hoof et al., 2015). This strategy is also easier to apply effectively in the case of small number combinations than in other categories of tasks. Importantly, it may appear that these items preclude our ability to disentangle the mechanisms of arithmetic fact retrieval from the NNB, if students try to find the missing number (presumably by substituting natural numbers). From our perspective, however, these two mechanisms may not be separable. Multiplication fact retrieval relies on memorization of the multiplication table, which comprises natural number arithmetic. The emphasis on developing fact fluency for natural number arithmetic throughout primary school years are among the factors that may strengthen the NNB phenomenon.
Results supported the second aspect of the NNB, an effect of operation-congruency on participants' evaluations, more clearly for multiplication than division. Specifically, the results showed higher accuracy rates for items that were in-line with participants' intuitive beliefs about the size of the results of multiplication and division (i.e., the operation-congruent, IC items), comparing with their evaluations in those items that falsified these beliefs when those items were not familiar (i.e., the operation-incongruent, II2 items). This is in-line with the prediction regarding the second part of the dual effect hypothesis, and with previous research that reported the multiplication makes bigger misconception (Fischbein et al., 1985;Greer, 1994;Onslow, 1990;Vamvakoussi et al., 2012Vamvakoussi et al., , 2013Van Hoof et al., 2015). For accuracy rates, results for multiplication items Further support for the notion that participants may base their evaluations on trying to find the missing number when this is possible was suggested with higher accuracy rates on operation-incongruent items that appeared familiar because the missing number was a unit fraction (i.e., II1 items), than operation-congruent items that were in-line with intuitive beliefs about the size of the results of operation (i.e., IC items), in all cases except in large number multiplication items; a result counter to our predictions. Most often these differences were not statistically significant, however they suggest that in cases where the tasks are operation-incongruent and Natural Number Bias in Arithmetic Operations 40 appear familiar, participants tended to respond even higher than in those tasks where the size of the results of the operations were in-line with participants' intuitive beliefs. In those cases, the effect of operation-congruency was stronger in the Group of Small Numbers than in the Group of Decimal and Large Numbers, both in multiplication and division. These results further support the above interpretation, that participants may base their evaluations on number substitution or task familiarity, rather than on the effect of the operation, and this is a more effective strategy in the case of small numbers.
Statistically significant differences between accuracy rates on items that falsified intuitions about number and operation-congruency (i.e., the II1 and the II2 items) further support this interpretation. As was predicted, accuracy was higher on items that appeared familiar (i.e., the II1 items) than on items that did not (i.e., the II2 items), for both multiplication and division. Again, this effect of false familiarity was stronger in the Group of Small Numbers than the other two groups, for both operations, however the effect sizes were smaller than the number congruency effect.
For reaction time, only multiplication items with small numbers supported the above interpretations; results were less clear for other multiplication items and groups, and for all division items and groups. In most cases, the participants responded statistically significantly faster on number-congruent and operation-congruent (i.e., CC) items than on the number-incongruent and operation-incongruent (i.e., the II1 and the II2) items. Also, as it was predicted, participants responded faster on items that were aligned with their intuitions about the size of the result of the operations (i.e., the IC items) and slower on items that falsified these intuitions (i.e., the II1 and the II2 items). There was also a familiarity effect for items that falsified intuitions about the effect of operations. In line with our predictions, participants' responses were most often faster when items appeared familiar (i.e., II1 items) than when they did not (i.e., II2 items). Interestingly, for items that aligned with intuitions about the size of the results of operations, in most cases participants responded slower when a natural number was missing (i.e., on the CC items) than when a non-natural number was missing (i.e., the IC items). These differences were not statistically significant; however, it is possible that there was not sufficient statistical power to detect these effects, due to large variation in reaction times overall. Speculatively, differences may suggest that participants spend time to confirm that a specific natural number is missing, yet do not when the missing number is not natural. This hypothesis could be tested in subsequent studies.
Overall, the results of the study support the dual effect of the NNB in arithmetic operations between missing numbers, providing further empirical support to previous findings (Christou, 2015a(Christou, , 2015bObersteiner et al., 2016;Vamvakoussi et al., 2012Vamvakoussi et al., , 2013Van Hoof et al., 2015). Results also suggest that when small natural number combinations are presented with missing numbers, participants are inclined to mentally substitute natural numbers to decide whether the expression can be true, and thus they are more susceptible to number-congruency than operation-congruency effects, and to familiarity effects, especially in multiplication.
However, when it is more difficult to trace the missing natural number, and the effect of familiarity is less strong, such as with decimal number or large number combinations, students base their evaluations on intuitions about the size of results from multiplication and division, as a more effective strategy. This tendency appeared when natural numbers were missing and when unit fractions were missing. The latter may have created an effect of false familiarity with multiplication facts, suggesting that students may directly retrieve facts from memory, as suggested in previous studies (Krueger, 1986;Krueger & Hallford, 1984;Stazyk et al., 1982).

Limitations
To focus on testing the dual effect of the NNB, this study maximized experimental items to examine numberand operation-congruency effects, and had relatively fewer distractor items (since distractor items must contain zero as an operand or result). As a result, to answer correctly, participants needed to indicate more often that the expression was possible than impossible. However, given the observed accuracy rates, participants evaluated many of the expressions to be impossible, suggesting item imbalance was not an issue.
Another potential limitation was that students could still respond using general insights about the solvability of linear equations. There may be contexts in which this strategy of applying a general insight (understanding that all expressions are possible as long as they don't involve a 0) is frequently relied on. For instance, Obersteiner et al. (2016) observed this in mathematical experts. In our study, missing number symbols rather than literal symbols were used to specifically discourage this strategy. Use of the general insight strategy in our current study seems unlikely; there were not ceiling effects, systematic fast reaction times on all items, or indications that participants figured out this strategy during the course of the experiment. Specifically, only one student gave the correct response for all items, and only 10% of participants scored higher than 95. From our perspective, these results show that even if some students used either of the two strategies (i.e., either solved the equations, or evoked knowledge on equation solvability) they did not do it systematically. This implies that although they presumably recognized that knowledge of equations was relevant to the tasks at hand, they failed to use this knowledge in all items. This could be interpreted as an instance where the intuitive, "biased" response (Leron & Hazzan, 2006) overrode the analytic, perhaps because students felt more confident about it. Future studies that incorporate interviews could shed more light on the actual strategies that the participants tend to use in each category of items, as the present study did not collect strategy information.
Finally, a limitation of this research approach generally is that it involves rating symbolic equations only in a purely symbolic form. Performance on mathematical tasks can be highly context dependent (e.g., Saxe, 2015), and so can knowledge of mathematical principles (Prather & Alibali, 2008). Along this line, someone could argue that thinking of the unknown number as a decimal number, for example, in a context where all the other numbers in the equation are natural, is more difficult than when decimal numbers would appear in the same equation. Indeed, in the present study, the accuracy rates on the tasks that were number incongruent but operation congruent were slightly higher in the group of decimal numbers than in the group of small natural numbers. However, since participants are exposed to natural and decimal numbers throughout the experiment, this potential difference in difficulty should be eliminated or greatly reduced for any specific equation. Additionally, from the NNB perspective that we endorse, such an effect of context would further support our main claim about an intuitive/analytic distinction between the cognitive processes that underlie reasoning with arithmetic operations. In other words, no context effect should appear if participants' responses were not affected by the NNB, considering that the participants have been exposed to decimal numbers and fractions for many years throughout schooling. Support for this position comes from previous studies which have shown that the NNB may affect how students apply different properties to the different symbolic representations of rational numbers, i.e., when they appear as decimals or as fractions (for a detailed discussion see Vamvakoussi & Vosniadou 2010). However, participants may still perform differently if tasks are presented in verbal or problem-solving form. As such, the potentially context-dependent nature of performance on these tasks requires further examination. Also, further studies should test whether the reported effects could be education dependent, by testing Natural Number Bias in Arithmetic Operations 42 how performance on the tasks might differ between adults with more and less formal education or with younger students.

Implications
The results of the present study provide valuable information about the cognitive processes that underlie reasoning with arithmetic operations and the role of prior natural number knowledge in these processes.
Results further support that the NNB contributes to the effect of number size on students' evaluations of the validity of multiplication and division equations, which further suggests that classroom instructors will need to address the NNB. However, addressing the NNB in instruction is a complex endeavor that requires remedial and anticipation approaches (Vamvakoussi et al., 2018;. As an initial step, teachers should learn that students hold intuitive ideas about various mathematical topics, including the size and the type of the results of operations, since preservice teachers are insufficiently aware of such issues (Depaepe et al., 2015;Depaepe et al., 2018). Students should also learn about their intuitions, since these beliefs are often implicit and not under their conscious control (Fischbein, 1987). However, merely challenging students' erroneous beliefs, without interventions that raise awareness about the discrepancy between incorrect beliefs and the mathematically correct perspective, is not enough for students to abandon their alternative conceptions and accept the mathematically correct knowledge (Merenluoto & Lehtinen, 2004).
Further, merely raising awareness may not be enough. For example, students who can verbalize the fact that non-natural numbers can be substituted for missing number symbols or literal symbols may still substitute natural numbers only, even after hints from interviewers (Christou & Vosniadou, 2012;Van Hoof et al., 2015).
Along the same line, merely changing the missing number symbol to x in the given equations (rather than the '_' that was used), would be unlikely to solve the NNB issue. That is because by following an equation-solving process (mentally or not), the participants may determine the non-natural missing number (i.e., the solution).
This, however, would draw on cognitive processes different from students' intuitive responses that are affected by the NNB. Therefore, as has been argued before (Christou, 2015b;Dimitrakopoulou & Christou, 2018;Vamvakoussi et al., 2013), for students to overcome their intuitions about the effects of arithmetic operations, they need explicit strategies to inhibit natural number knowledge interference (Moutier & Houdé, 2003;Roell et al., 2017Roell et al., , 2019Van Dooren & Inglis, 2015). One inhibition strategy is to always try with at least one non-natural number-a negative or a positive rational number smaller than 1-in cases of missing number tasks, or to always recall that multiplication does not always make bigger.
Lastly, the design of the items presented in the study may be fruitfully used and empirically tested as educational materials that could illuminate and falsify students' intuitive beliefs about the size of the results of operations.
These items could be used in constructivist teaching environments to raise students' awareness about the familiarity of multiplication facts due to natural number calculations. Students can learn that the tendency to think that calculations between missing numbers hold only between natural numbers could create further constraints on successfully completing mathematical tasks, such as solving equations.