^{a}

^{a}

Considering the importance of mathematical knowledge for STEM careers, we aimed to better understand the cognitive mechanisms underlying the commonly observed relation between number line estimations (NLEs) and arithmetics. We used a within-subject design to model NLEs in an unbounded and bounded task and to assess their relations to arithmetics in second to fourth grades. Our results mostly agree with previous findings, indicating that unbounded and bounded NLEs likely index different cognitive constructs at this age. Bounded NLEs were best described by cyclic power models including the subtraction bias model, likely indicating proportional reasoning. Conversely, mixed log-linear and single scalloped power models provided better fits for unbounded NLEs, suggesting direct estimation. Moreover, only bounded but not unbounded NLEs related to addition and subtraction skills. This thus suggests that proportional reasoning probably accounts for the relation between NLEs and arithmetics, at least in second to fourth graders. This was further confirmed by moderation analysis, showing that relations between bounded NLEs and subtraction skills were only observed in children whose estimates were best described by the cyclic power models. Depending on the aim of future studies, our results suggest measuring estimations on unbounded number lines if one is interested in directly assessing numerical magnitude representations. Conversely, if one aims to predict arithmetic skills, one should assess bounded NLEs, probably indexing proportional reasoning, at least in second to fourth graders. The present outcomes also further highlight the potential usefulness of training the positioning of target numbers on bounded number lines for arithmetic development.

To prepare students for success in STEM careers, it is important to better understand the cognitive processes underlying mathematical development. Several basic numerical tasks, such as magnitude comparison (e.g.,

Studies consistently report strong relations between children’s number line estimations (NLEs) and arithmetics. Better estimations in terms of average error have been related to higher achievements on various more complex and advanced mathematical skill measures in elementary school children (e.g.,

In NLE tasks, participants need to place a target number on a visually presented number line (

Since NLE tasks were typically considered to allow for a direct assessment of the underlying mental number line (MNL) representation (

However, recent studies assessing both the classical bounded NLE task and the relatively new unbounded version found that associations between NLEs and arithmetics were only observed for the former but not the latter task (

Interestingly, bounded NLEs seem to be best described by a variety of cyclic power models (

More concretely, since bounded number lines have a clearly defined origin and endpoint, individuals can scale the line length to a proportion by estimating the distances of the target number from the left and right boundaries of the line. For example, if individuals need to place the number 60 on a line from 0 to 100, they pick a position on the line and then look back and forth between that position and the boundaries of the line, adjusting the position until its distances to the lower and upper boundaries seem to be 60 and 40 respectively. When participants use this “proportional judgement” strategy to estimate the position of numbers on bounded lines, their estimates form an S-shaped ogival curve around the accuracy line, which can be described by Spence’s power model (

Even though the question of whether proportional reasoning is adequately modelled by CPMs remains a matter of debate, the application of such a calculation strategy on bounded tasks can be further supported by studies focussing on participants’ error rates (e.g.,

As opposed to the bounded task, strategies on unbounded number lines were shown to consist of direct estimation and dead-reckoning (

Direct estimation on unbounded number lines is also further confirmed by participants’ error data. Namely, error variability in the unbounded task linearly increases with target number and is not characterized by the M-shaped pattern reflecting proportional judgement on bounded number lines (

Considering that 1) mainly bounded but not unbounded estimates correlate with arithmetics and that 2) direct estimation is likely only indexed by unbounded NLEs, mechanisms other than numerical magnitude representations such as proportional judgement probably underlie the relation between NLEs and arithmetics. In other terms, NLEs correlate with mathematical skills, because proportional reasoning is a key proficiency component in mathematics (cf.,

Whether or not the MNL can account for the commonly observed relation between NLEs and arithmetics is a matter of ongoing debate. Although abundant evidence suggests that bounded estimations rather reflect calculation strategies (

How can the findings of

In this paper, we generally aimed to further unravel the cognitive mechanisms underlying the commonly observed relation between NLEs and arithmetics. To shed further light on the aforementioned inconsistencies and to account for the potential shortcomings of previous designs, the present study used a

The current

We also performed

We intended to contrast the MNL with proportional reasoning as possible constructs underlying unbounded and bounded NLEs and their relations to arithmetics. Since the likelihood of strategy application and the reliable use of reference points was shown to be increased by employing a

Considering the logarithmic-like (as opposed to linear-like) estimation patterns usually observed in younger children or with less familiar number ranges, we would like to point out that different conclusions regarding the constructs underlying (bounded and unbounded) NLEs and their relations to arithmetics might be drawn in those cases. It was, however, beyond the scope of the present study to examine whether the shift from seemingly logarithmic to more linear NLEs with age and experience reflects 1) changes in the disposition of numerical magnitude representations (e.g.,

A cross-sectional sample of 69 elementary school children (20 second graders [11 boys, mean age = 8.17 years,

Post-hoc power analysis using the G*Power 3 software (

All tasks were administered in group settings, starting with the arithmetic task.

To assess children’s arithmetic skills, the TTR (Tempo Test Rekenen;

The study consisted of a bounded and unbounded version of the NLE task, requiring participants to indicate the correct position of a presented target number on a number line. All number lines had a constant length of 20 cm and were presented on paper in landscape format with one item per DIN A4 sheet.

In the bounded task, number lines were labelled below the origin and endpoint with the numbers 0 and 100, respectively. Target numbers [2, 3, 6, 7, 9, 11, 13, 17, 18, 27, 35, 47, 53, 64, 75, 82, 95, 99] were placed above the origin with one target number per number line. Children were instructed that they are presented with a number line that only has an origin and an endpoint. The task was then explained (in Luxembourgish) as follows: “Look at the number above the number line – where do you think this number goes between 0 and 100. Please mark your estimate on the number line.” 50 was used as practice trial.

In the unbounded task, number lines were only labelled below the origin with 0 but did not comprise a clearly defined endpoint. A unit indicating the distance between 0 and 1 was depicted below the origin. Target numbers were presented above the origin. The numerical length of the unbounded number line was 29. However, only items up to 20 [2, 3, 4, 6, 7, 8, 9, 12, 13, 14, 15, 16, 17, 18, 19] were used to keep sufficient space between the largest target number and the physical endpoint. Children were instructed that there is no end to the number line but that they can see how long the distance from 0 to 1 is. 10 was used as practice trial.

All children first completed the unbounded task followed by the bounded one. This was to avoid that the endpoint ‘100’ given in the latter task biased estimations in the former. We particularly wanted to prevent the children from mistakenly assuming that unbounded number lines might also cover the range from 0 to 100. Participants were not informed about the number range covered by the unbounded task prior to completing it. In other words, the task that explicitly defined a number range (i.e., the bounded task) had to be administered last to avoid participants from potentially building up expectations about the number range used in the unbounded task. This procedure has also been previously used in studies assessing unbounded and bounded NLEs using a within-subject design (e.g.,

We also calculated mean PAEs as a function of target number in both the unbounded and bounded tasks across all children separately for each grade.

We distinguished between models that either index direct estimation and dead-reckoning or reflect calculation strategies, such as proportional judgement. Direct estimation is indicated by the superior fits of either the mixed log-linear model (MLLM) or the scalloped power models (SPMs), with the latter also reflecting some extent of dead-reckoning depending on the variant of the model. Conversely, proportional judgment can be revealed by the superior fits of the cyclic power models (CPMs), including the subtraction bias cyclic model (SBCM).

The MLLM predicts estimates as a weighted sum of logarithmic and linear transforms of the target number (

where

The SPM

Dual- (2SPM) and multi-scalloped power models (multi-SPM) reflect the estimation of a particular working window of numbers (e.g., 5) before using multiples of this working window to estimate the position of higher target numbers. The 2SPM allows to identify participants, estimating the working window once before positioning their estimates:

while the multi-SPM indexes multiple applications of the working window:

In these models,

Finally, CPMs index calculation strategies, such as probably proportional judgment. These models suggest that participants use at least two reference points (i.e., the origin and the endpoint) to guide their estimations. Number placement thus occurs via estimating a target number’s distance from the lower and upper bounds of the number line until consistency of the two distances. While the one-cycle power model (1CPM) reflects the use of two reference points (i.e., the origin and the endpoint):

the two-cycle power model (2CPM) indexes the reliance on an additional central reference point:

1CPM and 2CPM were fitted with one free parameter, the exponent β describing the numerical bias and the shape of the power function.

In addition to these models,

Importantly, CPMs require definition of an upper bound

In both the SPMs and CPMs, accurate responding is indicated by β = 1, while β < 1 indicates a negatively accelerating bias (i.e., logarithmic), and β > 1 indicates a positively accelerating bias (i.e., exponential). β was set to be greater than 0.

The models were fitted both to individual estimates and to median estimates separately for each grade. Models were compared in terms of goodness of fit by calculating AICc^{1}^{2}), it thus considers both goodness of fit and model complexity in terms of the number of parameters (

Firstly, we determined whether estimation errors depended on grade and/or the version of the number line task. We therefore conducted a linear mixed effect model using the lme4 package (^{2}(1) = 8.53, ^{2}(1) = 6.10, ^{2}(1) = 4.26, ^{nd} grade: PAE = 9.25, ^{rd} grade: PAE = 8.45, ^{th} grade: PAE = 5.97, ^{nd} grade: PAE = 11.12, ^{rd} grade: PAE = 14.96, ^{th} grade: PAE = 13.12,

Next, we correlated unbounded and bounded PAEs with the size of target numbers separately for each grade. Unbounded PAEs significantly increased with increasing target number in every grade (2^{nd} grade: ^{rd} grade: ^{th} grade:

To further confirm the use of different strategies on unbounded and bounded tasks, we fitted a series of models to children’s unbounded and bounded NLEs. Since relying only on mean or median estimates across individuals from each grade for each task might obscure individual differences in estimation patterns and trajectories (see also

First, children were grouped based on the model that best described their individual estimates. ^{2}

Grade | MLLM | 1SPM | 2SPM | Multi-SPM | SBCM | 1CPM | 2CPM |
---|---|---|---|---|---|---|---|

Unbounded Task | |||||||

2^{nd} |
30 | 55 | 0 | 5 | 5 | 5 | – |

3^{rd} |
33.33 | 50 | 0 | 5.56 | 11.11 | 0 | – |

4^{th} |
38.71 | 48.39 | 0 | 6.44 | 3.23 | 3.23 | – |

All | 34.78 | 50.72 | 0 | 5.80 | 5.80 | 2.90 | |

Bounded Task | |||||||

2^{nd} |
10 | 0 | 0 | 0 | 40 | 25 | 25 |

3^{rd} |
11.11 | 0 | 0 | 0 | 44.44 | 27.78 | 16.67 |

4^{th} |
9.68 | 6.45 | 0 | 0 | 25.81 | 29.03 | 29.03 |

All | 10.14 | 2.90 | 0 | 0 | 34.78 | 27.54 | 24.64 |

To determine whether NLEs in terms of PAEs depended on the strategy used to position target numbers on either the unbounded or bounded number line (i.e., best-fit model), we performed two one-way ANOVAs on either unbounded or bounded PAEs including best-fit unbounded or bounded model respectively as between-subject factor. For these analyses, individuals whose unbounded NLEs were not described by either the MLLM or 1SPM were pooled to avoid very small sample sizes. For the same reason, those participants whose bounded NLEs were not fit by a variant of the CPMs were combined. Analysis revealed a main effect of best-fit model for bounded PAEs, in that children using three reference points outperformed children whose estimates were best described by either the MLLM or 1SPM (2CPM: PAE = 5.47, MLLM-1SPM: PAE = 9.32,

Apart from this, we also considered the average goodness of fit, as indexed by AICc, across all participants per grade for each model used to fit either unbounded or bounded NLEs. Mean unbounded and bounded AICc are displayed in ^{nd} grade: ^{rd} grade: ^{th} grade: ^{nd} grade = 0.69, 3^{rd} grade = 0.71, 4^{th} grade = 0.89; 1CPM: 2^{nd} grade = 0.66, 3^{rd} grade = 0.68, 4^{th} grade = 0.89). Importantly, unbounded and bounded model parameters did not correlate (see

Grade | MLLM | 1SPM | 2SPM | Multi-SPM | SBCM | 1CPM | 2CPM |
---|---|---|---|---|---|---|---|

Unbounded Task | |||||||

2^{nd} |
28.20 | 27.43 | 29.75 | 29.49 | 31.18 | 42.12 | – |

3^{rd} |
28.44 | 28.77 | 30.33 | 29.98 | 34.57 | 51.21 | – |

4^{th} |
23.66 | 23.75 | 25.99 | 24.96 | 29.33 | 46.87 | – |

All | 26.77 | 26.65 | 28.69 | 28.15 | 31.70 | 46.74 | |

ΔAICc | 0.12 | 0.00 | 2.04 | 1.50 | 5.05 | 20.09 | |

Bounded Task | |||||||

2^{nd} |
72.85 | 85.60 | 88.48 | 88.62 | 70.67 | 74.01 | 79.28 |

3^{rd} |
71.05 | 83.84 | 86.71 | 86.70 | 67.73 | 70.39 | 74.67 |

4^{th} |
67.67 | 71.69 | 74.55 | 74.47 | 62.71 | 64.46 | 67.10 |

All | 70.53 | 80.38 | 83.25 | 83.27 | 67.04 | 69.62 | 73.68 |

ΔAICc | 3.49 | 13.34 | 16.21 | 16.23 | 0.00 | 2.58 | 6.64 |

Parameter | Correlation Coefficient |
---|---|

MLLM_{λ} |
-.03 |

1SPM_{β} |
.11 |

2SPM_{β} |
.12 |

Multi-SPM_{β} |
.07 |

SBCMs | .14 |

SBCM_{β} |
.19 |

1CPM_{β} |
.14 |

Alongside these analyses at the individual level, we also fitted the different models to the children’s median estimates separately for each task and grade (see

To determine whether arithmetic performances depended on grade and/or task (addition vs. subtraction), we also conducted a linear mixed effect model. Task and grade with interaction term were entered as fixed effects, while we included intercepts for participants as random effect. Model comparisons were done using chi-squares tests on the log-likelihood values. Since the full model did not provide a better fit than the reduced model without the interaction term, χ^{2}(1) = 2.53, ^{2}(1) = 30.61, ^{2}(1) = 43.47, ^{nd} grade = 31.70, 3^{rd} grade = 37.56, 4^{th} grade = 44.19). Performances were significantly correlated in both tasks (

Better addition and subtraction skills were related to significantly fewer and less variable estimation errors, as indexed by individual mean PAEs and SDs of mean PAEs, in the bounded but not the unbounded task (see

Measure | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|

1. Addition | – | .702*** | .001 | -.008 | -.422*** | -.437*** |

2. Subtraction | .578*** | – | .144 | .138 | -.368** | -.384** |

3. Unbounded PAE ( |
-.099 | .087 | – | .812*** | -.041 | -.107 |

4. Unbounded PAE ( |
-.069 | .113 | .812*** | – | .006 | -.004 |

5. Bounded PAE ( |
-.240* | -.198 | .017 | .043 | – | .856*** |

6. Bounded PAE ( |
-.219^{†} |
-.184 | -.048 | .039 | .823*** | – |

^{†}

Since most of the children relied on calculation strategies, such as probably proportional reasoning, when placing target numbers on bounded number lines, we additionally assessed correlations with arithmetic skills when focussing only on those individuals whose bounded NLEs were best described by one of the variants of the CPMs (

Measure | 1 | 2 | 3 | 4 |
---|---|---|---|---|

1. Addition | – | .735*** | -.442*** | -.429** |

2. Subtraction | .611*** | – | -.431** | -.444*** |

3. Bounded PAE ( |
-.319* | -.315* | – | .825*** |

4. Bounded PAE ( |
-.249^{†} |
-.289* | .800*** | – |

^{†}

To further assess the importance of calculation strategies, we determined whether the relation between arithmetic skills and bounded NLEs was conditional upon strategy use by performing moderation analysis using Hayes’ PROCESS macro for SPSS. In two separate analyses, we assessed the effect of bounded mean PAEs on addition and subtraction skills respectively, including best-fit model as moderator. Those children whose estimates were best fit by either the MLLM or the 1SPM were categorized as not applying any calculation strategy and compared to those individuals classified as either SBCM, 1CPM or 2CPM. This resulted in a multi-categorical moderator of four levels. In this model, moderation is depicted by the significant effect of the interaction term between bounded mean PAEs and best-fit model on addition and/or subtraction skills, while controlling for the effects of the factors included in the interaction term. A bootstrapping approach with 5.000 bootstrap samples was used. Significance was determined at 95% bias-corrected confidence intervals. To avoid multicollinearity issues, all variables were mean centred prior to analyses.

While the interaction between bounded mean PAEs and best-fit model did not account for a significant proportion of the variance in addition skills (Δ^{2} = .06, ^{2} = .08,

We also assessed correlations between addition and subtraction skills and best-fit model parameters (see

Parameter | Addition | Subtraction |
---|---|---|

Unbounded MLLM (λ) | .088 | -.072 |

Unbounded 1SPM (β) | .012 | .179 |

Bounded SBCM (s) | -.307* | -.170 |

Bounded SBCM (β) | -.274* | -.272* |

Bounded 1CPM (β) | -.263* | -.220^{†} |

Bounded 2CPM (β) | -.330** | -.308** |

^{†}

Finally, we performed a linear mixed model to determine the effects of best-fit CPM (SBCM vs. 1CPM vs. 2CPM) and/or task (addition vs. subtraction) on arithmetic skills to assess whether the latter performances were affected by the specific type of proportional judgement strategy applied by the children. Best-fit CPM and task with interaction term were entered as fixed effects and intercepts for participants as random effect. Model comparisons were done using chi-squares tests on the log-likelihood values. Since the full model did not provide a better fit than the reduced model without the interaction term, χ^{2}(1) = 0.54, ^{2}(1) = 40.24, ^{2}(1) = 1.19,

In this paper, we aimed to shed further light on the cognitive mechanisms underlying the commonly observed relation between NLEs and arithmetics in older elementary school children. Below we will first discuss the cognitive constructs indexed by the two different NLE tasks and then consider how those mechanisms could explain the differential relations of unbounded and bounded NLEs to arithmetic performances.

Unbounded and bounded NLEs were unrelated in terms of both estimation errors and model parameters. Moreover, only bounded NLEs improved with grade and were overall better than unbounded performances. In addition, while unbounded NLEs were best described by models reflecting direct estimation, functions supposedly capturing proportional judgment provided better fits for bounded NLEs. The two tasks thus probably elicit different estimation strategies with only the unbounded version providing an accurate measure of children’s numerical magnitude representations in the current sample.

More concretely, unbounded NLEs of about 85% of the children were best described by either the MLLM or the 1SPM, reflecting direct estimation. In addition, the goodness of fits (as indexed by AICc) provided by these models were better than those of the CPMs, probably indexing proportional reasoning strategies. Unbounded estimation errors also significantly increased as a function of target number. This pattern reflects a signature of the approximate number system (e.g.,

Conversely, bounded NLEs of the majority of the children were best fit by the CPMs, commonly reported to index proportional judgement (

Bounded NLEs, likely indexing proportional reasoning, related to addition and subtraction skills. Importantly, these relations remained significant when controlling for grade in children whose estimated were best described by the CPMs. This generally agrees with previous findings, consistently reporting strong relations between bounded NLEs and arithmetics (e.g.,

It is important to also comment on the more practical implications of the present outcomes. What does the absence of a relation between arithmetics and unbounded NLEs, probably providing a purer measure of numerical magnitude representations, tell us about the importance of the latter for arithmetic development? The present findings suggest that the scaling of numerical magnitudes on the MNL does not relate to arithmetics in second to fourth graders. This agrees with studies reporting the absence of a relation between the SNARC effect, another important marker of the MNL, and arithmetic skills in older children attending fourth (

Despite the outcome of the present study providing no evidence for the importance of numerical magnitude representations for arithmetic learning, the relation of bounded NLEs to addition and subtraction skills further suggests that bounded number lines are a valuable and robust tool for predicting mathematical competence. Since number line tasks are easily applicable, relatively short, and very cost-effective, they could also be used to assist the diagnosis of mathematical learning difficulties. The present study also further highlights the potential usefulness of training the positioning of target numbers on bounded number lines for arithmetic development.

First of all, it should be emphasized that the present outcomes might not be generalizable to different age groups and/or number ranges. We tested older elementary school children to 1) complement previous within-subject designs assessing both model fits and relations to arithmetics in younger children (

Since relatively older children (and adults) usually produce seemingly linear estimation patterns on familiar number lines, while negatively accelerating logarithmic-like responses are commonly observed in younger children or with less familiar number ranges, it is probable that different conclusions regarding the constructs underlying bounded and unbounded NLEs as well as their relations to arithmetics might be drawn depending on age and/or number range. Since the shift from seemingly logarithmic to more linear (unbounded and bounded) NLEs with age and experience was suggested to reflect 1) changes in the disposition of numerical magnitude representations (e.g.,

The idea that different cognitive constructs might underlie NLEs at different developmental stages could be supported by the findings of

Apart from the constructs underlying NLEs, also their relations to arithmetics might vary depending on age and/or number range. For instance, ^{3}

Another important point worth mentioning is that due to the linear-like estimation patterns usually observed in older children on familiar number lines, the present study did not include any additional models, such as the bilinear (also known as decomposed linear or segmented linear) account. This model was suggested to provide good fits for logarithmic-like NLEs and is thereby yet another alternative explanation for the logarithmic-to-linear shift hypothesis (see

Finally, apart from using different models with unfamiliar number ranges at earlier developmental stages, future studies should also complement the present findings by considering potential domain-specific and/or domain-general covariates. In the current study, we employed a hybrid design, where we used a modelling procedure to better understand the constructs underlying (bounded and unbounded) NLEs and consequently individual differences in addition and subtraction skills (see e.g.,

In second to fourth graders, unbounded and bounded NLEs index different cognitive constructs. While unbounded estimates reflect direct estimation, thereby providing an appropriate measure of the scaling of numerical magnitude representations, bounded estimates rather index calculation strategies, such as proportional reasoning. These calculation strategies then likely account for the relation of bounded but not unbounded NLEs to addition and subtraction skills. Although the present findings do not provide any evidence for the involvement of numerical magnitude representations for arithmetic learning, we cannot rule out their importance at earlier developmental stages. Depending on the aim of future studies, the present outcomes suggest measuring estimations on unbounded number lines if one is interested in directly assessing numerical magnitude representations in second to fourth graders. Conversely, if one aims to predict arithmetic skills at this age, one should rather assess estimations on bounded number lines, likely indexing proportional reasoning.

The current research was supported by the National Research Fund Luxembourg (FNR,

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The authors would like to thank Charlotte Sosson for her help with data collection.

The same results were obtained when using the Bayesian information criterion (BIC) instead of AICc as a measure of goodness of fit (see also

It could be argued that the better fits of the CPMs compared to the MLLM in the bounded task might be explained by a decline in attentional processes on this task rather than by its boundedness. Namely, task order was fixed in the present study with the bounded task always being administered last and attention was previously shown to generally increases the linearity of NLEs (see e.g.,

Considering that