Mathematical competencies are pivotal for educational and professional success of individuals and for societal functioning and welfare (Hanushek & Woessmann, 2008). Thus, learning what to do in situations that involve mathematics is a major goal of schooling. At all grade levels, however, performance in mathematics reveals tremendous variations between individuals. This variation can be traced back to a dynamic interaction between personal characteristics of the individuals and their access to supportive formal and informal learning opportunities (e.g., Watts, Duncan, Siegler, & DavisKean, 2014). Even though powerful learning environments can boost mean achievement of classrooms, learners of all age levels differ markedly in the extent to which they gain from practice and explanations, and in the extent to which they can construct meaningful concepts which facilitate transfer to yet unfamiliar problems (e.g., NokesMalach & Mestre, 2013; Schwartz, Bransford, & Sears, 2005). While researchers agree that humans differ in their cognitive preconditions for learning mathematics beyond environmental impact, domainspecific and domaingeneral sources of these differences are still under discussion. We contribute to this discussion by presenting results from a longitudinal study. We investigated the predictive value of kindergarten children’s informally acquired understanding of relational quantitative reasoning and seriation skills (i.e. the concept of order) on their later mathematical achievement in third grade. In particular, we administered an initial test of relational quantitative reasoning that does not require knowledge about Arabic numerals and assessed mathematical competencies controlling for children’s domaingeneral reasoning abilities in third grade.
DomainSpecific Sources of Mathematical Achievement [TOP]
Researchers have amassed evidence that the human brain is hardwired for processing quantitative information. For example, preferential looking studies in infants indicate that already few months old babies can discriminate between sets with different numbers of elements (Feigenson, Dehaene, & Spelke, 2004; Wynn, 1992; Xu & Spelke, 2000). This domainspecific ability is not only part of the universal roots of mathematical achievement, but also seems to be an early source of individual differences: 6 months old infants’ ability to discriminate numerical changes predicts numerical discrimination abilities three months later (Libertus & Brannon, 2010). In the following years, children build on these foundations for developing their mathematical proficiency in informal learning opportunities.
In several longitudinal studies starting at kindergarten age, researchers have identified particular numerical precursors of later mathematical achievement. These precursors comprise quantity to number words linkage (Krajewski & Schneider, 2009a, 2009b), numeracyspecific attention (Hannula, Lepola, & Lehtinen, 2010), counting ability (Aunola, Leskinen, & Nurmi, 2006), and numerical magnitude comparison (De Smedt, Verschaffel, & Ghesquière, 2009). Children entering school without being able to count sets of different sizes, to recognize correct and incorrect counting, or to make numerical magnitude judgments run the risk of being disadvantaged in mathematics from the very beginning of their school career (e.g., Geary, 2011; Jordan, Kaplan, Ramineni, & Locuniak, 2009; Siegler, 2009).
Siegler and LortieForgues (2014) describe numerical development as “a process of broadening the set of numbers whose magnitudes, individually or in arithmetic combination, can be accurately represented.” (p. 144). The importance of this process is reflected in the majority of studies on early precursors of mathematical achievement. Accordingly, researchers have focused on the precision with which children can differentiate symbolic and nonsymbolic quantities, for instance when comparing two numbers or sets of objects. De Smedt et al. (2009), for example, showed that the impact of acuity in magnitude representations was weaker for nonsymbolic than for symbolic tasks (i.e. tasks that require knowledge about Arabic digits). Their finding thus highlights the importance of learning number words and Arabic numerals in order to promote numerical development and early mathematical competencies.
Although the strong impact of symbolic magnitude representation on later mathematical achievement is uncontested, it is not the only domainspecific precursor. Another important source of mathematical achievement is the spontaneous representation of order. Children as young as 11 months (but not as young as 9 months) can discriminate between an increasing and decreasing row of quantities independent of the magnitude of sets in preferential looking tasks (Brannon, 2002). Thus, the concept of order, which is independent of the size of concrete sets, begins to emerge at the end of the first year of life.
The concept of order lays the foundation for relational quantitative reasoning or seriation, as labeled in Piagetian tradition. Seriation skills help to integrate features across quantities and thereby allow drawing inferences based on this integration. In this sense, 6, for instance, is not just understood as a term for naming a set of 6 items, but is also represented as a predecessor of 7, a successor of 5, or the double of 3. Researchers typically assess this concept by socalled nonsymbolic seriation tasks: children have to rearrange a disordered collection of sticks differing in length to construct an ordered series. While even 2 to 3yearolds master this task when presented with 1 up to 3 sticks, many 5yearolds acted by trial and error when presented with six sticks (Aunio & Räsänen, 2015; Piaget, 1952). In a modified version of this task, children are presented with an ordered series of sticks and have to sort in an additional stick. The simultaneous consideration of the descending and ascending order necessary to solve the task overtaxes most 5yearolds. Aunio and Niemivirta (2010) showed that seriation and counting abilities in 4yearolds are equally predictive of mathematical achievement in elementary school. Individual differences found in seriation task performance in preschool time remained quite stable over the following years. Thus, the vast majority of poor performing preschoolers were not able to compensate for their deficits after entering school: the level reached by these children at age 8 was equivalent to the level reached by the group of the best performing ones at age 4. The importance of early seriation skills for later mathematical achievement was also demonstrated for children from Belgium, the Netherlands, and Germany (Van de Rijt et al., 2003). Furthermore, the ability to represent relative order remains to be an important indicator of mathematics achievement throughout development. Crosssectional studies indicate that the specific predictive value of understanding the order of numerical symbols for mathematical performance increases from grade 1 to 6 (Lyons, Price, Vaessen, Blomert, & Ansari, 2014), and that university students who were best in processing ordinal information also performed best in mental arithmetic tasks (Lyons & Beilock, 2011). Given these findings, when looking for early precursors of mathematical achievement, it seems worthwhile to go beyond counting and symbolic magnitude representation by also focusing on the concept of order.
The Impact of DomainGeneral Reasoning Abilities [TOP]
Besides the domainspecific abilities, also domaingeneral reasoning abilities contribute to mathematics learning (Nunes, Bryant, Barros, & Sylva, 2012). There is sound evidence for a strong relationship between school performance in mathematics and domaingeneral reasoning abilities as measured with, for example, intelligence tests. This relationship may partly result from the design of intelligence tests, which often contain subtests requiring numerical reasoning, arithmetic operations, and mathematical wordproblem solving. However, the relationship cannot solely be attributed to this design aspect because spatial and verbal subtests also reveal substantial correlations with mathematics. For example, Deary et al. (2007) showed that general intelligence accounted for 59% of achievement variance in mathematics, which was higher than for any other school subject.
Apparently, an individual’s domaingeneral reasoning abilities guide the acquisition of knowledge in terms of quality and speed, which will subsequently impact further learning. Nevertheless, several studies with elementary (e.g., Staub & Stern, 2002) or secondary school students (e.g., Stern, 2009; Watts et al., 2014) showed that the unique variance explained by intelligence shrank markedly or even disappeared when measures of prior mathematical knowledge were included. Thus, it seems likely that controlling for mathematical knowledge in kindergarten children would decrease the predictive value of domaingeneral reasoning abilities like intelligence.
One might object that most of the studies have controlled for domaingeneral reasoning abilities in kindergarten children (e.g., Nunes et al., 2012). It is, however, worth mentioning that the psychometric qualities of these measures are relatively weak at this age level. Although significant correlations exist between measures of domaingeneral cognitive abilities in infancy / early childhood and later measures of domaingeneral reasoning abilities (Rose, Feldman, Jankowski, & Van Rossem, 2012), they are substantially lower than the longterm stability of intelligence measures later in development. For instance, in the German longitudinal study LOGIK, the correlation between the verbal intelligence measures gained at age 5 (before entering school) and at age 7 was r = .51, while the correlation was r = .81 between age 7 and 9 (Schneider & Bullock, 2009; Schneider, Niklas, & Schmiedeler, 2014). Other studies also confirmed that psychometric properties of intelligence tests for kindergarten children are weaker compared to tests for older children and adults (Flanagan & Harrison, 2012).
Consequently, the unique predictive validity of early numerical precursors on later mathematical achievement could have been overestimated. This might have been the case because most of the studies control for domaingeneral reasoning ability with tests like intelligence tests, but their relatively low stability and reliability in assessing young children’s performance may be problematic. As a consequence, the impact of domaingeneral reasoning abilities on mathematics achievement may have been underestimated in studies which assessed domaingeneral reasoning abilities in preschool or kindergarten. Furthermore, in addition to domaingeneral reasoning abilities, several researchers have shown that reading comprehension explains variance in mathematical achievement because it is related to conceptual understanding and application of mathematics (e.g., Grimm, 2008; Krajewski & Schneider, 2009b). Consequently, a study is needed that controls for domaingeneral reasoning abilities and reading comprehension later in development (e.g., in elementary school) by applying tests simultaneously to the outcome measures.
Open Questions and Goals of the Current Study [TOP]
Despite the large number of studies on early precursors of mathematical achievement, research has been biased in three ways. First, a major focus has been on children at risk and deficiencies in the lower performance range (Siegler, 2009). In contrast, relatively little is known about individual differences and their longterm stability on normally developing children. Second, while tests on symbolic magnitude representation and counting are widely used, measures on seriation skills were neither considered nor advanced in the same way. Third, the impact of domaingeneral reasoning abilities on explaining differences in mathematics achievement might have been underestimated given that the measures for these abilities suffer from low reliability if assessed too early in development.
In a longitudinal study, we focused on assessing the predictive value of the concept of order, that is, relational reasoning about quantities and seriation skills in kindergarten children growing up under normal conditions (i.e. no special focus on children at risk). Will relational reasoning about quantities still explain unique variance in third grade mathematical achievement after simultaneously controlling for intelligence and reading ability? If this is the case, it would strengthen the domainspecific view on learning mathematics.
Addressing this research question requires an assessment of relational reasoning about quantities that indicates individual differences among normally developing 5 to 6yearolds. We decided to not include number symbols like Arabic digits, because they were not systematically taught in the kindergartens from which we recruited participants (even though most German 6yearolds can be expected to know the Arabic digits in the range from 1 to 6, Knudsen, Fischer, Henning, & Aschersleben, 2015). The Quantity Sequence Test (QST; German: MengenfolgenTest) developed by Guthke (1983) meets this objective. We planned to compare the predictive value of QST with the impact of intelligence and reading ability as measured in third grade of elementary school concurrently with mathematical achievement. If QSTperformance would explain unique variance in third grader’s mathematics performance under control of intelligence, this would provide less biased evidence for the importance of unique domainspecific roots of mathematical achievement.
Material and Methods [TOP]
Participants [TOP]
Fiftyone children (21 girls) participated from kindergartens located in a metropolitan middleclass area. Children were initially tested in kindergarten a few months before entering formal education (M_{age} = 6.5 years, SD = 3.9 months). German kindergartenprograms did not contain systematic instruction in mathematics at the time of data collection. When retested 2.5 years later, participants were in third grade of elementary school (M_{age} = 8.9 years).
Measures [TOP]
Relational Reasoning About Quantities Assessment in Kindergarten [TOP]
To assess kindergarten children’s relational reasoning about quantities, we administered the Quantity Sequence Test (QST, Guthke, 1983). This is a standardized test which was revalidated by Gühne (2003) who reported a Cronbach’s Alpha of .82. The QST was developed to assess the learning potential in early mathematics, but has rarely been used in research. It is, however, recommended as a test to assess dyscalculia in elementary school children in Germany (Warnke, 2008). This test requires completing a sequence of three cards in every trial. In each trial, the cards show increasing or decreasing numbers of bears. Children have to complete the sequence by selecting three more cards out of a pool of six cards. After each selection, the child receives feedback whether s/he chose the correct card. If the child chose the correct card, he/she is encouraged to select the next one. If the child makes a wrong selection, the experimenter turns the card around (i.e., takes it out of the pool), and asks the child to make another selection. This procedure continues until the child selects the correct card.
However, each trial will only be scored when the child selects the correct three cards at the first try. For example, Trial 1 (see Figure 1A) requires children to complete the sequence of 2, 3, and 4 bears (positions A, B, and C, respectively) with three more quantities (to be put sequentially on positions D, E, and F). The instructor prompts the child: “There are three cards missing [pointing to positions D, E, and F]. You have to choose the right cards out of these cards [pointing to the pool of cards to choose from for positions DF]. You will find the correct card when you look at these three cards [pointing to the cards on the positions A, B, and C], count the bears that are depicted on each card, and compare the numbers.” (the instruction is translated from German). Thus, to complete the sequence in Trial 1, children had to choose the card with 5, then the one with 6, and finally the one with 7 bears to correctly complete this trial. Trial 4 requires to count backwards and to leave a blank with a blank symbolizing the number 0 (see Figure 1B). Children see the sequence of cards representing 0, 2, and 0 bears and choose from six cards representing 0, 2, 3, 4, 5, and 10 bears. The correct response would be the successive choice of the cards representing 3, 0, and 4 bears. There are all together 10 trials (including a practice trial) with increasing difficulties (see Table 1).
Table 1
Trial  Positions Presented

Positions to be Completed

Cards to Choose From for Positions DF  

A  B  C  D  E  F  
Practice  1  2  3  4  5  6  3  6  0  4  9  5 
1  2  3  4  5  6  7  8  7  9  5  6  10 
2  1  0  2  0  3  0  0  4  9  0  3  1 
3  0  1  0  2  0  3  3  0  9  8  4  2 
4  0  2  0  3  0  4  5  4  3  0  2  10 
5  4  0  3  0  2  0  5  2  0  10  0  6 
6  6  5  4  3  2  1  1  6  3  0  2  8 
7  10  9  8  7  6  5  7  4  2  6  1  5 
8  0  2  4  6  8  10  7  10  6  2  0  8 
9  10  8  6  4  2  0  3  7  8  4  0  2 
Note. Numbers in the cells for the positions represent numbers of bears shown on the test cards.
Mathematics Performance Assessment in Third Grade [TOP]
We measured mathematical achievement in third grade with a 36item test (henceforth referred to as MAT). Items belong to two categories: (1) word problems (16 items), and (2) arithmetic multiplication, addition, and subtraction problems (20 items). We scored each item with one point if answered correctly, and with zero if answered incorrectly.
We used word problems from a set of empirically validated word problems for elementary school students developed for the Munich Longitudinal Study (Stern, 1999, 2009). These problems require children to use all arithmetic operations they have learned until third grade: addition, subtraction, multiplication, and division. The set contains different types of problems comprising socalled exchange and comparison problems (Stern, 1993; Stern & Lehrndorfer, 1992 for more details). In exchange problems, a certain number of objects changes hands. For example: “Bernd has got some balloons. Werner gave him three more balloons. Now Bernd has 7 balloons. How many balloons did Bernd initially have?“ In comparison problems, students have to consider the difference between two quantities. These problems are the most difficult ones for students because in addition to the concept of cardinal numbers, the problems require an understanding of relational numbers (Stern & Mevarech, 1996). For example: “Annette keeps 15 rabbits in two kinds of stables. There are 3 more rabbits in the big stable than in the little stable. How many rabbits are in the little stable?” The arithmetic multiplication, addition, and subtraction problems were taken from a standardized mathematics achievement test (DEMAT 3+; Roick, Gölitz, & Hasselhorn, 2004). The problems presented twodigit numbers up to 99 which had to be added, subtracted, or multiplied. The children had to solve the problems within a time limit of 45 minutes.
Reading Comprehension Assessment in Third Grade [TOP]
We assessed reading comprehension (RC) with a standardized test  Salzburger LeseScreening (Mayringer & Wimmer, 2003). In this test, children have to read a series of easy sentences as quickly as possible and judge their appropriateness (e.g., “Apples grow on doors.” or “A zebra has black and white stripes”).
General Reasoning Ability Assessment in Third Grade [TOP]
We used the CultureFair Test for Intelligence (CFT 20, Weiss, 1998) to measure general reasoning abilities (i.e. intelligence). In this test, children have to complete sequences of patterns by extracting the relation among them.
Procedure [TOP]
At the first measurement point, we administered only the QST. A trained experimenter tested the children individually in a quiet room of their kindergarten. The mean testing time was 19 minutes (SD = 8 minutes). At the second measurement point 2.5 years later, we tested the children in small groups of four to six at a university laboratory. In this session, we assessed mathematical achievement (MAT  approximately 30 minutes), reading comprehension (RC 3 minutes), and after a short 5 minute break, intellectual ability (CFT 20 minutes). We presented all tests following the instructions in the respective manuals.
Results [TOP]
Table 2 compiles the descriptive statistics for the raw scores of the different tests. The MAT comprised both, (1) word problems, and (2) arithmetic multiplication, addition, and subtraction problems. However, the internal consistency of the entire 36 MAT items was high (Cronbach’s Alpha = .90, [95%CI: .85, .93]) and we therefore use only the sum score of all items for further analyses.
First, we computed Pearson correlation coefficients between the test scores (see Table 3). We found the expected statistically significant correlations of QST with MAT, RC, and CFT. Furthermore, MAT correlated significantly with RC and CFT but there was no statistically significant relation between RC and CFT. Thus, RC and CFT are independent measures, but both are related with QST and MAT.
To examine whether the QSTperformance in kindergarten explains unique variance in third grade mathematics achievement, we computed a hierarchical regression analysis (see Table 4). In Step 1, we entered CFT and RC into the model to predict MAT. Both predictors were statistically significant (both ps < .01). In Step 2, we entered the QST performance as an additional predictor. Importantly, this predictor was statistically significant (p = .03). Furthermore, the β weights decreased from Step 1 to 2 from .40 to .24 for CFT and from .36 to .25 for RC. The model fit increased from Step 1 (Total R^{2} = .305 [95%CI: .11, .50]) to Step 2 (Total R^{2} = .370 [95%CI: .18, .56]). Thus, in STEP 2 at least 18% of variance (i.e., the lower bound of the confidence interval) in third grade math performance could plausibly be explained by the regression model. The QST added ΔR^{2} = 6.5% of unique variance to MAT performance.
Table 4
Step  Variable  Unstandardized B [95% CI]  β  t  p  ΔR^{2}  Total R^{2} [95% CI] 

1  .305 [.11, .50]  
CFT  .017 [.007, .027]  .401  3.329  .002  
RC  .008 [.003, .013]  .362  3.006  .004  
2  .370 [.18, .56]  
CFT  .010 [.001, .021]  .241  1.762  .085  
RC  .005 [.000, .011]  .251  1.984  .053  
QST  .037 [.003, .070]  .324  2.211  .032  .065 
Discussion [TOP]
In the introduction, we emphasized that previous research on domainspecific numerical precursors of later mathematical achievement is limited in three ways. First, the majority of studies have focused on children at risk and deficiencies in the lower performance range. Second, measures of the relational quantitative reasoning (i.e. the concept of order and seriation skills) have not been considered in the same way as tests on symbolic magnitude representation and counting. Third, measures of domaingeneral reasoning abilities have often been assessed together with the precursors in longitudinal studies. These early measures suffer from low reliability which might lead to an underestimation of the impact of domaingeneral abilities. In a longitudinal study, we examined the relation between relational quantitative reasoning assessed in kindergarten and mathematical performance in the third grade of elementary school in normally developing children.
Our results provide evidence that relational quantitative reasoning (as assessed with the quantity sequence test, QST; Guthke, 1983) in kindergarten predicts mathematical performance in elementary school. Thus, our results complement existing empirical results on the relevance of the understanding of relational quantitative reasoning and the concept of order on mathematics performance from crosssectional studies (Lyons et al., 2014). In our regression analysis, we controlled for two competencies which have been shown to be strongly related to elementary school children’s mathematical performance: general reasoning ability and reading skills (e.g., Grimm, 2008; Krajewski & Schneider, 2009b). We assessed these competencies concurrently with mathematical performance in third grade. The predictive relation between performance in the QST in kindergarten and third graders’ mathematical performance remained significant albeit small, given the confidence interval for this predictor. Thus, their understanding of the concept of order can be considered a domainspecific precursor of later mathematical ability.
In contrast to the widely used number comparison tasks which address only the questions of “What is more [or less]?”, the QST used in the present research required a deeper understanding of ordinality, namely, understanding the sequence of quantities which refers to the question “What comes before or after?” Lyons and Beilock (2011) proposed that representing ordinal associations is an important intermediate step in the development from a basic approximate number system to complex mental arithmetic. In an experimental study with university students, they showed that the speed and accuracy in number ordering (i.e. deciding whether three Arabic numerals are in increasing order) fully mediated the correlation between approximate number acuity and mental arithmetic. Correspondingly, our finding indicates that the ability to understand sequences of quantities predicts math achievement in elementary school. Another important characteristic of the QST lies in its nonsymbolic nature (i.e. no use of Arabic digits). As outlined above, there is a current debate (e.g., De Smedt & Gilmore, 2011; Rousselle & Noël, 2007; Sasanguie, Van den Bussche, & Reynvoet, 2012) on whether the relation between the performance in the number comparison task and later mathematical achievement is due to the accuracy of magnitude representation per se or to the processing of number symbols (i.e., the mapping of symbols onto magnitudes). Our findings contribute to this discussion by showing that kindergarten children’s understanding of nonsymbolic quantity relations predicts mathematical skills in elementary school.
One could object that a more direct test of the role of symbol processing in understanding the concept of order would require the inclusion of a symbolic version of the QST. However, as there was no compulsory kindergarten curriculum in Germany at the time of data collection, not all children could be expected to be familiar with digits (i.e., Arabic numerals) required in a symbolic version. Nevertheless, future studies should include several other measures of mathematical precursor skills to allow assessment of the unique contributions of the different skills. Our main aim was to show that early relational quantitative reasoning does indeed predict later mathematical achievement.
Another criticism might be that we did not control for social background or general reasoning abilities, which might account for differences in the QSTperformance. It is beyond doubt that a lowstimulating home environment affects mathematical development and that children from lowincome families fall short of their potential (e.g., Siegler, 2009). However, we have no reasons to assume strong effects of social background in our study because children were all recruited from kindergartens in areas mainly populated by middleclass families, where educational toys that boost quantitative development are naturally available (e.g., board games). Therefore, in our sample variance in the QSTperformance was presumably not systematically affected by social background.
Different from several other studies, we presented the test of general reasoning abilities together with the outcome variable, not together with the predictor. As reliability and stability of measures of intelligence have been shown to markedly increase after children enter school, our kindergarten measure of domainspecific quantitative abilities had a stronger competitor compared to many other studies. Even if domaingeneral reasoning abilities are measured together with the outcome variables, as we did, their predictive value became smaller when we included quantitative relational reasoning as a domainspecific predictor in the regression model. Although our study can thus be considered a rigorous test of the impact of a domainspecific precursor, an additional measure of general reasoning in abilities in kindergarten would have strengthened the explanatory power. Incorporating such a measure would have enabled us to empirically replicate our claim about the increasing stability of such measures within our study: Nevertheless, the unique proportion of variance in third grade mathematical achievement explained by informally acquired seriation skills indicates that relational quantitative reasoning is a sound domainspecific source of mathematical competencies.
The present findings stimulate questions for future research concerning early education: to what extent can pedagogical interventions in kindergarten improve a child’s understanding of the concept of order? Would such an improvement facilitate children’s acquisition of mathematical concepts later in school? How should interventions be designed in order to provide effective learning? What are the educational implications and effects of relational reasoning about quantities compared to those of other domainspecific precursors of mathematical skills (e.g., counting ability, quantity to number word linkage, and numeracyspecific attention)? By demonstrating that early relational quantitative reasoning is an important domainspecific precursor for later elementary school mathematics achievement our study complements other findings on precursor skills of later mathematical achievement (e.g., Aunola et al., 2006; De Smedt et al., 2009; Hannula et al., 2010; Krajewski & Schneider, 2009a, 2009b; Watts et al., 2014). In addition, it contributes to our understanding of the developmental trajectory of mathematical achievement, and might provide a starting point for designing early education programs.