Researchers have used a variety of measures to assess children’s verbal-based cardinal-number knowledge—understanding that a number word represents a specific number of items. Such measures include the how-many task (stating a set’s total in response to the question, “How many?”) and the give-n task (creating a set from a larger set to comply with a verbal request such as “Give me three chips”). Children typically develop both competencies for small numbers (i.e., sets less than four) before they can use a counting procedure to either label a set’s total or create a specified set. Small-n recognition—children’s initial means of stating a set’s total number of items—likely involves subitizing, originally defined by Kaufman et al. (1949) as immediately recognizing the total number of items in a set and associating it with an appropriate number word. Small-n creation—children’s initial means of creating a collection of a verbally specified size—likely also entails using subitizing to identify when a requested number of items have been put out. There is considerable, but not unanimous, agreement that this first subitizing-based phase of cardinality development provides a foundation for the second counting-based phase (Baroody et al., 2006, 2017; Benoit et al., 2004; Carey & Barner, 2019; Fischer, 1992; Klahr & Wallace, 1976; von Glasersfeld, 1982; but cf. Cordes & Gelman, 2005; Gallistel & Gelman, 2000; Nieder, 2017). What is unclear is whether small-n recognition and creation emerge simultaneously or in succession during pre-counting phase. A secondary analysis of preschool pre-counting data (Mix et al., 2012) provided an opportunity to directly test this question, which has significant theoretical and methodological implications.
Theoretical Background
On many accounts, small-n recognition and small-n creation are assumed to emerge simultaneously. In their seminal research on number development, Schaeffer et al. (1974) found that children who could subitize collections of two and perhaps as many as four could, for example, also create sets of two or three objects upon request without counting. They concluded that small-n recognition allows children to create sets of 2 or 3 but did not specify whether the latter emerges simultaneously with or later than the former. In effect, Schaeffer et al. did not clearly specify whether small-n recognition was a necessary and sufficient condition or a necessary condition for small-n creation.
More recently, theorists have argued that small-n recognition and creation unfold simultaneously in a stepwise manner based on the order of magnitude—commonly called n-knower levels (Condry & Spelke, 2008; Le Corre & Carey, 2007, 2008; Le Corre et al., 2006; Sarnecka & Carey, 2008; Wynn, 1990, 1992). Specifically, small-n recognition entails constructing an exact cardinal representation of small numbers by associating a category of sets with number words (e.g., any pair of items can be labeled “two”). Initially, such a representation may be inexact, and subitizing a total number may be unreliable (e.g., “two” may simply be understood as “many” or “not ‘one’” and overapplied). However, as the limits of a number-word category are constructed, subitizing becomes a reliable tool for labeling the total of a small set. Although reliable subitizing of “two” may develop simultaneously with that of “one” (Palmer & Baroody, 2011) or even before that of “one” (Beilin, 1975; Durkin et al., 1986; Mix, 2009; Wagner & Walters, 1982), small-n recognition is generally thought to occur first for “one,” then for “two,” next for “three,” and later for somewhat larger numbers. As subitizing skill expands, it permits creating successively larger sets. A “2-knower,” for example, is a child who can reliably recognize or create sets of one and two but not three. (A “subset knower” is a child who is a 1-, 2-, or 3-knower.) On this view, there is no reason to predict that—within each set size or knower-level—performance should differ on small-n recognition and creation tasks.1
Evidence indicating concordant development of small-n recognition and creation has been taken as support of the discontinuity hypothesis over the continuity hypothesis (Sarnecka & Carey, 2008; Wynn, 1990, 1992)—that an understanding of verbal-based cardinality unfolds in a series of conceptual steps rather than building on an innate understanding of cardinal numbers and counting. As Le Corre et al. (2006, p. 148) noted:
“Such consistency supports the discontinuity hypothesis; it suggests that ‘one’-knowers truly only know the numerical meaning of ‘one,’ ‘two’-knowers truly only know the numerical meaning of ‘one’ and ‘two,’ and so on, across tasks with distinct processing demands … Finding that children’s [give-n] knower-level is the same as their [how-many] knower-level … would provide strong evidence in favor of the discontinuity hypothesis.”
Supporters of the continuity hypothesis point to evidence that the give-n task is relatively difficult even when it involves small numbers. For instance, Cordes and Gelman (2005, p. 129) noted that a “child has to create a set of objects, one by one, until she has created a set whose numerical value corresponds to one memory” and that the “combined competence requirements exceed those of a beginning language user.”
A Different Version of the Discontinuity Hypothesis
Addressed first is an argument for a possible succession of small-n skills in the preverbal phase and then some of the methodological implications of this view.
Theoretical Argument
Whether the how-many and give-n tasks yield concordant results is not a critical test of the discontinuity (or continuity) hypothesis, and equating non-concordant results with support for the continuity hypothesis and concordant results with support for the discontinuity hypothesis is a false dichotomy. Cordes and Gelman (2005) may have overstated the difficulty of successfully creating a small set, because they assumed subitizing is not a real phenomenon and that children must count out a requested number of items. They were correct, however, that proponents of the discontinuity hypothesis overlook how the give-n task may be more challenging than the how-many task even with small numbers. In the present study, we tested a version of the discontinuity hypothesis that subitizing-based small-n recognition and creation may develop in a non-concordant fashion.
Specifically, although small-n recognition and creation may have a common conceptual basis (verbal-based cardinal concepts of these numbers), performance and conceptual differences in task demands may cause successful small-n creation to emerge later than small-n recognition (if only briefly)—and the latter difference may justify viewing small-n recognition as a necessary (but not a sufficient) condition for small-n creation. These differential task demands include:
-
Small-n creation (the give-n task) encompasses performance factors not required by number-recognition (e.g., the how-many task). Specifically, successively putting out items requires a child to register the requested number in working memory, put out items one at a time, subitize the amount put out, and mentally compare the results of subitizing with the remembered number—a process that needs to be repeated for “give 2” and again for “give 3.” Although putting out the requested number of items simultaneously puts fewer demands on working memory demands, it does require a child to isolate the subset within the larger set and carefully remove only the subset.
-
Conceptually, Benoit et al. (2013) made the theoretical distinction between mapping from a set to a number word (set-to-word mapping) and the reverse (word-to-set mapping).2 Small-n recognition entails set-to-word mapping—starting with a specific example of a number and relating it to a number word and its associated general concept (e.g., relating ■■ or ✌ to the number word “two” and a cardinal concept of two: any pair of like items, any duality, or even more exactly as one more than one). In contrast, small-n creation requires a word-to-set mapping—starting with a number word and its associated general concept and creating a specific example of the number. These differences in mapping may mean that reliable small-n recognition and creation have different conceptual demands. For example, the how-many task may entail understanding that a set of items can viewed as a total (whole) as well as individual elements (parts), whereas the give-n task requires applying this knowledge—understanding that a set of the specified total needs to be created. Indeed, reliably subitizing small numbers (e.g., accurately, consistently, and selectively labeling sets of 3 as “three”)—a set-to-word mapping—seems necessary for creating a requested number of items—a word-to-set mapping—via subitizing.
Based on these considerations and others detailed in Baroody et al. (2017) and Baroody and Lai (2022), we propose an alternative course of verbal cardinality development summarized in Table 1. This model diverges from the conventional wisdom regarding Phase 1 in that it posits successive development of small-n recognition and creation—a proposition that is tested by the present analysis. For evidence supporting the sequential development of analogous subphases for Phase 2, see Baroody et al. (2022) and Baroody and Lai (2022).
Table 1
Possible Phases of Verbal-Based Cardinality Development, Their Conceptual Basis, Type of Mapping, and Measures (Baroody & Lai, 2022; Baroody et al., 2017)
Aspect of Cardinal Number | Conceptual Basis | Mapping | Direct Measure |
---|---|---|---|
First (Pre-Counting) Phase of Cardinality Development—Before Meaningful 1-1 Counting (i.e., before understanding of the CP) | |||
Subphase 1.1. Small-n recognition: subitizing-based number recognition (commonly called n-knower levels) |
An exact cardinal representation of a small number underlies the reliable ability to subitize (immediately recognize) 1, 2, or 3 | Set-to-Word (via subitizing) |
How-many task without counting |
Subphase 1.2. Small-n creation: subitizing-based set creation (also commonly called n-knower levels) |
An exact cardinal representation of small numbers can be used to subitize when 1, 2 or 3 have been put out (i.e., to reliably stop the set-creation process) |
Word-to-Set (via subitizing) |
Give-n task without counting |
Second (Counting) Phase of Cardinality Development—Meaningful 1-1 Counting (i.e., with understanding of the CP) | |||
Subphase 2.1. Counting-based number identification (commonly called CP-knower level) |
Cardinality Principle (CP) or what Fuson (1988) calls the count-cardinal concept: the last number word used in counting a collection represents its total numbera | Set-to-Word (via counting) |
Counting-based how-many task |
Subphase 2.2. Counting-based number creation— (also commonly called CP-knower level) |
What Fuson (1988) calls the cardinal-count concept: a cardinal number would be the last number word if a collection is counted | Word-to-Set (via counting) |
Counting-based give-n task |
Note. Unlike stages in which a successive stage replaces a prior stage, the second phase of verbal cardinality development does not replace the first but (greatly) supplements it. CP indicates the cardinality principle.
aFuson (1988) noted that prior to the CP, some children learn to respond to how-many questions with a last-word rule (repeating the last count word by rote—without realizing it represents the total number of items).
Methodological Considerations
According to the model outlined in Table 1, although direct measures of Phase-2 competencies would involve counting, direct measures of Phase-1 competencies would involve subitizing, not counting. For example, for their pioneering research, Schaeffer et al. (1974) assessed the Phase-1 competence of small-n recognition with a how-many task that did not involve counting. Participants were shown a pictured array of one to four men and asked, “How many men are there?” If a participant counted, an experimenter requested the child not count or point but simply tell how many men there were. Researchers now often avoid using a how-many task with counting to assess Phase-2 knowledge, because (transitional) children may use the last-word rule learned by rote to be successful, thus overestimating knowledge of the cardinality principle (CP) or achievement of Subphase 2.1 in Table 1 (Sarnecka & Carey, 2008). However, the same concern does not apply to assessing Phase-1 small-n recognition with the how-many task via subitizing (without counting or need to apply the CP).
Some versions of the give-n task do not involve counting but some do, and such variations yield different results (Sella et al., 2021). Krajcsi (2021) found that prompting counting on the give-n task can minimize performance errors when assessing Phase-2 knowledge. Specifically, among CP knowers, this version of the give-n task indeed resulted in more success than not prompting counting. However, prompting counting on the give-n task when assessing Phase-1 pre-counting small-n creation skill may not be helpful and may even be counterproductive. Asking Phase-1 children to use developmentally more advanced Phase-2 concepts and skills to check small-n creation efforts is likely to be incomprehensible or even confusing, be interpreted as challenging a child’s initial response, and promote disengagement from the task—all of which may lead to underestimating competence of small-n creation ability.
Although the previously stated conjecture needs systematic examination, two recent findings are consistent with this proposition. Krajcsi (2021) concluded that a counting follow-up did not benefit subset-knowers (Phase-1 children). Marchand et al. (2022) used two versions of a give-n task with a counting prompt and found for both the reliability “of individual knower levels varied considerably, such that non-knowers, 1-knowers, 2-knowers, and CP-knowers exhibited fairly high [reliability], while 3-, 4-, and 5-knowers did not” (p. 12).
Empirical Evidence Regarding Developmental Order
The surprisingly few comparisons of young children’s small-n recognition and creation provide no clear evidence about their developmental relation. Schaeffer et al.’s (1974) data are not presented in sufficient detail to determine whether small-n recognition and creation emerged in tandem or sequentially. More recently, Mou et al. (2021) used how-many and give-n tasks that did not instruct children to count initially or follow-up with a request to check via counting. Using latent modeling of 3- and 4-year-olds’ performance on the how-many and give-n tasks with sets of up to eight items, they found that the best-fitting model was a bi-factor model indicating that the two tasks, though related, reflect distinct conceptual knowledge. Moreover, their analyses ruled out general cognitive or linguistic demands as a source of performance differences. Mou et al. concluded their results are inconsistent with the common assumption that the how-many and give-n tasks gauge interchangeable concepts and are consistent with multiple dimensions of cardinal-number knowledge acquisition.
Neither the Schaeffer et al. (1974) nor the Mou et al.’s (2021) study addressed the discontinuity hypothesis we have proposed because data were not analyzed separately for each small number. The latter’s analysis also included data for both the how-many and give-n tasks beyond the subitizing range (i.e., required counting to quantify). Moreover, as Mou et al.’s participants included 4-year-olds and had a mean age of 3 years and 11 months, it seems likely that the vast majority exhibited (near) ceiling performance for small-n recognition and creation. Finally, and most importantly, Mou et al.’s “how-many” task did not involve a how-many question, and success was defined as counting a set correctly—despite research that indicates children can accurately count one-to-one before accurately labeling the cardinality of sets (Schaeffer et al., 1974).
Wynn (1990, p. 155) concluded that a comparison of twenty-four 2- and 3-year-olds’ “performance across the ‘how-many’ and ‘give-a-number’ tasks shows strong within-child consistency” regarding when the CP develops. Because she focused on when the CP (Subphase 2.1 in Table 1) emerges, children were asked to count even on small-n trials of both the how-many and give-n tasks. With the latter task, this occurred if a child did not spontaneously count, whether the initial response was correct or not. Wynn’s (1990, 1992) results, then, do not bear on developmental relation between small-n recognition and creation in Phase 1.
Le Corre et al. (2006) used the “what’s on this card” (WOC) task to assess small-n recognition and Wynn’s (1990, 1992) give-n task with a counting follow-up to gauge small-n creation. They concluded from their analysis using the Wilcoxon Signed-Ranks test:
“The two tasks were highly consistent … While more children had higher knower-levels on WOC than [give-n] (n = 12) than the other way around (n = 5), this was not significant, Z = 0.79, p = 0.4. Thus, there was no evidence that children’s knower-levels were systematically higher on WOC than on give-n” (p. 150).
The non-significant result, though, may simply have been due to a lack of power. As ties (33 of their 50 cases) are not considered in the Wilcoxon test, the actual or redefined n was only 17. A power analysis using G*Power indicated the probability of correctly rejecting the null hypothesis was either 0.22 (one-tailed) or 0.13 (two-tailed). So inversely, the probability of Type II error would have been .78 and .87, respectively. Moreover, if give-n 0- and 1-knower levels listed in Le Corre et al.’s Table 3 are combined into a single category, only 58% (18 of 31) of the n-knowers produced concordant results. Even differences of one level represent an appreciable difference in the estimation a child’s conceptual understanding of small numbers.
Rationale for the Present Study
A post-hoc analysis of the Mix et al. (2012) intervention study provided the first opportunity to directly address the issue of whether small-n recognition and creation develop simultaneously or successively using tasks that do not involve counting (i.e., confound Phase-1 and Phase-2 competencies) and analyzing the data of each small number separately. Specifically, the analysis compared 3-year-olds’ performance on how-many and give-n tasks that disallowed one-to-one counting and did so separately for sets of one, two, and three.
Testing at one time point may miss the transition from small-n recognition (possible Subphase 1.1 in Table 1) to small-n creation (possible Subphase 1.2 in Table 1), especially if two subphases develop in rapid succession. Put differently, a one-shot assessment is more likely than multiple assessments to test children before achieving either subphase or after achieving both subphases. For this reason, children were tested three times on each task to check for possible transitions that indicate prior success on small-n recognition (or vice versa).
Method
The intervention study reported by Mix et al. (2012) focused on different methods of modeling the CP and provided no training on either small-n recognition or creation.
Participants
The Mix et al. (2012) study involved 60 participants (M = 3 years; 7 months, SD = 0;3, range = 3;1–4;7) recruited from preschool programs serving predominantly Caucasian middle-class communities in two small cities in Indiana and Michigan. Informed consent was obtained for experimentation with human subjects.
Procedure
Children were tested three times about 3 weeks apart—originally intended as the pretest (Time 1 or T1), immediate posttest (T2), and delayed posttest (T3).
Tasks
For each trial of the how-many task, children were asked to tell how many objects were displayed on a 5- × 8-inch index card. For each of the three time points, there were two cards for each collection size 1 to 3, resulting in a total of six trials per number. Each set consisted of identical, photographic images arranged in a random array. On each trial, the experimenter held up a card and asked, “How many are there?” As 3-year-olds typically do not count unless they can touch the objects with a finger, the cards were held out of a child’s reach to eliminate counting. If a child was close enough to a card to touch it, the tester pulled the card out of reach. No feedback was provided. A response to the how-many trial of two and three was scored as correct if a child correctly indicated the cardinal value of a collection without behaviors consistent with one-to-one counting, namely counting from “one” to the cardinal value (e.g., for three items, counting: “One, two, three”) or successive pointing to each item. It is not possible to distinguish between subitizing and counting with collections of one. Therefore, these trials were scored as correct if a child labeled a single item “one” whether the child pointed.
For the give-n task, children were given a pile of 15 objects and asked to create a set (e.g., “Give me three pigs”). Sets of one, two, and three were each requested twice in a random order that was interspersed with requests for five and six. Participants were not given instructions on counting, because Mix et al. (2012) allowed children to choose a strategy that was appropriate for either small numbers or larger ones: “Although children can produce small sets on demand without understanding [the CP], the ability to produce sets greater than 4 is taken as evidence for [CP] understanding because these larger sets must be counted (i.e., they cannot be subitized) (Wynn, 1990)” (p. 277). As children tend to choose a strategy that ensures success but requires the least effort, the assumption that young children would use subitizing on sets of 1 to 3 on the give-n task seems reasonable. However, even if a child had to rely on counting for success with small numbers, this would work against our hypothesis that successful performance on the how-many task emerges before success on the give-n task.
For purposes of the present re-analysis, performance on only small-n sets was considered. A child was scored as correct on a trial if the number created matched the number requested. Unlike typical n-knower scoring, producing a non-requested number for a trial was not penalized (e.g., producing three for a give-2 trial did not count against a child’s give-3 total score). However, overestimating “knower level” works against the authors’ hypothesis that the ability to recognize three precedes the ability to produce three.
Analyses
A Kolmogorov-Smirnov test confirmed that the data were not normally distributed. Thus, a non-parametric regression test was used to check whether session (T1, T2, T3), task (how-many vs. give-n), and set size (n = 1, 2, 3 using 2 as the reference set) had a significant impact on the outcome or dependent variable (number correct: 0, 1, or 2). Specifically, a proportional odds model was used because the dependent variable was ordered into three categories. Let be an ordinal outcome with categories; the model can be defined as:
where are model coefficient parameters (i.e., intercepts and slopes), with predictors for Intercepts can differ, but slopes are constant across categories due to the proportional odds assumption. Hence, the proportional odds model can be simplified as:
As Dixon and Moore (2000) argued that it is not enough to corroborate a hypothesis of developmental order but that alternative hypotheses need to be disconfirmed, a follow-up analysis involved comparing three possible developmental hypotheses: (a) synchronous-development hypothesis (simultaneous development of how-many and give-n competence), (b) how-many-priority hypothesis (earlier development of how-many competence), and (c) give-n-priority hypothesis (earlier development of give-n competence). In a 3 x 3 table, perfect support for the how-many-priority hypothesis over the simultaneous-development hypothesis would occur if all the data were distributed in three cells: partially successful on the how-many task but unsuccessful on the give-n task and successful on the how-many task but partially successful or unsuccessful on the give-n task (Dixon & Moore). Note that two cells (completely successful on both tasks and unsuccessful on both tasks) are consistent with all three hypotheses and, thus, not useful in discerning developmental order.
Results
The number correct by time, task, and set size are summarized in Figure 1. The non-parametric regression analysis was performed with R software. The proportional odds assumption was checked to see if it held. As Table 2 indicates, the test was insignificant. As the null hypothesis cannot be rejected (i.e., the proportional odds assumption holds), the proportional odds model is suitable for the data.
Figure 1
Students’ Scores by Task, Set Size, and Session
Table 2
Test Results for the Proportional Odds Assumption
Test for | χ2 | df | probability |
---|---|---|---|
Omnibus | 7.28 | 5 | 0.20 |
Set Size 2 | 1.08 | 1 | 0.30 |
Set Size 3 | 3.16 | 1 | 0.08 |
Task – How Many | 1.54 | 1 | 0.21 |
Time 2 | 1.69 | 1 | 0.19 |
Time 3 | 0.09 | 1 | 0.76 |
Table 3 shows the model estimates. All variables are significant except for time. Results of the odds ratios and the confidence intervals in Table 4 confirm that time had no effect on the response of students (OR = 0.94; 95% CI [0.67, 1.3]; OR = 1.06; 95% CI [0.75, 1.50]). As indicated by the odds ratio of 1.952 (95% CI [1.47, 2.60]), the odds of getting a higher score on the how-many task than on the give-n task (e.g., 2 or 1 on how many versus 0 on give n) are almost twice that of the reverse, holding all other variables constant. An OR of 1.68, 3.47, and 6.71 are equivalent to a small, medium, and large effect size (Cohen’s d), respectively (Chen, Cohen, & Chen, 2010). Holding all other variables constant, the odds of getting a higher score on Set Size 1 are about three times greater than on Set Size 2 (OR = 3.165; 95% CI [0.21, 0.48]); the odds of getting a higher score on Set Size 2 are about 0.44 times greater than on Set Size 3 (OR = 0.443; 95% CI [0.32, 0.59]). The interaction between task and set size was not significant.
Table 3
Summary of the Proportional Odds Model
Variable | Value | SE | t | p |
---|---|---|---|---|
Set Size 1 | -1.152 | 0.214 | -5.391 | < .001 |
Set Size 3 | -0.837 | 0.158 | -5.279 | < .001 |
Task – How Many | 0.669 | 0.145 | 4.617 | < .001 |
Time 2 | -0.066 | 0.173 | -0.384 | .701 |
Time 3 | 0.058 | 0.177 | 0.328 | .743 |
0|1 | -1.381 | 0.174 | -7.938 | < .001 |
1|2 | -0.747 | 0.169 | -4.434 | < .001 |
Table 4
The Odds Ratios and Confidence Intervals
Variable | OR | 2.50% | 97.50% |
---|---|---|---|
Set Size 1 | 3.165 | 2.099 | 4.863 |
Set Size 3 | 0.443 | 0.317 | 0.589 |
Task – How Many | 1.952 | 1.472 | 2.598 |
Time 2 | 0.936 | 0.666 | 1.313 |
Time 3 | 1.060 | 0.749 | 1.500 |
As the omnibus analysis was significant for task and set size, a follow-up analysis was conducted to examine further the developmental relation between the how-many and give-n tasks by each set size. This analysis was done by time point to maintain independent observations. The participants’ performance on the how-many and give-n tasks by collection size and time point are summarized in Table 5. A comparison of the data consistent with the how-many-priority hypothesis indicated by the green-shaded cells in Table 5—Cell A (successful on the how-many task but unsuccessful on the give-n task), Cell B (successful on the how-many task but partially successful on the give-n task), and Cell D (partially successful on the how-many task but unsuccessful on the give-n task) and that consistent with the synchronous-development hypothesis (Cell E; partially successful on both tasks)—revealed a significant difference in favor of the former hypothesis in seven of the nine cases. For the set size of three, the how-many-priority hypothesis was significantly superior to both the simultaneous-development hypothesis and the give-n-priority hypothesis (the data in the red-shaded Cells F, H, and I).
Table 5
Number of Correct Responses on the How-Many and Give-n Tasks by Set Size and Time Point
Note. Excluding Cells C and G, which are consistent with all three hypothesis, the data in unshaded Cell E are consistent with the simultaneous hypothesis (simultaneous development of how-many and give-n competence); that in the green-shaded Cells A, B, and D, with the how-many-priority hypothesis (earlier development of how-many competence); and that in the red shaded Cells F, H, and I, a give-n-priority hypothesis (earlier development of give-n competence).
*p < .05. **p < .01. ***p < .001.
Discussion
The results of the omnibus analysis indicate that performance on each task was relatively stable over the three testing sessions, significantly higher on the how-many task than on the give-n task, and significantly different by set size (1 > 2 and 2 > 3). As the follow-up analysis clarifies, the omnibus analysis does not support a strong version of the how-many-priority hypothesis—that children succeed on the how-many task with 1, 2, and 3 before they do so on the give-n task with 1, 2, and 3. Instead, consistent with authors’ alternative discontinuity view and contrary to the conventional wisdom (simultaneous-development hypothesis), the follow-up analysis generally supported a weak version of the how-many first hypothesis. Specifically, it indicated that, for sets of 1 and 2, prior success on the how-many task generally occurred significantly more often than simultaneous success on both tasks but not significantly more often than prior success on the give-n task. In contrast to the inconclusive results for sets of 1 and 2, those for sets of 3 were clearcut—the how-many-priority hypothesis was significantly superior to both the simultaneous and give-n-priority hypotheses.
The lack of conclusive results for sets of one and two is likely due to a ceiling effect—too few non-concordant cases to overcome measurement error. The present results are consistent with Marchand et al.’s (2022) finding of higher reliability for 1- and 2-knowers than for 3-knowers. Children often construct verbal-based number concepts in a step-like fashion—an understanding of “one,” then “two,” and finally “three” or, in some cases, “one” and “two” before “three” (Mix, 2009; Palmer & Baroody, 2011). As most participants were 3.5-years of age or older and children this age can typically recognize and create sets of one and two, it makes sense that at least 60% of the participants in the present study were successful on both tasks with sets of one and two.
Further research is needed with 2-year-olds—with children who are just constructing verbally based concepts of “one” and “two”—to evaluate whether competence with the how-many task emerges simultaneously or successively with that for the give-n task for sets of one and two. In brief, although further research with younger and less developmentally advanced children is needed, it should not be taken for granted that how-many and give-n tasks will yield the equivalent results with small collections (e.g., knower levels), particularly those involving three items.
It could be argued that the scoring procedure of the give-n task used in the present research—unlike that for Wynn’s (1990, 1992) give-n task—did not check for overapplication of a number word and, thus, overestimated small-n creation competence. However, ignoring such possible overapplications is not a threat to internal validity. Overestimating give-n competence works against the omnibus finding that performance on the how-many task was significantly greater than that on the give-n task or the follow-up analysis supporting the how-many-priority hypothesis over the simultaneous-development hypothesis for all small sets and over the give-n-priority hypothesis for sets of three. However, not checking for overapplications on the give-n task does limit the external validity of the present results. That is, caution should be exercised in generalizing these results to cases that involved checking for overapplications. Moreover, if a give-n task is needed to accurately gauge, for example, a child’s n-creator level, scoring should account number-word overapplications.
Implications and Conclusions
Theoretical Implications
Researchers have focused on whether performance on small-n recognition and creation tasks are concordant because such results were interpreted as supporting the discontinuity hypothesis (e.g., Le Corre et al., 2006), whereas non-concordant results were regarded as support for the continuity hypothesis (e.g., Cordes & Gelman, 2005). The present results are a first step toward supporting a version of the discontinuity hypothesis that entails postulating non-concordant development of subitizing-based small-n recognition and creation.
Marchand et al. (2022) offered two reasons for the instability of higher subset levels: (a) misclassification of CP-knowers and (b) noisy associative mappings between number words and approximate magnitudes (see also Krajcsi & Fintor, 2022; Wagner & Johnson, 2011). A third reason—children’s progressive construction of verbal-based number concepts—could explain the present, inconclusive results with sets of one and two due to a ceiling effect but clearcut results with sets of three and could either work in tandem with the second reason just discussed or not. Like other verbal-based concepts, children may initially overgeneralize the associated word and only gradually apply the word accurately and reliably (Mix, 2009; Palmer & Baroody, 2011). If our alternative discontinuity hypothesis outlined in Table 1 is correct and a child has already constructed exact verbal concepts for “one” and “two” but not for “three,” then significant non-concordant results can be expected only between the recognition of three and the creation of three—whether exact verbal small-number concepts build on an approximate-number system. Specifically, if children have an inexact concept of “three” as “many,” a fragile concept of “three,” or a newly emerged exact concept of “three,” then there is a greater chance they will perform (more) successfully on the recognition-of-three task than on the create-three task, whether associations between number words and the approximate-number system are a factor.
If further research confirms that small-n recognition emerges before small-n creation for some or all three of the smallest whole numbers (i.e., corroborate that Subphase 1.1 and Subphase 1.2 in Table 1 are distinct), it would be inappropriate to refer to both competencies as n-knower levels. More accurate labels for these competencies might be the “n-recognizer levels” and “n-creator levels,” respectively (cf. Clements & Sarama, 2021). Another reason for using the more specific terms n-recognizer and n-creator levels (instead of the broader term n-knower levels) was adduced by Barner and Bachrach (2010). They observed that specifying a particular n-knower level could be misleading, because it implies that a child does not have knowledge of numbers beyond the level. Their evidence and that of others (Gunderson et al., 2015; Krajcsi & Fintor, 2022; O’Rear et al., 2020; Sarnecka & Gelman, 2004; Wagner et al., 2019) indicates that children have some understanding of numbers beyond their n-knower level (e.g., knowledge of approximate magnitude).
Methodological Implications
As indicated in Table 1, caution should be exercised if the give-n task without counting is used to gauge the first phase of cardinality knowledge generally (i.e., n-knower knowledge that encompasses both small-n recognition or Sublevel 1.1 and small-n creation or Sublevel 1.2). The present results indicate that this task may underestimate the three-recognizer step of Sublevel 1.1 with 3-year-olds. Furthermore, in an intensive and dense case study of a toddler from 18 to 49 months of age, Palmer and Baroody (2011) found that, at 29 months, the child had difficulty responding to requests of “give me two” even after achieving reliable identification of sets of two. Further research is needed to examine whether the give-n task may underestimate the two-recognizer (or even one-recognizer) step of Sublevel 1.1 with 2-year-olds. The give-n task without counting is useful IF the goal is a conservative estimate of 3-year-olds’ Phase-1 cardinality of knowledge of three (or possibly two or even one if testing 2-year-olds).
The common practice in cognitive, developmental, and educational psychology of using the give-n task with counting to assess subset knowers needs careful reconsideration. For example, for Wynn’s (1990) version of the task, “any child who did not spontaneously count the objects was prompted to count … (e.g., “Can you count and make sure there are two?”; p. 171). However, repeatedly challenging children who have not constructed the CP and who do not understand the purpose of one-to-one counting to check their initial subitizing-based effort by counting could be viewed as challenging their initial answers and undermine confidence in them. Although research is needed to confirm the implication, asking subset knowers to count may be confusing to them, may render the task more taxing for no apparent reason, and may result in underestimating competence because of disinterest (avoidance behaviors) or acting out (uncooperative behaviors).