Why do some children develop more fluency with numerical concepts than others? One possibility is that children differ in the extent to which they attend to numerical aspects of the environment. Because those children who spontaneously focus on numerical features of their environment will experience more self-initiated practice with number concepts than those who do not, this may lead to faster development of numerical skills. The idea that children differ in the extent to which they spontaneously focus on numerosity was empirically examined by Hannula and Lehtinen (2005) by explicitly measuring children’s Spontaneous Focusing on Numerosity (SFON) tendencies. They demonstrated that children’s SFON tendencies correlate with their numerical skills, suggesting that SFON may be an important factor in predicting numerical development. Hannula and Lehtinen’s finding has been subsequently replicated a number of times (e.g., Batchelor et al., 2015; Gray & Reeve, 2016; Hannula et al., 2010).
There are two broad ways in which researchers have attempted to measure children’s SFON tendencies. One is to use an imitation task (Hannula et al., 2010; Hannula & Lehtinen, 2005; Nanu et al., 2018; Rathé et al., 2016; Rathé et al., 2018; Savelkouls et al., 2020). In such tasks a researcher may, while a child observes, feed plastic berries to a bird or post letters through a toy letterbox, and then ask the child to do exactly the same. If the child uses the same number of berries or letters then it is assumed that they have attended to the numerical aspects of the researcher’s actions. Equally, if they refer to exact number verbally or nonverbally, it is inferred that they have focused on numerosity. A second method has been to use a picture task. The researcher shows a child a picture that contains a variety of features, both numerical and non-numerical, and asks them to describe it (Batchelor et al., 2015; DePascale et al., 2021; Rathé et al., 2018, 2019, 2020, 2022; Savelkouls et al., 2020). If the child mentions any numerical aspects of the scene they are said to have spontaneously focused on numerosity.
Given that these two types of task aim to measure the same construct, it is perhaps surprising that the SFON measurements they produce do not always correlate. Batchelor et al. (2015) measured the SFON tendencies of 119 children using both a picture task and an imitation task, finding that the results correlated at only rs = .06. Similarly, both Rathé et al. (2018) and Savelkouls et al. (2020) found correlations close to zero when using both types of task. Moreover, Nanu et al. (2020) found that the trials from imitation tasks and picture tasks cleanly loaded onto two different factors in their exploratory factor analysis. In sum, there appears to be robust evidence that imitation tasks and picture tasks measure different constructs which, following Rathé et al. (2018), we refer to as action SFON and verbal SFON. Action SFON is the construct measured during imitation tasks and verbal SFON is the construct measured during picture tasks. Despite this terminology, it is clear that imitation SFON tasks have a verbal component as well as an action component: clearly, the child must attend to the experimenter’s verbal instructions. In addition, verbal utterances in imitation tasks are often included in the scoring criteria in imitation tasks (e.g., Hannula & Lehtinen, 2005). Nevertheless, Gray and Reeve (2016) and Batchelor (2014) both found that action SFON scores derived from an imitation task mostly stemmed from children’s imitation acts rather than their verbal utterances.
One possible reason for the surprising lack of a relationship between the two SFON constructs is to suppose that picture tasks actually measure children’s general verbal skills, at least in part, rather than solely the extent to which they express their spontaneously focus on numerosity verbally. Picture tasks heavily rely upon children’s verbal expression, and both Batchelor et al. (2015) and Rathé et al. (2020) found that children’s verbal SFON correlated with the total number of verbal utterances made during the picture task. In other words, rather than being a measure of verbal SFON, perhaps picture tasks are primarily – or at least partially – a measure of general verbal skills. In contrast imitation tasks require fewer verbal skills – and not necessarily any verbal production skills – so perhaps more directly assess action SFON.
The two SFON constructs also differ in the extent to which evidence suggests they predict children’s mathematics achievement. Measures of action SFON robustly correlate with measures of mathematics achievement (Batchelor et al., 2015; Gray & Reeve, 2016; Hannula et al., 2010; Hannula & Lehtinen, 2005; Hannula-Sormunen et al., 2015; Torbeyns et al., 2018) and not non-mathematical outcomes such as reading (e.g. Nanu et al., 2018), but this is not always the case with verbal SFON. While both Batchelor et al. (2015) and DePascale et al. (2021) found strong relationships between verbal SFON and mathematics achievement, r = .47 and r = .38 respectively, Rathé et al. (2020, 2022) found close-to-zero relationships, all rss < .1 and Savelkouls et al. (2020, Experiment 1) found a similarly weak relationship with a measure of numerical understanding (the give-N task), r = -.04. In Rathé et al.’s (2019) cross-sectional study, a borderline significant relationship was found between verbal SFON and mathematical competency in 5 year olds (rs = .28), but not in 4 year olds (rs = -.26) or 6 year olds (rs = .08). To date these conflicting results have not been satisfactorily accounted for.
Moreover, recent work has found that children may differ in the extent to which they appear to be focusing on numerosity depending on whether or not this is assessed behaviourally or via verbal reports. Elliott et al. (2022) assessed SFON using an imitation task and two different scoring methods. One focused on children’s verbal utterances, the other on their imitation behaviour. They found higher SFON scores on the behavioural measure and that action SFON, but not verbal SFON, was associated with children’s prior mathematics achievement.
In sum, further work investigating the validity of verbal SFON, as measured by picture tasks, would be worthwhile (Inglis, 2020). More specifically, does asking children to describe a visual scene, and assessing whether or not they use number words in their description, give us insights into whether they will engage in more self-directed number practice in realistic situations? Addressing this question was the primary aim of the study reported in this paper.
We are not the first to ask this question. The ecological validity of SFON tasks has previously been assessed with mixed results (DePascale et al., 2021; Edens & Potter, 2013; Hannula et al., 2005; Rathé et al., 2016, 2018). Hannula et al. (2005) assessed children’s action SFON, and recorded their use of number words during day-care activities, both before and after an intervention. They found mixed results: at pre-test the relationship between action SFON and number-word use was nonsignificant, r = .36, p = .227 (n = 13), but at post-test it did reach significance, r = .55, p = .027 (n = 16). Conversely, Edens and Potter (2013), found no relationship between children’s self-selected number activity choices in the classroom and their action-SFON tendencies (the non-significant r was not reported, n = 14).
In a much larger study, Rathé et al. (2016) found no significant association between 48 children’s action SFON and their spontaneous number-related utterances during a picture book reading activity, rs = –.14, p = .35. However, when Rathé et al. (2018) assessed 65 children’s verbal SFON they did find a significant association with number-related utterances during picture book reading, rs = .47, p < .01. Consistent with the earlier study, in the same sample no such relationship was found for action SFON, rs = .02, p = .88. Using a similar approach, DePascale et al. (2021) assessed children’s verbal SFON, finding no significant relationship with their ‘foundational number talk’ during a parent/child play session, r = -.21. However, DePascale et al.’s interest in the parent/child play session focused not only on children’s spontaneous number talk: to form their foundational number talk measure they also coded children’s responses to parent’s numerical prompting. Given this, their study does not directly address the validity of verbal SFON as measured by picture tasks (one perhaps would not expect children’s prompted number talk to be associated with their verbal SFON tendencies).
Finally, Trickett et al. (2022) assessed 164 young children’s verbal SFON and also conducted observations of parent/child play sessions. They found no significant relationship between the use of number words during these sessions and verbal SFON, r = .18. However, they noted that this could have been because of floor effects in their verbal SFON measure found in their relatively young sample (mean age 3.6 years): only 31.7% of children mentioned number in any of the three picture task trials used to form the verbal SFON measure, limiting the possibility of finding any relationship with number word use.
In sum, there is mixed evidence about the ecological validity of verbal SFON as measured by picture tasks, and at least some suspicion that the construct may, at least in part, index children’s verbal skills rather than solely their spontaneous focusing on numerosity tendencies. In the study reported in this paper we set out to directly assess the extent to which children’s verbal SFON, as assessed using a picture task, predicts their verbal SFON as assessed with a play task. Our play task measure involved assessing the extent to which children spontaneously used number words during a parent/child play session (e.g., Braham, Libertus, & McCrink, 2018). In addition, we aimed to assess the extent to which verbal SFON, as assessed by a picture task, is related to children’s general verbal skills, by correlating children’s picture task scores with a standardized assessment of their verbal skills.
Method
Participants
Fifty-six child-parent dyads were recruited through the University of Nottingham’s Summer Scientist Week, an event where children and parents visit the university to take part in a range of research studies. Children (30 boys) were aged 4 to 6 years (M = 4.9 years, SD = 0.6 years). Roughly half of them had just finished their first year of school (n = 27) and the other half were preschoolers soon to start school (n = 29).
Eight children were excluded from the analyses because they had missing data (n = 4), English was not their native language (n = 3) or they were identified by their parents as having learning difficulties (n = 1). The final sample comprised 48 complete datasets.
Materials and Procedure
Children and parents took part in a single testing session comprising two separate phases: an observational (play task) phase followed by an experimental (picture task) phase. As with the previous studies, they were not informed of the numerical nature of the research.
Play Task
Children and parents were observed (video-recorded) whilst they played three games together:
-
Hungry Hippos. Participants were asked to play Hungry Hippos, a tabletop game in which players munch as many marbles as they can with their toy hippos. There were four hippos and 19 marbles (12 red, 6 silver and 1 gold).
-
Lego Duplo. Participants were given a box of Lego Duplo and they were asked to construct one of three objects, either a boat, a car, or a house. Each object was presented on its own instruction card.
-
Picture Printing. Participants were presented with a collection of pictures that had missing pieces (e.g. a cat with no facial features). They were asked to choose a picture and fill in the missing pieces using different geometric-shaped stamps and colored inks.
These games were chosen on the basis that they would be familiar and suitable for children and parents to play together. The researcher presented each game sequentially, in the same order for each child-parent dyad. Each game lasted approximately 3 minutes and ended when the activity came to a natural end, e.g. the game finished or the model was complete, (Hungry Hippos M = 2.33 minutes, SD = 0.75, min = 1.12, max = 4.07; Lego Duplo M = 3.39 minutes, SD = 1.30, min = 1.02, max = 7.27; Picture Printing M = 3.44 minutes, SD = 1.32, min = 1.28, max = 6.83). Timings were variable to increase the naturalistic nature of the observations.
For each game children received a score of 0 or 1. They were scored as spontaneously focusing on numerosity if they initiated any symbolic number talk. Symbolic number talk was any utterance which included an exact number word (e.g. “I need two red pieces of lego”). For this to be categorized as spontaneous it had to be unprompted by the parent. If the parent first asked the child “how many red pieces of lego do you need?” and then the child responded “I need two red pieces of lego”, then the child would receive a score of 0. However if, on the same trial, a child subsequently independently initiated unrelated number talk then they received a score of 1. “One” was considered to be a numerical utterance only when it unambiguously referred to the number 1. The inter-rater reliability of two independent observers (who coded 10% of the videos) was 1.00. Children received a total play task score out of three.
Picture Task
The materials used in this task were three cartoon pictures each laminated on an A4 card. The researcher introduced the task by saying: “This game is all about pictures. I’m going to show you a picture, but I’m not going to see the picture. Only you get to see the picture. This means I need your help to tell me what’s in the picture.” On each of three trials, the researcher held up a picture in front of the child (with their left hand) and said: “What can you see in this picture?” With their right hand the researcher wrote down everything the child said (pilot testing revealed that children typically did not offer extensive verbal descriptions so we opted not to record these sessions, one drawback of this decision is that we cannot calculate a measure of inter-rater reliability). If the child was reluctant to speak, the researcher repeated their request: “Can you tell me what you can see?” There was no time limit for children to respond. When the child finished the researcher asked: “Is that everything?” When the child was ready to move on the researcher introduced the next trial: “Let’s look at another picture. Ready, steady…”
The pictures were presented in the same order for each child. All pictures contained several small arrays of 1 to 3 objects, people or animals that could be enumerated (see Figure 1 for an example). Due to the nature of the picture task, additional larger numerosities were included in all three trials (e.g. blades of grass).
Figure 1
For each trial children were scored as spontaneously focusing on numerosity if their description contained any explicit number word/s, regardless of whether they had enumerated the objects correctly. For example, in Picture 1 children would receive a score of 1 if they accurately described “three chicks”, or if they miscounted and inaccurately described “four chicks”, but not if they described “some chicks” and made no other reference to number in their description. Other related utterances (e.g., “I’ll count them”) did not result in a score unless accompanied by an explicit number word. “One” was considered to be a numerical utterance only when it unambiguously referred to the number 1. Scores for each trial were binary therefore a child who mentioned number several times and a child who mentioned number only once both received the same score of 1.
Verbal Abilities
As part of the wider Summer Scientist event, 39 of the children also completed the British Picture Vocabulary Scale (BPVS), a standardized measure of receptive vocabulary. On each trial the researcher reads a word and children are required to select which of four pictures matches the word. Testing continues until children have answered six consecutive items incorrectly. This was administered by a different researcher in a separate session. We interpreted BPVS scores to be measures of children’s verbal abilities.
Results
We first report descriptive statistics for the play and picture task measures of verbal SFON. Then we assess the relationship between the two scores using Spearman correlations, before finally looking at the relationship between the verbal SFON, as measured on the picture task, and verbal ability scores.
Figure 2 presents the frequency of children obtaining verbal SFON scores from 0 to 3 during child/parent play (play task) and on the picture task (picture task). In sum, children showed substantial individual differences in both tasks. The Appendix shows example responses from the picture task and example transcripts from the child/parent play sessions, together with the associated scores.
Figure 2
There was a strong positive correlation between verbal SFON scores from the picture task and the play task, rs = .638, 95% CI [.419, .802], as shown in Figure 3. The more children spontaneously focused on numerosity when asked to describe the pictures, the more self-initiated references they made to symbolic numbers during the child/parent play session. Controlling for age and verbal skills (BPVS raw score) did not reduce the strength of this relationship, prs = .668, 95% CI [.464, .833], suggesting that verbal SFON scores from the picture task are associated with verbal SFON scores from the more ecologically valid play task.
Figure 3
Finally, we assessed the extent to which picture task verbal SFON scores were related to verbal abilities, as indexed by raw BPVS scores, finding no significant relationship, rs = .187, 95% CI [-.147, .500].
Discussion
We explored the ecological validity of verbal SFON as measured by picture description tasks. Previous research on verbal SFON has found surprising and inconsistent results. Specifically, despite picture tasks and imitation tasks both being designed to assess SFON, it has been found that responses to each type of task are not positively correlated, suggesting that verbal SFON and action SFON are separable constructs. One particular concern about verbal SFON is that picture tasks typically place a high verbal requirement on children, as they must accurately describe the picture to the experimenter. Given this, we investigated whether individual differences in verbal SFON are driven by individual differences in general verbal abilities. We found no evidence for this hypothesis. Rather, in our study children’s BPVS scores were not significantly associated with their verbal SFON scores, as measured with a picture task.
SFON is of interest to numerical cognition researchers because children with high SFON tendencies are believed to engage in more number-based self-initiated practice. Specifically, those children who regularly focus on the numerical aspects of their environment are presumed to engage in numerical processing much more often than those who rarely focus on the numerical aspects of their environment. Given this, we assessed the ecological validity of our verbal SFON picture task by assessing children’s verbal SFON with a play task which captured the extent to which they spontaneously mentioned number during a realistic child/parent play session. We found that scores from the picture and play tasks were strongly related, suggesting that verbal SFON, as indexed by picture tasks, does indeed capture something important about the likelihood that children will spontaneously focus on numerosity during their day-to-day activities. This additional self-initiated numerical practice, in turn, is likely to explain why verbal SFON has previously been found to predict children’s early mathematics achievement (Batchelor et al., 2015; DePascale et al., 2021).
In short, we found that verbal SFON is a useful construct with which to explore individual differences in children’s early numerical development. This finding highlights two important questions that future research should address. First, why do measures of verbal SFON and action SFON not correlate? Second, why has previous research found inconsistent results when assessing the correlation between verbal SFON and mathematics achievement? We speculate on each issue in turn.
The Relationship Between Verbal SFON and Action SFON
Two important differences between picture tasks and imitation tasks may help to explain why measures of verbal SFON and action SFON don’t correlate: the role of the experimenter and the mode of children’s responses. Each of these may contribute to differences in children’s SFON-like behaviour in the two tasks.
In the imitation task the experimenter plays a crucial role: to succeed the child must pay attention to and imitate their behaviour. Research on the development of children’s imitation of others’ actions has identified that there are individual differences in children’s ability to imitate in social situations (Fenstermacher & Saudino, 2007). Thus factors unrelated to children’s SFON tendencies may contribute to their scores on imitation tasks. In contrast, in picture tasks children can choose where to direct their attention, albeit within the constraints of the picture being described. Research on the closely-related attention-to-number task, where participants are asked to match stimuli on the basis of a range of characteristics, including number (Mazzocco et al., 2020), has also identified individual differences in children’s and adults tendencies to focus on different dimensions. The effect that these individual differences will have in picture tasks designed to measure verbal SFON are likely to vary depending on the precise nature of the pictures used. In sum, factors specific to imitation and picture description tasks may influence participants’ SFON scores in different ways, reducing the correlation between the tasks (see also McMullen, Chan, Mazzocco, & Hannula-Sormunen, 2019).
The picture and imitation tasks also differ in the mode of children’s responses: unrestricted verbal utterances in the case of picture tasks and (usually) a single motor response in the case of imitation tasks (recall that although children may be scored as having SFONed in imitation tasks if they verbally mention number, this seems to be relatively unusual, Gray & Reeve, 2016; Batchelor, 2014). Consequently, even if children pay attention to number to a similar extent across both types of task, there may be differences in the extent to which this translates into verbal SFON and action SFON scores. Although we have demonstrated that verbal SFON is not strongly related to general verbal abilities, the tasks typically used to measure verbal and action SFON may place different demands on other general cognitive skills. For example, the imitation task plausibly requires children to hold in memory the number of actions made by the experimenter and to inhibit their own responses at the appropriate point (e.g., Batchelor, 2014). Indeed, consistent with this thought, Silver et al. (2020) found a borderline significant relationship between action SFON and inhibitory control. Individual differences in these general cognitive skills may therefore mask a relationship between the verbal SFON and action SFON scores obtained from the picture and imitation tasks. Future research could productively investigate how general cognitive skills contribute to SFON scores.
The Relationship Between Verbal SFON and Mathematics Achievement
If the picture task is an ecologically valid measure of verbal SFON, why doesn’t verbal SFON consistently correlate with mathematics achievement, as indexed by standardized tasks? Recall that some researchers have found strong associations between verbal SFON and measures of mathematics achievement (e.g., Batchelor et al., 2015; DePascale et al., 2021) whereas others have found relationships that are close to zero (e.g., Rathé et al., 2019, 2020, 2022; Savelkouls et al., 2020). It is not straightforward to explain these differences by reference to study characteristics such as the age of participants or the nature of the mathematics measure used. Some studies involved older children than others, and some involved relatively more numerical and relatively less numerical mathematics measures, but these factors do not seem to clearly line up with the differing findings.
One possibility for these discrepancies relates to the extent to which the verbal SFON measure in each study exhibited floor effects. In Batchelor et al.’s (2015) study, 35% of participants received a verbal SFON score of 0 (on the 0 to 3 scale); in the current study (which used identical materials), 31% of participants obtained such a score. In contrast, in Rathé et al.’s (2020) and Savelkouls et al.’s (2020) studies the equivalent figures were around 50%. DePascale et al. (2021) used a different scoring system that increased the variation in their data (rather than score each trial as a binary yes/no outcome related to whether or not the child used number words, they enumerated every occasion that numbers were mentioned). Perhaps the differing results found in the literature could be the result of different data distributions: it is difficult to detect a relationship between two variables where one exhibits a floor effect.
Why might Rathé et al. (2020) and Savelkouls et al. (2020) have found floor effects in verbal SFON in their samples? One possibility, in the case of Rathé et al., concerns the relative complexity of the images used in their picture task. Our impression, which readers can verify by inspecting the stimuli reported by Batchelor et al. (2015, p. 83) and Rathé et al. (2020, p. 286), is that Rathé et al.’s stimuli are somewhat more complex than Batchelor et al.’s (2015). Chan and Mazzocco (2017) have shown that the extent to which participants attend to numerical properties depends on the salience of competing dimensions. For example, they found that participants attend to number less often when it is competing for attention with the shape and colour dimensions, than when it is competing for attention with the dimensions of pattern and location. Given this, one might expect lower verbal SFON scores when participants are asked to describe relatively more complex images than when they are asked to describe relatively less complex images. This, in turn, may lead to floor effects, making it more difficult to detect a relationship between verbal SFON and mathematics achievement. Similarly, Savelkouls et al.’s (2020) picture task stimuli contained “many different colors, shapes and animal characters, providing many other features to label and talk about aside from numbers” (p. 1882), so this factor might also explain the relatively low verbal SFON scores observed in their study. This account, while speculative, does seem to account for the pattern of results found in the literature. If future research corroborates this proposal then extreme care must be taken when designing picture tasks to assess verbal SFON: pictures must be complex enough to ensure ecological validity, but not so complex that they lead to unacceptable floor effects that may mask relationships with other variables.