^{*}

^{a}

^{b}

^{a}

^{a}

^{a}

A large body of research has shown that human adults are fast and accurate at enumerating arrays of ~1-4 items. This phenomenon has been called subitizing. Above this range, enumeration is slower and less accurate. The subitizing range has been related to individual differences in variables such as mathematical abilities, working memory, etc. The two most common methods for calculating subitizing range today – bilinear fit and sigmoid fit – have their strengths and weaknesses. By combining these two methods, we overcome their biggest limitations and come up with a novel way for calculating Individual Subitizing Range (ISR). This paper introduces this new method as well as empirical studies designed to test the new method. We replicated classic effects from the literature and obtain a high correlation with the sigmoid fit method. This paper includes a Matlab code for easy calculation of ISR as well as a ready-to-use experimental file for testing ISR. We hope that these tools would be of use to researchers studying individual differences in the subitizing range.

When asked to enumerate a small number of items (usually up to 4), responses are very fast and accurate. When one plots response times (RT) as a function of the number of the to-be-enumerated items, the increase in RTs as a function of number is found to be very small for up to about 4 items, creating a shallow slope, or even a slope of zero (

While most studies have found that, on average, the subitizing range of adults is between 3-4, there does exist evidence for individual differences in the ISR. More specifically a variety of factors has been found to affect individuals’ subitizing range. For example, children with Williams syndrome, a genetic developmental disorder that is associated with severe impairments in visuospatial cognition, were found to have a lower subitizing range than age-matched typically developed children (

While there are many studies that attempt to measure differences in subitizing range at the group level, it is difficult to accurately measure the subitizing range of an individual in a way that generalizes well across participants. Evaluating the subitizing range of an individual can be important to study individual differences. For example, comparing a participant’s ISR to the group’s average, or measuring ISR of an individual in different time points. Moreover, having a measure of individual subitizing range will allow us to evaluate the distribution of subitizing ranges in a typically developed population, and compare this distribution to special groups, such as individuals with dyscalculia. Several methods have been previously employed; we will discuss the two most common methods: bilinear fit (also known as piecewise regression, hockey stick and broken-stick fit) and sigmoid fit.

When plotting RT as a function of the to-be-estimated number, the resulting data can be generally described by two lines with different slopes: a slope of zero – or close to zero – in the subitizing range, and a non-zero positive slope above the subitizing range. These lines are then fit to the data and the intersection point of these two lines is taken to be the upper limit of the subitizing range (

The basic bilinear fit is essentially a piecewise regression model, which joins two straight lines with an abrupt transition at the intersection point that minimizes the fitting error. More generally, the data is simply segmented into two groups and a line fit to each group of data, with the intersection of those lines being the upper limit of the subitizing range. This method requires the selection of the point at which the data is split along the x-axis into the two groups (subitizing and above) over which the two lines will be fitted (i.e., the split point). In the simplest implementation, the researchers will select the split point themselves. To do that, the researcher needs to first establish a prior assumption for which points lie within the subitizing range. This assumption is refined after fitting the two lines to the groups separated by this split point by then taking the intersection point as the upper bound of the subitizing range.

To determine the split point, various methods are used; some simply eyeball the plot to select that point (

There are other more accurate techniques for determining this splitting point for an individual, which do not face these limitations. Since the data range for subitizing calculations is small and data presentation consists of natural numbers only (e.g., 2, 3, 4, etc.), you can simply iterate over each of the possible splitting points between natural numbers, and for each possible pair of lines, calculate the best fit and the total sum of square errors. For example, you would start at a point “between” 2 and 3, fitting a line for the RT at 1 and 2, and a line for the RT at all values at 3 or larger. You would then do the same for a point between 3 and 4, and so on until the second last point you collected (since you need at least two points to uniquely determine a line). Once this is done, you choose the split point that has produced the smallest sum of square errors for each linear fit. The intersection point for this fit is your calculated upper bound of the subitizing range (

Bilinear fitting also presents several problems unique to ISR calculations. Due to the small number of RT data points used to fit the linear equations, individual deviations from expected response patterns can introduce large errors in the fit and skew the calculated subitizing range (see

Bilinear fit is also sensitive to experimental design. A methodology whereby stimuli presentation is time-limited may introduce a “plateau” in RT above the subitizing range when the presented numerosity become sufficiently large, effectively introducing a third linear element into the data which is not accounted for in this method (see

As an alternative to the bilinear approach, a sigmoid fit is a very common way to determine the subitizing range (

_{1} and c_{2}) as commonly used for subitizing calculations (

Sigmoid function can describe the relationship between RT and the to-be-estimated number in some cases: in the subitizing range, the slope should be zero (or close to zero). Then, when counting is required, the slope becomes positive. This is the relationship described by bilinear fitting. However, when counting is impossible (due to the conditions of the experiment, like very short exposure to the stimuli or a requirement to answer quickly), RTs to different numbers become very similar, resulting in a slope close to zero again. The same is often true for the relationship between error rates and the to-be-estimated numbers. In the subitizing range, there are virtually no errors. During counting, the error rate increases with number, and when estimating, the error rate is similar across quantities, resulting in a slope of zero.

The data for RT or error rates are used to fit a sigmoid function with unknown parameters (see _{1} = 2 with c_{1} = 0.5 and note that for c_{1} = 0.5 the inflection point lies much further from the initial region of shallow slope).

While both the basic sigmoid fit and the new combined method will be highly correlated, because a sigmoid curve is fitted in both methods in an identical manner, the basic sigmoid fit will always overestimate the range due to the inflection point of the curve being taken as the upper ISR limit. This will naturally always lie above the point of first deviation from base RT. Depending on the particular pattern of RT, this gap may be large or small. This will not be a perfectly consistent shift, but because it is always positive, it is still useful for inter-group comparisons and has been used as such, because the delta will still be there. This kind of fit can be useful when comparing individual differences, as long as all the participants are tested using the same method. However, it holds little theoretical meaning on its own, since the reported inflection point does not necessarily accurately reflects the subitizing range.

In this paper, we present and empirically test a novel method for calculating individual subitizing range (ISR). This method combines the bilinear fit and the sigmoid fit methods, requiring no a-priori assumptions about the data and correcting for the sigmoid overestimation of subitizing range, thereby managing to overcome the previously discussed limitations of these methods when used separately.

In this “combination” method, the RT data is first fit with a sigmoid function (a logistic function) as with the basic sigmoid method using a non-linear error minimization algorithm (we used nlinfit in M

This finds the “elbow point”, or the point at which the slope begins to appreciably change. The intersection point of these two lines is thus a more accurate estimate of the upper bound of the subitizing range; this method corrects for the sigmoid overestimation of subitizing range and avoids the a-priori assumptions necessary to ensure consistent bilinear fitting in a purely bilinear model.

_{2} (see ^{-c1}^{*(x-c2)}) and a subitizing line of slope zero intersecting the y-axis at the same point as the sigmoid curve. The intersection point of these two lines defines the upper bound of the subitizing range.

This approach requires no assumptions of data structure and can accurately fit both bilinear data and sigmoid data, thus eliminating the need for a multitude of bilinear approaches with varying methodologies and assumptions, and overcoming the limitations of previous approaches that used a sigmoid function (i.e. using only the midpoint of the function).

In the current study, we aimed to empirically test the suggested method. For this aim, we have conducted several enumeration studies in different conditions and with different populations. In Experiment 1, adult participants were briefly exposed to groups of items arranged in random or canonical patterns (familiar arrangements of items in space, like dots on a game die) – see

Before calculating ISR, we trimmed outlier trials and prepared the data (Using the ‘Prep_RT.m’ script). The user defines the range beyond which outliers are excluded. We elected to treat every trial that deviated more than 2.5 standard deviations as an outlier. In addition, the 'Prep_RT.m' script averages RT across trials for a presented number per participant, preparing the data for the script that computes ISR.

Thirty-eight participants (25 females), students at the University of Western Ontario, participated in the experiment for course credit. The mean age was 18.26 years (

The study was designed and executed in OpenSesame (

Groups of 1-9 cartoon frogs (11.43 x 12.7 mm) appeared on a white background. We used images of frogs and a frame with a “beach” theme to make this design suitable for children as well as adults (see

Participants were asked to say aloud how many frogs they saw on the screen, as quickly and as accurately as possible. No feedback was given. In order to practice vocal responses, participants were given a practice in which they had to read aloud simple words. This was also used to adjust the sensitivity of the microphone for every participant. The practice discontinued once the experimenter was satisfied with the performance of the participant. In the experimental task, a trial started with a red fixation point presented at the center of the screen for 1,000 ms. Five hundred ms after the elimination of the fixation point, a target appeared for 350 ms, followed by a 100 ms mask. Then, a slide asking “how many?” was presented, and remained until a participant provided a vocal response. Participants were instructed to respond as soon as they can, either when the stimulus was presented or when the stimulus was replaced with the mask or the "how many" slide. RT was measured from the onset of the target stimuli until a vocal response was detected. The “how many?” slide remained until participants responded. Then, the experimenter typed the response on a keyboard. The next trial started 1,000 ms after the response was typed (see

We will first describe our preprocessing procedure. Then, we will describe an analysis confirming that our study replicates previous findings. Finally, we will demonstrate how the ISR predicted by our method is related to differences between RT to canonical and random patterns.

Trials in which participants made unrelated sounds that were registered as a response, (started to say one number and changed it to another, etc.), were excluded. This procedure resulted in the exclusion of 3.1% of the trials across conditions and participants. Only correct trials were used in the analysis. RT to correct trials were trimmed. Trials that deviated more than 2.5 standard deviations from the mean of a specific participant at a specific number and arrangement were removed, resulted in trimming 1.07% of the correct trials of the random condition and 1.49% of the correct trials of the canonical condition across all participants. ISR was calculated separately for the random and the canonical conditions.

RT and accuracy rates in random vs. canonical arrangements were analyzed to verify that our study replicates previous findings. In this analysis, we included all numbers, including 1 and 9, to get an overview of the pattern. ^{2}_{p} = .86; ^{2}_{p} = .91 for RT and accuracy, respectively. Performance was also faster and more accurate for canonical arrangements in comparison to random arrangements ^{2}_{p} = .75; ^{2}_{p} = .95 for RT and accuracy, respectively. Finally, we found an interaction between range and arrangement. Namely, the effect of arrangement, although significant across ranges, was stronger in the counting range than in the subitizing range ^{2}_{p} = .64; ^{2}_{p} = .91 for RT and accuracy, respectively.

ISR was calculated for each participant separately for canonical and random arrangements. For the calculation of ISR, we used quantities 1-8 since in our other Experiments the maximal quantity was 8. In the canonical condition, the mean ^{2} value for the fit was 0.83. Out of 38 participants, 4 had an ^{2} value of 0.7 or less, which we considered unreliable. The correlation with the sigmoid fit method (^{2} value was 0.89. Two out-of-38 participants had an ^{2} value of 0.7 or less. The correlation with sigmoid fit was 0.69. For random patterns, 56% of the participants has ISR of between 3 and 4 (including 4). However, 30% had ISR of 3 or even lower, and 14% had ISR greater than 4. For canonical patterns, the proportions were different. Namely, only 6% of the participants had ISR of 2 or lower, 37% had ISR of between 4-5 and 37% had ISR greater than 5. The actual distribution of ISR per pattern is depicted in

As expected according to the literature, ISR was higher for canonical arrangements (mean = 4.73) compare with random arrangements (mean = 3.38).

Testing children poses unique challenges: their relatively short attention span limits the number of trials one can use. The data usually contains more noise. In addition, a design requires vocal response is even more problematic: responding vocally is exhausting even for adults, and with children, fatigue means even more noise and a need for even shorter experiment. Since many studies of subitizing are done with typically or atypically developed children and special populations (e.g.,

Twenty-four typically developed children (13 females) between the ages of 4 and 9, were recruited from the psychology department’s developmental participant pool. The mean age was 7 years (

Groups of 1-8 cartoon frogs (11.43 x 12.7 mm) appeared on a white background. We used images of frogs and a frame with a “beach” theme to make this design suitable for children as well as adults (see

The procedure was similar to that of Experiment 1. For 4-5-year-old children the practice included naming pictures instead of reading words.

5.45% of the trials across participants were excluded for technical reasons, such as the microphone did not pick up the sound of the participant at the first try, a sound in the room was picked up as a response, or the participant started to say one number and continued to another (e.g., fou…five). Outliers (above and below 2.5 standard deviations) were also removed, resulting in exclusion of 2.86% of all correct trials across participants and numerosities.

To demonstrate the expected pattern of enumeration task, we plotted error rate and RT as a function of numerosity (

We calculated ISR according to RT for correct responses only. Four participants had ^{2} value lower than 0.7 and they were excluded from further analysis. Mean ^{2} value for the rest of the participants was 0.92. The correlation with a sigmoid fit was 0.78. The distribution of ISR values across participants is depicted in

One critical difference between enumeration tasks in the literature is the time to which the stimulus is presented. In some studies, the stimuli are presented for a limited time. For example,

Thirty-one participants (26 females), students at the University of Western Ontario, participated in the experiment for course credit. The mean age was 18.19 years (

The stimuli that were used for Experiment 2 were used also for this experiment.

Participants were asked to say aloud how many frogs they saw on the screen, as quickly and as accurately as possible. We emphasized that being fast is equally important to being accurate. No feedback was given. As described for the previous studies, a word-reading practice was given before the experimental task. In the experimental task, a trial started with a red fixation point presented at the center of the screen for 1,000 ms. Five hundred ms after the elimination of the fixation point, a target appeared until participant made a response. Then, the experimenter typed the response on a keyboard. RT was measured from the onset of the target stimuli until a vocal response was detected. The next trial started 1,000 ms after the response was typed (see

Procedure of Experiment 3: Unlimited presentation time.

Due to technical problems, 2.95% of the trials across participants were removed. Outliers (above and below 2.5 standard deviations) were also removed, resulting in exclusion of 2.26% of all correct trials across participants and conditions.

As seen in

ISR was calculated based on RT for correct responses only. All participants had ^{2} value higher than 0.7. The mean ^{2} value was 0.96. The correlation with sigmoid fit was 0.75. The distribution of ISR in our sample is described in

The current Experiment had more trials than the other experiments. This enabled us to test how consistent the ISR scores obtained by our methods are. For this aim, we first divided the data from Experiment 3 into two halves, by separating odd and even trials. Namely, trials Number 1,3,5 etc. of each participant were part of the “odd trials” data set, and trials Number 2,4,6 etc. were part of the “even trials” data set. For each data set, we calculated ISR. Then we correlated the ISRs of the two data sets. This procedure yielded a Pearson value of 0.54 (

Comparing the new method to a sigmoid fit method, our method applies the same sigmoid fit as reported in previous studies (

To test this expected pattern, we have calculated for every participant in Experiment 3 the ISR twice: once according to our method and once according to the infliction point. The results were as expected; the two ISR estimates highly correlated (Pearson's

In this paper, we present a novel method for calculating subitizing range at a single-subject level. The method combines sigmoid fit with bilinear fit, thereby overcoming some of the limitations these methods present separately. There is no need to assume an initial subitizing range or a normal distribution of data, no need to limit or truncate the range for fitting, the fit is less affected by outliers, and the ISR value does not over-estimate the subitizing range.

To validate this method, we used different common experimental designs and multiple populations. We replicated the effect of spatial arrangement: ISR was higher for canonical than for random arrangements. We also demonstrated that the correlation of ISR values using a traditional sigmoid fit and our new method is relatively high (between .7 and .83) in different populations (children and adults) and in different experimental designs (limited/unlimited stimuli presentation time and random/canonical arrangements).

Importantly, we used the exact same fit to sigmoid curve as previous studies (

Finally, we also conducted a reliability analysis to demonstrate that the ISR produces reliable results even with a procedure that is expected to take less than 30 minutes to complete. This is especially important when the ISR is tested with children or other populations with short attention span. The task used in this study is available online in a children-friendly format.

The current method allows to evaluate the subitizing range of individuals and to compare proportions of ISRs amongst different populations. Usually the subitizing range is reported to be between 3 to 4 or between 3 to 5. Our sample of typically developed adults and children, although small, suggests that there is variability in the subitizing range even in the general population. This variability depends not only on the population but also on the conditions of the task. For example, both Experiments 1 and 3 tested typically developed adults (university students). However, in Experiment 1, the stimuli were presented for limited time and participants were encouraged to respond quickly. In Experiment 3, the stimuli remained on screen until participants responded, and speed was as important as accuracy. In these two experiments, the proportion of different ISR seems different: the proportion of ISR greater than 4 was higher when stimuli were presented until response. This finding needs to be taken with caution given our relatively small sample size. Future studies that will employ our method of analysis in different conditions with larger sample size will be able to better evaluate how much the ISR is affected by different experimental conditions.

In addition, there is also variability in how much spatial arrangement affects the subitizing range in an individual (

The ISR values obtained here and with other methods (sigmoid and bi-linear fits) are not necessary whole numbers. What does ISR values represent? Is it accurate to say that an individual with a subitizing range of 4 will always, under various conditions, subitize 1-4 items and count more than 4 items? If it so, then ISR values that are not whole numbers are meaningless. We, however, think that such account for the meaning of ISR is less likely, given the multiple factors influencing enumeration. One such factor, as discussed above, is the time one has to see the stimuli. For example, compare trying to enumerate the number of legs of a housefly, flying very fast, vs. enumerating the number of petals of a flower you are holding. Even if both quantities are 4, it could be that with the fly, you will subitize (and maybe fail to reach a correct answer), while with the flower you will count, just because you have enough time and you want to be accurate.

Another example for a factor that highly affect enumeration is the spatial arrangement of the to-be-enumerated items. A canonical pattern of 5 items will (most likely) be 'subitized' while a random pattern of 5 will be (most likely) counted. However, one must keep in mind that not all patterns are completely canonical or completely random. In other words, some random patterns may be less or more familiar than other random patterns. Some studies have demonstrated this point. For example,

Given the multiple factors that influence enumeration, we suggest that the ISR is not the actual subitizing range, but the

Our new approach combines the advantages of bilinear regression with a non-linear regression sigmoid fit to take advantage of the more robust nature of sigmoid fitting while correcting for the overestimation in subitizing range. There is now no need to make any prior assumptions about where the subitizing range may be, no data needs to be truncated prior to fitting, and the approach is more robust to experimental variations (e.g. limited presentation time or an emphasis on speed) and to participant strategies, as well as being less affected by outliers in RT. The approach has been codified into easy-to-use Matlab scripts that can be run on both Windows and Mac systems and is generalizable across participants and experiments. The approach has been validated in several experiments with different populations (see results), replicating classic results from literature.

A limitation of the method we report here is that it can only be used with response time but not accuracy data. Especially with adults, and in experiments where presentation time is not restricted or limited, some participants obtain consistently high accuracy measures, thus presenting no identifiable transition outside of the subitizing range. However, this limitation is the same for sigmoid fitting and bilinear fitting. We recommend using RT data for this reason, as it is more suitable to measure the difference between the subitizing and counting range across participants and experimental designs.

This study presented a new method and tools for calculating individual subitizing range and validated the method by demonstrating its correlation with sigmoid fit across different populations and experimental procedures. The experimental files, and scripts, as well as readme files are all available online at

The authors have no funding to report.

The authors have declared that no competing interests exist.

The authors have no support to report.