Revisiting and Refining Relations Between Nonsymbolic Ratio Processing and Symbolic Math Achievement

In their 2016 Psych Science article, Matthews, Lewis and Hubbard (2016, https://doi.org/10.1177/0956797615617799) leveled a challenge against the prevailing theory that fractions—as opposed to whole numbers—are incompatible with humans’ primitive nonsymbolic number sense. Their ratio processing system (RPS) account holds that humans possess a primitive system that confers the ability to process nonysmbolic ratio magnitudes. Perhaps the most striking finding from Matthews et al. was that ratio processing ability predicted symbolic fractions knowledge and algebraic competence. The purpose of the current study was to replicate Matthews et al.’s novel results and to extend the study by including a control measure of fluid intelligence and an additional nonsymbolic magnitude format as predictors of multiple symbolic math outcomes. Ninety-nine college students completed three comparison tasks deciding which of two nonsymbolic ratios was numerically larger along with three simple magnitude comparison tasks in corresponding formats that served as controls. The formats included were lines, circles, and dots. We found that RPS acuity predicted fractions knowledge for three university math placement exam subtests when controlling for simple magnitude acuities and inhibitory control. However, this predictive power of the RPS measure appeared to stem primarily from acuity of the line-ratio format, and that predictive power was attenuated with the inclusion of fluid intelligence. These findings may help refine theories positing the RPS as a domain-specific foundation for building fractional knowledge and related higher mathematics.

. On this argument, because the ANS is dedicated to processing numerosities (i.e., sets of discrete objects), it is well-suited to serve as foundation for natural conceptual analogs to numerosities-the whole numbers (Dehaene, 2011;Feigenson et al., 2004). In contrast, these researchers have argued that innate constraints of the ANS make it inappropriate for processing fractions, concluding that fractions learning (in contrast to whole number learning) cannot be supported by phylogenetically ancient protonumerical abilities. Ni and Di Zhou (2005) referred to such theories as innate constraints accounts, and Matthews, Lewis, and Hubbard (2016) was framed largely as a response to such accounts with regard to fractions. Here, we replicated a novel finding of Matthews et al. (2016) and extended the study to put this controversial account regarding symbolic fraction acquisition into the test.

Processing Nonsymbolic Ratio Magnitudes
However, growing evidence suggests that there may be a primitive nonsymbolic number sense that is different from the ANS (Jacob, Vallentin, & Nieder, 2012;Lewis, Matthews, & Hubbard, 2015;McCrink & Wynn, 2007;Sidney, Thompson, Matthews, & Hubbard, 2017). This primitive ability is dedicated to processing nonsymbolic ratios (e.g., ratios instanti ated by juxtaposing two line segments; Figure 1). Multiple studies have shown that human adults (Bonn & Cantlon, 2017;Jacob & Nieder, 2009a;Matthews & Chesney, 2015;Meng, Matthews, & Toomarian, 2019), infants (McCrink & Wynn, 2007) and even some nonhuman species (Bastos & Taylor, 2020;Drucker, Rossa, & Brannon, 2016;Vallentin & Nieder, 2010) are capable of representing ratio magnitudes when presented nonsymbolically. Against this backdrop, Matthews et al. (2016) leveled a challenge against the innate constraints account that fraction learning is hard because it goes beyond the limits of our basic representational capacities (e.g., Feigenson et al., 2004). They proposed that a nonsymbolic ratio processing system (RPS) might help children acquire symbolic fractions knowledge efficiently (see also Jacob et al., 2012;Lewis et al., 2015;Matthews & Chesney, 2015). They suggested that the RPS might potentially serve as a neurocognitive foundation for learning symbolic fractions much as the ANS supports symbolic whole number learning. They further argued that the RPS may provide an additional perceptual route that can expand cognitive primitive accounts of numerical cognition to apply to fractions-and perhaps to all real numbers (Sidney et al., 2017; see also Gallistel & Gelman, 2000).

Figure 1
Example Stimuli for Simple Magnitude (Top)

and Ratio Magnitude (Bottom) Comparison Tasks
Note. Line, circle, and dot formats were presented.
Several studies have provided evidence consistent with the hypothesis that the RPS and symbolic fraction represen tations are compatible. One line of studies has demonstrated that both adults and children are capable of rapidly translating magnitudes across nonsymbolic and symbolic formats (Binzak, Matthews, & Hubbard, 2019;Kalra, Binzak, Matthews, & Hubbard, 2020;Matthews & Chesney, 2015;Matthews & Lewis, 2017;Meert, Greǵoire, Seron, & Noël, 2013). In these studies, adults and children completed cross-format comparison tasks, whereby they determined the larger between a nonsymbolic and a symbolic ratio. The nonsymbolic ratios were presented in several formats, including lines (Kalra et al., 2020), circles or dots (Matthews & Chesney, 2015). Regardless of nonsymbolic format, participants' responses were rapid and generally accurate. In fact, cross-format comparisons are typically completed faster than within-format comparisons of symbolic fractions. Furthermore, these comparisons showed a distance effect whereby performance improves as the distance between two magnitudes decreases (Moyer & Landauer, 1967), suggesting that both symbolic and nonsymbolic ratios were represented as analog magnitudes and that those analog magnitudes were compatible enough to facilitate rapid comparison.

The RPS and the Acquisition of Symbolic Fractions Knowledge
One important aspect of the RPS account is that it posits that RPS acuity can help support the acquisition of fractions knowledge and other downstream mathematics, such as algebra. Specifically, Matthews et al. (2016) hypothesized 1) that both formal and informal learning helps generate links among nonsymbolic ratios and their corresponding symbolic fractions, 2) that individual differences in RPS acuity might moderate the effects of instruction, and 3) that the RPS, which has been proven to operate even when it serves as a task irrelevant dimension (Jacob & Neider, 2009a;Matthews & Lewis, 2017), exerts its effects on learning even when it is not an explicit pedagogical focus.
On this hypothesis, RPS ability should be associated with fractions knowledge, and perhaps even higher mathematics such as algebra which require an understanding of relational magnitude. To test this hypothesis, Matthews et al. (2016) investigated possible relations between the RPS and symbolic math achievement scores. The authors used a series of ratio magnitude comparison tasks to construct a composite measure of RPS acuity and also used simple dot and line comparison tasks to measure acuities for number of dots and line-lengths as controls. In a novel finding, they observed that RPS acuity predicted symbolic numerical measures, including fractions knowledge and algebra assessment scores measured at college entry. These relations were significant even after controlling for inhibitory control, number acuity, and line-length acuity.
To date, however, this novel result has not been replicated. Although two studies with children also showed similar relations between nonsymbolic ratio comparison performance and symbolic fractions ability (Hansen et al., 2015;Möhring, Newcombe, Levine, & Frick, 2015), even these closest existing reports were quite different from Matthews et al. (2016) in at least three important ways: 1) Neither study measured RPS acuity explicitly, 2) they were conducted with child participants rather than adults, and 3) as a result, neither assessed relations between the RPS and advanced mathematics, such as algebra.
With the present study, we aimed to replicate Matthews et al. 's novel results using some identical tasks, a similar protocol, and a sample drawn from roughly the same population (i.e., students from the same introductory courses at the same university). At the same time, the current study aimed to refine and extend the results in three ways. First, we included the additional domain general control measure of fluid intelligence along with the inhibitory control measure from the original study. Fluid intelligence, the ability to solve novel and abstract problems, has been known to be related with mathematical attainment and higher order mathematics (Preusse, Elke, Deshpande, Krueger, & Wartenburger, 2011;Primi, Ferrão, & Almeida, 2010). Thus, including this measure imposes a tougher test of the extent to which RPS acuity can explain unique variance of higher order mathematics. Second, whereas Matthews et al. examined only algebra subtest scores from the university placement exam, we acquired additional subtests of higher order math achievement from the same exam. These were trigonometry and math fundamentals, which tested a combination of basic arithmetic, algebraic and geometry skills. Even though the math fundamentals featured some algebra items, they were confined to linear equations, whereas the algebra subtest consisted of more advanced materials such as non-linear equations and complex functions. Similar to the algebra subtest, both math fundamentals and trigonometry seem to be more distal outcomes when compared to fractions comparisons. Although there is no preexisting evidence to guide our hypotheses, our theory suggested the possibility that RPS acuity may be predictive of these two additional measures. Thus, we sought to explore how the RPS might predict these other tests of higher-order mathematics. Third, we added ratio and simple magnitude comparisons for circle stimuli (see Matthews & Chesney, 2015;Meng et al., 2019) as a new format of nonsymbolic comparison tasks along with the line and dot formats from the original study. We added circles for three reasons: 1) Unlike dot arrays, they have no obvious whole number analogs; 2) unlike lines, they are not easily partitioned such that count based strategies are plausible; and 3) despite their use in other RPS studies (e.g., Park, Viegut, & Matthews, 2020;Matthews & Chesney, 2015;Meng et al., 2019), their relations to symbolic mathematics performance have yet to be investigated.
In an analytical extension, we investigated the comparative predictive power of ratio processing ability for each separate format. Prior work has shown that the RPS acuities differ depending on format (Park et al., 2020). Moreover, other work suggests that the relations between acuity for magnitude in a specific format and math achievement may be specific to particular subdomains of mathematics (Lourenco, Bonny, Fernandez, & Rao, 2012;Odic et al., 2016;Park & Cho, 2017). However, Matthews et al. combined performance across different ratio formats to create a composite RPS measure that obscured potential difference by format. In the current study, we included acuity in each format as separate predictors to determine whether RPS acuity in each format differentially relates to mathematical achievement. Moreover, prior work has shown that perceptual acuity for magnitudes can vary substantially by format (i.e., Odic, 2017;Odic, Libertus, Feigenson, & Halberda, 2013;Starr & Brannon, 2015). For example, acuities for continuous magnitudes are typically found to be more accurate compared to acuity for number of dots (Odic, 2017;Odic et al., 2013;Park & Cho, 2017;Starr & Brannon, 2015). For instance, Odic (2017) found that acuity for discriminating line segments was the most accurate followed by acuities for discriminating area and numerosity. Based on this prior research, we hypothesized that our participants would demonstrate highest acuity for our line-based stimuli, followed by that for circles, and then dots. Finally, we expected that ratio comparisons would be more difficult compared to that of simple magnitude as previously found by Matthews et al. (2016).

Method Participants
Ninety-nine undergraduate students from a large Midwestern university (85 Female; M age = 20.12, SD = 1.14) participated for course credit.

Measures
Because this study was a conceptual replication of Matthews et al. (2016) we used several of the same measures from Matthews et al., and added a few more as well. For the chief predictors, we used three nonsymbolic ratio comparison tasks: separated formats of the dot and line ratio tasks from Matthews et al. and circle ratios, adapted from Meng, Matthews, and Toomarian (2019). We also included other cognitive tasks as covariate predictors. First we used simple magnitude comparison tasks in dot, line and circle formats to control for the ability to process the absolute magnitudes of the components of ratios (in contrast to their relative magnitudes). Each of these tasks in simple formats has been studied for decades (e.g., Krueger, 1984;Stevens, 1957;Teghtsoonian, 1965). We also included a flanker task to account for differences in inhibitory control and Raven's Standard Progressive Matrices as a measure general intelligence.
Three of the outcome measures were identical to those in Matthews et al. (2016)-symbolic fractions comparisons, a fractions knowledge assessment (FKA), and Algebra scores from participants' university placement examinations. We also included two additional placement examinations subtest scores-Trigonometry and Math fundamentals (all the measures are listed in Table 1). Each measure is described in more detail below. Task materials, data collected, and the R scripts are available online via the Open Science Framework (https://osf.io/c75xy).  (Schneider, Eschman, & Zuccolotto, 2002).

Nonsymbolic Comparisons (RPS Acuity and Simple Magnitude Control Tasks)
Nonsymbolic comparison tasks were blocked by type (i.e., ratio or simple magnitude) and format (i.e., dot, line or circle stimuli). For all nonsymbolic comparisons, participants were simultaneously presented with two stimuli and instructed to choose the larger one. Participants indicated their choices via key press-pressing "j" for right and "f" for left. Each trial began with a fixation cross for 200 ms, immediately followed by brief presentation of two comparison stimuli ( Figure 1). Per Matthews et al. (2016), nonsymbolic ratio comparisons were presented for 1,500 ms before disappearing, and simple magnitude control trials were presented for only 750 ms. Trials did not advance until participants responded. Each ratio block started with 10 practice trials followed by 40 experimental trials, and each simple magnitude control block started with five practice trials followed by 40 experimental trials. Task difficulty varied from trial to trial and was operationalized as the ratio distance, or the ratio between compared stimuli in a trial. Difficulty increased as ratio distance approached 1:1. Note that for ratio comparison trials, ratio distance was a ratio of ratios. The ranges of line and dot ratio distances were varied by format per Matthews et al. (Table  2). For circle stimuli, we adopted the distances for lines from Matthews et al., as our prior piloting demonstrated that participants had similar discrimination abilities in that format. Line-Ratio Comparisons -Line ratio stimuli were constructed by juxtaposing white and black line segments with jitter per Matthews et al. (2016). White segments ranged from approximately 24 to 128 pixels long, and black segments ranged from approximately 30 to 254 pixels. We followed Matthews et al. 's controls to minimize the likelihood that participants would choose based on the physical length of components as opposed to irrelevant dimension such as overall raw size. Hence, we controlled stimuli such that the larger ratio had longer summed lengths (numerator plus denominator) in half of all trials, and the larger ratio had shorter summed lengths in the other half of trials.
Simple Line Comparisons -Individual black line segments appeared on each side of the screen. Segments ranged from approximately 64 to 162 pixels in length. The two lines were always jittered relative to each other so that participants would be encouraged to consider the entire lengths of each line as opposed to merely focusing on the tops of the lines as would be possible if they were aligned at the bottom.
Circle-Ratio Comparisons -Stimuli were constructed of white circles in the numerator/top position and black circles in the denominator/bottom position. The size of white circles ranged from approximately 2,826 to 12,070 square pixels, and the size of black circles ranged from approximately 3,847 to 18,617 square pixels. We controlled summed areas such that the larger ratio had a larger summed area in half of all trials, and the larger ratio had smaller summed area in the other half of trials.
Simple Circle Comparisons -Two black circles were presented on each side of the screen. The size of circles ranged from approximately 1,661 to 5,539 squared pixels.

Dot-Ratio
Comparisons -Ratio stimuli were constructed from juxtaposed pairs of white dot arrays against black backgrounds (numerators) and black dot arrays against white backgrounds (denominators). The number of dots in the numerators ranged from 11 to 67, and the number of dots in the denominators ranged from 30 to 118. We controlled the summed numerosities (i.e., the summed number of white and black dots) such that in half of all trials, the larger ratio featured a greater summed number of dots, and in the other half, the larger ratio had a smaller number of summed dots.
Simple Dot Comparisons -An array of black dots against a rectangular gray background appeared on each side of the screen. The number of dots in arrays ranged from 50 to 200 to preclude the possibility of counting given the rapid rate of response typical for such tasks (i.e., <1,000 ms). In half of the trials, the summed area of dots was constant across the two arrays, and in the other half, the dot size was constant across two arrays. Thus, in the first case, dot size was anticorrelated with numerosity, and in the other case, the cumulative area and density were correlated with numerosity.

Symbolic Fractions Comparisons
Participants selected the larger of two symbolic fractions via keypress. All fractions stimuli were irreducible and composed of single-digit numerators and denominators. We used the same 30 pairs used by Matthews et al. (2016), which excluded fractions pairs sharing common components (e.g., 3/5 vs. 3/6) to minimize dependency on componential strategies (i.e., judgments based solely on numerator or solely on denominator comparisons, rather than on overall fraction magnitude). On each trial, a fixation cross appeared for 200 ms followed by presentation of comparison stimuli until the participant responded or until the trial timed out at 5,000 ms. Symbolic comparison blocks started with five practice trials followed by 30 experimental trials. The side on which the larger fraction appeared (left/right) was counterbalanced across trials. Order of trial presentation was randomized.

Flanker Task
Our version of this measure of inhibitory control was identical to that from Matthews et al. (2016). Participants were asked to decide which direction the center arrow among five was pointing. On each trial, a fixation cross was presented for 500 ms followed by an array of five evenly spaced arrows, which appeared for up to 800 ms or until the participant's response. Participants first received 12 practice trials followed by 80 test trials. Half of experimental trials were congruent, in which all stimuli pointed the same direction, and the other half of the trials were incongruent, in which the center stimulus pointed the opposite direction from the four flanking arrows. We used the congruity-based difference in RTs (RT incongruent − RT congruent ) in our analyses.

Fractions Knowledge Assessment
The FKA was a 38-item pencil-and-paper test constructed by Matthews et al. (2016). Items were culled from key national and international assessments (e.g., National Assessment of Educational Progress and the Trends in International Mathematics, and Science Study), and from assessments developed by psychology and math education researchers (Carpenter, Corbitt, & National Council of Teachers of Mathematics, 1981;Hallett, Nunes, Bryant, & Thorpe, 2012). Items were intended to measure aspects of conceptual knowledge and of fraction arithmetic procedures. The conceptual knowledge items covered ordering of fractions, density, and how fractions operations affect magnitudes plotted on a number line (see sample items in Appendix). Participants had 20 minutes to complete the test. All participants were able to finish the test within the time limit.

College Mathematics Placement Exams
We obtained scores from three subtests of the math placement exam taken by all incoming freshman: Advanced Algebra (AALG), Math fundamentals (MFND), and Trigonometry & analytic geometry (TAG). The exams were taken by all freshman once admitted to the University for placement purposes and have been subject to years of validation work by the university testing services. As noted above, Math fundamentals tested a combination of basic arithmetic, algebraic and geometry skills. The subtests were composed of 30, 25, and 20 items respectively. The internal consistency reliability of each test (Cronbach's α) was .89 for MFND, .88 for AALG, and .85 for TAG. The mean normalized assessment score for each test is 500 with a standard deviation of 100.

Raven's Progressive Matrices
Raven's is a widely-used standardized test measuring fluid intelligence (Raven, Raven, & Court, 1998). The test is composed of five sets of 12 items each. Each item requires analyzing a pattern of figures and reasoning about what figure would complete the pattern if placed in the blank. As the test progresses, the test set becomes more complex. Participants were given 20 minutes to finish the test. Each item of the test was worth 1 point, with a total possible raw score of 60. This test has been used across a wide range of population, with test-retest reliabilities of .83-.93 (Raven, 2000;Raven et al., 1998). We used the raw score in analysis.

Experimental Procedure
The experiment was divided into two sessions, each on a different day (M gap = 5.25 days, SD = 2.8). In session 1, participants completed all comparison and flanker tasks on computers. First, participants completed the nonsymbolic comparison tasks. The order of format was always dots, lines, and then circles. In each format, participants completed the simple magnitude control tasks first followed by the ratio magnitude comparison task. Next, participants completed symbolic fractions comparison followed by the flanker test. In session 2, the participants completed the FKA followed by Raven's Progressive Matrices.

Missing Cases and Outlier Removal
Data for simple line comparisons from one participant, simple circle comparisons from another, and the FKA for another were not collected due to experimenter error. Also, FKA and Raven's scores were unavailable for six participants who failed return for session 2. Additionally, we were unable to secure placement exam scores for one participant. These specific data elements were missing for individual participants whose data remained otherwise intact. Remaining data for these participants were included in the analyses whenever possible. However, because regressions were run using listwise deletion, when an element was missing, participants' with missing data elements were removed entirely from those regressions. We indicated the included sample size for each analysis in the corresponding tables for reference. For all computerized tasks, trials with reaction times (RTs) shorter than 250 ms and trials with RTs more than 3 SD from a participant's mean RT were trimmed. These steps resulted in the loss of 1.6 to 3.91% of the data for each computerized task. Additionally, for all comparison tasks, we excluded data from participants who scored beyond 3 standard deviations from the group mean for that task and from participants who scored below chance level. This step resulted in the exclusion of four participants' ratio comparison task data (2 from circle-ratio, 1 from dot-ratio, 1 from line-ratio, 3 from symbolic fractions comparison). This resulted in a trimmed analytic sample of 80 participants. Note, we used this trimmed sample in the hierarchical regressions, except those predicting symbolic fractions comparisons, so that the sample would not vary depending on the number of predictors in each step. For models predicting symbolic fractions comparison, an additional three participants were excluded for below chance level performance, resulting in analytic sample of 77 participants.

Weber Fraction Analysis
We used task accuracy as the measure of acuity in most of our analyses. However, to compare acuity across different comparison tasks, we computed internal weber fractions (hereafter, w) as opposed to accuracy as measures of acuity. In comparison tasks, W represents the smallest discriminable difference expressed as a ratio between the two magnitudes (e.g., Halberda, Mazzocco, & Feigenson, 2008;Pica, Lemer, Izard, & Dehaene, 2004), with smaller w indicating better discrimination acuity. Because the weber fraction is invariant to differences in specific ratio difficulties used to measure it, w allows direct comparison across tasks. By contrast, it is inappropriate to compare discriminability across tasks using accuracy when the tasks use different ratio distances for comparisons.
The model for Weber's Law assumes that the internal representation for magnitude can be represented as a Gaussian function. For instance, in the case of dot comparisons, if we use n 1 and n 2 to indicate the number of dots in each array, the Gaussian error function (erfc; Eq. 1) computes the degree of overlap between the two Gaussian functions. The overlap can be expressed as a new Gaussian with a mean of (n 1 -n 2 ) and a standard deviation of w n 1 2 + n 2 2 . For each individual, the w can be calculated using the Levenberg-Marquardt algorithm for nonlinear least squares fit on mean accuracy as a function of magnitude ratio (Eq. 2).

Eq. 2. Expected Percentage Correct
Percentage correct = 1 − 1 2 erfc ( n 1 − n 2 2w n 1 2 + n 2 2 ) We calculated w for each individual participant in each nonsymbolic comparison task. Prior to analysis, we excluded three participants' ws from circle ratio comparisons and one participant's w from dot ratio comparison due to extremely large values (>10, which would imply difficulty discriminating between 10 and 110 dots). We took such extreme values as indicators of noncompliance (see Odic [2017] for similar approach). We also excluded participants' w data that fell beyond 3 standard deviations from the group mean for each task. This step resulted in the exclusion of several w data points distributed across tasks (3 from dot ratio, 5 from circle ratio, 1 from line ratio, 1 from dot and 1 from line comparisons).

Statistical Power
We conducted two different types of analysis of our results: the first analysis was to compare acuities across different tasks which using w as the dependent variable, the second type of analysis replicated Matthews et al. 's (2016) regres sions, using acuity indexed by accuracy as independent variable. Because replication was our main goal, we powered the study for the analyses using accuracy as the chief independent variable. We used the "pwr" package in R to calculate statistical power based on the effect sizes of .18-.25 observed in the hierarchical models predicting symbolic fractions comparison (f 2 = .18), FKA (f 2 = .25), and Algebra (f 2 = .20) in Matthews et al. (2016) and α = .05 (Figure 2). Given these conditions, our initial recruitment of 99 participants would have resulted in power β = .96 for FKA, β = .91 for algebra, and β = .85 for symbolic fractions comparison. After cleaning, our regression framework with the trimmed samples using comparison task accuracy as predictors had the power to defect the above listed effects with β = .90 (n = 80) for FKA, β = .82 (n = 80) for Algebra, and β = .72 (n = 77) for symbolic fractions. For the supplementary analysis with models using ws as predictors, power was β = .87 (n = 74) for FKA, β = .78, (n = 74) for Algebra and β = .68 (n = 71) for symbolic fractions comparison (see Table S4, Supplementary Materials).

RPS and Simple Magnitude Acuity
To compare acuity across magnitude types and formats, we conducted linear mixed effects regression models to account for within-subject correlation using "Imer" function of lme4 package in R software (Bates, Mächler, Bolker, & Walker, 2015). We regressed acuity (ws) against task (2 levels, ratio = 0, simple magnitude = 1) and format (3 levels, dot = 0, circle = 1, line = 2). We also included a task × format interaction term in the model to check whether observed format differences differed depending on whether participants were comparing ratios or making simple magnitude comparisons. To facilitate evaluation of our hypotheses for how ws would vary with format and task (i.e., the ratio > magnitude task and dot > circle > line), we used a backward difference coding scheme to compare adjacent levels of variables (i.e., coded so that the mean of a given level was compared with the mean of the immediately prior level). We estimated fixed effects for all predictors with random intercepts. Results from these regressions are presented in Table 3. Participants showed significantly higher acuity (lower ws) for simple magnitude comparisons than for ratio comparisons (β = −.21, p < .001; Figure 3, Table 3). Moreover, there was a significant format effect. Consistent with our hypotheses, acuity was higher for lines than for circles (β = −.02, p = .021) and higher for circles than for dot arrays (β = −.07, p < .001). There were no significant interactions between tasks and formats, which suggests that main effects were additive (see Figure 3).

Figure 3
Task and Format Differences in ws Note. Lower weber fractions indicate higher acuity. Simple magnitude acuity was sharper than that for ratio magnitudes, as all ws in the panel on the right are lower than those on the left. Acuity also varied by format with acuity increasing (i.e., w decreasing) from dot to circle to line format.

Revisiting Matthews et al. (2016)
We first replicated the analyses from Matthews et al. (2016). We calculated a composite RPS acuity by taking the mean of line ratio and dot ratio accuracies per Matthews et al. Zero-order correlations showed that RPS composite scores were significantly correlated with three of the five symbolic outcome measures (FKA: r = .26, p = .019, Math fundamentals: r = .28, p = .014, Trigonometry: r = .23, p = .038) but not with symbolic fraction comparison (r = .16, p = .16) and Algebra (r = .20, p = .069) (Table 4). Next, we conducted a series of two-stage hierarchical linear regressions (Table 5), one for each of the symbolic outcomes (i.e., symbolic fractions comparison, FKA, Algebra, Math fundamentals, Trigonometry). In the first stage, we entered the control variables simple dot acuity, simple line acuity, and flanker performance. In the second stage, we added composite RPS acuity. To facilitate the interpretation of effect size across variables, we reported standardized coefficients in all hierarchical regression models.
These results noted, use of an RPS composite does not allow insight into whether RPS acuity in different formats are differentially predictive of math outcomes. Thus, we expanded Matthews et al. (2016) and disaggregated RPS acuity by format and controlled for fluid intelligence using Raven's Standard Progressive Matrices. We conducted analyses parallel to those above, but this time we included acuity in each format as separate predictors. Although we used accuracy as the indicator of acuity for the analyses reported below, we also conducted supplementary analyses using ws as an alternate measure of acuity. The results with ws were consistent with those reported using accuracy (see Supplementary  Materials). Bivariate correlations showed that line ratio acuity was significantly correlated with four of five symbolic math outcomes (FKA: r = .32, p = .004, Algebra: r = .28, p = .013, Math fundamentals: r = .35, p = .001, Trig: r = .29, p = .009) and with Raven's scores (r = .32, p = .004). In contrast, dot and circle ratio acuities were not correlated with any of the five symbolic outcomes (Table 4). Regarding Raven's, it was correlated with all math outcomes (FKA: r = .32, p = .004, Algebra: r = .42, p < .001, Math Fundamental: r = .43, p < .001, Trig: r = .30, p = .007) except for symbolic fractions comparison (r = .16, p = .16). Because our regressions used trimmed samples, we also conducted supplementary bivariate correlation analysis without using list-wise deletion (Table S2, Supplementary Materials), and the results showed similar correlation across variables.
Finally, we performed a new set of three-stage hierarchical linear regressions that extended Matthews et al. (2016) (Tables 6, 7, 8, 9, and 10). In Stage 1, we entered flanker scores and all simple magnitude acuities, in Stage 2, we entered ratio magnitude acuities, and in Stage 3, we added fluid intelligence. Below, we present results from regressions using comparison accuracies, but supplementary analyses conducted with weber fractions and yielded similar results (S4 Table  a-e, Supplementary Materials). Table 6 Results From the Hierarchical Regression Analyses Predicting Scores on Symbolic Fractions Comparison (n = 77)

Table 9
Results

Table 10
Results From the Hierarchical Regression Analyses Predicting Scores on Trigonometry (n = 80)
Dot acuity was the only significant predictor from among the controls entered in stage 1. It significantly predicted Algebra performance (β = .257, p = .036). Overall, inhibitory control and magnitude acuities in the models explained between 2 and 7% of the variance for math achievement in each of the three subdomains. When all ratio acuities were entered in stage 2 it explained an additional 7-10% of the variance in the models. Dot acuity was no longer a significant predictor of symbolic math outcomes. However, line ratio acuity emerged as a significant predictor for all symbolic math outcomes except for symbolic fractions comparison: line ratio acuity significantly predicted FKA (β = .282, p = .034), Algebra (β = .305, p = .020), Math fundamentals (β = .344, p = .008) and Trigonometry (β = .293, p = .028) (Figure 4). No other ratio acuities were significant. Thus, it seems that the predictive power of the RPS composite was driven largely by the predictive power of the ratios presented in the line format.

Correlation Between Accuracies of Line Ratio Comparison Task and Different Math Achievement
When Raven's scores were added in the final step, it explained additional 2-11% of the variance in the models. Fluid intelligence significantly predicted math achievement in three subdomains: FKA (β = .258, p = .039), Algebra (β = .375, p = .002), and Math fundamentals (β = .327, p = .007), but it was not predictive for symbolic fractions comparison (p = .168) or Trigonometry (p = .076). Because of a strong effect of intelligence and shared variance, line ratio was rendered non-significant for most outcomes, but remained a significant predictor for math fundamentals (β = .273, p = .031).
Finally, we note the unexpected finding that symbolic fractions comparison was inversely correlated with acuity for simple dot comparisons. Upon further analysis, this correlation appears to be coincidental; indeed, we found that it was only present for the sample after it was trimmed for listwise deletion in the regressions. When supplemental bivariate correlations were conducted without list-wise deletion, the correlation disappeared (r = −.04, n = 88). Similarly, when we conducted supplemental hierarchical regressions without Raven's-which allowed the inclusion of seven more participants from whom we failed to collect Raven's scores-the relation was once again nonsignificant (Table S5, Supplementary Materials).

Discussion
The current research was a partial replication and extension of Matthews et al. (2016), which previously found an association between nonsymbolic ratio processing ability and symbolic numerical abilities including symbolic fractions comparisons, general fractions knowledge and Algebra. We extended the prior work by including a new stimulus format (circle stimuli) and general intelligence as additional predictors and by including two additional symbolic math outcomes (Math fundamentals and Trigonometry). Our findings confirmed and refined some of the previously observed links between the RPS and symbolic math abilities, but also failed to replicate some of original findings. We discuss the nuances and possible implications of these findings below.

The Links Between RPS Acuity and Symbolic Math Outcomes
Consistent with Matthews et al., when we operationalized RPS acuity as a composite of line and dot ratio performance, we found that composite acuity predicted symbolic fractions comparison and general fractions knowledge. This was true even when controlling for simple magnitude acuities and inhibitory control. On the other hand, the relations between composite RPS acuity and Algebra failed to replicate. However, the RPS composite was predictive of Math fundamentals which also tested some basic algebra concepts.
When we disaggregated the composite to check the predictive power of each format, we found that effects of the RPS composite were largely driven by performance in the line ratio format. Prior to the addition of general intelligence in the third round of our hierarchical regression, the line ratio format predicted performance on 4 of 5 outcome measures-the FKA, Algebra, Math fundamentals, and Trigonometry. Indeed, a standard deviation improvement on line ratio comparisons was associated with anywhere from one-fourth to one-third of a standard deviation improvement on these outcomes. The current findings both corroborate and refine Matthews et al. 's prior results showing that nonysmbolic ratio processing ability was predictive of symbolic math performance, with predictive power confined to the line ratio format.
It is unclear why the line format was the most predictive. Although it was reasonable to expect that acuity would be higher for line ratios than for circles or dots based on prior research (e.g., Odic, 2017;Odic et al., 2013), we had no a priori expectations that the line format would prove more predictive than others. We speculate that it may have to do with the relative simplicity of the line format compared to the other two. The visual complexity of dot arrays is well documented (e.g., Gebuis & Reynvoet, 2012;Leibovich & Henik, 2013;Newcombe, Levine, & Mix, 2015). As a result of this complexity, non-numerical features add noise to the numerosity-based signal of ratio magnitude. Although line ratios are ostensibly simpler than circles, it has been demonstrated that participants can use either area or circumference as an index of circle size (Teghtsoonian, 1965). In contrast, by confining attention to a single dimension, the line format may allow participants relatively access easily to ratio information without unnecessary visual input. Hence, it may be the case that line ratio discrimination offers a cleaner measure of individual differences of participants' ratio processing acuity compared to other formats. More research is necessary to evaluate this speculative account.
It is striking that the low-level perceptual ability to discriminate line ratios-an ability which has been found even among rhesus macaques (Drucker et al., 2016;Vallentin & Nieder, 2008)-was predictive of higher order symbolic math abilities. This is perhaps more noteworthy in light of the unexpected finding that line ratio comparison failed to predict symbolic fraction performance. After all, nonsymbolic ratio discrimination and symbolic fraction comparison shared several features: 1) both tasks were computerized, 2) both were alternative forced choice comparisons, and 3) both assessed analogous rational number magnitudes. Given this, it seems reasonable to expect that there should be more shared variance between line ratio comparison and symbolic fraction comparison than with ratio comparison and any other outcome. On the other hand, recent work by Bhatia et al. (2020) suggests that we should not expect a positive relation between RPS tasks and nonsymbolic ratio discrimination (Bhatia et al., 2020). Using match-to-sample tasks, the authors found that ratio matching tasks exhibited distance effects whereas fraction matching tasks did not. They interpreted this finding as underscoring the role that strategies-and strategy-inducing foils-can play in symbolic comparisons as opposed to nonsymbolic comparisons. Bhatia et al. hypothesized that the differential role played by explicit strategies in nonsymbolic and symbolic comparisons should render results from the task to be largely independent. Although we found no correlation, per Bhatia et al. ' predictions, more systematic work is necessary to adjudicate between these competing hypotheses.

Beyond Magnitude
Why would ratio processing ability predict higher order mathematics? RPS theorists (Jacob et al., 2012;Lewis et al., 2015;Matthews et al., 2016) have hypothesized that the ability to process nonsymbolic ratio magnitude might serve as a cognitive primitive that imbues symbolic fractions with meaning. According to Matthews and Chesney (2015) "…we might eventually come to teach what a fraction symbol like 1/3 represents in much the same way that we teach young children what the symbol 4 represents or what a 'dog' or a 'cat' is" (p. 52). However, we argue that there are two rather large problems with this account given the current evidence. First, according to RPS as cognitive primitive hypothesis, symbolic fraction comparison should have been predicted by RPS acuity. Second, it fails to account for the complexity of the higher order mathematics abilities we measured. Even the simplest of them-the FKA-involves multiple arithmetic operations. In addition to the arithmetic operations, each college placement exam adds use of variables, math specific vocabulary, and multi-step problems. Cognitive primitive accounts that focus on numerical magnitude do not explain why understanding the size of fractions (or whole numbers for that matter) should confer proficiency with this added complexity. Thus, it is worth considering that more than magnitude per se is at play in the relations we found.
One possibility is that the most effective aspect of nonsymbolic ratio processing lies less in the ability to accurately map from a given nonsymbolic to a specific symbolic and resides more in the ability to focus on the relations between ratio components. That is, it may be that performance on RPS tasks effectively measures participants' abilities to attend to the fact that there is a multiplicative relation between components. If this is the case, then this sort of nonsymbolic relational reasoning may fuel the development of more general relational reasoning, even in the case that some perceptual bias results in an inaccurate map between fractions symbols and their nonsymbolic analogs.
Two pieces of evidence are consistent with this account. First, Matthews and Chesney (2015) did find consistent biases in cross-format comparisons of symbolic fractions and nonsymbolic ratios, whereby participants overestimated the magnitudes of nonsymbolic stimuli. This emerged despite participants exhibiting extremely well-behaved sigmoid response patterns when considering how stimulus choice depended upon inter-stimulus distance. Notably, these biases were found with circle and dot ratios which were not predictive in our study. However, recent work has found similar, but smaller biases for comparisons of line ratios with symbolic fractions (Binzak et al., 2019). To the extent that these biases exist widely, it suggests that the predictive power of ratio processing may not necessarily lie in the ability to accurately ground symbolic fraction magnitudes in their nonsymbolic analogs. This potential individual difference in attending to multiplicative relation between ratio components may parallel the construct of spontaneous focusing on relational information (SFOR) which has been described in recent work (McMullen, Hannula-Sormunen, Laakkonen, & Lehtinen, 2015;McMullen, Hannula-Sormunen, & Lehtinen, 2014). SFOR can be roughly defined as the unguided focusing of attention on relational quantitative aspects of environment and the tendency to make use of these relations in action. It may be that individual differences in SFOR play a role in our ratio comparison tasks and contribute to the establishment of links between nonsymbolic ratios and their symbolic analogs. This speculative explanation is consistent with prior findings that SFOR uniquely predicts rational number knowledge and even algebra scores concurrently and longitudinally (McMullen, Hannula-Sormunen, & Lehtinen, 2017) and findings that it is distinctive from spontaneous attention to quantitative information per se (McMullen et al., 2014). However, RPS acuity and SFOR have yet to be measured concurrently on the same participants. Given the potential connection, future studies should explore the relations between SFOR and RPS acuity.
Second, there appears to be important shared variance between line ratio processing ability and performance on our measure of fluid intelligence-Raven's Standard Progressive Matrices. As detailed above, we observed a significant effect of fluid intelligence on FKA, Math Fundamentals, and Algebra scores. Moreover, adding Raven's scores to our models rendered line ratio nonsignificant for all but Math Fundamentals. Here it is important to note both 1) that Raven's is often characterized as a test of relational reasoning (e.g., Carpenter, Just, & Shell, 1990;Crone et al., 2009;Waltz et al., 1999) and 2) that nonsymbolic ratio processing has been described as inherently relational (e.g., Bonn & Cantlon, 2017;Lewis et al., 2015;Matthews & Ellis, 2018;Matthews & Lewis, 2017). Thus, Raven's and line-ratio tasks may both serve as indices of a sort of relational reasoning that applies beyond consideration of magnitude and drives mathematical competence. Of course, the two are not identical, as they were correlated only at the .33 level, and line ratio remained significant for Math Fundamentals even after controlling for Raven's performance. Further study with diverse criterion measures of general intelligence and relational reasoning is needed to explore the dynamics between domain general relational reasoning and the RPS.

Limitations and Future Directions
We would like to note two important limitations of our study. First, there were the failures to find the expected relations between multiple predictors and symbolic fraction comparisons: line, circle and dot ratio performance failed to predict symbolic fractions performance, and performance on dot comparisons was actually negatively correlated with symbolic fractions comparisons. We have no strong explanations for these results. It is well known that there can be considerable variability in the strategies that people attempt to use when comparing symbolic fractions, and these vary depending upon attributes of the fractions compared (Morales, Dartnell, & Gómez, 2020;Obersteiner, Alibali, & Marupudi, 2020;Obersteiner, Van Dooren, Van Hoof, & Verschaffel, 2013). However, we used the same stimuli as Matthews et al., who seemed to encounter no problems on this end. Because we only included 30 symbolic fraction comparison trials, our data are not suited for exploring this result further. Use of a larger number of trials with more systematic attention to fractions compositions would allow for much more confidence in the results.
Second, although the current study confirms an association between perceptually-based ratio processing abilities and symbolic math outcomes, our design was not adequate for testing proposed mechanisms connecting the two. One of the most interesting predictions of RPS-based theories is that ratio processing ability might be effectively leveraged to improve intuitions about symbolic fractions, thereby improving math performance. Our study cannot speak to this issue empirically, and neither can other existing RPS studies to our knowledge beyond nods to the idea that number line estimation may in some way leverage RPS ability (e.g., Sidney et al., 2017). Therefore, future studies need to investigate these relations across age cohorts using cross-sectional and-ideally-longitudinal designs to answer such developmental questions. Recent evidence demonstrates that even preschoolers show individual differences in RPS acuity (Park et al., 2020), but researchers have yet to examine the relations between RPS acuity and general symbolic math ability among children. Additionally, if studies of the RPS are ever to yield information about its practical potential to enhance children's developing mathematical competence, then they must also go beyond exploration of individual differences to examine RPS-based interventions.

Conclusion
The current study replicated Matthews et al. 's (2016) findings that nonsymbolic ratio magnitude perception is associated with symbolic math abilities. We further refined Matthews et al. 's results by demonstrating that the predictive power of nonsymbolic ratio processing was specific to a particular format-line ratios. However, this novel association between primitive perceptual ability and higher-order mathematics still needs to be further unpacked. Beyond the magnitude feature of nonsymbolic ratio processing, we speculate that a substantial portion of its explanatory power stems from its links to relational reasoning. Future research should investigate the extent to which RPS ability is related to both do main-general and spatial-or math-specific relational reasoning and the extent to which each of these is responsible for explaining performance in higher-order mathematics. Overall, the present findings stand to help frame new questions