Mathematical thinking and executive functioning (EF) skills are thought to be tightly linked across the early childhood years. A large body of research has documented a positive association between EF and math achievement (e.g., Bull et al., 2008; Espy et al., 2004; Schmitt et al., 2017). Additionally, while EF is thought to be concurrently linked to a variety of academic skills such as reading, literacy and math (e.g., Best et al., 2011; Cameron et al., 2019; Shaul & Schwartz, 2014), its capacity to promote change in such skills appears to be the greatest for math (Fuhs et al., 2014; McKinnon & Blair, 2019; Schmitt et al., 2017). There is also a small but growing body of evidence that math skills can predict, and perhaps even promote, change in EF skills in return (Fuhs et al., 2014; McKinnon & Blair, 2019; Welsh et al., 2010). EF and math may thus be part of a positive feedback cycle that could be of particular relevance in early childhood when these skills are first forming.
EF and Math in Early Childhood
Early childhood is a time when children are first acquiring basic foundations of numerical skills on which later math learning will be built. Children who enter kindergarten with better basic math knowledge are more likely to succeed across a variety of academic domains including math, reading and science (Claessens & Engel, 2013; Duncan et al., 2007; Romano et al., 2010). Similarly, EF skills are undergoing rapid development in early childhood. Children who enter kindergarten better able to control or regulate their cognition and behavior tend to adapt more easily to the classroom environment (Clark et al., 2002; Nesbitt et al., 2019; Neuenschwander et al., 2012), and tend to display stronger academic achievement (e.g., Best et al., 2011; Cameron et al., 2019; Shaul & Schwartz, 2014). Furthermore, low-income children tend to underperform in terms of both EF and math skills (Hackman & Farah, 2009; Hair et al., 2015; Jordan & Levine, 2009). Research focused on understanding the strength of the EF↔Math relation and how it can be leveraged to support the simultaneous development of both skills, can inform intervention efforts that aim to close the school readiness gap.
While there are multiple existing meta-analyses on the relation between EF and math (Friso-van den Bos et al., 2013; Peng et al., 2016; Yeniad et al., 2013), none have yet focused specifically on early childhood. Some studies included preschool and kindergarten-aged children in their analyses (Friso-van den Bos et al., 2013; Jacob & Parkinson, 2015; Yeniad et al., 2013), but they did not isolate the EF↔Math relation specifically among children in this younger age group. Given the practical and theoretical import of EF and math in early childhood (preschool and kindergarten years), a meta-analysis providing a detailed examination of how these constructs relate to one another specifically in this age group is warranted. Thus, the current meta-analysis aims to fill this gap. First, we test whether the strength of the EF↔Math relation in early childhood depends on how EF and math are measured. Second, we test whether the early-childhood EF↔Math relation is stronger or weaker among children of lower vs higher socio-economic status (SES). Third, we examine the strength and direction of longitudinal relations between EF and math in early childhood.
1) Measurement
Both EF and math are, in truth, shorthand appellations for complex sets of cognitive processes. At a given point in development, not all researchers agree on what the most important subskills are in each domain. Even for a given subskill, the specific task by which it is measured may vary across research groups. It is thus important to systematically examine how variability in measurement may affect the EF↔Math relation. While it is important to acknowledge that a complete treatment of all possible measurement variations is beyond the scope of a single paper, below we motivate the various measurement categories we considered in this paper (see also Table 1, Section 1: ‘Measurement’).
Table 1
Number of Studies and Effect Sizes for Each of the Coding Categories Described in the Main Text
| Grand Total | Studies | Effect-Sizes |
|---|---|---|
| 95 | 1022 | |
| 1) Measurement | ||
| EF: Subcomponents vs Global Measures | ||
| Inhibitory Control | 39 | 235 |
| Working Memory | 50 | 389 |
| Cognitive Flexibility | 25 | 132 |
| Composite | 18 | 120 |
| Integrative | 36 | 146 |
| EF: Lab-Based vs Naturalistic | ||
| Lab-Based | 74 | 826 |
| Naturalistic | 40 | 159 |
| EF: Direct vs Observational | ||
| Direct | 89 | 937 |
| Observational | 16 | 65 |
| EF: Numerical vs Non-numerical Stimuli | ||
| Numerical | 32 | 170 |
| Non-Numerical | 84 | 852 |
| Math: Math Domain | ||
| Achievement Test | 67 | 484 |
| Arithmetic | 18 | 161 |
| Basic Numeracy | 32 | 377 |
| 2) Socioeconomic Status (SES) | ||
| Low | 16 | 177 |
| Mid-High | 8 | 58 |
| Mixed-SES | 73 | 787 |
| 3) Longitudinal Relations | ||
| Cross-Sectional vs Longitudinal | ||
| Cross-Sectional | 70 | 486 |
| Longitudinal | 69 | 536 |
| Unadjusted Longitudinal | ||
| EF→Math | 66 | 386 |
| Math→EF | 32 | 150 |
| Adjusted Longitudinal | ||
| EF→ΔMath | 23 | 89 |
| Math→ΔEF | 23 | 87 |
Measuring EF Skills
Broadly, EF refers to a set of cognitive processes that, taken together, allow for the volitional control of one’s thoughts, emotions and actions in order to identify and obtain goals (Blair, 2016; Zelazo, 2015; Zelazo et al., 2016). This set of control processes is often categorized into subprocesses, such as inhibitory control, working memory, and cognitive flexibility (Baddeley, 1996; Miyake et al., 2000; Nguyen & Duncan, 2019). In research examining early childhood, some studies measure one or more of these subprocesses separately (Espy et al., 2004; Harvey & Miller, 2017; Nguyen & Duncan, 2019), while other studies combine multiple measures into a composite EF measure (Fuhs et al., 2014; McKinnon & Blair, 2019; Welsh et al., 2010). Notably, it remains somewhat unclear when exactly in early childhood these subprocesses become functionally differentiated, and whether existing measures are capable of dissociating them (Best & Miller, 2010; Garon et al., 2008; Nguyen et al., 2019). In a recent meta-analysis working with preschoolers, Emslander and Scherer (2022) found little evidence for a difference between the math intelligence- EF association when the latter is separated into subcomponents or treated as a latent factor (See also Table 1 of Emslander & Scherer for an excellent summary of prior meta-analyses examining the association between math and EF across childhood.) At the level of the individual study, math and/or EF are often measured using a composite or overall achievement measure. Hence, it seems relevant to examine whether associations between EF and math depend on whether a study treats EF as a composite measure, or attempts to isolate specific subprocesses of EF.
Researchers also differ in whether they use lab-based EF tasks (e.g., backward-span tasks), versus what are sometimes referred to as ‘naturalistic’ EF tasks, such as standing still [“like a statue”] in the face of distractions (Morrison & Grammer, 2016). Both types of tasks, lab-based and naturalistic, are used frequently in the early childhood literature. Whether they relate differently to math skills remains unclear.
Each of the EF measures described above relies on an objective measure of performance in specific, predefined tasks (also known as ‘direct assessment’). Other researchers prefer to use observational measures that assess demonstration of EF skills in a potentially broader range of day-to-day activities. Such observational measures typically rely on parent or teacher reports (Clark et al., 2010; Fuchs et al., 2010; Swanson et al., 2014), which some have argued may be more valid assessments of early childhood EF abilities (Isquith et al., 2004; Morrison & Grammer, 2016). Here we test whether the association between EF and math depends on whether EF is measured via direct or observational assessment.
A final consideration with respect to measuring EF is that many early childhood EF measures involve numerical stimuli. For instance, one popular task involves encoding a short string of digits (e.g., “7, 3, 2, 8”) and then repeating the string back after a short delay, either in the same or in the reverse order (e.g., Hilbert et al., 2015). One possibility is that the use of numerical stimuli in one’s EF measure may inflate the association between EF and math skills – a possibility we test here.
Measuring Math Skills
The majority of research in the early childhood years utilizes standardized math assessments that are in effect composite scores across a range of (age-appropriate) disparate math skills (e.g., Clark et al., 2010; Espy et al., 2004; McKinnon & Blair, 2019). Other research has focused specifically on arithmetic calculation and fluency (Rasmussen & Bisanz, 2005). Still others have isolated basic numeracy skills, such as counting or comparing which of two numbers is greater (e.g., Harvey & Miller, 2017; LeFevre et al., 2010; Purpura et al., 2017). These different types of math assessments vary in complexity and may thus invoke EF skills to differing degrees (Friso-van den Bos et al., 2013; Fuchs et al., 2010; Jõgi & Kikas, 2016). Hence, here we also test whether the EF↔Math relation depends on the type of math assessments used.
2) Socioeconomic Status (SES)
Prior research suggests that low-income preschoolers tend to fall behind their higher-income peers in terms of both EF and math skills (Hair et al., 2015; Noble et al., 2007; Reardon, 2011). One possibility is that the lags seen among low SES children in EF and math may be linked, in which case one might expect the EF↔Math relation in early childhood to depend on SES. In the event of a discovered SES-dependency, this might contribute to (A) our understanding of the value of examining this relation in economically-diverse samples, as well as to (B) adjusting expectations about the potential for feedback between EF and math skills, for instance, in early childhood. We thus test whether the EF↔Math relation is moderated by SES. For a summary of SES categories considered here, see Table 1, Section 2 (‘SES’).
3) Longitudinal Relations Between EF and Math
There is currently a relative lack of evidence regarding the direction of influence between EF and early math skills. To date, much of the research on EF↔Math relations has tended to assume – tacitly or explicitly – a unidirectional relation whereby early EF primarily predicts gains in math outcomes (EF→Math: e.g., Clark et al., 2010; Fitzpatrick & Pagani, 2012; Mulder et al., 2017). As such, a common occurrence in the literature is to collect an assessment of EF at an earlier time point (T1) and an assessment of math skills at a later time point (T2). An association between EF and math is then interpreted as evidence that EF influences math. However, only a few studies have collected both measures at both timepoints. Without doing so, a given study could not control for initial skill levels, and thus could not accurately estimate the relation between early EF and changes in math skills. It is thus likely that studies computing an EF(T1)↔Math(T2) association without controlling for Math(T1) are reporting inflated associations (Bailey et al., 2018; Jacob & Parkinson, 2015). Also left largely unexamined is the possibility of the reverse: that early math predicts changes in EF skills.
Only somewhat recently have researchers begun to implement repeated measures designs (sometimes referred to as cross-lagged longitudinal designs) to better estimate bidirectionality between EF and math (Chu et al., 2016; McKinnon & Blair, 2019; Welsh et al., 2010). Overall, this small, but growing, body of longitudinal work has converged to suggest that the longitudinal relation between EF and math in early childhood may well be bidirectional, with early math abilities predicting change in EF to a similar or even greater extent than initial EF abilities predict change in math (Fuhs et al., 2014; McKinnon & Blair, 2019; Welsh et al., 2010). To our knowledge, two papers, Jacob and Parkinson (2015) and Nguyen et al. (2019), have examined the prevalence and magnitude of longitudinal EF-Math relations across multiple studies, specifically in early childhood. When investigating the longitudinal relations between EF and math, Jacob and Parkinson relied more on a narrative approach, and did not use a formal meta-analytic framework. Nguyen et al. (2019), given their stated theoretical aims, were concerned primarily with decomposing (or not) EF subcomponents in their relation to math. Further, given the intervening years between 2019 and the present day, we are able to include roughly twice as many relevant studies and over 170 individual longitudinal effect-sizes in our meta-analytic dataset. That is not to say these earlier papers are not valuable; rather, we believe the current work provides an important update and extension of this prior work. In sum, we tested for longitudinal relations between EF and math in early childhood after controlling for initial performance, and for evidence of an asymmetry between potential directional relations. For a summary of longitudinal comparisons considered here, see Table 1, Section 3 (‘Longitudinal Relations’).
Method
Literature Search
We identified relevant articles in two ways: first, by conducting an online search of common databases (i.e., Google Scholar and PsycInfo) using the following search terms: math; math achievement; math performance, numeracy; calculation; arithmetic; problem-solving; executive function/ing; working memory; updating; response inhibition; inhibitory control; attention shifting; set shifting; cognitive flexibility; domain-general cognition; self-regulation; preschool; early childhood; early childhood education; kindergarten. Second, we scanned the reference lists of relevant reviews and previously identified studies. The search included papers published prior to August 2023.
We assessed the relevance of each study from the initial search by scanning titles and abstracts for keywords. Studies were deemed relevant if they included any of the key terms listed above. Relevant studies were then scanned more thoroughly and included in the meta-analysis if they met the following inclusion criteria: (a) reported at least one correlation (r-value) between EF and math; (b) had a sample consisting of preschool age children (2-5yrs) and/or kindergarten children (5-6yrs)1; (c) did not focus exclusively on children with learning-related disabilities (e.g., dyscalculia). Ninety-five studies contributing a total of 1022 effect sizes met these inclusion criteria.
Coding Procedures
For a complete summary of the coded categories, as well as the relevant comparisons motivated in the Introduction above, see Table 1. All categories were double-coded; there were no discrepancies between coders (r = 1).
1) Measurement
The first aim of the current study was to identify whether the EF↔Math relation in early childhood depends on measurement factors – in particular, how EF and Math are measured.
EF Measurement: EF Subcomponents vs Global Measures
Here we coded for the task that was administered. For specific subcomponents, we examined inhibitory control, working memory, and cognitive flexibility, as these tend to be the most commonly used subcomponents in the literature on this age group (Baddeley, 1996; Miyake et al., 2000; Nguyen & Duncan, 2019). A task was coded as Inhibitory Control if it involved sustaining attention, inhibiting a predetermined response, maintaining regulated behavior (e.g., standing still or waiting patiently), or ignoring distractions (23.0% of effect sizes included here – see Table 1). A task was coded as Working Memory if it involved holding and/or manipulating multiple pieces of information in mind at once (38.1% of effect-sizes). A task was coded as Cognitive Flexibility if it involved shifting attention from one aspect of a problem to another, shifting between different strategies, or making comparisons (12.9% of effect-sizes). We also examined two forms of global EF measures. One form is a simple composite, which combines multiple measures of EF (typically via simple average or latent factor analysis); 11.7% of effect-sizes were coded as Composite. Another form of global EF assessment arises when a given task is expressly designed to rely heavily on more than one EF subcomponent; we refer to these as Integrative measures (14.3% of effect-sizes). An example is the Head, Shoulders, Knees and Toes (HSKT) task, which prior research suggests should be thought of as a broad measure of behavior regulation that requires both Working Memory and Inhibitory Control (Ponitz et al., 2008, 2009).
EF Measurement: Lab-Based vs Naturalistic
We followed Jacob and Parkinson (2015) in coding for whether the EF measure was lab-based or naturalistic. Lab-Based tasks measure the cognitive processes of EF typically through reaction time or accuracy (Morrison & Grammer, 2016). As can be seen in Table 1, 80.8% of effect-sizes were coded as Lab-Based. Naturalistic tasks (15.6% of effect-sizes) are contextualized assessments of ‘real-world’ or classroom-based self-regulatory behaviors related to EF (Morrison & Grammer, 2016).
EF Measurement: Direct vs Observational
Direct assessments estimate a child’s EF skills via objective performance on an EF task (91.7% of effect-sizes, see Table 1). Observational measures assess perceived EF abilities via an observer’s (e.g., parent or teacher’s) rating of a child’s self-regulatory behavior (6.4% of effect-sizes).
EF Measurement: Numerical vs Non-Numerical EF Tasks
We coded for whether EF assessments utilized numerical stimuli (coded as Numerical, 16.6% of effect-sizes, Table 1) or used exclusively non-numerical stimuli (coded as Non-Numerical, 83.4% of effect-sizes).
Math Measurement
In terms of measuring math skills, we coded for whether researchers employed standardized math achievement assessments; these typically comprise a battery of age-normed tasks that assess general math skills across a range of capacities (coded as Achievement, 47.4% of effect-sizes, see Table 1). Other researchers measure math skills in terms of more focused assessments of arithmetic calculation or fluency (coded as Arithmetic, 15.8% of effect-sizes). Still other researchers exclusively assessed basic numerical abilities, such as comparing which of two numbers is greater, counting, or number-line estimation (coded as Basic Numeracy, 36.9% of effect-sizes).
2) Socioeconomic Status (SES)
The second aim of the current study was to test whether the strength of the EF↔Math relation in early childhood depends on SES. Here we coded for SES based on how the sample was characterized in a given study. For instance, some studies expressly targeted participants with low income-to-need ratio or those who attended ECE programs targeted toward low-income children such as Head Start (headstart.gov). We were particularly interested in the EF↔Math relation among low SES children, so effect-sizes were coded as reflecting Low (17.3% of effect-sizes, see Table 1) or Mid-High (5.7% of effect-sizes) SES samples. In addition, the majority of studies did not isolate specific SES groups and instead included a diverse sample in terms of SES. We coded these as Mixed-SES (77.0% of effect-sizes).
3) Longitudinal Relations
The third aim of the current study was to examine longitudinal EF↔Math relations in early childhood.
Cross-Sectional vs Longitudinal Assessment
Here, Cross-Sectional means the EF↔Math relation was computed between measures of EF and Math that were collected at the same time-point (47.6% of effect-sizes). Longitudinal means the EF↔Math relation was computed between measures of EF and Math that were collected at different time-points (52.4% of effect-sizes).
Unadjusted vs Adjusted Longitudinal Relations
There is often an implied direction when researchers report longitudinal associations. As noted in the Introduction, simply relying on when data were collected to infer directionality is highly problematic, as it likely overestimates the magnitude of directional effects. More useful are estimates of the extent to which a given variable predicts change in another variable. Such estimates require that one adjust the observed relation for pre-existing levels of the outcome variable. Practically, this means one needs to measure both EF and Math at both time-points. Unfortunately, of the 536 longitudinal effect-sizes, only 176 (32.8%) came from studies that measured EF and Math at both time-points. This means the majority of studies reporting longitudinal EF↔Math relations failed to control for measurement of the outcome variable at the earlier time point, raising the possibility that these associations represent an inflated estimate of the extent to which EF predicts change in Math, and vice-versa. We refer to longitudinal associations as Unadjusted when they do not control for pre-existing levels of the outcome variable, and we denote them here as EF→Math and Math→EF 2. Longitudinal associations that do include such controls we refer to as Adjusted longitudinal associations, and we denote them here as EF→ΔMath and Math→ΔEF, respectively.
Meta-Analytic Strategy
We used Pearson’s r-values to index EF↔Math relations. Because r-values are standardized (expressed as relative changes in units of standard-deviation), the standard-error of a given r-value (ser) can be calculated, given the sample-size used to estimate that r-value (Nr): .
Meta-analytic values were estimated in SPSS [IBM: version 29.0.1.1 (244)]. Estimated average r-values and corresponding confidence-intervals were generated using a random-effects (RFX) model (effect-sizes nested within studies), restricted maximum likelihood (REML) estimator, and Knapp-Hartung adjusted standard-errors (Langan et al., 2019). We selected RFX models because heterogeneity tests (Q-statistics) were highly significant. Weights were computed using fully inverted variance, such that they included both within- and between-study variance (Pustejovsky & Tipton, 2022). Specific estimates (e.g., Low vs Mid-High vs Mixed SES) were calculated in a similar manner via stratified subgroup analyses (also via RFX as specified as above).
Direct comparisons between pairs of estimated average r-values (e.g., comparing the average EF↔Math relation among Low SES samples vs the same among Mid-High SES samples) were computed via random-effects meta-regression, again using Knapp-Hartung adjusted REML estimates. Note that the number and breadth of available studies did not allow for a complete assessment of all combinations of potential sub-moderators (e.g., comparing Low vs Mid-High SES only for studies using a direct, lab-based, non-numerical, composite EF measure and a basic numeracy math measure). Hence, we isolated a single comparison category (e.g., Low vs Mid-High SES, or Lab-Based vs Naturalistic EF measures) at a time. In doing so, we sought to lean on the methodological strength of a meta-analytic approach, which seeks to identify broad trends across a range of studies.
In this regard, comparison categories were yoked to the theoretical questions outlined in the Introduction (see also Table 1 for a summary). We believe this approach to be more theoretically informative relative to an atheoretical consideration of all possible combinations. In addition, this approach maintains a more reasonable sample-size for each comparison, and it reduces the total number of comparisons being computed – both factors are important for increasing the generalizability and replicability of the results being presented here. Finally, despite our arguably more circumspect approach, we nevertheless tested a fair number of direct comparisons; hence, we report both the traditional threshold of p < .05, as well as Dunn-Šidák corrected thresholds (Šidák, 1967).
Computing Adjusted Longitudinal Effects
To compute an adjusted longitudinal effect, one must control for the outcome variable (EF or Math at Time-2) measured at an earlier time-point (EF or Math at Time-1, respectively). Doing so allows one to compute the equivalent of sample-adjusted change scores for the outcome. For instance, if one removes the linear association between a math measure collected at the end of preschool and that same measure collected again at the end of kindergarten, then the resulting residuals are by definition what changed across measurement time-points3. Further, so long as the predictor variable was measured at the same time as the first instance of the outcome variable (i.e., EF was measured at the end of preschool in the example above), adjusted effects partially4 account for pre-existing relations between the predictor and outcome (between EF and Math). In this way, adjusted effects allow one to estimate the extent to which one variable predicts change in another variable. However, in the current context, to do so we are limited to studies that collected EF and Math measures at both time-points (i.e., a repeated-measures design), and that reported the full matrix of correlations (between EFTime1, EFTime2, MathTime1, MathTime2).
For a given study, EF→ΔMath is the partial correlation between EFTime1 and MathTime2, controlling for MathTime1. Most studies do not report these partial correlations. Serendipitously however, because r-values are standardized covariance estimates, the requisite partial correlations can instead be calculated from the zero-order correlation matrix (Opgen-Rhein & Strimmer, 2007). This calculation involves a two-step, ‘pseudo-inverse’ procedure. In the first step, one inverts the input zero-order correlation matrix, which by definition orthogonalizes the resulting matrix elements with respect to the original matrix; in this way, the new elements represent covariances that are independent of one another. However, the new elements are no longer in standardized units. Hence, the second step of the procedure standardizes each element in the usual manner: , where here Cov(x,y) is a given non-diagonal element from the first step, and Var(x) and Var(y) are the diagonal elements corresponding to the constituent variables x and y. The result is a new matrix in which non-diagonal elements represent partial correlations between two variables, controlling for the influence of all other variables in the matrix. As an illustration, consider the zero-order correlation matrix with three variables: EF1, Math1, Math2. Entering this matrix into the above procedure will result in a partial correlation matrix. More to the point, the element of the matrix at the intersection between EF1 and Math2 will have been ‘adjusted’ such that it now represents the partial correlation between EF1 and Math2, controlling for Math1. In plainer terms, it is the association between EF and change in Math, or EF→ΔMath. We thus used the above procedure to compute adjusted longitudinal effects (EF→ΔMath and Math→ΔEF) for each unadjusted longitudinal effect (EF→Math and Math→EF) reported in a study with a repeated-measures design.
Comparing Unadjusted and Adjusted Effects
To test whether unadjusted effects inflate apparent longitudinal associations, we used meta-regression to contrast adjusted and unadjusted effects for a given direction (e.g., EF→Math vs EF→ΔMath). To equate adjusted and unadjusted effects as much as possible, when directly comparing the two, we used unadjusted values only from studies for which adjusted values were also available. In this way we are essentially testing whether, on average, adjusting a given longitudinal effect for the outcome measure at Time-Point 1 significantly reduces that effect.
Directionality
For tests of directionality, we limited analysis to adjusted effects (EF→ΔMath and Math→ΔEF), as only adjusted effects capture change in the outcome. First, to test for the presence of significant directional effects, we estimated the average effect and confidence-interval for each direction via the meta-analysis approach noted above. Second, to test for asymmetrical directional effects, we used meta-regression to directly contrast EF→ΔMath and Math→ΔEF.
Publication Bias
A Trim-and-Fill evaluation of publication bias indicated minimal evidence of publication bias (Shi & Lin, 2019). We used Egger’s regression test to determine imputation side (right). Imputation required only 14 additional effect-sizes (relative to 1022 observed effect-sizes). Trim-and-Fill results indicated a very slight underestimate of the overall effect (r = .357 instead of .350). Thus, we make no further adjustments for apparent publication bias.
Results
The overall average effect, based on 1022 effects across 95 studies, was r = .350, 95% CI [.338, .361]. This indicates a small but highly significant positive association between EF and Math in early childhood. This overall average effect is represented with a black dashed line in Figures 1, 2, and 3.
1) Measurement
Average effects for each measurement subcategory are shown in Figure 1. Exact values for reference purposes can be found in the Appendix (Table A1). Contrast results are given in Table 2. We conducted 16 measurement-related contrasts (Table 2), so the corrected threshold for this section was α < .0032.
Table 2
Contrast Effects Comparing Average EF↔Math Relations When Using Different Methods for Measuring EF and Math
| Contrast | Δ | 95% CI | t | df | p | d | |
|---|---|---|---|---|---|---|---|
| LL | UL | ||||||
| EF: Subcomponents vs Global Measures | |||||||
| Composite > Inhibitory Control | .238 | .198 | .279 | 11.53 | 353 | 2.6E-26 | 1.23 |
| Composite > Working Memory | .162 | .129 | .195 | 9.55 | 507 | 5.3E-20 | 0.85 |
| Composite > Cognitive Flexibility | .190 | .157 | .222 | 11.49 | 250 | 8.4E-25 | 1.45 |
| Integrative > Inhibitory Control | .063 | .024 | .103 | 3.13 | 379 | .002 | 0.32 |
| Integrative > Working Memory | -.013 | -.046 | .020 | -0.77 | 533 | .444 | -0.07 |
| Integrative > Cognitive Flexibility | .011 | -.022 | .045 | 0.67 | 276 | .505 | 0.08 |
| Composite > Integrative | .176 | .140 | .213 | 9.43 | 264 | 2.2E-18 | 1.16 |
| Inhibitory Control > Working Memory | -.077 | -.108 | -.046 | -4.86 | 622 | 1.5E-06 | -0.39 |
| Inhibitory Control > Cognitive Flexibility | -.053 | -.091 | -.014 | -2.68 | 365 | .008 | -0.28 |
| Working Memory > Cognitive Flexibility | .026 | -.006 | .058 | 1.57 | 519 | .116 | 0.14 |
| EF: Lab-Based vs Naturalistic Measures | |||||||
| Lab-Based > Naturalistic | .018 | -.013 | .049 | 1.16 | 983 | .244 | 0.07 |
| EF: Direct vs Observational Measures | |||||||
| Direct > Observational | .061 | .015 | .106 | 2.61 | 976 | .009 | 0.17 |
| EF: Numerical vs Non-numerical Stimuli | |||||||
| Numerical > Non-Numerical | .129 | .100 | .159 | 8.72 | 1020 | 1.1E-17 | 0.55 |
| Math: Math Domain | |||||||
| Achievement > Arithmetic | .148 | .117 | .180 | 9.22 | 643 | 4.1E-19 | 0.73 |
| Achievement > Basic Numeracy | .069 | .045 | .094 | 5.49 | 859 | 6.5E-08 | 0.37 |
| Arithmetic > Basic Numeracy | -.079 | -.113 | -.045 | -4.55 | 536 | 6.5E-06 | -0.39 |
Note. Mean differences (Δ) are expressed as the difference between average correlation estimates for the two measurement types.
Figure 1
Average Effects for Each Subcategory in the Measurement Section
Note. The black dashed line is the overall average effect (r = .350). Error-bars are 95% confidence-intervals. Note that ‘Achievement’ refers to math achievement.
Measuring EF Skills
We first examined how the average observed EF↔Math relation in early childhood varies as a function of how EF is measured – as individual EF subprocesses, composite EF measures, or integrative EF measures. Average effects are shown in orange in Figure 1. In terms of individual EF subprocesses, the average effects for Working Memory (r = .352) and Cognitive Flexibility (r = .327) were similar to the overall average effect, and did not significantly differ from one another (p = .120: see Table 2 for full contrast details). Inhibitory Control, by contrast, showed an average effect (r = .276) significantly lower than either Working Memory (p < .001) or Cognitive Flexibility (p = .008, though note this latter effect does not pass the more stringent corrected threshold). Examining measures that combine multiple subprocesses, Composite measures yielded a significantly higher average association (r = .515) relative to Integrative measures (p < .001) or any of the individual subprocesses, even after correcting for multiple comparisons (all ps < .001). Conversely, Integrative EF measures showed an average effect (r = .338) similar to the overall mean effect. As such, Integrative measures did not differ significantly from Working Memory or Cognitive Flexibility measures (ps ≥ .444), but were significantly higher than Inhibitory Control measures (p < .002).
We next compared EF↔Math relations for Lab-Based vs Naturalistic EF measures (green bars in Figure 1). Both types of measures showed similar effects (Lab-Based: r = .347; Naturalistic: r = .328) to the overall average effect, and they did not significantly differ from one another (p = .244).
For EF↔Math relations using Direct vs Observational EF measures (purple bars in Figure 1), we found that Direct measures showed an effect (r = .347) similar to the overall average, though this is perhaps unsurprising as Direct measures comprised over 90% of the total effects included in the dataset. That said, the 65 Observational effects included in the dataset showed an average association with Math (r = .285) that was notably lower than the overall average (Figure 1), and significantly lower than that seen for Direct measures (p = .009). Caution is warranted, however, as this latter contrast effect did not pass the more conservative corrected threshold.
We next compared EF↔Math relations for EF measures that utilize Numerical vs Non-numerical stimuli (red bars in Figure 1). Numerical EF measures yielded an average association with Math (r = .459) substantially higher than the overall average effect, whereas Non-numerical measures yielded an average effect (r = .327) more in keeping with the overall average. Further, the difference between Numerical and Non-numerical effects was highly significant (p < .001).
Measuring Math Skills
When considering the impact of different types of math measures on the EF↔Math relation in early childhood (blue bars in Figure 1), we observed the highest average relation for math Achievement measures (r = .396), which was significantly higher (ps < .001; see Table 2 for complete contrast statistics) than those observed for Arithmetic (r = .246) or Basic Numeracy (r = .327). The lowest relation was seen for Arithmetic, which was significantly lower than Basic Numeracy as well (p < .001).
2) SES
Here we examined the effect of SES sample composition on EF↔Math relations. Average effects for each SES subcategory are shown in Figure 2. Exact values for reference purposes can be found in the Appendix (Table A2). Contrast results are given in Table 3. We conducted 3 SES-related contrasts (Table 3), so the corrected threshold for this section was α < .0170.
Table 3
Contrast Effects Comparing Average EF↔Math Relations Across Samples of Differing SES Composition
| Contrast | Δ | 95% CI | t | df | p | d | |
|---|---|---|---|---|---|---|---|
| LL | UL | ||||||
| Socioeconomic Status (SES) | |||||||
| Low SES > Mid-High SES | .115 | .048 | .183 | 3.37 | 233 | 8.8E-04 | 0.44 |
| Low SES > Mixed SES | .099 | .068 | .130 | 6.32 | 962 | 4.0E-10 | 0.41 |
| Mid-High SES > Mixed SES | -.027 | -.076 | .023 | -1.05 | 843 | .292 | -0.07 |
Note. Mean differences (Δ) are expressed as the difference between average correlation estimates for the two measurement types.
Figure 2
Average Effects for Each Subcategory in the SES Section
Note. The black dashed line is the overall average effect (r = .350). Error-bars are 95% confidence-intervals.
The highest average effect was observed for Low SES samples (r = .431), which was significantly higher (ps < .001) than those observed for either Mid-High SES (r = .317) or Mixed SES (r = .336). There was no significant difference between Mid-High and Mixed SES (p = .292).
3) Longitudinal Effects
Average effects for each measurement subcategory are shown in Figure 3. Exact values for reference purposes can be found in the Appendix (Table A3). Contrast results are given in Table 4. We conducted 4 contrasts comparing two longitudinal effects, so the corrected threshold for this section was α < .0127.
Table 4
Contrast Effects Comparing Longitudinal Effects
| Contrast | Δ | 95% CI | t | df | p | d | |
|---|---|---|---|---|---|---|---|
| LL | UL | ||||||
| Cross-Sectional vs Longitudinal Effects | |||||||
| Cross-Sectional > Longitudinal | 0.032 | 0.009 | 0.055 | 2.70 | 1020 | .007 | 0.17 |
| Unadjusted vs Adjusted Longitudinal Effects | |||||||
| EF→M > EF→ΔM | 0.230 | 0.182 | 0.279 | 9.34 | 176 | 4.3E-17 | 1.41 |
| M→EF > M→ΔEF | 0.180 | 0.139 | 0.221 | 8.65 | 172 | 3.5E-15 | 1.32 |
| Directional Effects | |||||||
| EF→ΔM > M→ΔEF | -0.050 | -0.089 | -0.012 | -2.59 | 174 | .010 | -0.39 |
| EF→ΔM > 0 | 0.190 | 0.159 | 0.220 | 12.44 | 88 | 4.4E-21 | 1.33 |
| M→ΔEF > 0 | 0.240 | 0.216 | 0.264 | 20.23 | 86 | 1.9E-34 | 2.18 |
Note. Mean differences (Δ) are expressed as the difference between average correlation estimates for the two measurement types.
Figure 3
Average Effects for Different Longitudinal Effects
Note. The black dashed line is the overall average effect (r = .350) Error-bars are 95% confidence-intervals.
Cross-Sectional vs Longitudinal Effects
Here, cross-sectional effects are those where the EF and math measure were collected at the same time-point; longitudinal effects are those where the two measures were collected at different time-points (light pink bars in Figure 3). Note that for this analysis, longitudinal effects were averaged across temporal direction and considered all unadjusted effects (EF→Math and Math→EF), regardless of whether adjusted versions of those effects could be computed. Overall, Cross-Sectional effects (r = .367) were slightly higher than Longitudinal effects (r = .336). This difference was relatively small, but significant (p = .007, d = 0.17).
Unadjusted vs Adjusted Longitudinal Effects
Here, we tested whether adjusting longitudinal effects so that they reflect predictions of change in the outcome variable significantly alters one’s estimate of the relation between the two variables over time. To equate adjusted and unadjusted effects as much as possible, when directly comparing the two, for this analysis we included unadjusted values only from studies for which adjusted values could also be computed. We computed unadjusted vs adjusted effects for each direction separately. When considering effects where EF predicted later math performance, we found that Unadjusted effects (EF→M: r = .419; upper medium pink bar in Figure 3) were significantly higher than Adjusted effects (EF→ΔM: r = .190; upper dark pink bar in Figure 3): p < .001, d = 1.41). Similarly, when considering effects where math predicted later EF performance, we found that Unadjusted effects (M→EF: r = .420; lower medium pink bar in Figure 3) were significantly higher than Adjusted effects (EF→ΔM: r = .240; lower darker pink bar in Figure 3): p < .001, d = 2.18.
Directional Effects
We first established the presence of significant directional effects – namely, whether adjusted effects for each direction (EF→ΔMath and Math→ΔEF; dark pink bars in Figure 3) were significantly greater than zero. They were: EF→ΔMath: r = .190, 95% CI [.159, .220]; Math→ΔEF: r = .240, 95% CI [.216, .264]; ps < .001. Next, to test for directional asymmetry, we directly contrasted EF→ΔMath and Math→ΔEF adjusted effects. We found significant evidence for a stronger relation from Math to change in EF than the reverse (p = .010), though it is important to note that this equated to at best a moderate effect-size (d = -0.39).
Discussion
EF and math are both strong indicators of school readiness and appear to be tightly linked over the course of early childhood (e.g., Bull et al., 2008; Fuhs et al., 2014; Purpura et al., 2017). Numerous studies have examined the EF↔Math relation, and there are several meta-analyses establishing that this relation is statistically robust (Cortés Pascual et al., 2019; Peng et al., 2016; Yeniad et al., 2013). More recently, there has been increasing attention paid to the possibility that the EF↔Math relation might be leveraged to boost school readiness skills and ensure that all students enter kindergarten equipped with the skills necessary to succeed. However, there is as of yet no comprehensive meta-analysis to our knowledge that has examined the EF↔Math relation specifically in early childhood. The current study aimed to fill this gap. Further, we sought to probe three critical aspects of the (meta-analytic average) EF↔Math relation in early childhood: (1) whether this relation depends on how EF and Math are measured, (2) whether this relation depends on the SES of the children in the study, and (3) the direction of longitudinal relations between EF and Math.
We found that the average EF↔Math correlation in early childhood was r = .350. This result is highly consistent with the average relation of r = .34 between EF and math intelligence in preschoolers reported by Emslander and Scherer (2022). (1) In terms of measurement, we found that composite EF measures yielded significantly higher EF↔Math associations than integrative EF measures or any individual EF subprocess measured separately. Other EF measures yielded comparable associations with math, except for Inhibitory Control, which yielded a significantly lower association than did Working Memory or Cognitive Flexibility. Naturalistic and Lab-Based measures of EF yielded similar EF↔Math associations, and the association was marginally stronger for direct vs observational measures of EF. EF assessments that involve numerical stimuli led to substantially higher associations with math than EF measures that did not. In terms of math measurement, the strongest EF↔Math relations were found when using math achievement tests, followed by measures of basic numeracy skills, and then measures of arithmetic ability. (2) In terms of SES, the highest EF↔Math associations were found in samples comprising low SES children. (3) With respect to longitudinal associations, we found that associations that fail to adjust for Time-1 performance on the outcome measure (and hence effectively fail to predict change scores) are substantially inflated relative to associations that do adjust for Time-1 performance. Adjusted associations nevertheless revealed significant associations in both the EF→∆Math and Math→∆EF directions. Interestingly, the average adjusted Math→∆EF effect was significantly stronger than the average adjusted EF→∆Math effect.
1) EF and Math Measurement
How one measures various constructs can have a substantial impact on the correlations one observes between those constructs. When considering the correlation between EF and math in early childhood, we found that all operationalizations of EF and math considered here yielded a significant average correlation. This speaks to the overall robustness of this relation, even in relatively young children, for whom measurement can sometimes prove challenging. On the other hand, we also found that the strength of the association did vary significantly with some – though not all – choices with respect to how EF and math were measured.
Composite vs Integrative EF Skills
As noted in the introduction, substantial effort has been made to unpack EF into constituent subprocesses, such as working memory, inhibitory control, and cognitive flexibility (Lan et al., 2011; Miyake et al., 2000; Nguyen & Duncan, 2019). However, similar to Nguyen et al. (2019), we found the strongest EF↔Math association in studies using composite EF measures (r = .515, 95% CI [.489, .541]) – namely, those that combine multiple EF measures into a single average or latent variable. What remains unclear is whether this result is due primarily to measurement or developmental reasons, and indeed Nguyen et al. (2019) remain notably agnostic on this front. From a measurement perspective, it may simply be the case that combining multiple measures of partially overlapping constructs leads to a reduction in measurement error. In turn, this reduced error produces a more robust estimate of the strength of that construct’s (latent EF’s) association with another construct (math). From a theoretical perspective, it may be the case that the underlying neurocognitive mechanisms that subserve disparate EF subprocesses are largely undifferentiated at this stage in development (Garon et al., 2008; Wiebe et al., 2008; Wiebe et al., 2011). Our data may shed light on this distinction.
From a theoretical perspective, Integrative EF measures combine multiple aspects of EF within a single task. Hence, if the main reason that we see stronger average EF↔Math associations for Composite EF measures is that they also combine multiple aspects of EF, then the EF↔Math association seen for Integrative EF measures should be similar to that observed for Composite EF measures. It was not. The average Integrative effect was r = .338 (95% CI [.312, .365]), and the average Composite effect was r = .515, 95% CI [.489-.541] (see Figure 1; Table A-1). An alternative view would be to take a measurement perspective: Integrative measures are typically drawn from a single measurement per child (at a given time point), and not the average or composite of multiple EF measures at that time point. Most measures of individual EF subprocesses also rely on a single measurement (per child per time point). Hence, in this regard, the average EF↔Math association for Integrative measures should be similar to the average EF↔Math association for individual EF subprocesses. This was primarily what we found: Integrative measures yielded average effects no different from those seen for Working Memory or Cognitive Flexibility measures (ps > .426, Table 2, Figure 1, orange). The lone exception was a higher effect for Integrative relative to Inhibitory Control measures, but this was also true when contrasting Working Memory and Cognitive Flexibility measures against Inhibitory Control measures. Furthermore, in a measurement-based interpretation of our and Nguyen et al.’s (2019) results, one would expect Composite EF measures to yield greater EF↔Math associations than Integrative EF measures (because, again, the former comprises multiple, combined measurement points, whereas the latter comprises a single measurement point). This is again precisely what we found (Composite > Integrative: p < .001, d = 1.16, Table 2). In sum, our data are more consistent with the view that the increased EF↔Math associations seen for Composite EF measures tell us little about the underlying neurocognitive structure of EF subprocesses and how they relate to math processing at this stage of development; instead, they are more likely the result of measurement factors. Consistent with this view, in their recent meta-analysis, Emslander and Scherer (2022) also found little meaningful evidence for differences in the associations between math intelligence scores and EF subcomponents (inhibition, shifting and updating in their case) among preschoolers.
Numerical EF Measures
A popular measure of EF in early childhood is the (backward) digit span task, perhaps due to its ease of administration and simplicity of interpretation. However, digits are numerical stimuli, which may inadvertently increase the observed correlation with numerical tasks, especially at an age where there is still considerable individual variability in basic numeracy skills (e.g., Lyons et al., 2014; Siegler et al., 2011). Our data indicate that Numerical measures of EF indeed produce a significantly higher average association with math than Non-numerical measures of EF (p < .001, d = .55; Table 2). Whether or not the higher association for Numerical EF tasks is inflationary is perhaps a matter of perspective. From a theoretical point of view, using a Numerical EF task may be problematic because it muddies the waters, making it unclear whether domain-specific or domain-general factors account for the observed correlation. On a practical level, it is worth noting that the EF↔Math relation for Non-numerical EF measures was more similar to the overall average (r = .350) than Numerical EF measures (r = .459). Hence, our conclusion is that Numerical measures indeed inflate the apparent association with math – by our estimates, by over 40%. We thus suggest researchers avoid such stimuli in their EF tasks in cases where a key goal of the study is to relate said EF tasks to math skills. For example, in the case of the popular digit-span task, a simple alternative would be to use consonant letters instead of digits, perhaps taking care to avoid common acronyms. Note that an important caveat to this point is that one may take the view that there exists a domain-specific aspect of EF dedicated to numerical or mathematical processing (Wilkey, 2023). In that view, a Numerical EF measure may be more optimal. However, if one’s intent is to identify how more domain-general aspects of EF relate to numerical or math processing, then a Non-Numerical EF measure may be better suited.
Another possibility is that the inflationary conclusion above is not valid because it is biased. For instance, it was discussed above that Direct measures of EF possess stronger relationships with math ability than their Observational counterparts; given this reality, if Numerical EF measures happen to also be disproportionately Direct, this overlap may account for Numericity’s apparent inflationary effects. Unfortunately, an exhaustive treatment comparing Numerical vs Non-Numerical EF measures within each of the remaining subcategories is not possible due to insufficient data (e.g., there were only 2 effect-sizes where the EF task was coded as both Numerical and Inhibitory Control). However, we were able to compare Numerical vs Non-Numerical effects within the WM, Lab-Based and Direct categories. In all three cases, the results were similar: using a Numerical (relative to a Non-Numerical) version of a WM, Lab-Based or Direct measure of EF inflated the EF↔Math association by 50%, 49% and 33%, respectively. While a fully exhaustive treatment of all possible factors was not possible here, the results support an inflationary view. We thus reiterate our recommendation that researchers avoid using numerical EF measures when estimating the association between Math and EF, unless they have an explicit theoretical reason for isolating a math-specific aspect of EF (as suggested by Wilkey, 2023).
Math Measures
The EF↔Math correlation was significantly greater than 0 for all types of math measures, even relatively basic skills like Basic Numeracy and Arithmetic. This consistency speaks to the breadth and robustness of the relationship between EF and math. That said, the EF↔Math relation was strongest for math Achievement measures, relative to Arithmetic or Basic Numeracy measures (ps < .001). One explanation is developmental. Namely, Achievement measures often take the form of standardized tests that include a range of different math problems ostensibly tapping different types of math skills. These Achievement measures may thus include math items that require more complex problem solving, which rely more heavily on EF, in turn increasing the association with EF measures (Friso-van den Bos et al., 2013; Fuchs et al., 2010; Jõgi & Kikas, 2016). An alternative explanation is that stronger EF↔Math associations for Math Achievement primarily reflect measurement as opposed to developmental factors. Standardized Achievement tests are essentially composite measures, aggregating performance across multiple, interrelated math sub-skills. As we saw with EF, Composite measures, due to superior measurement qualities, tend to yield more robust estimates and hence stronger associations with other variables. Consistent with this notion, the effect for Achievement measures (r = .396, 95% CI [.381, .411]) was greater than the overall mean effect (r = .350). In this respect, the results seen for math measures are similar to what we saw for EF measures: At this early stage of math education, a more reliable measurement approach may be to combine measures of math to generate a robust estimate of overall math ability.
2) SES
In terms of SES, Mixed-SES samples, which contributed 77% of the effects included here, yielded an average effect (r = .336, 95% CI [ .324, .348]) close to the overall average (r = .350). However, examining the subset of samples that isolated specific portions of the SES distribution revealed this average effect camouflages important SES-related differences in the EF↔Math relation. In particular, the relation was significantly stronger for samples comprised exclusively of low-income children (r = .431, 95% CI [.398, .464]) and the weakest for those without such children (i.e., consisting of primarily middle-to-high income samples: r = .317, 95% CI [.255, .378]. One interpretation is that low-income children may rely upon their EF skills when performing math to a greater extent than their higher-income peers. This finding could be a reflection of differential access to resources that support math learning in the home. For instance, high-income parents tend to engage in more math talk with their children compared to lower-income parents (Vandermaas‐Peeler et al., 2009), which, in turn, is associated with stronger early math skills (Susperreguy, 2013; Susperreguy & Davis-Kean, 2016). As such, EF may play less of a role in the math performance of high-income children as they are more likely to receive additional supports for early math learning in the home. However, this is just a hypothesis and more empirical work is needed to further understand the moderating role of SES on the relation between EF and math in early childhood.
The stronger EF↔Math relation for low-SES children may also have implications for school-readiness. Specifically, low-income children demonstrate poorer early math knowledge than their higher-income peers, and the gaps in this knowledge at school-entry are thought to be mediated by income-based discrepancies in EF (Dilworth-Bart, 2012; Fitzpatrick et al., 2014; Lawson & Farah, 2017). Further, gains in low-income preschoolers’ early math skills are predictive of improved EF capacities upon kindergarten entry (McCoy et al., 2019). The current meta-analysis is the first to directly contrast the magnitude of the EF↔Math relation in early childhood across the SES spectrum. In doing so, we find support for the claim that fostering math and EF skills in early childhood education can be mutually beneficial. Moreover, efforts to integrate EF and math in early childhood education may yield the greatest benefits in schools serving children from low-income families, where perhaps not coincidentally, children often lag the furthest behind their peers in both types of skills. A more nuanced understanding of how the relation between EF and math differs as a function of SES can help to inform targeted interventions designed to improve school readiness skills among low-income children specifically, though more work is certainly needed to further test this idea.
3) Longitudinal Associations
The current paper sought to systematically document the evidence for (or against) directional relations between EF and math within a formal meta-analytic framework. ‘Directional’ here refers not to causality, but to capacity to predict change (see below for a more detailed note on causality). Overall, we observed that the average adjusted longitudinal relation from EF to math is less than half the size of the average unadjusted longitudinal relation between EF and subsequent math performance (EF→ΔMath: r = .190, EF→Math: r = .419, respectively; see Table A3), suggesting that unadjusted relations are inflated by about 120%. A similar pattern was seen for longitudinal relations between math and subsequent EF skills (Math→ΔEF: r = .240, Math→EF: r = .420; see Table A3), suggesting an inflation factor of about 75%. Together, these results indicate that failing to adjust longitudinal effects between EF and math leads to substantially inflated estimates of the extent to which these effects are truly directional – that is, the extent to which a given variable predicts change in the outcome.
On the other hand, even after adjusting these effects for initial performance, we nevertheless found significant effects in both directions (ps < .001, ds > 1; see bottom two rows of Table 4, and dark pink bars in Figure 3). Together, these results support the notion that the relationship between EF and math is bidirectional (Clements et al., 2016). More broadly, our results support a view wherein both EF and math skills are dynamic, and the relationship between them is transactional (Miller-Cotto & Byrnes, 2020). That said, while average adjusted longitudinal effects were robustly positive, these average effects were relatively modest in magnitude. In terms of longitudinal effects in the literature, this means we would expect some portion of studies to find null or inconsistent results (e.g., Barnes et al., 2016). Further, we would expect only reasonably well-powered studies (i.e., sample-sizes in the hundreds or more) to be capable of detecting effects of the magnitude found here (e.g., Zhang et al., 2023); but even with this power, detection is not a foregone conclusion (Willoughby et al., 2019). With these points in mind, our results may thus help to contextualize some of the apparently inconsistent findings in the literature: null results, especially in smaller samples, are not entirely unexpected, and while unadjusted effects are likely to overstate true underlying longitudinal relations, we should nevertheless expect the average (adjusted) effect to be positive in both directions.
Another point worth emphasizing is that the directional relation from math to change in EF skills was just as strong, if not modestly stronger (p = .010, d = 0.39), than the generally more celebrated relation from EF to change in math skills. Specifically, many researchers describe EF as a set of skills that are foundational to the acquisition of math knowledge (e.g., Cragg & Gilmore, 2014; Nguyen et al., 2019; Passolunghi & Lanfranchi, 2012). However, findings from the current study suggest that rather than thinking about EF primarily as a precursor to math learning, we would do well to think about EF and math as complementary skills that each play a role in the development of the other. Reciprocity in the relation between EF and math may be a product of the fact that complex math problems require the integration of multiple EF skills. Moreover, acquiring math skills is a complex process requiring one to marshal multiple existing cognitive and neural subsystems to think about and manipulate the world in novel ways – a task that EF skills are uniquely suited to facilitate. Conversely, if one sees EF not as fixed but malleable given experience and input (Miller-Cotto & Byrnes, 2020), then learning math is in many ways an ideal context within which to practice and refine the efficient deployment of EF skills (Clements et al., 2016; Miller-Cotto & Byrnes, 2020).
On a more applied level, the fact that the longitudinal relation between EF and math, once adjusted for baseline performance, is weaker than what is commonly reported in the literature suggests that interventions designed to improve early math skills that focus exclusively on EF training may yield at best modest results. Indeed, this appears to be the case in several extant studies (Barnes et al., 2016; Fuchs et al., 2022; though see also DePascale et al., 2024). Our data indicate that a more accurate view of the developmental relation between EF and math is a bidirectional one, implying that math skills are just as – if not slightly more – predictive of EF development as EF skills are for math development. Based on this, we suggest that efforts to improve both math and EF performance prior to formal schooling should focus on how math activities can be leveraged to include more explicit EF support (Barnes, 2023; Clements et al., 2016; DePascale et al., 2024; Miller-Cotto & Byrnes, 2020).
Limitations
One important limitation of the current work is that, given the relative scarcity of studies assessing the relation between EF and changes in math (and vice versa), we were unable to look at moderators of the bidirectional longitudinal relations between EF and math. For instance, longitudinal links, and their potential implications for interventions in particular, may depend on overlap between the specific subprocesses. Such subprocesses may include the EF subcomponents examined here, or they may involve other cognitive domains relevant for both EF and math, such as spatial thinking.
In another vein, it would be informative to know if the reciprocity in the relation between EF and math is specific to preschool-aged children. Current research on the topic has reported inconsistent results in terms of whether this reciprocity persists as children get older. For example, Schmitt et al. (2017) found evidence for a bi-directional relation between EF and math in preschool, but not in kindergarten. The authors observed that once children started formal schooling, EF predicted gains in math but not the other way around. However, a more recent study observed bi-directional relations between EF and math across kindergarten and into first grade (McKinnon & Blair, 2019). Understanding how reciprocity in the relation between EF and math may change with age is important for informing theories on the developmental progression of EF and its relation to math. Further, it may also be important for understanding how best to design developmentally-appropriate curricula and interventions aimed at improving EF and math skills. As such, future research should investigate how the bidirectional relations between EF and math may change after the onset of formal schooling.
A Note About Causality
With respect to longitudinal associations – even for adjusted relations in which one predicts change in the outcome variable – it is important to note that ‘directional’ does not mean ‘causal’; it means ‘predictive of change in a given direction’ (Bailey et al., 2018). To draw an analogy, the amount of snowfall on Monday night predicts the change in student absences from Monday to Tuesday. Knowing the overnight snowfall is thus extremely useful. Further, it is directionally specific: we would not want to confuse the above with saying that student absences on Monday predict the change in snowfall from Monday to Tuesday. Nevertheless, we would also be careful to stop short of saying that the overnight snowfall directly caused Tuesday’s absences. This is because there are a large number of potential intervening factors that our simple analysis cannot account for, especially at the individual student level. As such, while the methods applied here bring us a step closer to causality in terms of understanding directionally specific contributions of EF and math to the development of one another, more work is needed to further delineate the causal relations between these two skills. Even with directional (adjusted) longitudinal associations, one cannot rule out the possibility that there is a third omitted variable operating on both the predictor and the outcome (Bailey et al., 2018). The most rigorous assessment of the causal relation between EF and math would entail a randomized controlled trial (RCT). Unfortunately, as noted in Jacob and Parkinson (2015), there are few, if any, RCT studies that have been designed specifically to answer this question. To conclude, one might best summarize the longitudinal results from the present work in the form of predictions about what future RCT studies are most likely to find: Based on our results, we predict that (1) an RCT that manipulates EF training in early childhood is likely to find a small but significant transfer to math; (2) an RCT that manipulates math training in early childhood is likely to find a small but significant transfer to EF; (3) the transfer effect from math training to EF is likely to be modestly larger than the reverse. Other researchers may disagree with these predictions or find them incomplete, but that is precisely why one would need more work on the topic. In sum, the current work is not proof of causality; instead, it makes evidence-based predictions about the likely outcome of experiments that do have the capacity to establish causality.
Conclusions
The present meta-analysis examined whether the strength of the EF-Math relation in preschool and kindergarten depends on EF and math measurement factors, socio-economic status (SES), and the nature and direction of longitudinal relations. Several results of note emerged. [1] Composite and Achievement measures of EF and Math led to higher average estimates of the EF-Math relation relative to measures that attempt to isolate EF subprocess or specific math skills; however, more work may be needed to determine whether this is due to measurement or developmental factors. [2] We found that EF measures using numerical stimuli inflate estimates of the EF-Math association by roughly 40%, and thus recommend researchers avoid using numerical stimuli in their EF measures when their goal is to estimate the magnitude of EF↔Math associations. [3] In terms of SES, the strongest average EF-Math association was found for low SES samples – a finding with important implications for preschool programs targeted at this population, such as Head Start and many state pre-K programs. Considering longitudinal associations, [4] those that do not adjust for Time-1 measurement of the outcome variable lead to substantially (as much as 120%) inflated estimates of directional associations. After making these adjustments, we nevertheless found [5a] significant, albeit reduced bidirectional relations between EF and math. We also found [5b] that math is a stronger predictor of future change in EF than the reverse, and thus recommend that math instruction serve as a foundation for embedding EF-building skills rather than isolating EF-fostering activities from the math curriculum. In sum, we hope that the results of this work contribute to theoretical models of the interaction between EF and math in early childhood, to increasing understanding of the inter-connected developmental courses of growth in math and EF capacities, and to practical attempts to foster growth in children’s EF and math skills, whether in the lab, classroom or living room.
This is an open access article distributed under the terms of the Creative Commons Attribution License (