^{*}

^{a}

^{a}

^{a}

^{b}

^{a}

^{c}

^{b}

Mental Abacus (MA) is a popular arithmetic technique in which students learn to solve math problems by visualizing a physical abacus structure. Prior studies conducted in Asia have found that MA can lead to exceptional mathematics achievement in highly motivated individuals, and that extensive training over multiple years can also benefit students in standard classroom settings. Here we explored the benefits of shorter-term MA training to typical students in a US school. Specifically, we tested whether MA (1) improves arithmetic performance relative to a standard math curriculum, and (2) leads to changes in spatial working memory, as claimed by several recent reports. To address these questions, we conducted a one-year, classroom-randomized trial of MA instruction. We found that first-graders students struggled to achieve abacus expertise over the course of the year, while second-graders were more successful. Neither age group showed a significant advantage in cognitive abilities or mathematical computation relative to controls, although older children showed some hints of an advantage in learning place-value concepts. Overall, our results suggest caution in the adoption of MA as a short-term educational intervention.

Mental Abacus (MA) is a popular technique for supplementing elementary school mathematics education, especially in Asian countries, where there is a tradition of abacus use in schools and businesses (

A typical abacus consists of several vertical columns upon which beads move freely. A horizontal beam divides each column into an upper and a lower deck. Each column has 4 or 5 beads on the lower deck (“earthly beads”) and 1 or 2 beads on the upper deck (“heavenly beads”). On a column chosen by the user, each bead in the lower deck represents a cardinality of one, and each bead in the upper deck represents five. Lower deck beads on the next column to the left represent multiples of 10 (and 50 on top), and beads on the next column represent multiples of 100 (and 500 on top), etc. Beads are “in play” (contributing to a numerical representation) when moved towards the central beam.

Previous studies have argued that the unusual abilities of MA users stem from the format of MA computations, which are thought to be supported by gestural and visuospatial representations. Following early studies by Hatano and colleagues (

Both the structure of the abacus itself and MA users’ computational limits are consistent with known limits to visual working memory. Like other attested abacus systems found in the human historical record, the Soroban abacus represents number by chunking beads into small sets of 4 or 5, which corresponds to the hypothesized capacity limits described in the visual attention literature (e.g.,

The apparent reliance of MA expertise on visuospatial working memory raises questions about its potential utility as an educational tool for typical K-12 students (the main question of the present research). In most previous reports of MA expertise, students were self-selecting, highly motivated learners, who may have been especially predisposed to learning math in a visuo-spatial format (

In one recent study (

In addition to making claims about its efficacy as an educational intervention, some recent studies have claimed that MA training may result in changes to working memory, given its visuo-spatial format (

As MA grows in popularity internationally, it has begun to appear in the United States, and has been introduced to multiple public schools to supplement existing mathematics instruction. This turn of events raises the question of whether MA can be implemented effectively in the context of the US public school system, and whether it compares favorably to alternative enrichment programs. Our past work leaves this question open for a number of reasons. First, our previous study was conducted at a private charitable school, which was able to adjust its curriculum to accommodate intensive MA training and provide substantial and ongoing teacher training in how to accommodate MA in the curriculum. Thus, it remains unknown whether the schedule and resources of a typical US public school can accommodate a novel technique like MA. Second, although our previous study was conducted at a private school, it served a population of very low-income families, and featured very large class sizes by US standards – around 60 – 70 students per class. Therefore, while the school had more flexibility to introduce MA, the students may have been less prepared for training than typical US students and may have received relatively little teacher attention, raising the possibility that US students could benefit more, or with smaller amounts of training. Relevant to this suggestion, in that previous study, we not only found that our sample’s visual working memory capacity was predictive of their MA uptake, but also that overall their visual working capacity was quite low relative to comparison samples from higher socio-economic status Indian children. Third, because of the large class size, a small number of teachers were responsible for MA instruction. Thus, we were not able to test for the generality of intervention benefits across classrooms.

In the present study, we investigated the potential effectiveness of MA in the US context by conducting a one-year, classroom-randomized trial of MA instruction in a Northeastern US public school. Classes were randomly assigned to either maintain their standard math curriculum (Common Core Singapore Math; Control) or to receive a mixture of their standard math plus MA training. We assessed performance during the first week of classes, before MA was administered, and near the final week of classes, after children in the treatment group had received a full academic year of training.

Our study had two main goals. First, we asked whether a group of US school children (

We conducted a classroom-randomized evaluation of MA training in Grades 1 and 2 in a large elementary school. Classes were randomly assigned to a Control group who received their standard math curriculum (Common Core Singapore Math) or an experimental group who received the same number of hours of instruction but divided between their standard curriculum and mental abacus. We assessed student performance during the first week of school, prior to MA instruction (pretest) and then at the end of the school year (posttest). Students were assessed at both time periods on measures of mathematics knowledge and on general cognitive measures (spatial working memory, executive function, and general reasoning). At posttest all students completed a measure of math anxiety. In addition, as a test of MA curriculum uptake, those students in the MA group were also tested on their ability to read an abacus.

We partnered with a large school in a Northeastern US state, located in a large metropolitan area. The school was ethnically and socioeconomically diverse, with 67.1% of students eligible for free or reduced lunch in 2012 and > 90% black or Hispanic students. Prior to study initiation, we received a list of classrooms (

Students in the Control group followed their standard, existing “Singapore Math” curriculum. Singapore Math focuses on a gradual transition between concrete and abstract representations, and often makes use of concrete visualizations of problems (e.g., using schematic diagrams). Standard math class was scheduled daily for one period per day (40 min). The MA training group received three periods per week (40 min each) of MA training from their own teachers, using external curriculum materials; the remaining two periods per week were used for standard Singapore Math instruction, using the same materials as the control group. The MA materials focused first on the introduction of the physical abacus, and then on the basics of the MA technique. Specifically, children in Grade 1 learned addition using the complement of 5 (to prepare them for MA addition, which requires this skill for the use of the top bead, which denotes 5). By the end of the year, some could do mental abacus computations using both bottom and top beads. Children in Grade 2 learned addition using the complement of 10 (to prepare for multi-column abacus problems) and some could do mental abacus addition using the complement of 5.

We sent consent forms home with all students in Grades 1 and 2 and followed up with several school-wide announcements. All children from participating classrooms were enrolled in the study if they had valid consent forms, though a small number were sick or otherwise absent and were not tested in particular tasks or at both time points. Participants were only included in analyses if they had been tested at both time points. We also received some consent forms from children in excluded classes.

Group | Grade | Children Enrolled | Consent Form Received | Data Collected |
---|---|---|---|---|

Control | 1st | 103 | 49 (48%) | 26 |

2nd | 89 | 51 (57%) | 44 | |

Mental Abacus | 1st | 95 | 46 (48%) | 35 |

2nd | 114 | 70 (61%) | 59 |

Descriptive statistics | Control |
Mental Abacus |
||
---|---|---|---|---|

Grade 1 | Grade 2 | Grade 1 | Grade 2 | |

N Participants | 26 | 44 | 35 | 59 |

Mean age | 6.52 | 7.56 | 6.37 | 7.46 |

Median Age | 6.49 | 7.51 | 6.32 | 7.41 |

Std. Dev. Age | 0.39 | 0.41 | 0.29 | 0.36 |

N Hispanic | 15 | 20 | 14 | 26 |

N African-American | 7 | 13 | 11 | 24 |

N Mixed Race/Other | 4 | 11 | 10 | 9 |

Students were tested twice: Once at the beginning of the school year and once at the end. In each case, all students were tested during a 4 to 5 day period. Testing was performed in the school library or unused classrooms, using portable laptop computers for the cognitive assessments. Students completed tasks in groups of 2 – 4 over the course of one or two sessions, totaling about 45 – 60 minutes. Cognitive assessments were given first, then mathematics assessments.

We administered a short battery of cognitive assessments to test for cognitive transfer (see

All students completed three short assessments. The first was the Woodcock-Johnson III standardized measure, a two-page battery ranging from simple arithmetic to much more advanced high-school topics (5 minutes). The second was an in-house assessment of conceptual understanding of place-value, a foundational math concept that is part of the Common Core curriculum for Grades 1 and 2. The task required filling in the missing quantity in a place-value decomposition (e.g., 400 + ___ + 1 = 451) (5 minutes, reported in

To test children’s uptake of MA, we administered an in-house assessment of abacus translation (measuring the accuracy and speed with which children can translate from an abacus state to Arabic numerals). Children had 5 minutes to complete 23 problems.

We assessed math anxiety using a questionnaire adapted from

Confirmatory analyses specified below were pre-registered at the Open Science Framework (see

All measures were normalized into the unit interval for ease of comparison and interpretation. In general, we normalized by the total number of questions on a measure. Since no child was able to answer questions on the second page of the Woodcock-Johnson III (which includes fractions, negative numbers, etc.), we included only the first page, containing 25 questions. For the Go/No Go task, we used accuracy on “no go” trials. For spatial working memory, we normalized spans arbitrarily by dividing by 10; this decision does not affect any statistical analyses and was made to simplify visualization.

As in our previous work (

outcome ~ grade + year * intervention + (1 | subid) + (1 + year | class)

The key coefficient in these models was the year-by-intervention interaction term, capturing the possibility of a greater increase in the outcome measure as a function of random assignment to intervention group. The random intercepts for each participant control for participant-level individual differences, and the random slopes and intercepts for classes capture different baseline levels and growth patterns across classes.

We had six primary outcome variables, corresponding to our six tasks: Three mathematics measures (Arithmetic, Place Value, and the standardized WJ III assessment) and three cognitive measures (Matrix Reasoning, Go/No Go, and Spatial Working Memory). The distribution of outcome variables for each task is shown in

Histograms showing the distribution of scores from each task in our battery, split by grade level. Dashed lines show means. Upper panels show pre-test scores; lower panels show post-test scores.

All tasks showed evidence of modest test-retest reliability across the school year (range = .30 – .63, see

Task | Grade | Lower 95% CI | Upper 95% CI | ||
---|---|---|---|---|---|

Arithmetic | 1st | .52 | 0.30 | 0.68 | <.0001 |

2nd | .49 | 0.33 | 0.63 | <.0001 | |

Place Value | 1st | .31 | 0.06 | 0.52 | .0154 |

2nd | .63 | 0.49 | 0.73 | <.0001 | |

WJ III | 1st | .32 | 0.07 | 0.53 | .0126 |

2nd | .38 | 0.20 | 0.53 | .0001 | |

Matrix Reasoning | 1st | .33 | 0.09 | 0.54 | .0091 |

2nd | .33 | 0.15 | 0.50 | .0005 | |

Go/No Go | 1st | .56 | 0.35 | 0.71 | <.0001 |

2nd | .45 | 0.28 | 0.59 | <.0001 | |

Spatial WM | 1st | .43 | 0.19 | 0.62 | .0008 |

2nd | .30 | 0.12 | 0.47 | .0019 |

We also examined intervention uptake at the end of the study (

Histogram of abacus familiarity scores, broken down by grade. Dashed lines show means.

We found a roughly bimodal distribution of children, with some children relatively proficient at decoding abacus representations and others quite poor and only able to do so for 1 - 2 digit displays (a skill which they nevertheless did not possess prior to the intervention). The relative balance of children in the two modes was different across grades, however, with a much larger population of second-graders gaining proficiency in the technique. These uptake findings are an important metric of the appropriateness of MA instruction. A relatively small proportion of first graders could accurately decode a multi-digit abacus by the end of one year of instruction (21%). Thus, MA may not have been an appropriate curriculum for these children, given their place value knowledge. We discuss this result in more depth below, but it qualifies the interpretation of all subsequent outcome measures for the intervention.

The primary question addressed by our confirmatory analyses was whether assignment to treatment condition (MA vs. Control) resulted in differential change in mathematical or cognitive measures. Due to model convergence issues, we deviated from our pre-registered plan by removing random slopes for individual classes (this decision is consistent with our standard operating procedures for how to deal with non-convergent mixed effects models).

Task | Predictor | β | Std Err | ||
---|---|---|---|---|---|

Arithmetic | |||||

Intercept | 0.040 | 0.015 | 2.63 | .0085 | |

Second Grade | 0.095 | 0.015 | 6.28 | <.0001 | |

Post-Test | 0.146 | 0.011 | 13.52 | <.0001 | |

Mental Abacus | 0.009 | 0.017 | 0.55 | .5838 | |

Post-Test x Mental Abacus | -0.013 | 0.014 | -0.90 | .3665 | |

Place Value | |||||

Intercept | 0.038 | 0.039 | 0.98 | .3257 | |

Second Grade | 0.266 | 0.038 | 7.05 | <.0001 | |

Post-Test | 0.251 | 0.029 | 8.62 | <.0001 | |

Mental Abacus | 0.021 | 0.042 | 0.50 | .6154 | |

Post-Test x Mental Abacus | 0.074 | 0.038 | 1.94 | .0527 | |

WJ III | |||||

Intercept | 0.230 | 0.013 | 17.93 | <.0001 | |

Second Grade | 0.154 | 0.012 | 12.98 | <.0001 | |

Post-Test | 0.186 | 0.012 | 15.67 | <.0001 | |

Mental Abacus | -0.001 | 0.014 | -0.10 | .9218 | |

Post-Test x Mental Abacus | 0.001 | 0.016 | 0.09 | .9303 | |

Matrix Reasoning | |||||

Intercept | 0.201 | 0.026 | 7.75 | <.0001 | |

Second Grade | 0.081 | 0.026 | 3.17 | .0015 | |

Post-Test | 0.114 | 0.021 | 5.56 | <.0001 | |

Mental Abacus | 0.005 | 0.029 | 0.17 | .8661 | |

Post-Test x Mental Abacus | -0.021 | 0.027 | -0.78 | .4353 | |

Go/No Go | |||||

Intercept | 0.730 | 0.017 | 42.03 | <.0001 | |

Second Grade | 0.066 | 0.017 | 3.98 | .0001 | |

Post-Test | 0.041 | 0.014 | 3.04 | .0024 | |

Mental Abacus | 0.012 | 0.019 | 0.64 | .5233 | |

Post-Test x Mental Abacus | -0.032 | 0.018 | -1.79 | .0742 | |

Spatial WM | |||||

Intercept | 0.296 | 0.020 | 14.44 | <.0001 | |

Second Grade | 0.049 | 0.019 | 2.59 | .0095 | |

Post-Test | 0.075 | 0.019 | 3.94 | .0001 | |

Mental Abacus | 0.023 | 0.022 | 1.02 | .3098 | |

Post-Test x Mental Abacus | 0.014 | 0.025 | 0.53 | .5950 |

Performance on mathematics measures by time and grade. Error bars show 95% confidence intervals, computed by non-parametric bootstrap.

Performance on cognitive measures by time and grade. Error bars show 95% confidence intervals, computed by non-parametric bootstrap.

Beginning with the math measures, we did not see evidence of differential change in performance for either the in-house arithmetic or standardized WJ-III measures. This result differs from the findings of

In the cognitive measures, we did not see evidence of differential changes in performance for either matrix reasoning or spatial working memory. These results are consistent with our previous findings and suggest again that MA-related changes to spatial working memory do not result from even extensive MA training in typical groups of school children.

We did, however, find an unpredicted negative interaction of time and condition, such that students in the control group appeared to increase more in performance on the Go/No Go task. One possible explanation for this finding is a speed / accuracy tradeoff such that children in the MA group were less accurate but faster (perhaps stemming from the fact that MA training focuses on increasing computation speed, sometimes even at the expense of accuracy, such that computations are executed before the abacus “fades” from the mind’s eye). This explanation appears plausible given a visual inspection of the reaction times (

Reaction times for the go/no-go task. Plotting conventions are as above.

In sum, we saw limited evidence for the effectiveness of the MA intervention. In the math tasks, only the place value measure showed a hint of an intervention effect. And in the cognitive tasks, there were no intervention effects except for a possible shift in response criterion on the Go/No Go task.

In our previous study, we found that spatial working memory scores at study initiation moderated the effects of the intervention. Children who were above the median in spatial working memory tended to show the largest gains in arithmetic performance from studying MA. We pre-registered this confirmatory hypothesis and conducted it on all three of our math measures (

Median split results for outcome measures based on intervention group and spatial working memory scores. Plotting conventions are as above.

Of the three, only place value showed the predicted pattern, and only for the second graders – as might be expected based on the limited uptake among first-graders. Numerically, the pattern for place value was similar to what we observed in the arithmetic measure in our first study: Greater growth for high spatial WM children in the MA group. Nevertheless, in exploratory mixed-effects models, the three-way interaction of spatial working memory, time, and condition was not significant. Likely our study would have required considerably more power to detect such an effect.

We further assessed whether the MA intervention led to changes in math anxiety at the end of the study via an exploratory analysis. As shown in

Average math anxiety ratings (on a five-point Likert scale), split by group and grade level. Error bars show 95% confidence intervals computed by non-parametric bootstrap.

We investigated the potential effectiveness of mental abacus (MA) instruction in the US context by conducting a one-year, classroom-randomized trial of MA. The study had two main goals. First, we asked whether a large group of US school children, distributed across a number of classrooms, would experience greater benefit from MA training than from equivalent hours of standard math curriculum. Second, we explored the claim, made by several recent studies, that even small amounts of MA training might yield benefits to visuo-spatial working memory abilities. Rather than being a “best case” implementation of MA, our study reports a relatively realistic trial, in that – with the exception of classroom randomization – its implementation was planned and executed by the school and a private MA curriculum vendor with little researcher input.

Our study yielded two main results. First, we found that the first-grade children in our study generally failed to master the abacus as a format for numerical representation: Even after a year of weekly training, they struggled to translate numbers between abacus and numeral formats. Perhaps this difficulty was due to their very limited place-value understanding at study initiation (as shown by their floor-level performance on this measure). Regardless of its source, this limitation meant that they were missing one of the prerequisites for MA learning, thereby compromising their ability to experience other gains from the technique. In addition, they were not able to use the technique to construct an understanding of place value. Second, we found that, for the second-graders, one year was sufficient to acquire a basic understanding, but this level of mastery yielded only the beginnings of measurable benefits to mathematics achievement. We saw hints of improvement in place value understanding in the second grade MA group, but no effects on arithmetic performance more broadly. Overall, we did not find evidence that one year of training was sufficient to augment working memory or result in other measurable changes in cognitive tasks (e.g., reasoning or go/no-go).

These results have several implications. First, they suggest that there is prerequisite knowledge for students to learn MA. Although our study does not reveal precisely what these prerequisites are, basic numerical concepts, basic understanding of arithmetic operations, and some foundational understanding of place-value are all potential candidates. If these prerequisites are not satisfied, MA may be difficult for children to master in typical classroom environments. Thus, starting MA at an age before these prerequisites are in place may not be effective. Second, they suggest that although longer term interventions may benefit children, short-term MA interventions may be less likely to provide an advantage to children unless (1) they more intensive than the intervention pursued here, or (2) they are administered to better-prepared and/or older students. While this result may be interpreted as a negative finding, it might also be taken as evidence that alternative approaches like MA – which children may potentially find more enjoyable than existing techniques – can be deployed without sacrifice to learning (since the MA group was not numerically or statistically

Finally, a third implication of this work relates to the potential non-mathematical benefits of MA training. One common claim made by developers of MA curricula is that the technique not only accelerates learning of mathematics, but also promotes the growth of domain-general cognitive capacities like memory and attention. Consistent with these claims, several recent studies have reported remarkable changes in spatial working memory capacity after only brief experience with MA technique (e.g.,

One question that remains unanswered by this study is whether MA might yield a significant advantage over alternative techniques in a more protracted or in-depth intervention. Past research found that the largest benefits emerged after three years of training – the amount of time it takes typical children to complete most existing MA curricula. Given that the first year of MA training focuses extensively on physical abacus training, and less so on mental computations, the possibility of greater gains after more training seems plausible. Data from the present study do not give us reason to doubt that US children would perform differently from Indian children given additional training. In fact, the strong spatial working memory of children in the US sample suggests that advantages might emerge earlier if training continued (at least for the second graders). Still, additional evidence is required before MA can be recommended as superior to existing techniques.

In addition, because MA requires specialized teacher training, future studies should explore instructional components. One open question in this area is the amount of teacher training necessary for effective instruction – since teachers in our school appear to have found the technique challenging and required unplanned mid-year visits from the curriculum provider. A second question is how easily MA technique can be transmitted across teachers, such that instruction can persist within schools without incurring additional training costs each new year. Because MA is an unfamiliar technique in US schools, teacher training is a major challenge to implementing the technique in the classroom. Previous studies leveraged contexts in which MA instruction was already relatively common (

In summary, we found that MA instruction led to results comparable to our control group after one year of training. Although MA instruction has led to mathematical performance gains in past work, it may only produce such results after a longer intervention or in populations with stronger prerequisite mathematical or cognitive abilities. Further, consistent with past results, we find that MA training does not augment children’s pre-existing cognitive capacities.

All tasks are visible at

Pre-registration information for our study can be found at

All data and analytic code for the project can be found at

This work was supported by NSF BCS 1550667.

The authors have declared that no competing interests exist.

Our thanks to Kevin Kim, Woo Song, Daniela Small-Bailey, and all the teachers and students at Grieco Elementary School.