Why is it so hard for some people to learn the multiplication table – the single-digit multiplications up to 10 × 10? One account of such difficulties emphasizes the role of interference. The multiplication facts, which involve only 10 digits, are highly similar to each other, and this similarity may make them interfere with each other. This idea is central in the network interference model (Campbell, 1995; Campbell & Graham, 1985). According to this model, arithmetic facts are stored in memory as an associative network: each arithmetic fact, say 6 × 7 = 42, is associated not only with the correct solution, but also with the solutions of other facts (e.g., with 48, the product of 6 × 8). Such incorrect-solution association may occur, for example, if the two facts share an operand (e.g., 6 × 7 = 42 and 6 × 8 = 48). Incorrect-solution associations disrupt learning and can cause retrieval errors, because they create interference among facts: when presented with a particular fact, the person may follow the association network to the solution of another fact.
Several studies indicate that the representation underlying multiplication facts is verbal (Dehaene, 1992; Dehaene & Cohen, 1995; Dehaene, Piazza, Pinel, & Cohen, 2003). This conclusion is supported by neuropsychological studies showing that multiplication deficits are associated with verbal impairments (as opposed to subtraction deficits, which are associated with impaired quantity processing; Cohen & Dehaene, 2000; Cohen, Dehaene, Chochon, Lehéricy, & Naccache, 2000; Dagenbach & McCloskey, 1992; Dehaene & Cohen, 1997; Delazer & Benke, 1997; Lampl, Eshel, Gilad, & Sarova-Pinhas, 1994; Lochy, Domahs, Bartha, & Delazer, 2004; Pesenti, Seron, & van der Linden, 1994; van Harskamp & Cipolotti, 2001). It is also supported by brain imaging studies showing that multiplication tasks activate brain regions that are also activated by tasks of language, verbal short-term memory, and phonological processing (Dehaene, Spelke, Pinel, Stanescu, & Tsivkin, 1999; Simon, Mangin, Cohen, Le Bihan, & Dehaene, 2002). There is also evidence for nonverbal multiplication (McCrink & Spelke, 2010), but the common assumption is still that by and large, multiplication is driven mostly by verbal representations. Thus, the interference effects in multiplication may be related to general mechanisms of verbal memory. Indeed, verbal memory is susceptible to interference: similar verbal items interfere with one another more than dissimilar items in a variety of contexts – in verbal memory tasks (Baddeley, 1966, 2003; Hall, 1971; Nelson, Brooks, & Borden, 1974; Oberauer & Kliegl, 2006; Oberauer & Lange, 2008; Vallar, 2006), when learning words in a new language (Pajak, Creel, & Levy, 2016), and even when memorizing non-numeric arithmetic-like facts (Graham & Campbell, 1992).
Everybody experiences some difficulties arising from verbal interference, but for some people the difficulty is more severe than for others. De Visscher and Noël (2013, 2014a, 2014b) proposed that some people have hypersensitivity to interference – an extreme sensitivity to the interference arising from similarity between verbal items – and that these people may experience difficulty in handling the multiplication facts. In support of this hypothesis they reported DB, a woman with poor memory of the multiplication table, and showed that she also had hypersensitivity to interference: she performed poorly in tasks that were sensitive to interference, even when the stimuli were non-number words. In contrast, she performed well in tasks that assessed several other potential sources of difficulty in calculation, including verbal working memory capacity. De Visscher and Noël proposed that DB’s difficulty in memorizing the multiplication table was a reflection of a more general difficulty in verbal memory – her hypersensitivity to interference.
De Visscher and Noël’s series of studies showed a convincing correlation between hypersensitivity to interference and difficulty in arithmetic facts. Here, we wish to strengthen their point by providing causal evidence that hypersensitivity to interference disrupts the memorization of multiplication facts. We examined DL – a woman who, similar to the woman described De Visscher and Noël, had poor memory of multiplication facts as well as hypersensitivity to interference. To show that DL’s hypersensitivity to interference not only correlates with her difficulty in multiplication facts but is also the reason for this difficulty, we designed an experiment in which we taught her multiplication facts in a low-interference condition. We predicted that in this condition, interference will play only a minor role, and consequently DL will succeed learning the multiplication facts. In contrast, in a high-interference condition she should still have difficulty. As we shall see, this was indeed the case.
Our method to reduce interference is rooted in Campbell and Graham’s (1985) network interference model. The model assumes that multiplication facts are represented as a network of associations and that different facts may use intersecting network sections, so when the activation region of one problem intersects that of another problem, retrieving the answer to the first problem increases the probability that this answer would be incorrectly retrieved for the second problem. This idea is supported by the error priming phenomenon: Campbell (1987) asked participants to solve a given set of multiplication problems; when the experimental block included, on top of these problems, some additional, associated problems, the participants responded more slowly and made more errors than when the block did not include such associated problems.
Similar to Campbell (1987), here we manipulated the degree of interference by controlling the similarity between the multiplication facts presented during a particular experimental session. Based on previous findings (Campbell, 1987; De Visscher & Noël, 2013, 2014a, 2014b; Girelli, Delazer, Semenza, & Denes, 1996), we reasoned that even if hypersensitivity to interference impairs DL’s ability to memorize similar multiplication facts, she would still be able to memorize a set of dissimilar multiplication facts. Importantly, although the multiplication table as a whole has much similarity between the facts, some facts are dissimilar from each other (e.g., 9 × 9 = 63 and 7 × 4 = 28), and we hoped that DL would be able to learn such subsets of facts. Moreover, we assumed that fact A would interfere with a similar fact B if the two facts are presented within a short time from each other, but not if they are presented with sufficient temporal delay between them (this was essentially the finding of Campbell, 1987). This second assumption opens the door to teaching DL not only dissimilar multiplication facts but also similar facts: all we need to do is to present similar facts in different learning sessions, which are separated from each other by a sufficient delay.
These two foundations led to the following simple training method. We identified the multiplication facts that DL did not know, and grouped them into small sets of facts. Crucially, we constructed the different sets of facts such that they had different levels of between-item similarity within the set, i.e., different levels of induced interference. Each set was taught during one week, and importantly, during that week DL refrained from rehearsing facts from any other set. We predicted that DL would have difficulty learning the multiplication facts in high-similarity sets, but would succeed learning low-similarity sets.
General Method
Participant
DL was a 40-year-old woman who arrived in our lab as a potential participant in an experiment about number reading difficulties. An initial examination showed that she did not have number reading difficulties, but her knowledge of the multiplication table was severely impaired. Put in her own words, she was "clueless in multiplication". She reported that her difficulties began in elementary school, and persisted in spite of several years of hard work and private tutoring in math.
Procedure
All tasks were administered in Hebrew. The assessment tasks were done in a quiet room in our lab or in DL’s home. The training was done over the telephone, while DL was in a quiet room in her home. There was no limit on the response duration in any task (but extremely slow responses were classified as errors, as detailed below).
Statistical Analyses
Comparing DL With Control Participants
Statistical comparisons of DL’s performance to control groups were done using Crawford and Garthwaite's (2002) one-tailed t-test, with effect size defined as zcc (CC stands for case-controls) – DL’s score normalized according to the control group distribution (Crawford, Garthwaite, & Porter, 2010). Control participants with outlier error rates (higher than the 75th percentile by more than 150% the inter-quartile range) were excluded. When comparing DL’s performance versus the control group we typically report one-tail p values, reflecting the assumption that DL may have some cognitive deficit underlying her difficulty in multiplication, and such deficit should be manifested in poorer performance than the control group.
Comparing Single-Subject Data Between Two Conditions
One-tailed comparisons of DL’s performance between two conditions were done using a bootstrapping method. The effect we examine was indexed as the difference between the two conditions in the mean score across items:
ScoreDiff = (mean scores of condition 1) – (mean scores of condition 2)
One-tailed p values were obtained by comparing the observed ScoreDiff value versus the random distribution of ScoreDiff under the null hypothesis that the two conditions do not differ. This random distribution was generated by arbitrarily classifying all scores into two conditions 10,000 times, and computing the value of ScoreDiff for each such random classification. For a paired analysis, in which each tested item has one score in condition 1 and one score in condition 2, each random classification was created by arbitrarily labeling each item’s two scores as “condition 1” and “condition 2”. For an unpaired analysis, each random classification was created by reshuffling the scores of the two conditions into two sets arbitrarily labeled “condition 1” and “condition 2”, while maintaining the original number of items in each condition. When the total number of items was too small such that there were fewer than 10,000 different possible random classifications, we computed the random distribution based on all possible classifications. As effect size we report z – the observed ScoreDiff value, standardized to the mean and standard deviation of the random distribution.
Initial Cognitive Assessment
Basic Arithmetic
In a screening test, DL was presented with 12 multiplication facts: 4 rule-based facts (N × 0 and N × 1) and 8 non-rule facts (both operands between 2 and 9). She had difficulty in both types of facts (Table 1). In contrast, she performed well in several other oral tests of arithmetic: single-digit additions, subtractions with single-digit second operand and single-digit result, and two-digit calculationsi. Thus, her difficulty in multiplication facts did not disrupt addition/subtraction facts or the execution of calculation algorithms. At this time, we taught her the rules N × 0 = 0 and N × 1 = N. Throughout the study, she hardly ever again erred in these rules.
Table 1
Task | No. of items | Errors | Hesitations |
---|---|---|---|
1-digit multiplication (rule-based) | 4 | 2 | 1 |
1-digit multiplication (non-rule) | 8 | 2 | 2 |
1-digit addition | 15 | 0 | 0 |
1-digit subtraction | 8 | 0 | 0 |
2-digit addition | 3 | 0 | 0 |
2-digit subtraction | 3 | 0 | 0 |
2-digit multiplication | 3 | 0 | 0 |
We then tested DL’s knowledge of all 55 multiplication facts (the larger operand always appeared first) – 19 rule-based facts and 36 non-rule factsii. She was flawless in the rule-based facts, but she had 14/36 errors (39%) in the non-rule facts: she made 5 operator errors (adding instead of multiplying), 2 within-table errors (saying the result of another multiplication fact), 1 out-of-table error (saying a number that is not the product of any two digits), and 7 “don’t know” responses. Her error rate of 39% was significantly worse than the performance of 12 age-matched control participants (M = 2.5 errors, SD = 2.02, Crawford and Garthwaite’s (2002) t(11) = 5.47, p < .001, zcc = 5.69; control group mean age = 40;7, SD = 5;3), and worse than the worst-performing control participant, who had only 4/36 errors (using the paired bootstrap method described in General Method, one-tailed p = .006, z = 2.12). Thus, DL clearly had impaired knowledge of the multiplication table. We next examined potential origins for this impairment.
Cognitive Assessment
We assessed DL’s language abilities, number processing, and verbal memory, in order to examine whether her difficulty in multiplication could be explained by a deficit in any of these mechanisms. As we shall see, DL performed well in all tasks except one.
Reading, Lexical Retrieval, and Symbolic Number Processing
DL performed well in reading words, nonwords, and word pairs (TILTAN battery, Friedmann & Gvion, 2003a), as well as in picture naming (SHEMESH test, Biran & Friedmann, 2004) (not significantly different from controls, Crawford and Garthwaite's one-tailed t-test; Table 2). She also performed well in processing symbolic numbers (digit strings and number words): reading aloud from paper a list of multi-digit numbers with 3-6 digits, repeating the same numbers, and writing numbers (3-5 digits) to dictation as digit strings. Thus, DL had good reading, lexical retrieval, and processing of symbolic numbers. Her difficulty in memorizing the multiplication facts did not originate in any of these processes.
Table 2
Task | No. of items |
DL | Control group
|
DL vs. controls (one-tailed p)a |
|||||
---|---|---|---|---|---|---|---|---|---|
M | SD | n | t | df | p | zcc | |||
Reading | |||||||||
Single words | 136 | 100% | 98.3% | 1.5% | 372 | – | |||
Nonwords | 40 | 100% | 95.9% | 4.2% | 372 | – | |||
Word pairs | 30×2 | 96.7% | 97.5% | 2.4% | 372 | 0.33 | 371 | .37 | 0.50 |
Lexical retrieval | |||||||||
Picture naming | 100 | 97.0% | 97.7% | 1.7% | 87 | 0.41 | 86 | .34 | 0.41 |
Symbolic number processing | |||||||||
Reading aloud | 120 | 96.7% | 97.2% | 1.3% | 21 | 0.38 | 20 | .36 | 0.38 |
Repetition | 120 | 91.0% | 95.4% | 3.6% | 20 | 1.25 | 19 | .11 | 1.22 |
Dictation | 68 | 97.1% | 97.5% | 2.3% | 20 | 0.21 | 19 | .42 | 0.17 |
Verbal-phonological short-term memory: phonological input buffer | |||||||||
Word matching spanb | 5 | 6.33 | 0.98 | 12 | 1.3 | 11 | .11 | 1.36 | |
Digit matching spanb | 7 | 7 | 0 | 10 | – | ||||
Verbal-phonological short-term memory: phonological output buffer | |||||||||
Digit spanb | 7 | 7.05 | 1.28 | 29 | 0.04 | 28 | .49 | 0.04 | |
Word spanb | 6 | 5.57 | 0.75 | 35 | – | ||||
Nonword spanb | 3 | 3.46 | 0.54 | 37 | 0.84 | 36 | .20 | 0.85 | |
Nonword reading | 40 | 100% | 95.9% | 4.2% | 372 | – | |||
Nonword repetition | 48 | 97.9% | 95.4% | 3.5% | 20 | – | |||
Working memory | |||||||||
Digit span backwardc | Max. level = 8 digits | 93rd percentile | |||||||
Digit span forward + backc | Score = 24 | 91st-97th percentile | |||||||
2-back | 30 | 87% | 96.4% | 4.0% | 18 | 2.06 | 17 | .03 | 2.35 |
Note. DL = DL's performance, either span or percentage of correct responses.
aNo statistical comparison when DL’s score was identical with, or numerically higher than, the control group. bControl data for these span tasks are taken from Gvion and Friedmann (2012). cThese span tasks and their norms are from WAIS-III (Wechsler, 1997).
Verbal Memory
Short-term memory and working memory
We examined the two components of verbal-phonological short-term memory: the phonological input buffer, the short-term store that maintains auditory verbal information in memory during comprehension; and the phonological output buffer, the short-term store that maintains phonological elements during speech production (Butterworth, 1989, 1992; Dell, 1986, 1988; Franklin, Buerk, & Howard, 2002; Friedmann, Biran, & Dotan, 2013; Friedmann & Gvion, 2002; Garrett, 1976, 1992; Gvion & Friedmann, 2012; Kempen & Huijbers, 1983; Levelt, 1989, 1992; Martin, Shelton, & Yaffee, 1994; Monsell, 1987; Nickels, 1997; Nickels, Howard, & Best, 1997; Patterson & Shewell, 1987; Shallice, Rumiati, & Zadini, 2000; Shallice & Warrington, 1977). We also examined DL’s working memory – the mechanisms that maintain mental representations available for use in thought and action – i.e., situations where the information is both maintained and being used actively (Oberauer et al., 2018).
Span tasks
The phonological input buffer was examined using tasks that required memorizing an auditory input sequence but did not require producing it verbally: (1) Digit matching span – DL heard pairs of digit sequences in increasing lengths, and judged whether the two sequences in each pair have the same digit order (e.g., 3-7; 3-7) or not (3-7; 7-3). (2) Word matching span – same, but with words rather than digits. To examine the phonological output buffer, we used tasks that required both memorizing the auditory input and producing it verbally: (1) reading aloud 40 nonwords (from the TILTAN battery, Friedmann & Gvion, 2003a). (2) Repeating 48 nonwords, some of which were long, and phonologically or morphologically complex (from the BLIP battery, Friedmann, 2003). (3) Serial recall (span) – repeating sequences of digits, words, or nonwords in increasing lengths (from the FriGvi battery, Friedmann & Gvion, 2002; Gvion & Friedmann, 2012; in the words and nonwords tasks, the items in each sequence were phonologically and semantically dissimilar). DL’s performance in all these tasks was good (Table 2), indicating good verbal-phonological short-term memory – i.e., intact phonological input buffer and phonological output buffer.
Working memory was examined using a backward span task (from WAIS-III; Wechsler, 1997): DL heard sequences of digits in increasing length, and repeated each sequence in reverse order. As control, she also performed the WAIS-III forward span task (same, without reversal). She performed well in both tasks (Table 2).
2-back
Another task we used to examine working memory is the 2-back task (from the FriGvi battery, Friedmann & Gvion, 2002, 2003b): DL heard 99 one-syllable animal names, one item per second, and for each item she decided whether it was identical with the item that was 2 places back in the list (e.g., cat – donkey - cat; this was the case for 30 items). DL’s performance was worse than the control group (Table 2).
Note that DL showed poor performance in the 2-back task, but good performance in the backward span task. This discrepancy between the tasks indicates that, although both tasks are typically considered as indexing working memory, they require somewhat-different cognitive abilities. We propose that the crucial difference between the two tasks is that the 2-back task, but not the backward span task, induces interference. The 2-back task requires attending a target item (in our task, the item that appeared 2 positions before in the list), and ignoring temporarily-irrelevant items (i.e., the n-1 item). Crucially, the classification of each item as relevant or irrelevant keeps changing. This is parallel to interference when memorizing multiplication facts, which requires attending to a relevant fact and ignoring temporarily-irrelevant facts, while the classification of each fact as relevant or irrelevant keeps changing. Under this interpretation, DL has normal working memory capacity (demonstrated by her backward span), and her difficulty in the 2-back task resulted from a very specific deficit – hypersensitivity to interference. DL’s difficulty in the 2-back task may resemble the difficulty of DB (De Visscher & Noël, 2013) in the “recent probes” task, in which interference was induced by an item from the previous trial.
Long-term verbal memory
DL also performed two memorization tasks that examined her verbal long-term memory. In the first task she memorized a list of arbitrary words, and in the second task – a short story.
Memorizing a list of words
Rey AVLT test, Hebrew version (Vakil & Blachstein, 1997). The task included 10 sub-tests. The first 5 sub-tests were free recall of the same list of 15 nouns. The 6th sub-test was a free recall of a new list of 15 words, and in the 7th sub-test DL heard nothing and recalled the first list. The two latter sub-tests examine possible effects of interference between the two lists. The 8th sub-test, performed after a 20-minute retention interval that included no verbal tasks, required recalling list #1 again (without hearing it). The 9th sub-test required recognizing the 15 words of list #1 in a list of 50 nouns (the distracters being semantic, phonological, and list #2). Finally, DL was asked to sort the shuffled 15 words of list #1 to their original order. She performed well in all sub-tests (z scores: 0.57, 1.03, 0.25, 1.44, -0.05, -0.68, 0.56, 1.42, 0.73, 0.93), demonstrating good verbal short-term and long-term memory. We also examined 2 additional measures of the Rey AVLT test that specifically consider possible effects of interference. One measure, which aims to tap the effect of proactive interference, is the difference between the 1st and the 6th sub-tests (i.e., how well was list#2 memorized versus how well list#1 was first memorized). DL’s score was slightly low on this measure (z = -1.11, one-tailed p = .13), suggesting perhaps some sensitivity to proactive interference. A second measure, which aims to tap retroactive interference, is the difference between the 5th and 7th sub-test (i.e., the degradation in list#1 following the memorization of list #2). DL’s score in this measure was good (z = 0.77).
Memorizing a short story (Cohen, 1997)
The experimenter read aloud two short stories (about 100 words each), one after another. DL repeated each story immediately after its presentation, and again after 30 minutes (during which other tasks were administered). Her performance was on the 34th-40th percentile both in immediate recall and in delayed recall. Again, this result indicates good short-term and long-term verbal memory.
Interim Summary
DL performed well in the tasks that examined reading, lexical retrieval, verbal memory, and symbolic number processing. Thus, her difficulty in memorizing multiplication facts does not originate in a general deficit of language, number processing, or memory. These results replicate previously-reported dissociations between good memory functions and impaired knowledge of arithmetic facts (Butterworth, Cipolotti, & Warrington, 1996; Kaufmann, 2002).
DL had difficulty in only one memory task – the 2-back task. We proposed that this difficulty actually originated in her hypersensitivity to interference, because the 2-back task may induce a high degree of interference.
Sensitivity to Interference
To examine DL’s sensitivity to interference, we adapted to Hebrew De Visscher and Noël’s (2013) "first name – surname – country" memorization task. The task requires memorizing two list of verbal, non-numeric facts: one list with high between-item similarity, and another list with low between-item similarity. If DL has hypersensitivity to interference, she should perform better on memorizing the low-similarity list than on memorizing the high-similarity list. This was the pattern exhibited by DB in De Visscher and Noël (2013), and as we shall now see – also by DL.
Method
The task included a list of 12 fictitious person names (first name + surname) and a country in Africa or Asia where each person allegedly lived. Unknown to DL, the 12 names were the mixture of two lists with 6 names in each: in the low-similarity list, each first name and each surname appeared only once. In the high-similarity list, there were only 3 first names and 3 surnames, and each repeated twice to create 6 different combinations. DL memorized the 12-item list in 5 successive and identical learning stages, each of which was administered as follows. First, the experimenter said aloud each item (name-surname-country) and DL repeated it. After completing the list of 12 items, the experimenter said each name-surname (in random order), DL answered where that person lived, and the experimenter corrected her errors. The 5 learning stages were followed by a final test stage: DL was presented with 24 name-surname-country combinations, and judged whether each combination was correct or not. In this final test, each name-surname appeared twice – once with the correct country, and once with the country of one of the 5 other persons in the same list. Both during learning and during testing, two subsequent items never had the same first name, surname, or country. DL's performance in this task was compared with 24 age-matched control participants (mean age = 40;3, SD = 5;2, range = 31;6 to 48;5) with no reported cognitive deficits and with a mean digit span of 7.17 (SD = 1.19). One additional control participant was excluded due to outlier performance (chance level) in the final test. The detailed results of this task (DL and controls) are available in Supplementary Online Material.
Results
Notably, in the final test (Table 3) even the control group was affected by the similarity between list items: their performance in low-similarity items was marginally better than in high-similarity items, paired t(23) = 1.64, one-tailed p = .06, Cohen’s d = 0.41. This effect could also be observed during learning – e.g., in the last learning stage they recalled 4.21 out of 6 low-similarity items (SD = 1.72), but only 2.79 out of 6 high-similarity items (SD = 1.28; paired t(23) = 4.30, one-tailed p = .0001, d = 0.92). These results agree with previous findings of similarity-induced interference in normal population (Corman & Wickens, 1968; Hall, 1971; Mark-Zigdon & Katzoff, 2015; Oberauer, Lewandowsky, Farrell, Jarrold, & Greaves, 2012; Oppenheim, Dell, & Schwartz, 2010; Posner & Konick, 1966; Runquist, 1970, 1971).
Table 3
Similarity between items | DL | Control group
|
DL vs. controls (one-tailed p)a |
||||
---|---|---|---|---|---|---|---|
M | SD | t | df | p | zcc | ||
Low | 11 | 10.32 | 1.44 | – | – | – | – |
High | 7 | 9.88 | 1.26 | 2.24 | 23 | .02 | 2.29 |
Note. DL = DL's performance, number of correct responses (out of 12) in the Verbal Memorization Task (Name – Surname – Country).
aNo statistical comparison for low-similarity items, because DL’s score was numerically higher than the control group.
As for DL, she showed a dramatic effect of similarity in the final test: she performed almost at ceiling on the low-similarity list, having only a single error – fewer errors than the control group, and not significantly more than zero errors (Fisher's one-tailed p = .50). Conversely, her performance was poor on high-similarity items – significantly worse than the control group, not significantly different from the chance level of 50% (Fisher’s one-tailed p = .50), and significantly worse than her own performance in the low-similarity items (using the paired bootstrap method described in General Method, one-tailed p = .01, z = 1.83). This pattern meets the criteria for classical dissociation (Crawford, Garthwaite, & Gray, 2003). Crucially, increasing the similarity level (low-similarity list versus high-similarity list) disrupted DL’s performance significantly more than it disrupted the control group’s performance (dissociation analysis of Crawford, Garthwaite, & Porter, 2010: t(23) = 2.15, one-tailed p = .02). These results clearly show that DL was sensitive to the level of item similarity significantly more than the control group. Namely, she had hypersensitivity to verbal interference.
Summary of the Assessment Results
DL’s difficulty in solving multiplication facts is best explained as hypersensitivity to verbal interference: she demonstrated this hypersensitivity also in a memorization task that did not involve numbers. In contrast, she showed good general memory, language, and symbolic number processing abilities.
Multiplication Facts Training
We now turn to the main aim of this study – to examine whether our training method would allow DL to learn the multiplication table in spite of her hypersensitivity to interference.
Method
The training program was structured as a pre-training test, a training period, and several post-training tests (left column in Figure 1). The training was done on the 16 multiplication facts with the lowest pre-training scores, which were grouped into 4 sets with 4 facts in each. Each set of facts was trained during one week, and only during this week. After the 4-week training period, a post-training test evaluated DL's knowledge of all multiplication facts. Another test was run after 2 months, during which DL received no training. Throughout this 3-month period, DL was asked not to rehearse multiplication on her spare time, and she reported to have followed this instruction. All training and test sessions were performed orally over the telephone, while DL was in a quiet room in her home. In all training and testing, the larger operand always appeared first.
Figure 1
To allow examining the effect of interference, the four sets of trained facts were constructed, unknown to DL, to have different degrees of within-set interference: there were three low-interference sets and one high-interference set. The degree of interference was manipulated by controlling the degree of similarity between the facts in a given set. We predicted that DL would show better memorization of the low-interference sets than of the high-interference set. As we shall see below, this prediction was confirmed, which means that at the end of the study DL knew some multiplication facts but not others. Thus, after completing the study (including the 2-month follow-up test), we taught DL the remaining facts properly (in the low-interference mode), so that by the time she left our lab she knew the multiplication table fully. An additional test of DL’s knowledge was run 3 years later.
Grouping the Trained Facts Into Sets
The 16 trained facts were grouped into 4 sets, with 4 facts per set. The sets differed from each other in the level of within-set similarity, which was computed using De Visscher and Noël’s (2014b) method: first, the similarity between each two multiplication facts was defined as the number of digit pairs that appeared in both facts, irrespectively of the digits’ position in the fact and of the relative order of the two digits. For example, the facts 8×7=56 and 8×3=24 have no common digit pair (only the digit 8 appears in both) so their similarity is 0. The facts 3×4=12 and 3×7=21 have three common digit pairs (1-2, 2-3, and 1-3) so their similarity is 3. Then, the similarity index for a set of 4 facts was computed by summing the pairwise similarities of all 6 fact pairs in the set. Below, in Appendix A, we consider alternative methods to compute similarity and their fit to the observed results.
Of the four sets of facts, one set had high similarity (7×4=28, 7×6=42, 8×4=32, 9×4=36, similarity=9). The three other sets had lower similarities (4×4=16, 8×3=24, 8×7=56, 5×3=15, similarity=0; 8×8=64, 9×7=63, 6×2=12, 8×6=48, similarity=3; 9×6=54, 6×5=30, 8×5=40, 7×5=35, similarity=4).
The Training Program
Each set of 4 facts was trained during one week, in four 5-minute sessions held in four separate days. Each week started with a session that tested DL’s knowledge of the facts that she learned during the previous weeks, and continued with 3 identical training sessions (Figure 1). Each session took about 5 minutes. The high-similarity set was trained on the second week.
Training sessions. The session started with a pretest: the experimenter said each fact and DL said the answer. After responding to all 4 facts, her errors were corrected. Next, 3 memorization-and-recall phases were done. In each phase, the experimenter said the 4 facts (exercise and result, e.g., "four times five, twenty"), presented in the same order in each session, and DL repeated each fact immediately after it was said; then DL recalled the 4 facts in free recall. Errors, “I don’t know” responses, and omissions of facts were corrected immediately. A post-test, administered like the pretest, completed the session. The order of presenting the facts in each session, and DL’s full responses, are available as Supplementary Online Material.
Testing during training. In the first session in each week (except the first week), we tested DL's knowledge of all the facts she learned since the beginning of the training program. Namely, 4 facts were tested in the beginning of the 2nd week, and all 16 facts were tested in the beginning of the 5th week. In each of these sessions, each fact was presented 3 times in a pseudo-random order, such that the same fact never appeared twice in a row, and the fact a×b was never followed by the facts a×(b±1) or (a±1)×b. Errors were not corrected, and there was no teaching during these test sessions. DL’s full list of responses in these tests is available as Supplementary Online Material.
Testing Before and After the Training
DL's knowledge of the multiplication facts was tested in 4 time points (hereby, “testing times”): before training, immediately after training, two months after the training ended, and 3 years after the study ended. Each testing time consisted of 3 testing sessions, administered in 3 days of a single week. In each of these testing sessions, DL was asked to solve the 55 multiplication facts. Reaction times were defined as the delay between the time when the experimenter finished asking the question and the time when DL started saying the result. This delay was measured by inspecting the recordings with an audio-processing software.
Three kinds of responses were classified as errors: (1) Incorrect responses, even if preceded or followed by a correct response. (2) “I don’t know” responses. (3) Extremely slow responses, which suggest that DL used calculation rather than retrieval: reaction times that exceeded the 75th percentile of the correct-response trials by more than 150% the inter-quartile range. To compute the outlier threshold, items that were classified as errors by one of the two other criteria were not included. The outlier threshold was computed per testing time, but essentially the same results were obtained when using a common threshold for all testing times.
In each testing time, DL was tested 3 times on all facts, yielding an accuracy score between 0 and 3 for each fact. The 16 facts with lowest pre-training scores were selected for training (10, 3, and 3 facts with score = 0, 1, and 2, respectively).
On top of these four testing times, we also examined DL’s performance in the testing sessions that were administered during the training period, in the first session in each week. In each of these sessions, DL was tested on the facts that she already learned (i.e., on 4 facts in the beginning of the second week, and on 16 facts in the beginning of the fifth week). From each session we considered only the facts learned in the previous week. Below, we refer to this data as “testing during training”.iii
Results
Effectiveness of Training: Performance Over all Facts
In the rule-based facts (N×0 and N×1), DL had merely a single error in the pre-training test and in the 3-year follow-up, and no error in the other testing times. Table 4 shows her error rates in the non-rule facts. Outlier reaction times of the non-rule facts were classified as errors, as explained in the previous section (“Testing Before and After the Training”): responses slower than 2408 ms in the pre-training test, 3265 ms in the post-training test, 4010 ms in the 2-month follow-up test, and 2380 ms in the 3-year follow-up test. To examine the effect of training, we compared the pre-training performance with each of the other testing times using the paired bootstrapping method described in General Method, based on DL’s pre-training and post-training scores (a 0-3 score for each fact).
Table 4
Test | Before training
|
During training
|
After training
|
2-month follow-up
|
3-year follow up
|
||||
---|---|---|---|---|---|---|---|---|---|
% correct | % correct | zbs | % correct | zbs | % correct | zbs | % correct | zbs | |
All 36 facts | 56 | 75*** | 2.87 | 81*** | 3.72 | 62 | |||
16 trained facts | 21 | 79*** | 3.41 | 56*** | 2.84 | 65*** | 3.18 | 35** | 1.94 |
20 untrained facts | 83 | 90 | 95** | 83 |
Note. zbs = Bootstrap effect size.
**p < .01 (one-tailed). ***p < .001 (one-tailed). Comparison with pre-training test.
DL’s performance in the non-rule facts (Table 4) significantly improved from the pre-training test to the post-training test, i.e., the training was effective. As predicted, this overall improvement was driven by a significant improvement in the trained facts, with no significant improvement in the untrained factsiv. Strikingly, DL’s relatively short training program has succeeded where years of schooling did not: it gave rise to an improvement in the trained facts. This improvement persisted even after 3 years during which she received no additional training (Table 4), and – according to her – did not use the multiplication table very often. Still, unsurprisingly, her knowledge of the trained facts after 3 years was not as good as at the end of the training program (comparing the two using the bootstrapping method described above, one-tailed p = .03, z = 1.69).
Effect of Within-Set Interference on the Post-Training Knowledge
Our main question was whether we can detect differences in the training effect between the facts that were trained together with highly-similar facts and the facts that were trained with dissimilar facts. Figure 2 presents DL’s detailed performance for each exercise in each of 4 testing times – before training, during training (on the testing sessions in the beginning of each week), after training, and in the 2-month follow-upv. The table clearly shows an effect of similarity: both during training and in the post-training tests, accuracy was better in the sets with lower within-set similarity than in the high-similarity set. To examine whether this effect was significant we considered, for each testing time, the within-set similarity values corresponding with each of DL’s answer attempts. We classified these values into two groups according to her answer (correct or incorrect), and compared the two groups using Mann-Whitney U test (the effect size r was computed according to Fritz, Morris, & Richler, 2012). The correct answers had significantly lower similarity values than the incorrect answers, confirming that lower within-set similarity in the training period improved the post-training accuracy (during training: U = 94, one-tailed p = .006, r = .36; after training: U = 217.5, one-tailed p = .08, r = .20; two-month follow-up: U = 161.5, one-tailed p = .01, r = .33; i.e., a medium effect size during training and in the 2-month follow-up). To eliminate any possible effect of prior knowledge, we also ran the same analysis only on the 9 exercises with pre-training score = 0. The results were essentially the same: for the tests administered during training, this analysis showed a slightly stronger effect than in the previous analysis (U = 35.5, one-tailed p = .01, r = .43). For the tests administered after training, the analysis now showed a slightly weaker effect than before, which was marginally significant (immediately after training: U = 58, one-tailed p = .08, r = .27; two-month follow-up: U = 64, one-tailed p = .09, r = .26).
Figure 2
These results were not an artifact of problem size (better knowledge of facts with smaller operands, Zbrodoff & Logan, 2005). First, a fact’s average operand size did not correlate with the within-set similarity level (r = .20, p = .46). Second, when directly comparing operand size and similarity as explanations to DL’s performance during and after training, similarity stands out as a genuine predictor of accuracy. This was examined by submitting DL’s accuracy scores to a logistic regression with two predictors: the average operand size and the within-set similarity level – 1, 2, or 3 (weeks 3 and 4, whose similarities were nearly identical, were both assigned similarity level 2). Both during training and in the 2-month follow-up test, the within-set similarity had a significant effect on accuracy (during training: z = 2.59, one-tailed p = .005; follow-up: z = 2.23, one-tailed p = .01), whereas the problem size had a smaller or non-significant effect (during training: z = 0.95, one-tailed p = .17; follow-up: z = 1.73, one-tailed p = .04). Namely, the effect of similarity could not be reduced to a problem size effect. In the post-training test, however, the regression showed a significant effect only for problem size (z = 2.46, one-tailed p = .007), with no significant effect of similarity (z = 0.76, one-tailed p = .22).
A second analysis ignored the specific similarity values, and just compared the per-fact accuracy after training (correct or incorrect for each answer attempt) between the high-similarity set (H) and the three lower-similarity sets (L). The comparison used the unpaired bootstrapping method described in General Method, based on the 0-3 score of each fact (the random distribution for H0 was computed by considering all possible classifications of the 16 facts into arbitrary “H” and “L” groups containing 4 and 12 facts respectively). The effect of similarity was significant in the tests administered during training (H: 50%, L: 89%, one-tailed p = .003, z = 1.85), in the post-training test (H: 42%, L: 61%, p = .03, z = 0.81), and in the two-month follow-up test (H: 17%, L: 81%, p < .001, z = 2.81). These results survived the exclusion of the three facts with the highest pre-training scores (during training: one-tailed p = .003, z = 1.47; post-training: p = .02, z = 0.55; two-month follow-up: p < .001, z = 2.42).
A third analysis examined how quickly DL learned each fact until reaching ceiling performance. Each fact’s learning duration was defined as the last answer attempt, out of the fact’s 12 answer attempts (right columns in Figure 2), in which DL made an error (here we did not code slow answers as errors, because most answer attempts during training were made as free recall, in which the response time is undefined). The 4 facts in the high-similarity set were the slowest to be learned – they had the longest learning durations (p = < .001).
Taken together, these results form a clear picture: DL’s learning of the multiplication facts was strongly affected by similarity. The effect of similarity was clearly observed during the 4-week training period, and critically – also in the test administered two months after DL’s training has ended. In the test administered immediately after the 4-week training period, the effect of similarity was observed only in one of the two analyses. The weak similarity effect immediately after the 4-week training period can be explained by the fact that in this test, the between-fact variance in the time elapsed since each fact was learned was the highest: DL had learned the facts of set #4 only few days before the post-training test, but she had learned the facts of set #1 almost 4 weeks before the post-training test. The idea of time-elapsed-since-learning as a confounding factor is supported by comparing the results of the immediately-after-training test with those of the two other post-training tests (during training, 2-month follow-up): in the immediately-after-training test, DL performed more poorly for facts that were learned more recently. Whether this effect is random or has a genuine cognitive origin remains an open question, as the amount of data in the present study is insufficient to evaluate the effect reliably.
The effect of similarity did not go unnoticed by DL herself: during the 2nd week of training, when she learned the high-similarity set, she commented more than once that "it is hard for me to learn these exercises because of all these 4's that repeat over and over again" – an accurate description of her hypersensitivity to interference.
Pre-Training Scores
If DL’s learning is affected by the similarity between facts, this should also be reflected in her performance in the pre-training test, because even before we started the training program, DL’s knowledge of specific arithmetic facts was presumably affected by similarity-induced interference (De Visscher & Noël, 2014b). In particular, we predicted that she would show lower pre-training knowledge for multiplication facts that have higher similarity with the rest of the multiplication table.
To examine this prediction, we considered each of DL’s answer attempts in the pre-training test for all 36 non-rule facts. For each answer attempt, we computed the fact’s similarity with the rest of the multiplication table (hereby denoted Sim) as the sum of that fact’s similarities with all other multiplication facts between 2×2 and 9×9 (the similarity between each pair of facts was defined as explained in the “Grouping the Trained Facts Into Sets” section). We predicted that DL’s pre-training knowledge would be better for facts with lower fact-table similarity (lower Sim). Namely, the Sim values should be larger for DL’s incorrect pre-training answers than for her correct answers. To assess whether this was the case, we used the unpaired bootstrapping method described in General Method. The difference between Sim of correct answers (15.25) and that of incorrect answers (18.19) was significant (one-tailed p = .03, z = 1.83). Essentially the same results were obtained when the fact-table similarity computation excluded the tie problems (Sim = 13.78 versus 16.21, one-tailed p < .05, z = 1.72). Namely, as predicted, DL’s pre-training knowledge was better for multiplication facts that were less similar to the rest of the multiplication table.
Discussion
Hypersensitivity to Interference as a Source for Difficulty in Memorizing Arithmetic Facts
We reported the case of DL, a 40-year-old woman with severe difficulties in memorizing the multiplication table. DL had several spared memory functions: her short-term memory spans were in the normal range, she performed well in nonword reading and repetition, and she showed good ability to remember arbitrary lists of words and the details of a story for several tens of minutes. These results indicate good verbal short-term and long-term memory abilities, i.e., DL's difficulties in memorizing multiplication facts did not stem from a general deficit in verbal memory. This pattern is consistent with the finding of a double dissociation between memory capacity and knowledge of arithmetic facts (Butterworth et al., 1996; Kaufmann, 2002), and with the finding that short-term memory capacity and arithmetic fact knowledge do not correlate (Temple & Sherwood, 2002).
In contrast, DL performed poorly in a task that taps hypersensitivity to interference: when asked to memorize verbal non-numeric items, she performed poorly only in items that were similar to each other. Moreover, among the tasks that assessed working memory, DL’s performance was poor only in the task that induced interference by repeating the same words over and over again (the 2-back task). Thus, her difficulty in multiplication is best explained as resulting from hypersensitivity to interference.
Our study aimed to confirm this conclusion by showing, for the first time, causal evidence for relation between hypersensitivity to interference and difficulty in multiplication facts. Our second goal was clinical – to develop a method to help individuals who are hypersensitive to interference to learn the multiplication table. To accomplish these aims, we devised a training method that manipulated the degree of interference. The method was clearly successful: DL managed to memorize multiplication facts as long as in a given week, she only had to learn facts that were relatively dissimilar from each other. In this condition, her learning was virtually immediate: in the set with lowest similarity, she reached perfect performance after merely two (!) exposures to each fact. This good memorization of multiplication facts was exhibited during the training sessions, and even two months after DL’s training has ended. In marked contrast, she had difficulty in the set with high similarity between facts: she made many errors while learning, her scores in the weekly tests during training were hardly any better than the pre-training score, and this small improvement virtually disappeared two months later. We showed that this effect of similarity could not be explained as an artifact of problem size.
These results extend the findings of De Visscher and Noël (2013) in two ways. First, the hypersensitivity to interference of the woman they reported, DB, was manifested mostly in slow retrieval of multiplication facts, whereas DL showed not only slow RTs but actually erred in almost half of the multiplication facts (even when ignoring the RT outliers, DL’s error rate in non-rule facts before the training program was 39%). Second, whereas De Visscher and Noël’s evidence for interference as the source of multiplication difficulty was correlational, here we showed evidence for a causal relation: manipulating the amount of similarity-induced interference affected the memorization of multiplication facts.
The Cognitive Mechanisms Underlying Sensitivity to Interference
Our findings clearly show that DL’s difficulty in multiplication was the result of hypersensitivity to interference. Still, to understand the exact origin of interference, we need to understand the cognitive architecture that underlies the representation of arithmetic facts, and the process that is sensitive to interference.
The Processing Stage Sensitive to Interference
According to Campbell’s network interference model, interference arises because one multiplication fact is associated with another, so they activate overlapping representations (Campbell, 1987, 1995; Oberauer, 2009). When the facts are similar to each other, irrelevant associations between them are strengthened. Furthermore, processing different facts in temporal proximity may strengthen the irrelevant associations between these facts. In our experiment, manipulating similarity and temporal proximity during learning affected how well DL learned the facts. Importantly, the effect of similarity could reliably be attributed to processes that occur during learning time, because similarity and temporal proximity were not manipulated when DL’s knowledge was tested (but note that the learning sessions included not only encoding and storage but also retrieval). Additionally, Campbell (1987) showed that arithmetic performance was affected by manipulating the degree of interference during retrieval, for arithmetic facts that the participants already knew beforehand. It therefore seems that interference affects both learning-time and retrieval-time processes. Indeed, other studies too showed that high levels of interference may take an effect in different processing stages (Bartko, Cowell, Winters, Bussey, & Saksida, 2010; Farrell, 2006; Fernandes & Moscovitch, 2000; Kaufmann, Lochy, Drexler, & Semenza, 2004; Lochy, Domahs, & Delazer, 2004; Van Dyke & McElree, 2006; Wixted, 2004). Interference may disrupt the encoding and storage of data in memory while learning the facts (Farrell & Lewandowsky, 2002; Lewandowsky & Farrell, 2008), and it may disrupt the retrieval stage (Burgess & Hitch, 1999; Campbell, 1987; Henson, 1996, 1998).
A more specific question concerns the locus of DL’s deficit. Her hypersensitivity to interference may have resulted from an encoding/storage deficit in creating the network of associations in long-term memory – e.g., she may have been creating too strong irrelevant associations. Alternatively, her deficit may be impaired retrieval processes, which fail retrieving the arithmetic facts from an intact storage. Sadly, it seems that our data cannot arbitrate between these possibilities. Within the framework of the network interference model, retrieving the correct answer to a multiplication problem requires that the representation of the fact is distinctive enough from other facts. When this is not met, incorrect responses occur. This can happen because the storage of facts is corrupted such that the facts are not distinguishable sufficiently; but it can also happen because the retrieval process is impaired and requires higher distinctiveness for successful retrieval. Under either assumption, increasing the distinctiveness of facts (as our manipulation probably did) would increase the activation of the correct solution relative to incorrect solutions, and would therefore help achieving the required threshold of distinctiveness. In short, both possibilities (storage deficit and retrieval deficit) make similar predictions about DL’s performance. Note, however, that from a clinical/intervention point of view, the picture emerging from our study is clear: a learning-time intervention can help overcoming similarity-induced interference.
Interference and Spacing
De Visscher and Noël (2014b) described interference as an effect of previously-learned items on the ability to learn a new item. Consequently, they defined an “interference parameter” for each multiplication fact as its degree of similarity versus all previously-learned facts. This formulation accords with the definition of proactive interference in working memory (Bennett, 1975). However, in the present study, it is impossible to define a clear order of learning the facts, because several multiplication facts were learned simultaneously during each given week (this is probably the case also in real-life situations of learning the multiplication table). The interference here does not arise from the similarity with previously-learned items, but from the similarity with simultaneously-learned items.
Our data suggest that the critical methodological factor is to create a sufficient temporal gap between the learning of similar multiplication facts. Future studies may examine the size of the temporal gap required to avoid interference. This idea of temporal separation accords with other studies showing that learning is improved by increasing the temporal delay between learning sessions that contain potentially-interfering items (Friedman & Korman, 2016).
The Type of Information Sensitive to Interference
The effect of similarity on the size of interference accords with the view of interference as arising from the amount of overlapping features between the items to be remembered (Oberauer & Lange, 2008). But what are these features? The representations sensitive to interference could be phonological (Baddeley, 1966, 1968; Farrell, 2006; Nelson et al., 1974; Runquist, 1970), semantic (Baddeley, 1966; Oppenheim et al., 2010), number-specific, or another representation.
In line with the possibility of phonological sensitivity-to-interference, the speed and accuracy of addition fact retrieval was shown to be affected by phonological similarity (Noël, Désert, Aubrun, & Seron, 2001). Further support to the role of phonological interference comes from studies of non-number words, which show that word memorization is affected by their phonological similarity to each other (Nelson et al., 1974; Pajak et al., 2016; Runquist, 1970). However, interpreting these findings as an explanation for difficulties in memorizing multiplication facts should be done with caution, because at least some phonological mechanisms treat words and numbers differently: e.g., the speech mechanisms handle words as sequences of phonemes, but numbers words as whole building blocks – in speech production (Bencini et al., 2011; Cohen, Verstichel, & Dehaene, 1997; Dotan & Friedmann, 2015, 2019; Shalev, Ophir, Gvion, Gil, & Friedmann, 2014), and apparently also in speech comprehension (Fischer-Baum, Mis, & Dial, 2018). Furthermore, the representation of multiplication facts in memory is apparently not purely phonological (Whalen, McCloskey, Lindemann, & Bouton, 2002).
In order to fully understand sensitivity to interference, and the effect of similarity, one must use a detailed cognitive model of arithmetic fact representation. Only a detailed model can specify the precise amount of representational overlap between some given facts, and the amount of interference between them. Consequently, each such model can predict how easy it should be to learn a given set of multiplication facts, so the model can be evaluated by comparing these predictions against the actual learnability of facts. In Appendix A we describe 7 detailed cognitive models of arithmetic fact representation, and we evaluate them based on their ability to predict several aspects of DL’s performance. This analysis lends support to the models that assume symbolic representations (digits / number words) of arithmetic facts, rather than magnitude representation – in accord with several studies indicating that multiplication facts are stored in verbal format (Dehaene, 1992; Dehaene & Cohen, 1995; Dehaene et al., 2003, 1999). Future studies may elaborate further on the precise representation of arithmetic facts.
Clinical and Pedagogical Implications
The clinical goal of this study was to examine whether a person can learn the multiplication table even when they have hypersensitivity to interference. Our results clearly indicate that they can: as long as we maintained a low level of interference, DL easily learned the multiplication facts. This is not trivial: conceivably, one could hypothesize that learning a sequence of facts such as 7×4, 8×4, and 9×4 would actually be easier – for example, it may more transparently lead to an addition-based strategy as scaffold for multiplication. The finding that such a set was harder to memorize, in spite of the opportunity for scaffold strategies, emphasizes even further the importance of similarity as a factor that determines the difficulty of memorization, at least for individuals with hypersensitivity to interference.
Our training method was effective for DL, a woman with hypersensitivity to interference, but its clinical implication may be most relevant for children who learn the multiplication table at school, many of whom may have normal sensitivity to interference. Will the same method be effective for all children, including children without hypersensitivity to interference? The findings of similarity-induced interference in Campbell (1987), whose participants were not screened for sensitivity to interference, suggest that the answer to this question is affirmative.
Additional support to our conclusions comes from another study that showed causal evidence for similarity-induced interference in typically-developing grade children (Mark-Zigdon & Katzoff, 2015). Similarly to our study, Mark-Zigdon and Katzoff examined how the memorization of multiplication facts is affected by manipulating the interference level. They taught a group of typically-developing 3rd grade children a set of 10 new multiplication facts, and showed that the children's memorization of these facts was disrupted if interference was induced by teaching a new set of multiplication facts immediately after the first set. Thus, like us, Mark-Zigdon and Katzoff showed that high-interference conditions disrupted memorization of multiplication facts. Still, each of the two studies highlights a slightly different aspect of interference: our study highlights the importance of low interference within a set of learned facts; Mark-Zigdon and Katzoff’s study highlights the importance of avoiding interference from out-of-set facts. Together, the two studies support what we described in the Introduction as the two foundations of an interference-reducing training method: grouping dissimilar facts when teaching, and teaching different sets of facts with a sufficient temporal gap between them.
Our findings directly bear on the recommended practices for teaching the multiplication table. At least for individuals with hypersensitivity to interference, it seems that we should teach simultaneously dissimilar rather than similar facts. This is almost the opposite of how multiplication is typically taught at school: very often, children learn the multiplication table in an ordered manner – first the products of 2, then the products of 3, etc. Although this ordered teaching method may have advantages, it implies that the children learn similar facts simultaneously, and this increases the degree of interference and may therefore create difficulty. Future studies may directly compare the traditional teaching method versus a low-similarity teaching method.