Notations play a crucial role in the development and advancement of mathematics and science. Some of the oldest records of human writing are notches on bones, a simple notation to keep track of quantities. Today, it is difficult to imagine a world without numerals: we use the decimal place-value notation based on the ten digits from ‘0’ to ‘9’ in our daily lives and they form the basis of our communication and commerce. But what features do good notations have? Here we investigate one possible answer to this question, by exploring whether symbols that share some of the properties of the mathematical concepts they represent are 'better' than those which do not.
The American philosopher, logician, and mathematician Charles Sanders Peirce (1894, 1902) famously distinguished between iconic signs, “which serve to convey ideas of the things they represent simply by imitating them” (1894, p. 5), and symbolic signs, in which the relation between the sign and its representation is purely arbitrary, stipulated by convention. For example, if we represent the fact that we have four oranges by ‘||||’, then we just need to count the number of strokes in the tally-notation to obtain the quantity it represents. In contrast, nothing in the symbol ‘4’ gives us a hint about the quantity it represents. Through education, we have learned to associate this particular sign with the number four, but in principle we could have used any other sign for this purpose (for an overview of the historical development of mathematical notation see Cajori, 1993 and Mazur, 2014).
Our current notations for natural numbers and basic arithmetic operations, such as addition, subtraction, multiplication and division, seem to use only symbolic signs for the operators (e.g. ‘+’, ‘-‘, ‘×’, ‘/’) . Could the use of more iconic signs for mathematical operators contribute to the effectiveness of notations? Our goal in this paper is to investigate this possibility. We begin by reviewing one early articulation of this hypothesis, due to Christine Ladd.
Christine Ladd (1847-1930) was the first woman to complete all requirements for a Ph.D. in mathematics and logic at Johns Hopkins University in 1883, where she studied with J.J. Sylvester and C.S. Peirce. However, since the university did not officially admit women at the time, she was not granted her Ph.D. until 1926. From 1884 onwards she was known as Christine Ladd-Franklin. In addition to mathematics and logic, she also worked on experimental psychology and the theory of color vision (Cadwallader & Cadwallader, 1990).
Our interest here concerns Ladd’s views on notational choices in mathematics. While the choice of individual signs in a symbolic notation (which we refer to as ‘symbols’) is purely conventional, the question arises whether some symbols are better suited than others for expressing certain contents. In other words, can we find some effect that the shape of a symbol has on the way it is used? If so, such cognitive considerations could be used to guide the choice of individual symbols.
Ladd's hypothesis regarding the relation between a symbol and its meaning was first articulated in the early literature on notations for logic. In a discussion of different systems of logic, she compared Peirce’s symbol for implication, ‘ ’ with that of Hugh MacColl, who used a colon, ‘:’, and remarked: "The copula ‘ ’ has an advantage over the colon ‘:’ in that it expresses an unsymmetrical relation by an unsymmetrical symbol" (Ladd, 1883, pp. 24–25). In this remark she was drawing attention to the fact that the implication relation is not symmetric, because ‘A implies B’ means something different from ‘B implies A’. From her comparison of the copula and the colon, we can further infer that Ladd understood ‘symmetric’ in regard to symbols as a reflectional symmetry along a vertical axis.
Although Ladd made her remark in the context of symbols for logical relationships such as implications, biconditionals, and so on, the issue also applies to binary operations, which are perhaps more pertinent to school-level mathematics education. Binary operations can be understood as mathematical rules for combining two elements of a given set to produce a third. For example, the addition of integers is a simple binary operation: two integers are combined to produce a third. Moreover, addition is a symmetric, or commutative, binary operation: for any two integers a and b, a + b = b + a. Not all binary operations are commutative. For instance, subtraction is not: in general a – b ≠ b – a.
In view of Ladd's remark concerning logical notation, it seems reasonable to suppose that she would have applauded the symmetrical nature of the addition symbol, but criticised the symmetry of the symbol for subtraction. In sum, in the context of binary operations we can formulate the following principle that underlies Ladd’s assessment: Commutative operations should be expressed by symmetric symbols. Following this principle would render the operation symbol iconic of the mathematical property it represents. Note that symbols can be symmetric along one or more lines of symmetry. Here, like Ladd, we are particularly concerned with those that exhibit reflectional symmetry along the vertical axis. This is because binary operations in mathematics, like written English, are usually displayed horizontally from left to right (e.g., a + b = c). A symbol with a vertical line of symmetry might therefore plausibly convey the possibility of swapping the different sides of that line of symmetry (e.g., that the elements a and b could be switched in the equation above). In contrast, a symbol that is symmetric only along the horizontal line of symmetry (like the symbol for the less-than relation, ‘<’) does not, in European languages such as English at least, offer the same visual affordance, as the elements being combined by the binary operation are not written above and below the symbol.
Symbolising Commutative Operations
It is notable that symbols typically used to represent binary operations do not always follow Ladd’s principle. While the choice of our symbol for addition, ‘+’, does (the symbol is symmetric and the operation is commutative), the symbol for subtraction does not. Subtraction is not commutative but the symbol ‘−’ is nevertheless symmetric. Similarly, some of the symbols we use to represent the (non-commutative) division operation are asymmetric, whereas others are not (compare ‘10/2’ with ‘10÷2’). Analogous observations can be made about advanced mathematics. For instance, introductory abstract algebra textbooks typically represent an arbitrary group operation (which is not in general commutative) with a symmetric symbol such as ‘✭’ or ‘•’ (e.g., Dummit & Foote, 2004; Smith, 2015). Similarly, in linear algebra the non-commutative matrix multiplication operation is typically represented by a symmetric symbol such as ‘⊙’ (Allenby, 1995), or simply by (symmetric) juxtaposition.
Indeed, an analysis of mathematical typesetting practice reveals that the majority of symbols used to represent binary operations are symmetric. Of the 59 primary symbols used to represent binary operations in LaTeX, a common mathematical typesetting language, 45 (76%) are symmetric along their vertical axes (we considered those symbols available in LaTeX by default or within the American Mathematical Society’s amssymb package, Pakin, 2017, Tables 50-51). But binary operations are not in general commutative.
Ladd did not elaborate what advantages she hypothesised would be gained by following the principle that commutative operations should be expressed by symmetric symbols. Neither did Schröder (1890, p. 119), who similarly criticised “the unsymmetrical presentation of so many symmetric relations” in natural language. It seems plausible to assume that both Ladd and Schröder’s views were based upon a belief that an alignment between the visual properties of a symbol and the formal properties of the mathematical relation it expresses would have a cognitive effect of some sort, perhaps that such an alignment would make it easier to associate the symbol with the particular relation in question. This, in turn, might be expected to lead to more fluent mathematical performance. Earlier in the 19th century Charles Babbage (1827, p. 370) formulated a similar idea: “The advantage of selecting in our signs, those which have some resemblance to, or which from some circumstance are associated in the mind with the thing signified, has scarcely been stated with sufficient force: the fatigue, from which such an arrangement saves the reader, is very advantageous to the more complete devotion of his attention to the subject examined”.
To our knowledge, despite the long history and apparent plausibility of Ladd’s hypothesis, it has never been empirically tested. This is what we set out to do in the experiments reported here. We derived two research questions from Ladd’s hypothesis: (1) Are symmetric symbols intuitively associated with commutative binary operations? and (2) Is the use of iconic symbols advantageous for students’ mathematical engagement with binary operations?
In this first study, we investigated whether the use of vertically symmetric symbols in mathematical statements depicting binary operations would bias people to endorse the commutativity of the operation. By comparing whether participants endorsed commutativity more often when symbols followed Ladd’s hypothesis than for the symbols with horizontal symmetry which did not follow Ladd’s Hypothesis, we were able to test whether vertically symmetric symbols are interpreted iconically, in the sense that they are intuitively associated with commutative binary operations.
We designed a simple task to test people’s intuitive judgement about the mathematical properties of a binary operation while we varied the visual properties of the symbol used. We used a range of arbitrary symbols to depict a binary operation (i.e. ‘3 ♠ 4’). Participants were asked whether they believed a statement about the commutativity of these operations to be true or false (‘3 ♠ 4’ is equal to ‘4 ♠ 3’ ?). We presented the operations once with the symbols having a vertical axis of symmetry (symmetrical as understood by Ladd) and once with the symbols rotated by 90° to change the symmetry axis to be horizontal.
The intended sample size and analysis plan was preregistered prior to data collection (see Supplementary Materials).
Thirty undergraduate and postgraduate students (14 male, 16 female, mean age = 26.23) participated in this study. Participants were invited to the Cognition Laboratory at Loughborough University where they completed the computerized task individually. The full duration of each session was about 20 minutes and participants were compensated for their time with £4.
The study task was programmed using the PsychoPy 3 software (Peirce et al., 2019) and was presented on a 17’’ laptop. At the beginning of the task, participants were instructed that they would see mathematical statements using unfamiliar symbols and should judge whether they believe the statement on the screen to be true or false. Participants were explicitly told that no calculation was required and that they might not know the correct answer, but should follow their intuition when judging the statements. Participants used the left and right arrow key to indicate whether they believed the statement in each trial to be true or false. The assignment of the arrow keys was counterbalanced between participants.
Participants judged 210 trials in an individually randomized order. A trial consisted of a binary operation, represented by a non-mathematical symbol, and a written statement about its relationship to the inverted form. An example trial is shown in Figure 1.
Out of all 210 trials, 140 asked participants for a judgement about the commutativity of the operation. Those trials stated that the operation ‘is equal to’ its inverse. The statements about the relationship in the remaining filler trials were either ‘is larger than’ or ‘is smaller than’ and responses to those trials were not analysed.
We constructed the trials from seven pairs of numbers between 1-99. Each pair was presented with each of ten different non-mathematical symbols (shown in Figure 2). One presentation was with the symbol being symmetrical on the vertical axis, one was with the symbol rotated by 90° to be symmetrical on the horizontal axis. This allowed us to compare the endorsement of commutativity statements on 70 trials where symbols had vertical symmetry with the endorsement of commutativity statements on 70 trials where symbols had horizontal symmetry. The raw data are freely available (see Supplementary Materials).
As shown in Figure 3, participants endorsed commutativity for symbols with vertical symmetry on average in 66.8% (SD = 26.1%) of the trials. Statements using symbols with horizontal symmetry were less frequently endorsed as commutative (M = 26.7%, SD = 25.7%). As preregistered, we calculated a one-sided paired t-test to compare the average frequency of commutativity endorsement between symbols with vertical and horizontal symmetry. In addition to the frequentist t-test, we also calculated a one-sided Bayes factor, using the default prior (a half Cauchy distribution with scale parameter 0.707) in favour of a difference. Participants endorsed commutativity significantly more frequently for statements that used symbols with vertical symmetry, which followed Ladd’s Hypothesis, than for the symbols with horizontal symmetry, which did not follow Ladd’s Hypothesis, t(29) = 5.57, p < .001, d = 1.02, BF10 = 7824.42. A robustness check revealed that this Bayes factor was not substantially affected by the choice of prior width.
We followed this main analysis up by calculating a generalized linear mixed model with a binomial link function to account for the random effects of participant and symbols. The aim of this analysis was to model random intercepts for each participant and symbol. This technique controls for the variation in the data that is introduced by participants differing in their general tendency to endorse a statement and some of the symbols generally being endorsed more frequently than others; both regardless of the symmetry axis of the symbol. The model was specified with an unstructured covariance structure and parameters were estimated using Laplace Approximation. Using this model, we calculated that statements with vertically symmetric symbols had significantly higher odds to be endorsed as commutative than statements with horizontally symmetric symbols, OR = 2.01, 95% CI [1.86; 2.16], p < .001. The analysis of the modelled random intercepts also indicated that the general endorsement rate only varied slightly between symbols (SD = 0.30) and varied more considerably between participants (SD = 1.05). Descriptive analyses of the effect for every individual participant and for every individual symbol are part of the Supplementary Materials.
In sum, we found very strong evidence that the notation for binary operations influences how that operation is intuitively interpreted. Specifically, in line with Ladd’s hypothesis, using symbols with a vertical axis of symmetry seems to be intuitively associated with commutativity. In other words, the visual properties of symbols for binary operations influence the mathematical properties that are intuitively attributed to these operations, as if the symbols were indeed iconic.
Given the findings from Study 1, it is natural to ask whether using symbols consistent with Ladd’s hypothesis would be advantageous for student’s mathematical learning. To our knowledge, no research has directly investigated this issue. However, it is known that learners do often inappropriately assume that binary operations are commutative. For example, Mitchell (1983) reported that around 50% of first, second, and third grade children responded to mathematical problems in ways consistent with a belief that subtraction is commutative. These findings were consistent with Weaver’s (1972), who noted that a belief in the commutativity of single-digit subtraction can lead children to assert, for example, that 78 – 45 = 33 (since 7 – 4 = 3 and 8 – 5 = 3). The results from Study 1 suggest that the symmetry of the minus sign might contribute to these difficulties.
The proposal that using symmetric symbols to represent non-commutative operations might cause problems for learners is consistent with recent work that has found that the visual properties of notations are cognitively relevant. As an example, Landy and Goldstone (2007) found that the spacing in simple arithmetic statements could be used to support or undermine students' understanding of the order of operations: they found that ‘a×b + c×d’ was more likely to be correctly understood than ‘a × b+c × d’, due to the differences in spacing around the operation symbols. Similarly, Kirshner and Awtry (2004) found that the 'visual salience' of invalid algebraic manipulations was a major cause of student errors. Might similar visual properties, of the type articulated in Ladd's hypothesis, support or undermine students' engagement with binary operations?
We sought to test Ladd's principle by using an artificial symbol learning paradigm. Specifically, we asked whether it is better to associate vertically symmetric symbols with commutative operations and vertically asymmetric symbols with non-commutative operations (the congruent condition) than it is to associate asymmetric symbols with commutative operations and symmetric symbols with non-commutative operations (the incongruent condition).
The intended sample size and analysis plan were preregistered prior to data collection (see Supplementary Materials).
Fifty-eight undergraduate mathematics students (34 men, 24 women, mean age = 19.9 years) participated during a statistics lecture at Loughborough University. Participants were randomly assigned to either the congruent or incongruent conditions based on the parity of their student ID numbers (which are assigned pseudo-randomly upon the students' enrolment at the university). Participants worked through booklets individually in silence. The experiment consisted of three parts. First, participants read an information sheet about the experiment, gave consent for their data to be used in the analysis, and self-reported their gender and age. In the second part, the learning phase, participants were given three minutes to read about, and learn, a set of novel symbols to represent addition, subtraction, multiplication, and division.
In the congruent condition participants were taught to associate symmetric symbols (◇, ◆) with addition and multiplication, and asymmetric symbols (▷, ▶) with subtraction and division. In the incongruent condition these symbols were reversed so that participants associated asymmetric symbols with the commutative operations and symmetric symbols with the non-commutative operations. The full text of this section, for those in the congruent condition, are shown in Figure 4. The text in the incongruent condition was identical, except that the ◇ and ◆ symbols were switched with the ▷ and ▶ symbols.
After reading these instructions for three minutes participants were asked to turn over the page. This revealed the test phase, which consisted of a simple fluency arithmetic task. Specifically, participants were asked to solve as many simple two-term arithmetic problems as they could in three minutes.
The booklet contained a total of 396 arithmetic problems, split equally between the four operations, and were presented in a different randomised order for each participant. The problems were designed so that participants could not infer the operation from the identity of the numbers in the problem (for instance, if we had asked a participant to solve 36 ◆ 41 they might reasonably have inferred that ◆ did not represent division).
For example, a participant may have seen:
Solve: 40 ▷ 2 =
Solve: 18 ◆ 2 =
Solve: 2 ◇ 2 =
Solve: 24 ▶ 1 =
And so on. All 396 problems, together with example test booklets for each of the conditions, are provided in the Supplementary Materials. Our dependent measure was the fluency score: the number of arithmetic problems each participant solved correctly in the three minutes of the test phase. The raw data are also available (see Supplementary Materials).
Two participants failed to meet our preregistered inclusion criteria (their fluency scores were not within 3 SDs of the mean) and were excluded. This left 56 participants in the main analysis.
The mean numbers of problems correctly answered in each condition are shown in Figure 5. In the congruent condition participants' mean fluency score was 64.3 (SD = 18.6), which was significantly higher than in the incongruent condition, 52.6 (SD = 10.6), t(53.92) = 3.013, p = .004, d = 0.777. A one-sided Bayesian t-test, using the default prior (a half Cauchy distribution with scale parameter 0.707) yielded a Bayes factor of 9.020 in favour of the alternative hypothesis, and a robustness check revealed that this result was not substantially affected by the choice of prior width.
In sum, in line with Ladd's principle, we found that participants in the congruent condition outperformed those in the incongruent condition. A caveat to note here is that we did not assess or control for potential arithmetic fluency differences between the congruent and incongruent conditions. However, those participants who were randomly assigned to learn to associate vertically symmetric symbols with commutative operations and vertically asymmetric symbols with non-commutative operations could perform arithmetic operations more fluently than those participants who learned to associate asymmetric symbols with commutative operations and symmetric symbols with non-commutative operations.
Christine Ladd formulated her hypothesis that commutative operations should be expressed by symmetric symbols almost 150 years ago. In two studies we have provided what we believe to be the first empirical test of her hypothesis. We demonstrated that the binary operation symbols that have a vertical axis of symmetry are more likely to be intuitively associated with commutativity than those with a horizontal axis of symmetry. We further found that Ladd's hypothesis was supported in the context of basic arithmetic where the use of iconic notation proved advantageous for students when solving problems.
These results suggest that despite the seemingly arbitrary nature of a symbol's visual appearance, it may nevertheless have some iconic features that influence the manner in which that symbol is processed. Specifically, our studies support Ladd’s hypothesis that it is advantageous if symbols have iconic aspects that connect in some way to the represented mathematical concept. This observation extends Landy and Goldstone's (2007) finding that perceptual grouping influences behaviour in order-of-operations contexts. Like Landy and Goldstone, we found that the processing of formal mathematical symbols has a non-trivial visual component.
But what mechanism underlies our results? Whereas Landy and Goldstone (2007) appealed to perceptual units to account for their findings (e.g., ‘4×2’ is more likely to be perceived as a unit than ‘4 × 2’, because of the affordances of the spacing around the operation symbol), a similar mechanism does not seem to apply in our context. Rather, our finding from Study 2, that arithmetic fluency on an immediate post-test is facilitated by notations that use congruent symbols, seems to involve memorability rather than perceptual units. What makes iconic symbols more memorable when looking at the case of symmetry and commutativity? Ladd seemed to understand commutativity to be a symmetric operation due to the interchangeability of the two operands while the result of the operation is unchanged. This concept of interchangeability along a vertical axis seems congruent with notation using a symbol that is vertically symmetrical (i.e. has an identical left and right side).
There are theoretical reasons to suppose that congruent symbols might be more memorable than incongruent symbols. Consider onomatopeias: words such as ‘buzz’, ‘snap’ or ‘whack’ whose pronunciation is related to their semantic meaning. Phenomena of this kind have been studied under the label of ‘sound-symbolism’ and ‘ideophones’ (Dingemanse, 2012; Perniss & Vigliocco, 2014). A congruent notation can be seen as a visual version of an onomatopeia: rather than its sound being related to its semantics, its shape is. Since it is well established that onomatopoeias are more memorable than non-omomatopoeias (e.g., Inoue, 1991; Lowrey, Shrum, & Dubitsky, 2003), it seems plausible to suppose that the meaning of congruent symbols would be more memorable than the meaning of incongruent symbols.
But do linguistic shape-semantic associations, analogous to the sound-semantic associations involved in onomatopeias exist? Several sources of evidence suggest that they do. It has been known since the early work of gestalt psychologists (e.g., Köhler, 1929) that humans spontaneously associate shapes with sounds. Perhaps the best known demonstration of this is Ramachandran and Hubbard's (2001) observation that when participants are presented with two shapes, one zig-zagged and one curvy, and told that they are the Martian symbols ‘kiki’ and ‘bouba’, around 95% assume that the zig-zagged symbol is ‘kiki’ and the curved symbol is ‘bouba’. Although this result is typically reported as demonstrating a shape-sound association, it is worth noting that the sound and the visual word are confounded here: both the sound ‘kiki’ and the word ‘kiki’ share properties with the kiki symbol (specifically, there are sharp changes in visual direction of the lines in the lines of the symbol, there are sharp changes in the visual direction of the letters in the word ‘kiki’, and there are sharp changes in the phonemic inflections of the sound ‘kiki’), and both the word ‘bouba’ and the sound ‘bouba’ share properties with the bouba symbol. Interestingly, the associations invoked by the word ‘kiki’ go beyond the visual modality. Gallace, Boschin, and Spence (2011) asked participants to taste various foods and rate them on a number of dimensions, one of which was kiki-bouba (i.e. one end of a Likert scale was anchored by ‘kiki’, the other by ‘bouba’). They found that cheddar cheese was rated as being more kiki than brie, and that regular chocolate was rated as being more bouba than mint chocolate.
More directly related to our work, the early gestalt psychologists also demonstrated that shapes can spontaneously generate semantic associations. For instance, Poffenberger and Barrows (1924) asked participants to match various adjectives to symbols formed of either curved or angular lines. They found large agreement that words such as ‘sad’, ‘quiet’ and ‘gentle’ were best represented by curves, whereas ‘agitating’, ‘furious’ and ‘powerful’ were best represented by angular lines. Lyman (1979) found similar results: 100% of his participants believed that ‘angry’ would be best represented by a zig-zagged shape, and 98% that ‘friendly’ would be best represented by a curved shape. In short, many symbols exhibit degrees of iconicity, which has been suggested as being a general property of language (Perniss, Thompson, & Vigliocco, 2010): their visual properties spontaneously convey associations that naturally fit better with certain meanings or sounds. It seems plausible to suppose that, like with onomatopeias, these associations improve memorability when the symbol is paired with a congruent concept.
These classic findings that demonstrate relationships between visual properties and meaning converge with recent work on the association between shape, space and mathematical concepts. When thinking about numbers, people seem to generally associate smaller numbers with the left side in space and larger numbers with the right side as well as expansion in space (for overviews see Cipora, Patro, & Nuerk, 2018; Fischer & Shaki, 2014; Newcombe, Levine, & Mix, 2015). Similar association seem to hold true for operation symbols in arithmetic. Participants were faster to press a button to their right than pressing a button to their left to respond to a presented addition symbol ‘+’ and vice versa for the subtraction symbol ‘-‘ (Pinhas, Shaki, & Fischer, 2014).
The symbols of advanced mathematics are often introduced very consciously to satisfy certain desiderata, thus they provide a rich resource for the study of cognitive and practical advantages of notations (e.g., see Schlimm, 2018). Thus, we believe that our programmatic study opens up many avenues for future research, both in terms of experimental work, such as studying the effects of other congruent/incongruent symbols, checking whether children’s familiarity with ‘÷’ or ‘/’ for division makes a difference to their mathematical understanding, trying to replicate this effect with other operations, developing production related tasks (e.g., where subjects can pick an operation symbol from a list, or just draw their own one), as well as in terms of theoretical work, such as extending Palmer’s (1978) classification of ‘intrinsic’ representations to cover not only properties of relations, but also properties of the symbols used in representations, etc.
Finally, it is natural to ask whether our findings have any practical implications for mathematics teaching and learning. An obvious suggestion would be that, all things being equal, we should follow Ladd's principle and favour choosing symmetric symbols for commutative operations and asymmetric symbols for non-commutative operations. This practice would certainly seem to be entirely feasible in many contexts. In advanced mathematics for instance such notational choices are largely a matter of convention (there is no reason why an introductory group theory course could not favour a symbol like '▷' to represent an arbitrary binary operation). Similarly, it would seem possible for teachers to introduce division with the '/' symbol rather than '÷'. Using a notation that iconically represents the concepts of commutativity might improve students’ learning and understanding of the concept. A possible benefit of supporting the understanding of commutativity could lie in the faster retrieval of arithmetic facts: If a student understands commutativity and remembers ‘5 + 6 = 11’, they also know ‘6 + 5 = 11’. These suggestions should perhaps be productively tested in a more ecologically valid context before strong conclusions are drawn: whereas the dependent variable in our Study 2 was fluency, this is rarely the outcome prioritised in more advanced settings, and it is unknown whether our results would generalise from fluency outcomes to measures of conceptual understanding. Nevertheless, there seems to be no obvious downside, and a plausible upside, to following Ladd's principle when choosing which symbols to use when teaching mathematics.