Empirical Research

An Input Lexicon for Familiar Numbers

Aliette Lochy*1,2, Christine Schiltz1

Journal of Numerical Cognition, 2022, Vol. 8(2), 244–258, https://doi.org/10.5964/jnc.7385

Received: 2021-08-25. Accepted: 2022-01-13. Published (VoR): 2022-07-28.

Handling Editor: Lieven Verschaffel, KU Leuven, Leuven, Belgium

*Corresponding author at: University of Luxembourg, Faculty of Humanities, Social and Educational Sciences, Department of Behavioral and Cognitive Sciences, Institute of Cognitive Science and Assessment, Campus Belval, Maison des Sciences Humaines, 11, Porte des Sciences, L-4366 Esch-sur-Alzette, Luxembourg. E-mail: aliette.lochy@uni.lu

This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Neuropsychological case-studies suggested that dates and encyclopedic numbers may be processed differently than unknown numbers. However, this issue was seldom investigated in healthy participants. Therefore, it is unclear whether known dates are read like words (as lexical items), or like numbers (each position strictly defines digits’ values in a base-10 system). Here, we compared dates to unknown numbers in an experiment using a paradigm from the word recognition literature. We assessed the word-superiority effect by testing experts (students/ teachers in History) with dates. A 4-characters stimulus (xxxx; letters or numbers, half known/unknown) was presented centrally, masked, and followed by 2 characters above and below the mask, at position 2 (xXxx) or 3 (xxXx) in an alternative-forced-choice recognition task. Both accuracy and reaction times were better for dates than unknown numbers, similarly to the results obtained with words by comparison to non-words. However, this effect was modulated by position in the string. These results show a “date-superiority effect” revealing that dates are processed differently than unknown numbers, and suggest that similar orthographical mechanisms might be used to process dates and words.

Keywords: transcoding, reading Arabic numbers, encyclopedic facts, dates, word superiority effect

When we talk about numbers, we mainly think about quantities. However, numbers may evoke different meanings: the cardinal value represents the size of a set, or an amount (e.g., “There are 900 places in the concert hall”), the ordinal value represents the place or the rank in a set (e.g., “The 900th participant will win a boat trip”). Numbers may also refer to a nominal value, and be used as simple labels (Peugeot 5008), known phone numbers (911), pin codes, or evoke a famous event (1945). Our stored knowledge about numbers is referred to as “number facts”: we all know some arithmetic facts (e.g., 12 x 12 = 144) as well as encyclopedic number facts, like those just mentioned. Several case-studies in neuropsychology have suggested that encyclopedic number knowledge might be processed differently than arithmetic facts or unknown numbers with a cardinal or ordinal value (Cappelletti et al., 2008; Cipolotti, 1995; Cohen et al., 1994; Delazer & Girelli, 1997). In the current study, we address this issue in healthy participants, by assessing how encyclopedic numbers are read when they are visually presented. More specifically, we investigate if they generate reading processes that are similar to reading words rather than reading unknown numbers, and thus if they could be stored in a lexicon for Arabic digits by analogy to the orthographic lexicon for words.

Reading multidigit Arabic numbers is a complex cognitive operation, involving specific syntactic parsing processes on the input form, that are radically different from what happens when we read words (Dotan & Friedmann, 2018, 2019). Indeed, a multidigit Arabic number is constituted of a chain of 10 different possible characters (the digits 1-9 and 0), the value of which is defined by the position they hold in the sequence starting from the rightmost position and increasing by a power of ten at each step to the left. As an example, the digit “2” in 3472 means the quantity {2} x 100, while in 3275, it means the quantity {2} x 102. When a power of 10 is not associated with any base quantity, then the digit 0 is used as a place-holder, e.g., in 3072. Finally, Arabic numbers are parsed into triplets from the rightmost position, also a striking difference with word reading that are read from left to right. These triplets are often marked in the visual array by a dot, a comma, or simply a space depending on cultural conventions (e.g., “thirty-four thousand seven hundred” is written “34.700”, “34,700” or “34 700”). Removing this triplets’ visual marker in very large numbers renders them almost impossible to read (e.g., 105975300 vs. 105.975.300). Reading aloud furthermore involves specific conversion processes in order to plan the corresponding verbal form, which corresponds to several words, rather than several phonemes as is the case in word reading. The appropriate number words, organized in lexical classes (units, decades or tens, and teens) and multiplier words (hundred, thousand, million, etc.), have to be retrieved in the correct order for them to represent the additive (e.g., “one hundred fifteen”: 100 + 15) or multiplicative (e.g., “fifteen hundred”: 15 x 100) relationships between lexical primitives.

Can this complex process be bypassed for encyclopedic numbers, so that they are recognized at a glance, as familiar words that are stored in an input orthographic lexicon? Some authors indeed suggested the existence of multiple routes for reading numbers (Cipolotti, 1995; Cohen et al., 1994), establishing parallels with the general cognitive architecture postulated in word reading (DRC model, Coltheart et al., 2001). The sublexical route, or grapheme-to-phoneme conversion route, would correspond to the transcoding process described above. Depending on the theoretical views, this route would mandatorily involve access to the representation of magnitude for numbers (McCloskey, 1992; McCloskey et al., 1985) or not (Barrouillet et al., 2004; Dehaene & Cohen, 1995; Power & Dal Martello, 1997; Seron & Deloche, 1984). The lexical route, if it exists, would imply the existence of an Arabic input lexicon where frequent and familiar number forms are stored (or “lexicalized”) and which addresses a phonological output lexicon, with or without access to semantics. Semantics in this case would be essentially non-quantity-related: it would hold properties and characteristics of these items at a conceptual level.

This multiple-route proposal for reading numbers is exactly what is suggested by the existence of several case-studies showing dissociations between the ability to read encyclopedic vs. unknown numbers. The patient reported by Cohen et al. (1994) suffered from deep dyslexia and could read half of the presented words (12/24), but no non-words (1/24). When reading multidigit numbers, he was also better at reading familiar, encyclopedic numbers (e.g., 1789, French Revolution) than unknown numbers (24/52 vs. 10/52). Similarly, Delazer and Girelli (1997) describe an aphasic patient who performed better in reading Arabic numbers in a semantic context (e.g., he could read “164” when preceded by “Alpha Romeo”) than without context or a non-associated context. The opposite pattern has also been described in the case-study by Cappelletti et al. (2008). In this case, the patient suffered from semantic dementia and was impaired in reading encyclopedic numbers, thus numbers corresponding to facts retrieved from stored knowledge. While still being able to process non-encyclopedic numbers (3 x 3 = 9), he was also unable to learn new encyclopedic facts when they included numbers. The patient’s impairment was presumably due to representational problems concerning the visual Arabic input levels, and not from retrieval of verbal forms, as he did not benefit from priming or from multiple choice situations.

More recently, an fMRI study complemented these neuropsychological dissociations by revealing different brain activations for numbers when processed as quantities vs. dates of famous historical events (Gullick & Temple, 2011). In the study, the authors used both unknown numbers and known famous dates. Before the experiment, they verified that participants knew the dates and allowed them a short study session to refresh memory traces. Then, they proposed two different tasks. Either, participants had to process numbers as quantities in a numerical comparison task where a pair of numbers was presented and participants had to choose which is larger/smaller, either they had to process them as events and rank the latest/earliest event of the pair. In the first task, numbers activate only the intraparietal sulcus bilaterally, as expected from the known regions involved in quantity processing (Ansari, 2007; Nieder, 2004). In the second task, numbers activate not only bilateral IPS but also extra regions related to semantic processing and fact retrieval: the temporal pole, and superior frontal gyrus.

Other studies in healthy individuals on this issue are scarce. Using a priming design to investigate the existence of stored representations for encyclopedic numbers, Alameda et al. (2003) required participants to name multidigit numbers preceded by related or unrelated masked or unmasked primes. So, as an example, participants had to read aloud the Arabic number “747” and it was preceded by “Boeing” (related prime), or by “Porsche” (unrelated prime). Participants were faster with the related prime, which seems to suggest that there are representations for familiar numbers, that could be activated by associated verbal labels. However, the effect could also stem from pre-assembled expressions in phonological format (e.g., Noël & Seron, 1995), as the task required participants to read aloud the numbers. Therefore, in the second experiment, the authors designed a “number decision task” not requiring any verbal output: participants had to answer with keypress whether a presented item was a legal/illegal number. Legal numbers were made of digit strings, while “non-numbers” were a mixture of digits and letters. Again, numbers were preceded by related or unrelated verbal primes. In that case, facilitation effects strongly suggest the existence of stored representations for known familiar numbers, automatically activated by associated semantic (non-numerical) knowledge, thus entailing the existence of an input lexicon for Arabic numbers.

Our study goes a step further in assessing the existence of a mental store, or lexicon, for numbers, by using a design that does not involve any semantic or verbal associate, no priming and no explicit processing of the whole multi-digit number. Indeed, to test if some numbers are represented in an input Arabic lexicon as lexical items, we decided to compare known dates to unknown numbers, and to test experts with dates, thus historians. This choice was made in order to design a homogeneous set of stimuli, supposedly known by all participants. We used a paradigm from the reading literature that has revealed a word-superiority effect (WSE, Reicher, 1969; Wheeler, 1970). The WSE initially showed faster RT and better accuracy to identify a letter in a two-alternative-forced-choice (2AFC) task when it belongs to a word than when the letter is presented in isolation, suggesting an advantage due to lexical activation. This paradigm has been replicated numerous times in the literature, and it was also shown that a letter is better identified when belonging to a word than a pseudoword (PW) (Grainger et al., 2003; Grossi et al., 2009). The WSE has been shown to correlate with behavioral lexical decision performances (Hildebrandt et al., 1995) and to emerge with the building of an orthographic lexicon (Chase & Tallal, 1990; Coch et al., 2012; Grainger et al., 2003). It has generally been interpreted as reflecting top-down influences from lexical levels of representation on letter-identification processes in the general framework of the interactive activation model (McClelland & Rumelhart, 1981; Rumelhart & McClelland, 1982), or in cascaded models of visual word processing (Coltheart et al., 2001).

Following the logic of this paradigm, here a number that could be a known date or not was briefly presented, without participants being aware of this manipulation. It was followed by a mask and an alternative choice between a target digit and another digit (2AFC task), placed above and below the corresponding position in the string (Figure 1).

Our general hypothesis was that in contrast to unknown numbers, dates are stored in an input lexicon and thus they should be sensitive to similar manipulations than words by comparison to non-words. More specifically, we hypothesized better digit identification performance in the case of dates than in the case of unknown numbers.

Method

Participants

Twenty-five participants (8 females, mean age: 24 years old; range: 21-33 years old) experts in History were recruited among master students in History (N = 13), History teachers (N = 3) or freshly graduated young historians (N = 9). They were tested after giving their written consent for this experiment approved by the Ethical Committee of the University of Luxembourg.

Stimuli

Stimuli were all constituted of 4-characters elements, half letters and half numbers. Among each context (letters/numbers), 18 stimuli were known and 18 were unknown.

Letters

In the letter context, known stimuli were constituted of frequent common French words with a CVCV structure (see Appendix for the full list of stimuli; e.g., ROBE, VASE, DATE) corresponding to three phonemes (in French, the last -e is soundless). Each target stimulus contained one letter that should be identified (e.g. ROBE, XX?X) and the proposed alternative letter belonged to one orthographic neighbor, or competitor hereafter (e.g. ROBE-ROSE, proposed letters: XX?X, B/S), in order to avoid a guessing strategy. Half of the items proposed the choice on position 2 (e.g., VASE-VISE) and half on position 3 (e.g. ROBE-ROSE). Target words and competitors (see Appendix) were matched on lexical frequency (words: 45.16, competitors: 46.30), number of neighbors (words: M = 3.86, range = 5-19; competitors: M = 4.41, range = 3-20), and bigram frequency (words: M = 7889, range = 2652-13649; competitors: M = 4529, range = 1325-15634).

Unknown stimuli (18 non-words) were also constituted of 4 letters arranged in a CVCV format containing 3 phonemes, and paired with competitors in the same manner as words. Thus, competitors were non-words that contained an alternative letter, half on position 2 and half on position 3. Non-word targets and competitors were matched in the number of orthographic neighbors (target non-words: M = 3.06, range = 6-18; competitor non-word: M = 3.36, range = 5-15) and in bigram frequency (target non-words: M = 8287, range = 2483-13855; competitor non-word: M = 8127, range = 4778-14364).

Numbers

Known numbers were constituted of 18 dates of 4 digits, hence starting with 1 (e.g., 1945, 1492) except for two stimuli starting with 2 (e.g., 2005 and 2011). Dates were chosen after collecting spontaneous reports of known dates in 4 different historians (not participating to the experiment) and selecting the ones reported by at least 2 of them. Each target stimulus contained one digit that should be identified (e.g. 1914, XX?X) and the proposed alternative digit gave rise to a competitor that was also a known date (e.g. 1914-1934, proposed digits: XX?X, 1/3). Half of the items proposed the choice on position 2 (e.g., 1492-1892) and half on position 3 (e.g. 1914-1934).

Unknown numbers (N = 18) resembled dates because they also started with the digit 1 or 2, but they did not correspond to famous dates for our participants. They were paired with competitors, also unknown dates, by changing one digit in position 2 or 3.

Procedure

Participants performed the experimental task followed by a 4 forced-choice matching task aiming at verifying their knowledge of the dates (Figure 1). Both experiments were run on E-prime 2.0. in a quiet room at the University.

In the experimental task (Figure 1A), each trial started with a fixation cross of 500ms in the center of the screen followed by a 500ms forward mask (####). The target stimulus was then flashed for 50ms, followed by a backward mask (####) of 500ms. After that, the series of hashes stayed on the screen, and two characters were presented above and below the position where they had appeared in the stimulus string, at position 2 or 3. Participants were required to choose which of the two characters belonged to the target, by pressing a left or right key of the keyboard (left for the above character, right for the below character, and the reverse instruction for half of the participants). The next trial began after the participants’ response.

Each item was repeated two times, once with the correct answer presented above and once with the correct answer below. Trials were presented in blocs of letters and blocs of numbers where known and unknown stimuli were randomly mixed. Blocs were randomly presented. There were two blocs of 36 numbers (dates/non-dates) and two blocs of 36 letters (words/non-words), for a total of 144 items. The experiment lasted approximately 15 minutes.

In the subsequent 4 forced-choice matching task, all events were presented with 4 dates, and participants had to choose the correct associated date (Figure 1B). No time pressure was emphasized in this task, as its purpose was only to verify participants’ knowledge.

Click to enlarge
jnc.7385-f1
Figure 1

Experimental Design

Note. A. Character identification task: participants have to decide which of two alternative characters has been presented in the stimulus that briefly appeared before. The example shows on the left, a number trial (known item) and on the right a word trial (known item). Trials were either known (as displayed here) or unknown, and position varied between position 2 or position 3. B. After the experimental task, participants did a 4AFC matching task to control for their knowledge of the dates.

Data Analysis

First, we analyzed accuracy in the matching task, both per item and per participant. The overall accuracy rate was 89.8%. One item gave rise to a high error rate (36% accuracy) as it was correctly matched to the event by only 16/25 participants, therefore we decided not to include it in the analyses of the experimental task (1948 – Date of the Declaration of Universal Human Rights).

When this item was removed, the accuracy rate was 91.2% (SD = 7.9%). Participants who scored below 2 SD of the mean were discarded (75.2%), leading to the exclusion of only one participant (mean score = 68.4%).

In the experimental task, we cleaned the data in several steps. First, we removed RT considered as technical failures (i.e., below 250ms or above 5000ms). Then we calculated the mean and SD per condition and removed RT where responses where more than 3 SD from the mean. In total, it represented 3.7% of the data. Finally, we checked the accuracy per item, and we removed items where accuracy was below 50% correct: 2 dates and 1 non-date were removed.

In total we thus removed one participant, and 4 numbers. The analysis was pursued on 24 participants, 18 words, 18 non-words, and on 15 dates and 17 non-dates.

Results

Reaction Times

Mean correct reaction times were analyzed with a 2 (Context: letters/numbers) x 2 (Type: known/unknown) x 2 (Position: P2/P3) ANOVA with repeated measures on all factors.

We found a main effect of Type, F(1,23) = 13.222; p < .001: RT were overall faster for known (1104ms) than unknown stimuli (1175ms) (Figure 2A). There was also a main effect of Position, F(1,23) = 31.689; p < .0001, with faster RT at position 2 (1038ms) than 3 (1240ms). There was an interaction between Position and Type, F(1,23) = 7.659; p < .011. At position 2, the 22ms advantage for known items (1027ms) by comparison to unknown items (1049ms) was not significant, t(23) = -1.088; p = .288, while at position 3, known items were responded to faster (1181ms) than unknown items (1299ms), t(23) = -3.841; p = .001. There was no main effect of Context (letters:1102ms, numbers:1177ms), F(1,23) = 2.064; p = .164, and no interaction with Context (Context x Type [F < 1]; Context x Position, F(1,23) = 1.510; p = .359; Context x Type x Position, F(1,23) = 1.275; p = .270), meaning that the advantage for known over unknown stimuli at Position 3 described above was valid both for letters (words over non-words) and for numbers (dates over non-dates).

Accuracy

Mean accuracy was analyzed with a 2 (Context: letters/numbers) x 2 (Type: known/unknown) x 2 (Position: P2/P3) ANOVA with repeated measures on all factors.

There was a main effect of Context, F(1,23) = 24.330; p < .0001: performance was overall better for letters (93%) than numbers (87%) (Figure 2B). There was a main effect of Position, F(1,23) = 38.614; p < .0001, with better scores at position 2 (92%) than 3 (86%). There was also an interaction between Context and Type, F(1,23) = 4.494; p < .04, qualified by a triple interaction between Context, Type, and Position, F(1,23) = 12.176; p < .002, showing that the effect of Type (known/unknown) was modulated by position differently for each context. Indeed, for letters, accuracy was high at position 2 (97% and 95% for words and non-words respectively), and the 2% better performance for words was not significant, t(23) = 1.238; p = .228, while at position 3 there was a significant benefit of 7% for words (92%) by comparison to non-words (85%), t(23) = 2.422; p = .024. For numbers, it was the reverse. At position 2, the benefit of 5% for known numbers was significant (dates: 90%, non-dates: 85%), t(23) = 2.933; p = .007, while in position 3 there was no significant difference in an overall lower accuracy (81% and 85% for dates and non-dates respectively), t(23) = -1.606; p = .122.

Inverse Efficiency Scores

For numbers, the profile of accuracy and reaction times at position 3 could suggest a speed-accuracy trade-off. Indeed, participants committed more errors on known numbers (81% correct) and they were also faster (1049ms) than on unknown numbers (85% correct, 1299ms). Therefore, we also ran analyses on Inverse Efficiency Scores (IES) combining the two measures (RT/ACC, Figure 2C).

The ANOVA Context (Letters/numbers) x Type (Known/unknown) x Position (P2/P3) also showed a significant triple interaction between all factors, F(1,23) = 9.459; p < .005. For letters, the difference between known (words) and unknown (non-words) was not significant at P2 (-41ms), t(23) = -1.335; p = .195, but was significant at P3 (-303ms), t(23) = -5.035; p < .001. For numbers, the reverse pattern emerged, with a significant difference between known (dates) and unknown (non-dates) at P2 (-114ms), t(23) = -2.668; p = .014, and not at P3 (+18ms), t(23) = .190; p = .851.

Click to enlarge
jnc.7385-f2
Figure 2

Digit and Letter Identification Scores

Note. Reaction times (A., top row left), Accuracy (B., top row right), and Inverse Efficiency Scores (C., bottom) for each stimulus context (numbers, letters) and each internal position (P2: X?XX) and P3 (XX?X) for known (dates and words respectively, in black) and unknown (non-dates and non-words, in grey) character strings. Known items are faster processed at P2 for numbers, and at P3 for words (Context x Type x Position, p < .005, see text).

Discussion

In the current study, we investigated the possible existence of an Arabic input lexicon for encyclopedic numbers, automatically accessed on the basis of the input form, in healthy individuals. Evidence for such an encyclopedic number lexicon was until now lacking, as previous findings could be explained by explicit semantic elaboration and processing (Gullick & Temple, 2011), or by activation of semantic and verbal associates (Alameda et al., 2003). Here, testing historians with high knowledge of dates, we compared the processing of dates to unknown numbers with the Reicher-Wheeler paradigm borrowed from the reading literature (word superiority effect, WSE, Reicher, 1969; Wheeler, 1970).

In this experiment, participants were not informed that there were two categories of numbers, dates and non-dates, or two categories of letter-strings. They had to identify which of two characters had been presented in the preceding stimulus. Overall, the results show a benefit for known items on character identification. On reaction times, participants were faster for known items, independently of these items being words or dates (no interaction with context). On accuracy, performance was better for known than unknown items, but this was modulated by position differently for words or dates. The same was found on IES: the benefit for known over unknown items was mainly occurring at position 2 for numbers, and at position 3 for words.

Before discussing the potential reasons for these position differences, we would like to highlight our main novel finding: dates are processed differently than non-dates, similarly as words by comparison to non-words, in a totally implicit task where participants were not required to judge items as dates, or to activate any semantic knowledge about dates. Since the WSE is classically interpreted as reflecting the impact of stored orthographic representations in the context of word processing (Grainger et al., 2003), the results of this experiment confirm that dates may be considered as stored items in an Arabic input lexicon. This idea was first put forward in neuropsychological studies, like the patient described by Cohen et al. (1994) who was better at reading familiar Arabic numbers (e.g. the dates 1789, 1918) than unfamiliar ones. Also, even for familiar numerals that he could not read, he made comments that indicated he had access to the (non-numerical) meaning of the item. The authors, by analogy with word reading, proposed the existence of a “surface” reading route and a “deep” (i.e., nominal semantic) reading route, that would involve an input lexicon for encyclopedic Arabic numbers. Other case-studies showed a similar dissociation, with better performance for encyclopedic than unknown numbers (Cipolotti, 1995; Delazer & Girelli, 1997), as well as the reverse (Cappelletti et al., 2008).

Besides patients’ studies, previous data on healthy individuals suggested different memory retrieval processes for personally familiar numbers (Dickson & Federmeier, 2018), distinct brain regions activated by famous dates when processed as events rather than as quantities (Gullick & Temple, 2011), as well as the existence of an input lexicon for encyclopedic numbers (Alameda et al., 2003). While the two former studies relied on explicit activation of number’s semantics, either personal (Dickson & Federmeier, 2018) or encyclopedic (Gullick & Temple, 2011), the later relied on implicit priming effects (Alameda et al., 2003). Interestingly, facilitation effects due to related primes were shown on multidigit Arabic numbers, both in reading aloud and in lexical decision tasks. However, the use of a verbally-related prime could not preclude that the priming benefits were due to the activation of a pre-assembled expression in phonological format, or to links that could exist in an orthographic input lexicon for words associated with the Arabic number. Here, this potential explanation does not hold, because we did not use any verbal primes or labels and no information or context. Thus, contrarily to previous studies on healthy individuals (Alameda et al., 2003; Gullick & Temple, 2011), we show that dates have triggered automatic activation of stored number forms. Our results clearly show that the link between known Arabic numbers and encyclopedic knowledge does not systematically require extensive and slow top-down elaboration of the stimulus, and does not require activation of an associated verbal label (either explicit, e.g. the patient of Delazer & Girelli, 1997, or implicit, e.g. Alameda et al., 2003). Our experiment shows that dates trigger a faster identification of constituent digits, like words speed up processing of constituent letters, and on the contrary to unknown items. This can be understood in an interactive-activation or cascaded model of reading, where items that possess an entry in the lexicon feedback activation to the level of digit identification (McClelland & Rumelhart, 1981). ERP studies showed indeed that lexical factors may facilitate letter string processing between 160ms and 200ms, such as frequency (Assadollahi & Pulvermüller, 2003; Hauk & Pulvermüller, 2004; Sereno et al., 1998), or lexicality (Martin et al., 2006).

In a general cognitive architecture of number reading, some authors have debated over the existence of a several routes for reading numbers (overview in Brysbaert, 2018): a purely asemantic route (Dehaene, 1992, 2011; Seron & Deloche, 1984), a semantic route activating cardinal/ordinal value of numbers (Brysbaert, 1995; McCloskey, 1992), or a lexical route for encyclopedic numbers (Cipolotti, 1995; Cohen et al., 1994).

Discrepancies in findings and thus, in theoretical proposals, might be reconciled if considering the co-existence of these multiple routes for reading numbers, triggered both by the task and the nature of the stimuli. Indeed, one proposal stated that Arabic numbers are resembling pictorial images more than words (Fias et al., 2001) therefore accessing semantics (magnitude) in a direct and obligatory manner. Evidence in favor of this idea initially came from STROOP-like paradigms, on 1-digit Arabic numbers. When participants had to name number words, they were not affected by the (in)congruent presence of a digit; but when they had to name an Arabic number they were affected by the presence of a number word. This suggested that words could be read totally asemantically, using the sublexical conversion route (letters-sounds), while digits could not, and automatically activate their magnitude representation. This might be true indeed for reading 1-digit Arabic numbers. Since then, however, other experiments also provided evidence that 2-digits numbers are not processed holistically as a picture, but are decomposed (Nuerk et al., 2001) and also that they can be read by a direct route relating Arabic input and verbal output forms (Herrera & Macizo, 2012; Ratinckx et al., 2005). Finally, familiar numbers trigger another route than unknown numbers: this was suggested by patient studies and we add here evidence that it might be the case in healthy individuals, even in an implicit task. In this lexical route, the coding of internal elements obeys the same law as letters in words, thus digits identity and position within the chain are processed in parallel like letters that constitute a word.

As concerns the position effects reported here, serial position effects are widely described in the literature on letter identification, where several constraints play a role in explaining better or worse performance for identifying/reporting letters depending on their position in the sequence. To examine position effects, many studies used 5-elements strings: the classical finding is a W-shape in accuracy and/or a M-shape on reaction times, both reflecting better performance on the central and the external letters, and worse performance at the second and fourth positions. This typical variation of performance according to position reflects the combination of several factors. First, the structural properties of our visual system induce better identification at the fovea (thus, the central position), and a decrease of acuity with eccentricity towards external positions (for instance, Nazir et al., 2004). Second, external positions benefit from a lack of lateral crowding as they are flanked by only one other character and not two (inter-letter interference; Bouma, 1970, 1973). Finally, letter identification is better on the left (especially the first letter) than on the right of central fixation in languages read from left to right, which would reflect an attentional bias (Aschenbrenner et al., 2017; Nazir et al., 2004; Pitchford et al., 2008) or an adaptative asymmetry in the receptive fields for retinotopic letter detectors, favoring the leftmost information and resulting from an optimization for processing initial letters in languages read from left to right (Chanceaux & Grainger, 2012; Grainger & Van Heuven, 2003; Tydgat & Grainger, 2009). There is a debate in the literature regarding the origins of these effects and also, their specificity to letters or stimuli typically processed in strings (letters and digits; Mason, 1982; Tydgat & Grainger, 2009), or to all types of visual objects (Gomez et al., 2008). Here, although we did not test external positions, we do observe overall an advantage of position 2 over position 3, both for numbers and letters (faster RT, better accuracy). This is expected in the general pattern described above, given that the center fixation was located between position 2 and 3: it shows a slight advantage on the left rather than the right of the fovea. What differs between numbers and letters is the location of the benefit due to the lexical status (known vs. unknown): for letters it is significant only at position 3, while for numbers it is the case only at position 2. On words, the WSE (better performance in identifying letters in words than non-words) at position 3 only was reported in another study using 4-elements stimuli with a CVCV structure, although in a shallow orthography (Italian; Ripamonti et al., 2018). On numbers, we can only speculate on a few reasons why they might have driven a specific pattern at position 2. First, because we used 4-digit dates, most of them started with the digit “1”, therefore the first position was not very informative. Thus, the second position was the first character of the string to be informative/ to potentially activate items as distinct from one another in memory, which could induce better performance, similarly as the first letter-advantage in words. In this perspective, the finding at position 2 would only reflect a list-context-effect, and we should not observe similar findings if the first digit would vary across items (which is not possible with 4-digit dates but would be possible using 3-digit dates). It would also mean that a similar advantage at position 2 could be found in words if using a list of items all starting with the same letter. A second potential explanation is linked to the specific processes deployed to read Arabic numbers. As stated in the introduction, multidigit numbers are parsed in triplets from the rightmost position, therefore, the second digit of a 4-character string was also the first digit of the triplet, and although speculative, the first element of a triplet might have a specific status. However, this would mean that both the lexical recognition process and the transcoding process would be triggered in parallel. Third, several dates included the digit 9 in second position (8/18). Among the 6 items that were retained for analysis (see Methods), only two included the alternative choice on that position. To exclude the possibility that these two items could have induced a generally faster RT at position 2 (because the digit 9 would be expected at that position), we checked RT for numbers containing 9 at position 2. They were actually slower (1117ms) than the RT for all others items not containing 9 at this position (1057ms); thus if anything these results go against the idea of faster RT at position 2 for dates because of the repeated presence of digit 9.

To sum up, in this study we found a significant effect of the status of numbers on character identification, with a “date superiority effect”, i.e., an increased performance (faster reaction times) for recognizing digits belonging to dates by comparison to non-dates. This effect was however present only at position 2 in the string, and several interpretations of the position effect have been put forward. There are some limitations to our study. First, we used a relatively small sample of participants (N = 24) and items (N = 18), due to the specific domain of knowledge that was tested (History). This limits the power of our results (Brysbaert, 2019). Second, although competitors for dates were also known dates (see Appendix), we could not match them in frequency (like words and competitors) as such frequency lists do not exist, and we did not test their knowledge in the matching task. Thus, it may be that some items were less familiar or frequent than others.

Further studies should investigate how our findings can be generalized to other types of encyclopedic numbers besides dates, like labels (Boeing 747), codes or phone numbers. Here we tested experts with dates, for a matter of experimental homogeneity, but in everyday life we all have personal number facts that are presumably stored in such a lexicon, and they may vary in length, chunking structure and type of verbal counterparts. For instance, stock managers label products by strings of 3 digits where each digit has a specific meaning, while pin codes or phone numbers are chunked in strings of 2 or 3 digits. Further research should aim at understanding how such a lexicon is organized: what defines neighborhood between Arabic numbers? Could we evidence similar principles of competition, activation/inhibition as in word processing? How stable are these representations? How are they influenced by frequency and length? Are both reading routes (lexical and non-lexical) triggered in parallel? Is the magnitude information nevertheless activated for these numbers? These questions show that (re)opening this field of research goes beyond the simple conclusion that Arabic numbers might be stored in an Arabic input lexicon, they point to the broader issue of generalizability of the principles that have been discovered in the domain of word reading and the orthographic lexicon.

Taken together, our results show the existence of a lexical route for reading encyclopedic numbers in healthy participants. Activation of this Arabic input lexicon can be observed without using any verbal priming and it occurs during implicit processes (i.e. not entailing any decision-taking), indicating that it does not require extensive top-down elaboration.

Funding

The first author was supported at the time of data collection by the Face perception INTER project (INTER/FNRS/15/11015111) funded by the Luxembourgish Fund for Scientific Research (FNR, Luxembourg) and by the Belgian Funds for Scientific Research (FNRS; Grant nr: PDR T.0207.16 FNRS).

Acknowledgments

The authors thank Michel Fayol, reviewer Marc Brysbaert and an anonymous reviewer for their helpful comments on previous versions of this manuscript. They also thank Fanny Golinvaux for her help in data collection.

Competing Interests

The authors have declared that no competing interests exist.

References

  • Alameda, J. R., Cuetos, F., & Brysbaert, M. (2003). The number 747 is named faster after seeing Boeing than after seeing Levi’s: Associative priming in the processing of multidigit Arabic numerals. Quarterly Journal of Experimental Psychology Section A: Human Experimental Psychology, 56(6), 1009-1019. https://doi.org/10.1080/02724980244000783

  • Ansari, D. (2007). Does the parietal cortex distinguish between “10,” “ten,” and ten dots? Neuron, 53(2), 165-167. https://doi.org/10.1016/j.neuron.2007.01.001

  • Aschenbrenner, A. J., Balota, D. A., Weigand, A. J., Scaltritti, M., & Besner, D. (2017). The first letter position effect in visual word recognition: The role of spatial attention. Journal of Experimental Psychology: Human Perception and Performance, 43(4), 700-718. https://doi.org/10.1037/xhp0000342

  • Assadollahi, R., & Pulvermüller, F. (2003). Early influences of word length and frequency: A group study using MEG. Neuroreport, 14(8), 1183-1187. https://doi.org/10.1097/00001756-200306110-00016

  • Barrouillet, P., Camos, V., Perruchet, P., & Seron, X. (2004). ADAPT: A developmental, asemantic, and procedural model for transcoding from verbal to Arabic numerals. Psychological Review, 111(2), 368-394. https://doi.org/10.1037/0033-295X.111.2.368

  • Bouma, H. (1970). Interaction effects in parafoveal letter recognition. Nature, 226, 177-178. https://doi.org/10.1038/226177a0

  • Bouma, H. (1973). Visual interference in the parafoveal recognition of initial and final letters of words. Vision Research, 13(4), 767-782. https://doi.org/10.1016/0042-6989(73)90041-2

  • Brysbaert, M. (1995). Arabic number reading: On the nature of the numerical scale and the origin of phonological recoding. Journal of Experimental Psychology: General, 124(4), 434-452. https://doi.org/10.1037/0096-3445.124.4.434

  • Brysbaert, M. (2018). Numbers and language: What’s new in the past 25 years? In A. Henik & W. Fias (Eds.), Heterogeneity of function in numerical cognition (pp. 3-26). Academic Press.

  • Brysbaert, M. (2019). How many participants do we have to include in properly powered experiments? A tutorial of power analysis with reference tables. Journal of Cognition, 2(1), Article 16. https://doi.org/10.5334/joc.72

  • Cappelletti, M., Jansari, A., Kopelman, M., & Butterworth, B. (2008). A case of selective impairment of encyclopaedic numerical knowledge or “when December 25th is no longer Christmas day, but ‘20 + 5’ is still 25.” Cortex, 44(3), 325-336. https://doi.org/10.1016/j.cortex.2006.07.005

  • Chanceaux, M., & Grainger, J. (2012). Serial position effects in the identification of letters, digits, symbols, and shapes in peripheral vision. Acta Psychologica, 141(2), 149-158. https://doi.org/10.1016/j.actpsy.2012.08.001

  • Chase, C. H., & Tallal, P. (1990). A developmental, interactive activation model of the word superiority effect. Journal of Experimental Child Psychology, 49(3), 448-487. https://doi.org/10.1016/0022-0965(90)90069-K

  • Cipolotti, L. (1995). Multiple routes for reading words, why not numbers? Evidence from a case of Arabic numeral dyslexia. Cognitive Neuropsychology, 12(3), 313-342. https://doi.org/10.1080/02643299508252001

  • Coch, D., Mitra, P., & George, E. (2012). Behavioral and ERP evidence of word and pseudoword superiority effects in 7- and 11-year-olds. Brain Research, 1486, 68-81. https://doi.org/10.1016/j.brainres.2012.09.041

  • Cohen, L., Dehaene, S., & Verstichel, P. (1994). Number words and number non-words: A case of deep dyslexia extending to Arabic numerals. Brain, 117(2), 267-279. https://doi.org/10.1093/brain/117.2.267

  • Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Ziegler, J. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108(1), 204-256. https://doi.org/10.1037/0033-295X.108.1.204

  • Dehaene, S. (1992). Varieties of numerical abilities. Cognition, 44(1–2), 1-42. https://doi.org/10.1016/0010-0277(92)90049-N

  • Dehaene, S. (2011). The number sense: How the mind creates mathematics (Revised and updated ed.). Oxford University Press.

  • Dehaene, S., & Cohen, L. (1995). Towards an anatomical and functional model of number processing. Mathematical Cognition, 1, 83-120.

  • Delazer, M., & Girelli, L. (1997). When “Alfa Romeo” facilitates 164: Semantic effects in verbal number production. Neurocase, 3(6), 461-475. https://doi.org/10.1080/13554799708405022

  • Dickson, D. S., & Federmeier, K. D. (2018). Your favorite number is special (to you): Evidence for item-level differences in retrieval of information from numerals. Neuropsychologia, 117, 253-260. https://doi.org/10.1016/j.neuropsychologia.2018.05.018

  • Dotan, D., & Friedmann, N. (2018). A cognitive model for multidigit number reading: Inferences from individuals with selective impairments. Cortex, 101, 249-281. https://doi.org/10.1016/j.cortex.2017.10.025

  • Dotan, D., & Friedmann, N. (2019). Separate mechanisms for number reading and word reading: Evidence from selective impairments. Cortex, 114, 176-192. https://doi.org/10.1016/j.cortex.2018.05.010

  • Fias, W., Reynvoet, B., & Brysbaert, M. (2001). Are Arabic numerals processed as pictures in a Stroop interference task? Psychological Research, 65(4), 242-249. https://doi.org/10.1007/s004260100064

  • Gomez, P., Ratcliff, R., & Perea, M. (2008). The Overlap Model: A model of letter position coding. Psychological Review, 115(3), 577-600. https://doi.org/10.1037/a0012667

  • Grainger, J., Bouttevin, S., Truc, C., Bastien, M., & Ziegler, J. (2003). Word superiority, pseudoword superiority, and learning to read: A comparison of dyslexic and normal readers. Brain and Language, 87(3), 432-440. https://doi.org/10.1016/S0093-934X(03)00145-7

  • Grainger, J., & Van Heuven, W. J. B. (2003). Modeling letter position coding in printed word perception. In P. Bonin (Ed.), Mental lexicon: “Some words to talk about words” (pp. 1–23). Nova Science Publishers.

  • Grossi, G., Murphy, J., & Boggan, J. (2009). Word and pseudoword superiority effects in Italian-English bilinguals. Bilingualism, 12(1), 113-120. https://doi.org/10.1017/S1366728908003891

  • Gullick, M. M., & Temple, E. (2011). Are historic years understood as numbers or events? An fMRI study of numbers with semantic associations. Brain and Cognition, 77(3), 356-364. https://doi.org/10.1016/j.bandc.2011.09.004

  • Hauk, O., & Pulvermüller, F. (2004). Effects of word length and frequency on the human event-related potential. Clinical Neurophysiology, 115(5), 1090-1103. https://doi.org/10.1016/j.clinph.2003.12.020

  • Herrera, A., & Macizo, P. (2012). Semantic processing in the production of numerals across notations. Journal of Experimental Psychology: Learning, Memory, and Cognition, 38(1), 40-51. https://doi.org/10.1037/a0024884

  • Hildebrandt, N., Caplan, D., Sokol, S., & Torreano, L. (1995). Lexical factors in the word-superiority effect. Memory & Cognition, 23(1), 23-33. https://doi.org/10.3758/BF03210554

  • Martin, C. D., Nazir, T., Thierry, G., Paulignan, Y., & Démonet, J. F. (2006). Perceptual and lexical effects in letter identification: An event-related potential study of the word superiority effect. Brain Research, 1098(1), 153-160. https://doi.org/10.1016/j.brainres.2006.04.097

  • Mason, M. (1982). Recognition time for letters and nonletters: Effects of serial position, array size, and processing order. Journal of Experimental Psychology: Human Perception and Performance, 8(5), 724-738. https://doi.org/10.1037/0096-1523.8.5.724

  • McClelland, J. L., & Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: I. An account of basic findings. Psychological Review, 88(5), 375-407. https://doi.org/10.1037/0033-295X.88.5.375

  • McCloskey, M. (1992). Cognitive mechanisms in numerical processing: Evidence from acquired dyscalculia. Cognition, 44(1–2), 107-157. https://doi.org/10.1016/0010-0277(92)90052-J

  • McCloskey, M., Caramazza, A., & Basili, A. (1985). Cognitive mechanisms in number processing and calculation: Evidence from dyscalculia. Brain and Cognition, 4(2), 171-196. https://doi.org/10.1016/0278-2626(85)90069-7

  • Nazir, T. A., Ben-Boutayab, N., Decoppet, N., Deutsch, A., & Frost, R. (2004). Reading habits, perceptual learning, and recognition of printed words. Brain and Language, 88(3), 294-311. https://doi.org/10.1016/S0093-934X(03)00168-8

  • Nieder, A. (2004). The number domain – Can we count on parietal cortex? Neuron, 44(3), 407-409. https://doi.org/10.1016/j.neuron.2004.10.020

  • Noël, M. P., & Seron, X. (1995). Lexicalization errors in writing Arabic numerals – A single-case study. Brain and Cognition, 29(2), 151-179. https://doi.org/10.1006/brcg.1995.1274

  • Nuerk, H.-C., Weger, U., & Willmes, K. (2001). Decade breaks in the mental number line? Putting the tens and units back in different bins. Cognition, 82(1), B25-B33. https://doi.org/10.1016/S0010-0277(01)00142-1

  • Pitchford, N. J., Ledgeway, T., & Masterson, J. (2008). Effect of orthographic processes on letter position encoding. Journal of Research in Reading, 31(1), 97-116. https://doi.org/10.1111/j.1467-9817.2007.00363.x

  • Power, R., & Dal Martello, M. (1997). From 834 to eighty thirty four: The reading of Arabic numerals by seven-year-old children. Mathematical Cognition, 3(1), 63-85. https://doi.org/10.1080/135467997387489

  • Ratinckx, E., Brysbaert, M., & Fias, W. (2005). Naming two-digit Arabic numerals: Evidence from masked priming studies. Journal of Experimental Psychology: Human Perception and Performance, 31(5), 1150-1163. https://doi.org/10.1037/0096-1523.31.5.1150

  • Reicher, G. M. (1969). Perceptual recognition as a function of meaningfulness of stimulus material. Journal of Experimental Psychology, 81(2), 275-280. https://doi.org/10.1037/h0027768

  • Ripamonti, E., Luzzatti, C., Zoccolotti, P., & Traficante, D. (2018). Word and pseudoword superiority effects: Evidence from a shallow orthography language. Quarterly Journal of Experimental Psychology, 71(9), 1911-1920. https://doi.org/10.1080/17470218.2017.1363791

  • Rumelhart, D. E., & McClelland, J. L. (1982). An interactive activation model of context effects in letter perception: II. The contextual enhancement effect and some tests and extensions of the model. Psychological Review, 89(1), 60-94. https://doi.org/10.1037/0033-295X.89.1.60

  • Sereno, S. C., Rayner, K., & Posner, M. I. (1998). Establishing a time-line of word recognition: Evidence from eye movements and event-related potentials. Neuroreport, 9(10), 2195-2200. https://doi.org/10.1097/00001756-199807130-00009

  • Seron, X., & Deloche, G. (1984). From 2 to Two: An analysis of a transcoding process by means of neuropsychological evidence. Journal of Psycholinguistic Research, 13, 215-236. https://doi.org/10.1007/BF01068464

  • Tydgat, I., & Grainger, J. (2009). Serial position effects in the identification of letters, digits, and symbols. Journal of Experimental Psychology: Human Perception and Performance, 35(2), 480-498. https://doi.org/10.1037/a0013027

  • Wheeler, D. D. (1970). Processes in word recognition. Cognitive Psychology, 1(1), 59-85. https://doi.org/10.1016/0010-0285(70)90005-8

Appendix: Experimental Stimuli

Table A1

Known Letter-Strings (Words)

Position Words Freq Ortho. Neighb. Bigr.F. W. Competitors (not shown) Freq Ortho. Neighb. Bigr.F.
2 BISE 8.11 11 10723 BUSE 2.09 9 5377
2 CAVE 42.09 16 4776 CUVE 2.09 5 2727
2 DIRE 28.92 13 11032 DURE 33.65 8 11380
2 LAME 25.81 12 9840 LIME 2.91 10 10108
2 PAGE 55.88 12 6168 PIGE 1.96 10 4676
2 PIRE 31.89 18 10168 PURE 34.19 9 11408
2 RIRE 112.57 19 13649 RARE 37.23 16 15634
2 RUSE 13.31 10 5651 RASE 6.62 20 10661
2 VASE 26.76 7 6309 VISE 7.50 13 10715
3 DATE 36.62 5 11022 DAME 106.15 12 7270
3 FINE 33.99 11 10129 FIXE 27.97 5 1325
3 JUGE 29.80 6 2652 JUPE 34.05 6 2652
3 LUNE 63.24 9 4589 LUXE 38.65 3 1551
3 MINE 48.18 13 10861 MISE 36.08 12 11406
3 NOTE 39.32 12 8709 NOCE 6.55 6 4261
3 RACE 28.72 13 8872 RARE 37.23 16 15634
3 ROBE 111.96 7 4749 ROSE 66.62 12 8475
3 VIDE 75.74 9 3906 VITE 351.89 9 9877
M 47.34 11.29 7828.35 48.90 10.12 8221.18
SD 29.39 3.86 3156.78 80.77 4.41 4529.39

Note. Position: position of character to identify; Word: item shown; Lex.Freq: lexical frequency; Ortho. Neighb.: number of orthographic neighbors; Bigr.Freq: Bigram Frequency; W. competitor: word competitor corresponding to the alternative letter presented in that pair.

Table A2

Unknown Letter-Strings

Position Non-words Ortho. Neighb. Bigr.Freq. NW competitors Ortho. Neighb. Bigr.Freq.
2 PASE 14 7247 PUSE 8 5442
2 RINE 18 13855 RONE 12 14364
2 JUSE 6 5188 JISE 5 9529
2 LURE 9 11794 LORE 14 11723
2 MIGE 13 5163 MOGE 12 5073
2 NOBE 8 2483 NABE 6 4778
2 RABE 10 7749 RIBE 10 6165
2 ROGE 10 7092 RIGE 13 8157
2 VINE 11 10170 VONE 6 11268
3 BICE 9 4777 BINE 15 10178
3 CATE 16 12790 CACE 13 6180
3 DIVE 8 4492 DIGE 8 5540
3 LASE 13 8771 LAGE 15 7692
3 PIME 11 7583 PITE 14 10081
3 RURE 9 11617 RUME 7 6279
3 VARE 12 11282 VAME 10 7378
3 DADE 12 3777 DARE 14 11174
3 FITE 9 9836 FIME 7 7338
M 10.82 8142.29 10.65 8405.71
SD 3.01 3365.42 3.45 2739.81

Note. Position: position of character to identify; Non-word: item shown; Ortho. Neighb.: number of orthographic neighbors; Bigr.Freq: Bigram Frequency; NW. competitor: non-word competitor corresponding to the alternative letter presented in that pair.

Table A3

Known Numbers

Position Dates Competitors
(not shown)
Events related to the dates Events related to the competitors
2 1054 1254 Great Schism between the Roman Catholics and the Eastern Orthodox churches Treaty between Pope Innocent IV and Sicilia
2 1453 1853 Fall of Constantinople Crimean war
2 1492 1892 Columbus discovers America Cholera crisis in Europe
2 1715 1415 Death of Louis XIV Azincourt Battle (Hundred years war)
2 1789 1889 French Revolution Universal exhibition in Paris
2 1804 1204 Napoléon Bonaparte is crowned Emperor Siege of Constantinople
2 1815 1515 Vienna Treaty Marignan Battle
2 1901 1601 President McKinley is assassinated Lyon treaty (end of war between France and Savoie)
2 1918 1618 End of World War I Thirty years war
3 1830 1840 Belgium’s independance Frederic William IV is King of Prussia
3 1914 1934 Sarajevo incident Anti-parliamentaris demonstration in Paris (Stavisky affair)
3 1917 1987 Russian revolution Black Monday (stock crash market)
3 1929 1989 Black Thursday (stock crash market) Fall of Berlin’s wall
3 1939 1919 Begin of World War II Peace Treaty after World War I
3 1945 1975 End of World War II Fall of Phnom Penh
3 1948 1968 Universal Declaration of Human Rights Civil unrest in France (May 68)
3 2005 2015 Death of Pope John Paul II Paris terrorist attack
3 2011 2001 Arab Spring New York terrorist attack

Note. Position: position of character to identify; Dates: items shown; Competitors: date competitors corresponding to the alternative digit presented in that pair (not shown); Event related to the date/competitor: historical event for dates and competitors (not shown).

Table A4

Unknown Numbers

Position Non-dates Competitors
2 1012 1112
2 1083 1983
2 1126 1726
2 1165 1265
2 1589 1689
2 1471 1371
2 1356 1556
2 1369 1169
2 1987 1387
3 2014 2004
3 1607 1647
3 1624 1634
3 1702 1742
3 1727 1797
3 1841 1861
3 1856 1876
3 1921 1911
3 1913 1953

Note. Position: position of character to identify; Non-date: item shown; Competitor: non-date competitor corresponding to the alternative digit presented in that pair.