The Effects of Auditory Numerosity and Magnitude on Visual Numerosity Representation: An ERP Study

Numerical representation is not restricted to sensory modalities. It remains unclear how numerosity processing in different modalities interacts within the brain. Moreover, the effect of continuous magnitudes presented in one modality on the representation of numerosity in another modality has not been well studied. By using event-related potential (ERP) and source localization analyses, the present study examined whether there was an interaction between auditory numerosity and continuous magnitude on visual numerosity representation. A visual dot array (visual standard stimulus) was preceded by sound in which numerosity (Multiple-tone vs. One-tone conditions) and magnitude (Loud-tone vs. Soft-tone conditions) information were manipulated. Then, another visual dot array (visual comparison stimulus) was presented, and participants were required to compare the numerosities of the visual dots. Behavioural results revealed that participants showed smaller just-noticeable differences (JNDs) when visual stimuli were preceded by multiple tones than those when visual stimuli were preceded by one tone. The subsequent ERP analysis of visual standard stimuli revealed that the peak amplitude of N1 was more negative under the Loud-tone condition than that under the Soft-tone condition, which could be related to better preparatory attention. Moreover, a significant interaction between auditory numerosity and magnitude was found within the P2p time window for the standard stimuli. Further source localization analysis identified the effect of N1 and P2p to be in the right middle frontal gyrus (MFG) and left inferior parietal lobule (IPL). The present study suggests that numerosity information presented in one sensory modality could spontaneously affect the numerical representation in another modality.

Previous studies have pointed out that the relationship between numerosity and magnitude information is close (Gallistel & Gelman, 2000). Previous studies have not clarified the potential interaction of the magnitude information and numerosity information of stimuli across sensory modalities. For instance, the loudness of auditory tones or the area of visual dot arrays could be regarded as the magnitude information of stimuli (Gallistel & Gelman, 2000;Leibovich, Katzin, Harel, & Henik, 2017;Rugani, Castiello, Priftis, Spoto, & Sartori, 2017). Few studies have investigated the effect of non-numerical magnitudes, such as the loudness of sound, on the numerical representation across modalities. Therefore, the second aim of the present study was to investigate how continuous magnitudes such as loudness presented in the auditory modality affect the numerosity representation in the visual modality.
In summary, the present study aimed to explore whether the numerosity and magnitude information of an auditory stimulus can influence the representation of visual numerosity. By adopting brain event-related potentials (ERPs), we investigated the brain-electric correlates of cross-modal numerosity representation. A numerosity comparison paradigm was used in the present study. An auditory prime tone sequence was presented, followed by a visual dot array (i.e., the standard stimulus). Then, another visual dot array was presented (i.e., the comparison stimulus). Participants were asked to compare the numerosities of the two visual dot arrays. Critically, the numerosity and magnitude information of the auditory prime stimulus was manipulated by changing the number (Numerosity) and the loudness of the tones (Magnitude), respectively. Tones were presented either at 60 dB SPL (Soft tone) or 80 dB SPL (Loud tone). The tone sequence with a single tone was defined as the One-tone condition, and the sequence with five tones was defined as the Multiple-tone condition. Thus, four kinds of prime sound sequences were involved in the present study, i.e., One-soft, Multiple-soft, One-loud, and Multiple-loud sequences. We used the sound sequence without any tones (No-tone) as the control condition.
We hypothesized that the numerosity or the magnitude of auditory stimuli would influence the numerosity processing of the visual standard stimuli.
Previous studies have examined brain responses when participants process numerosities. A positive ERP presented in the posterior parietal region and with the peak at approximately 150-250 ms, P2p, was found to be related to the numerical processing (Dehaene, 1996;Dehaene & Brannon, 2011;Fornaciai, Brannon, Woldorff, & Park, 2017;Hyde & Spelke, 2009;Libertus, Woldorff, & Brannon, 2007). The P2p amplitude could reflect the perception of numerosity (Park, DeWind, Woldorff, & Brannon, 2015). A large number could elicit a larger P2p amplitude than that for a small number. In the numerical comparison, the P2p amplitude could be modulated by the numerical distance between two numbers, with a large amplitude being associated with a closer distance (Dehaene, 1996;Dehaene et al., 1998;Hyde & Spelke, 2009;Libertus et al., 2007). In the present study, if the auditory numerosity influences the visual numerosity representation, a change in the amplitude of P2p is expected. In particular, the P2p amplitude of visual standard stimuli is larger in the Multiple-tone condition than in the One-tone condition. In addition, given that the numerosity and magnitude are closely related and could interact with each other (Cordes, Gelman, Gallistel, & Whalen, 2001;Heinemann et al., 2013;Leibovich et al., 2017;Rugani et al., 2017), we hypothesized that P2p may also be modulated by the magnitude of tones (Loud or Soft).
In the present study, it was hypothesized that the visual standard stimuli that were primed by loud tones could The Effects of Auditory Numerosity and Magnitude on Visual Numerosity Representation 166 elicit a larger N1 amplitude than that by soft tones. Finally, previous neuroimaging studies have investigated the neural basis of numerosity cognition, such as neuropsychological studies (Dehaene, Piazza, Pinel, & Cohen, 2003), positron emission tomography (PET) studies (Fias, Lammertyn, Reynvoet, Dupont, & Orban, 2003), and functional magnetic resonance imaging (fMRI) studies (Dehaene et al., 2003;Pinel, Piazza, Le Bihan, & Dehaene, 2004). Consistent evidence from these studies suggests that the bilateral parietal regions, especially the intraparietal sulcus (IPS), are essential for representing and processing numerical information (Ansari, 2008;Brannon, 2006;Nieder, 2005;Sokolowski, Fias, Bosah Ononye, & Ansari, 2017). In addition, the intraparietal region might also be important for multisensory processing, especially in exchanging auditory and visual information (Regenbogen et al., 2018). Using source localization analysis, we further investigated whether the parietal regions were associated with cross-modal numerosity and magnitude cognition.

Apparatus and Stimuli
Visual stimuli (used as the standard stimulus and comparison stimulus) were arrays of white dots on a black background (non-symbolic number stimuli) presented on a 23-inch LCD monitor (HP ProDisplay 231; resolution: 1920 × 1080; refresh rate: 60 Hz) using E-Prime 2.0 (Psychology Software Tools, 2012). These dot arrays were generated with the method proposed by Gebuis and Reynvoet (2011), which could minimize the influence of some non-numerical properties of the dot arrays that confounded with numerosities, such as the size of individual dots and the total surface area of the dot arrays. In the present study, the distribution of the dot arrays was limited to a circular region (radius = 5° of the visual angle) at the centre of the screen. The numerosity of the standard stimulus was fixed to 20 dots. For the comparison stimulus, the numerosity was controlled in an equally distributed Napierian-based log space as in previous studies (Dehaene, Izard, Spelke, & Pica, 2008;Nieder, 2018). The numerosity of the comparison stimulus was set to 15, 16, 18, 22, 24, and 27 dots (with a similar range of numerosity as in previous studies, e.g., Fornaciai & Park, 2018;Liu et al., 2013), which were varied within a range of ± 0.3 log units relative to the standard numerosity (3.0 in Napierian-based log space).
For example, the distance between 15 and 20 is the same as that between 27 and 20 on the log scale.
Auditory stimuli (used as the prime stimulus; pure sine-wave tone, 440 Hz) were presented through an in-earmonitor headphone (TORRAS H1). The magnitude information (loudness) and numerosity information (chunks,Zhang,Wu,Wu et al. 167 du or du-du-du-du-du) of the stimuli were manipulated. The magnitude information of the auditory stimuli is the loudness of the tones. Tones presented at 60 dB SPL were defined as Soft tones, and 80 dB SPL tones were defined as Loud tones. Here, the numerosity of the auditory stimuli was generated by inserting gaps (20 ms) into a continuous pure tone with a duration of 300 ms. Two such auditory stimuli were generated: one with no gaps, called One-tone (du; numerosity = 1), and the other with four gaps distributed equally within the continuous tone, called Multiple-tone (du-du-du-du-du; numerosity = 5). A 5 ms rise and fall time were applied to each gap-separated segment of the multiple tones.

Procedure and Experimental Design
Participants sat 60 cm away from the monitor in a dimly lit room and wore an in-ear monitor headphone. They were instructed to maintain a central fixation on the monitor throughout the experiment. At the beginning of each trial, a white fixation cross was presented to the participant for a random duration (from 900 to 1100 ms in steps of 25 ms). Then, an auditory prime was presented for 300 ms. There were four types of auditory primes: One-soft, Multiple-soft, One-loud, and Multiple-loud. We also set up a control condition (No-tone) in which the auditory prime was absent. After the prime (or no prime in the control condition), a blank screen (with a white fixation cross) was presented for a random duration (from 100 to 300 ms in steps of 25 ms). Then, a standard stimulus was presented for 300 ms. Afterwards, a blank screen was present for a random duration (from 900 to 1100 ms in steps of 25 ms). Then, a comparison stimulus was presented for 300 ms (see Figure   1). Participants were instructed to compare the numerosity of the visual standard and the comparison stimulus with a joystick (BTP-2163X). If the number of visual dots in the comparison array was larger than that in the standard array, participants were required to press the left button; otherwise, they were asked to press the right button. This mapping between the two response keys was counterbalanced across participants (i.e., the right button for the larger numerosity, the left button for the smaller numerosity). There were 600 trials in total, and each prime condition (No-tone, One-soft, Multiple-soft, One-loud, and Multiple-loud) consisted of 120 trials.
The five types of trials were randomly presented. Participants practised eight practice trials before starting the formal experiment, and a correct rate of 75% was required to pass the practice. In the present study, all participants passed the practice within five minutes.
The Effects of Auditory Numerosity and Magnitude on Visual Numerosity Representation 168 Note. The sequence of stimuli in the experiment. The auditory prime stimulus was manipulated in terms of loudness (magnitude) and number (numerosity). There was also a control condition in which no tone was presented. Dot arrays were presented sequentially to construct the comparison task. In the present study, the numerosity of the standard stimulus was fixed to 20. The numerosity of the comparison stimuli was sampled from the sets of 15, 16, 18, 22, 24, and 27 with equal numbers of trials. Participants were instructed to judge the numerosity of the standard and the comparison stimulus with the left or right button of a joystick. The mapping between the two response keys was counterbalanced across participants.
[900:25:1100] means that the stimulus was presented with a random duration from 900 to 1100 ms in steps of 25 ms.

Behavioural Data Analysis
Participants compared the standard and the comparison dot arrays and chose the arrays with more dots.
Trials with reaction times shorter than 50 ms were excluded. Then, the resulting psychometric data were fitted with a Gaussian cumulative density function (CDF) for each participant using psignifit version 4.0 (Schütt, Harmeling, Macke, & Wichmann, 2016), a MATLAB toolbox for Bayesian inference for psychometric functions.
The points of subjective equality (PSEs) were estimated as the probability of making more responses at 50%. Just-noticeable differences (JNDs) were defined as half of the probability difference of making the more responses at 25% and 75%. The PSE and JND reflect different aspects of the representation of input stimuli features. They are two features of the psychometric function. The psychometric function is an inferential model Zhang, Wu,Wu et al. 169 applied in detection and discrimination tasks. It models the relationship between a given feature of a physical stimulus and the forced-choice responses of a participant. The mean of the psychometric function-that is, the PSE-when more and less responses are balanced-represents the accuracy of participants' perception of the stimulus. The standard deviation of the psychometric function-that is, the JND-represents the precision of participants' perception of the stimulus.
For the statistical analysis of the behavioural data, we first examined whether the PSEs were different from the standard stimulus (20 dots) with a one-sample t-test in each prime condition separately. After that, we further compared the PSEs and JNDs between the auditory prime conditions (One-soft, Multiple-soft, One-loud, and Multiple-loud) and the No-tone condition separately via paired samples t-tests. Then, a 2 (auditory numerosity: One-tone vs. Multiple-tone) by 2 (auditory magnitude: Soft-tone vs. Loud-tone) repeated measures analysis of variance (ANOVA) was implemented for the auditory prime conditions. In the pairwise analysis, p-values were Bonferroni adjusted. In addition, Greenhouse-Geisser correction was applied in all statistical analyses when the sphericity was violated. Statistical differences were considered significant at p < .05. All statistical analyses were performed with JASP (JASP Team, 2018).

Electrophysiological Recording and Preprocessing
Scalp voltages were recorded (sampling rate 500 Hz, online bandpass filter 0.05-100 Hz) from 62 Ag-AgCl scalp electrodes mounted into an electric cap (Easy Cap; FMS, Herrsching-Breitbrunn, Germany) according to the standard international 10-20 system with a NeuroScan SynAmps2 Amplifier (Scan 4.5, Neurosoft Labs, Inc. Virginia, USA). External electrodes were used for the vertical and horizontal electrooculograms. All the scalp electrodes were referenced to an electrode attached to the right earlobe online. A linked earlobe was used as a reference calculated offline. The impedances of the electrodes were maintained below 5 kΩ during the recording.
The EEGLAB toolbox (Delorme & Makeig, 2004) was used to preprocess the EEG data. The data were high-pass filtered offline above 0.5 Hz with a one-pass, noncausal, zero-phase windowed sinc FIR filter. A Kaiser window was used in the present study. The maximum passband ripple of this window was set to 0.0015, and the transition bandwidth was set to 2 Hz. Ocular artefacts were corrected throughout the continuously collected data with a procedure based on independent component analysis (Jung et al., 2000). Specifically, the Infomax algorithm was used for the independent component analysis, and the SASICA plugin of EEGLAB (see https://github.com/dnacombo/SASICA) was further used to choose EOG-related components automatically.
The results of the EOG correction were visually confirmed via the comparison of waves before and after correction.

ERP Analysis of Visual Standard Stimuli
For the ERP analysis, data were further low-pass filtered below 40 Hz with the same filter parameters as those used in the preprocessing stage. To determine the effect of the auditory prime stimuli on the visual standard stimuli, ERPs were locked to the visual standard stimulus in the epochs from 200 ms before stimulus onset to 800 ms following stimulus onset. The period from -200 ms to stimulus onset served as the prestimulus baseline in the ERP analysis. Trials with extreme voltages of epochs exceeding ± 70 μV during the baseline and poststimulus periods were excluded. ERPs were constructed by separately averaging the standard stimuli-locked ERPs at the five prime conditions (No-tone, One-soft, Multiple-soft, One-loud, and Multiple-loud). For the visual The Effects of Auditory Numerosity and Magnitude on Visual Numerosity Representation 170 standard stimuli, the peak amplitudes of N1 (130 to 190 ms) and P2p (220 to 275 ms) were analysed at the P3 and P4 electrodes as in a previous study (Libertus et al., 2007). For the statistical analysis of ERPs elicited by the standard stimuli, repeated measures ANOVA was implemented to determine the interaction of auditory numerosity and auditory magnitude.
For visual standard ERPs, correlation analysis and source localization analysis were conducted. The difference in the JNDs and the peak amplitudes of the ERPs between the Multiple-tone and One-tone conditions was calculated. Then, Pearson's correlation coefficient was computed to examine the relationship between the ERPs and behavioural performance. In addition, to localize the cortical regions that were sensitive to the interaction effect between the auditory numerosity and magnitude observed at the scalp level, we adopted the standard low-resolution brain electromagnetic tomography analysis (sLORETA; Pascual-Marqui, 2002).
Although solutions provided by EEG-based source-location algorithms should be interpreted with caution due to their potential risks for error, sLORETA solutions have shown significant correspondence with the results provided by haemodynamic procedures as suggested by a previous study (Hinojosa et al., 2015).
In the present study, the sLORETA software package (Pascual-Marqui, 2002) was implemented to achieve source localization based on the topographic map of scalp voltages. Specifically, the three-dimensional current density was estimated for each condition (No-tone, One-soft, Multiple-soft, One-loud, and Multiple-loud) and each participant. Subsequently, the voxel-based, whole-brain sLORETA images (6239 voxels) were compared between conditions of interest using the nonparametric mapping (SnPM) tool. As explained by Nichols and Holmes (2002), this nonparametric methodology inherently avoids multiple comparison-derived problems and does not require any assumption of normality. Voxels that showed significant differences between conditions (t-statistic on log-transformed data, one-tailed corrected p < .05) were localized to specific anatomical regions and Brodmann areas (BAs).

ERP Analysis of Visual Comparison Stimuli
To investigate whether the peak amplitude of P2p is influenced by the ratio of the comparison stimuli, we analysed the peak amplitude of P2p (from 220 to 275 ms) at three ratios with a duration of 700 ms. We used the 200 ms before stimulus onset as the baseline. A 3 (ratio: 20 vs. 15/27; 20 vs. 16/24; 20 vs. 18/22) × 2 (hemisphere: electrode P3 vs. P4) repeated measures ANOVA was implemented on the peak amplitude of P2p. To explore the difference between the auditory prime conditions and the No-tone condition, one-way ANOVA (auditory prime conditions: No-tone, One-soft, Multiple-soft, One-loud, and Multiple-loud) was used.
The Greenhouse-Geisser correction was applied to all statistical analyses when sphericity was violated. For significant main effects or interactions of any factors, pairwise comparisons or simple effects analyses were conducted. Bonferroni correction was used for multiple comparisons. In the present study, all differences were considered significant at p < .05. All statistical analyses were performed with JASP (JASP Team, 2018).

Psychophysical Results
The fitted psychometric curve can be found in Figure 2a. The x-axis shows the numerosity of the comparison stimulus that has been transformed into the natural logarithm scale. The y-axis shows the percentage of trials Zhang,Wu,Wu et al. 171 in which the number of dots in the comparison stimuli was judged to be more than the number of dots in the standard stimulus (i.e., more responses).

PSEs
The PSE results are illustrated in Figure 2b. The one-sample t-tests did not reveal any significant drift of PSE away from the standard stimulus (ps > .05) among the auditory prime conditions presented. However, for the No-tone condition, the PSE showed a drift away from the standard stimulus with marginal significance (PSE lower than the standard stimulus, p = .051). Further planned paired t-tests between the auditory prime conditions (One-soft, Multiple-soft, One-loud, and Multiple-loud) and the control condition (No-tone) did not show any significant difference (ps > .05). A 2 × 2 repeated measures ANOVA (auditory numerosity: One, Multiple; auditory magnitude: Soft, Loud) did not reveal any significant main or interaction effects (ps > .05). was smaller than that of the One-tone condition (M = 0.165, SD = 0.039). Other main effects or interaction effects did not reach significance (ps > .05). The JND results indicated that auditory numerosity (One vs. Multiple), rather than auditory magnitude (Soft vs. Loud), affected participants' sensitivity to visual numerosity.
In summary, the behavioural results suggested that auditory numerosity influenced participants' representation of the subsequent visual numerosity. right) was applied. The results are shown in Figure 3 and Figure 4a.  The Effects of Auditory Numerosity and Magnitude on Visual Numerosity Representation 174 SD = 3.15) than that in the left hemisphere (M = -3.10, SD = 3.12). The interaction effects were nonsignificant, Fs < 2.30, ps > .14.

ERPs Elicited by Visual Standard
P2p waveform for visual standard stimuli (Electrodes: P3, P4; 220-275 ms) -First, to compare the difference between conditions with and without tones, a two-way ANOVA of auditory prime condition (No-tone, One-soft, Multiple-soft, One-loud, and Multiple-loud) and hemisphere (left vs. right) was applied. The results are shown in Figure 3 and Figure 4b. The main effect of auditory prime condition was significant, F(4,72) = 5.69, p < .001, η p 2 = .24, indicating that the visual standard stimulus elicited a more positive peak amplitude for the

Correlation Analyses for Visual Standard Stimuli
For the significant results that we found in the behavioural analysis and the ERP analysis, further correlation analyses were conducted. Pearson's correlation coefficients were computed to assess the relationship between differences in behavioural performance (JNDs) and differences in ERP peak amplitudes between Multiple-tone and One-tone conditions. The difference between Multiple-tone and One-tone conditions could reflect the effect of auditory numerosity. Considering that the hemisphere effect was reported in the analysis of the N1 peak amplitude, we conducted the analysis at different electrodes (P3 and P4) separately. Scatterplots summarize the results obtained (Figure 4a and Figure 4b, bottom panels). There were significant correlations between behavioural performance and N1/P2p peak amplitudes, ps < .05. The difference in N1 peak amplitudes was negatively correlated with the differences in JNDs at electrode P3, r = -.49, p = .040. The difference in P2p peak amplitudes was negatively correlated with the differences in JNDs at electrode P3, r = -.78, p < .001. One participant was an outlier for the difference in N1 peak amplitude (-1.37 μV) and another participant was an outlier for that of P2p (4.67 μV); therefore, they were not included in the correlation analyses. We identified influential or outlying data points based on Cook's distance and centred leverage plots.

Source Localization Results for Visual Standard Stimuli
The last analytic step consisted of localization of the cortical regions that were responsible for the interactions observed in P2p. To achieve this localization, we analysed P2 waveforms within the range of the P2p time windows (220-275 ms) for each participant and experimental condition (One-soft, Multiple-soft, One-loud, and Multiple-loud) with sLORETA. Then, the voxel-based whole-brain sLORETA images (6239 voxels) were compared with the contrast of (Multiple-One) soft -(Multiple-One) loud using the SnPM approach. The difference between Multiple-tone and One-tone conditions reflected the effect of auditory numerosity. The activity within the N1 time windows was also investigated for detecting potential responses with the same contrast. As illustrated in Table 1 and Figure 5, these voxels showed that under Soft-tone condition, stronger activity was localized to the left inferior parietal lobule (IPL; peak MNI coordinates: X = -35, Y = -40, Z = 40; BA 40) in the P2p window than that under Loud-tone condition. For the activity of N1, we found that under Soft-tone condition, lower activity was localized to the right middle frontal gyrus (MFG; peak MNI coordinates: X = 30, Y = 5/20, Z = 50/55; BA 6/8) compared to the activity for Loud-tone condition. The Effects of Auditory Numerosity and Magnitude on Visual Numerosity Representation 176 Figure 5. Results of sLORETA source localization.
Note. Top panel: Slice view of the source localization results within the time windows of the N1 (130-190 ms) and P2p (220-275 ms) waves for the standard stimulus-locked ERPs. Bottom panel: 3D visualization of the results of the source localization. Blue indicates that the difference between the Multiple-tone and One-tone conditions is larger for the Loud-tone condition than that for the Soft-tone condition. Red indicates that the difference between the Multiple-tone and One-tone conditions is larger for the Soft-tone condition than that for the Loud-tone condition. Colour bars represent the t value of the SnPM test. p = .024. We further analysed the linear trend of the ratio at electrode P4. Subsequent polynomial contrast analysis of the ratio suggested that there was a linear relationship between the ratio and the peak amplitude of P2p, p = .011; see Figure 6a (the zoomed-in part) and Figure 6c.

Discussion
The present study aimed to investigate whether numerosity and magnitude information presented in one sensory modality (audition) could influence non-symbolic numerosity representation in another modality (vision).
Moreover, by adopting the ERP technique, the neural correlates of the cross-modal influence on numerosity representation were also investigated. Behavioural results demonstrated that auditory numerosity information (One-tone vs. Multiple-tone) could affect the JND of responses to visual dot arrays, but magnitude information did not (Soft-tone vs. Loud-tone). However, visual ERPs revealed that both auditory numerosity and magnitude information could influence the processing of non-symbolic numerosity information in the visual modality. The visual standard stimulus under the Loud-tone condition elicited a more negative N1 peak amplitude than that under the Soft-tone condition. The visual standard stimulus under the One-tone condition elicited a more negative N1 peak amplitude than that under the Multiple-tone condition. Moreover, there was an interaction between auditory numerosity and auditory magnitude information for visual numerosity perception within the time window of the P2p. Finally, by using source localization analysis, we localized the brain activities involved in this interaction to the lateralized activity of the right frontal region (MFG) and the left inferior parietal lobule (IPL) within the time window of the N1 and P2p, respectively.
The Effects of Auditory Numerosity and Magnitude on Visual Numerosity Representation 178 Our behavioural results showed that JNDs under the Multiple-tone condition were smaller than those under the One-tone condition, indicating that participants were more reliable to the non-symbolic numerosity information in the visual modality when the auditory prime was multiple tones relative to one tone. This finding is in line with the priming distance effect, which suggests that when a number is preceded by a prime number, participants could process numbers more efficiently when the prime-target numerical distance is smaller (Bahrami et al., 2010;Dehaene et al., 1998). In other words, participants may have a more reliable representation of the numerosity information of standard stimuli under the Multiple-tone condition than that under the One-tone condition. In addition, our behavioural results did not show a significant difference in PSE among all priming conditions, indicating that participants were able to perceive the numerosity information of the standard stimuli accurately regardless of priming conditions.
In the present study, the interaction between auditory numerosity and magnitude was observed for the crossmodal standard stimuli on the P2p amplitude. This result is in line with previous unimodal studies, which suggest that the representation of numerosity is sensitive to continuous magnitudes (Leibovich & Henik, 2013;Leibovich et al., 2017). Under the Soft-tone condition, visual standard stimuli presented after multiple tones elicited larger P2p amplitudes than that when the stimuli were presented after one tone. In contrast, this numerosity effect was not shown under the Loud-tone condition. Previous studies showed that the P2p amplitude was highly related to the processing of numerosity (Dehaene, 1996;Hyde & Spelke, 2009;Libertus et al., 2007).
However, the detailed function of P2p in numerosity cognition is still controversial. Some researchers have suggested that the P2p amplitude reflects the perception of numerosity (Park et al., 2015). That is, a large number elicited a higher P2p amplitude than that for a small number. However, other researchers suggested that the P2p amplitude might reflect the comparative distance effect (Hyde & Spelke, 2012;Libertus et al., 2007). When comparing the numerosity of two numbers, the closer two numbers are, the higher the amplitude of P2p is. In the present study, for the ERPs elicited by the visual standard stimulus, a larger P2p amplitude was observed for the Multiple-tone condition than that for the One-tone condition. These results could be attributed to the larger numerosity being perceived in the Multiple-tone condition than that in the One-tone condition. This processing was spontaneous, as no numerosity comparison between the auditory prime and visual standard stimuli was required. The absence of such an effect under the Loud-tone condition suggested that the perception of the numerosity of visual stimuli might be jointly affected by the numerosity and magnitude of the auditory stimuli. Previous studies have proposed that the amplitude of P2p may reflect the integration effect of numerosity and magnitude based on evidence from the visual modality (Gebuis & Reynvoet, 2012. This integration was also observed in the present cross-modal context. N1 has been reported for both symbolic and non-symbolic stimuli in previous studies on numerosity cognition (Gebuis & Reynvoet, 2013;Hyde & Spelke, 2012;Soltész & Szűcs, 2014). The N1 wave is believed to represent a sensory gain control mechanism because focusing on a visual area would facilitate further perceptual processing of stimuli presented in that area (Luck, Woodman, & Vogel, 2000). We observed that larger N1 amplitude was elicited by a visual standard stimulus under the One-tone condition than that under the Multiple-tone condition, indicating that participants allocated more attention resources to the dot array under the One-tone condition. By contrast, the processing of a large number could cost more attention resources than those for a small number (Pomè, Anobile, Cicchini, Scabia, & Burr, 2019). Moreover, a previous study suggested that the loudness of auditory stimuli could modulate participants' preparatory attention. Loud auditory stimuli could increase participants' alerting levels, speed up processing, and decrease the perception threshold for subsequent visual stimuli (Petersen, Petersen, Bundesen, Vangkilde, & Habekost, 2017). Therefore, enhanced Zhang, Wu, Wu et al. 179 negative peak amplitude of N1 was found under the Loud-tone condition in the present study, which could be attributed to the better preparatory attention elicited by the Loud-tone (van den Berg et al., 2016). This explanation was further supported by the source localization results of the N1 wave elicited by the visual standard stimulus. The right MFG is related to the allocation of attention resources (Small et al., 2003). Our results showed that more activity was observed in the right MFG for the Loud-tone condition than that for the Soft-tone condition.
The correlation results revealed a covariant relationship between the difference of the JNDs and the differences in the peak amplitudes of visual standard stimuli ERPs between the Multiple-tone and One-tone conditions, which showed the consistency and reliability of our results. In the behavioural results, we found that participants showed smaller JNDs for the standard stimuli under the Multiple-tone condition than those under the One-tone condition. For N1, the peak amplitude was independently related to the numerosity and magnitude of auditory prime stimuli. Considering that the numerosity effect on JNDs and N1 was not influenced by the magnitude of the auditory prime, we combined the difference of JNDs and the peak amplitudes of N1 across the Soft and Loud conditions. A negative correlation between the difference in the JNDs and the difference in the peak amplitude of N1 was observed; that is, less negative peak amplitude of N1 was found under the Multiple-tone condition than that under the One-tone condition when participants showed smaller JND under the Multiple-tone condition than that under the One-tone condition. For P2p, an interaction between auditory numerosity and magnitude was found. The difference in the peak amplitude of P2p between the Multiple-tone and One-tone conditions was only significant under the Soft-tone condition. Therefore, we only applied the correlation analysis under the Soft-tone condition for P2p. Under the Soft-tone condition, more positive peak amplitude of P2p was found under the Multiple-tone condition than that under the One-tone condition. A negative correlation between the difference in JND and the difference in peak amplitude of P2p was observed; that is, the more positive peak amplitude of P2p was found under the Multiple-tone condition than that under the One-tone condition when participants showed smaller JND under the Multiple-tone condition than that under the One-tone condition. The correlation analysis of N1 and P2p consistently showed that the individuals with obvious condition differences revealed by the group-level ERP analyses (less negative N1 or more positive P2p) also demonstrated obvious condition differences revealed by group-level behavioural analyses (smaller JND).
In the analysis of the P2p amplitude for the visual comparison stimulus, we observed that the P2p amplitude varied as a function of ratio (see Figure 6), i.e., the smaller the ratio, the higher the amplitude of P2p. The ratio represents the distance of the two numbers: a ratio close to one means that the numbers are close in distance.
Therefore, the increase in P2p amplitude following the decrease in the ratio may be due to the change in the distance between two numbers, i.e., the distance effect (Moyer & Landauer, 1967). When comparing two numerosities, the closer the numbers, the higher the amplitude of P2p.
A recent meta-analysis used the activation likelihood estimation (ALE) method to analyse nearly one hundred neuroimaging studies that focused on numerical and non-numerical magnitude processing (Sokolowski et al., 2017). Their results indicated that the processing of symbolic, non-symbolic, and non-numerical magnitudes was related to overlapping activation in the frontal and parietal lobes. This coexistence of overlapping and segmentation of brain activation among symbolic, non-symbolic, and non-numerical magnitudes suggests that number cognition may use both a generalized brain magnitude system and some specialized brain regions for representing numerical magnitudes. Despite the relatively poor spatial resolution of source localization The Effects of Auditory Numerosity and Magnitude on Visual Numerosity Representation 180 techniques, our source localization results might suggest that different brain areas, such as the left inferior parietal lobule and right middle frontal gyrus, were involved in different processing stages in visual numerosity processing when the perception of visual numerosity was cross-modally influenced by the preceding auditory numerosity.
In summary, the present study demonstrated that visual non-symbolic numerosity perception could be affected by auditory numerosity information. Most importantly, our results revealed the existence of the interaction between auditory numerosity and magnitude at the P2p waves for cross-modal standard stimuli (visual P2p), which were further localized to the left inferior parietal lobule (IPL). Potential neural mechanisms underlying such a cross-modal influence on numerosity processing need to be investigated in the future.

Funding
The work was funded by a grant from Natural Science Foundation of China (31470978) to Dr. Zhenzhu Yue.