For centuries, philosophers and neuroscientists have questioned whether the use of language and the ability to solve complex problems are related and, if so, what the nature of the relationship between language and thought is. Most of the attention – and controversy – have been focused on the claim that the structure of language shapes non-linguistic thinking; so-called linguistic relativity.
Human intelligence directly derives from brain activity and it is closely linked to the natural languages that humans speak . The Language, this complex system of sound-meaning connections, not only provides a comprehensive description of the world, but its acquisition is one of the most fundamental human traits, and it is obviously the brain that undergoes the developmental changes.
Brain development seems to be non-linear, with sensitive periods of time in which the characteristics of experiences determine different possible outcomes . In fact, during development, the brain not only stores linguistic information but also adapts to the grammatical regularities of language.
Language acquisition might be oversimplified as the way in which the brain learns, perceives, represents and integrates complex sequences of verbal events. The temporal nature of sounds, structural integration, expectations, and cognitive sequencing allows the brain to construct progressively intricate representations of the environment, and with progressive maturity, even aspects of emotion or cognition not readily verbalized may be influenced by linguistically based thought processes.
No matter whether it is verbal or not the new material we have to deal with, once it appears it is processed through a group of co-acting neural specific subsystems which allow us to detect, encode, temporarily hold and compare incoming stimuli with previous material, along with the decision making on what to do next. In this context, it is crucial to understand that certain characteristics of the stimulus might influence its processing and, if so, how these characteristics interact with cognitive processing.
1.1. Evaluating incoming information
Presently, it is generally accepted that incoming information is initially processed in the working memory (WM), which is a theoretical construct used to refer to the system or mechanism underlying the maintenance of task-relevant information during the performance of a cognitive task. WM is crucial for a wide range of complex cognitive activities but has a limited capacity [3-5]. Enough empirical evidence supports that WM plays an important role in recognition, encoding and manipulation of task-related and concurrent distractor stimuli, while WM load influences attention modulation. In fact, the working memory central executive system [6,7], concept based on the “Supervisory Attentional System” proposed by Shallice [8,9], is critical for systematizing a continuous “background monitoring” that searches for new relevant information, even though the information may be irrelevant to the ongoing act [10,11]. These “background-monitoring’’ mechanisms seem to be designed to eventually interrupt the current action and trigger an updating of working memory ; thus, WM provides goal-directed control of visual selective attention and allows the minimization of interference caused by goal irrelevant distractors .
The interaction between attention and working memory is bidirectional. It has been postulated that the maintenance of information in working memory is accomplished by directing attention to the neural representations of the information itself , whereas attentional orienting within working memory can retroactively influence maintenance-related activity in functionally specialized posterior areas by engaging selective retrieval functions [see reference 15 for a review]. Even while flexible switching between goals may require maintaining higher sensitivity to possibly relevant information, distracting stimuli must be continuously evaluated and suppressed. Therefore, behavioral performance could be sensitive to the “on-line” appearance of environmental distractors; especially when they could be “relevant” to the subject.
1.2. Processing information with affective valence
Several theories posit that emotionally salient stimuli have privileged access during information processing [16,17] which implies that affective stimuli have the capacity to transcend task boundaries, disrupting ongoing processing regardless of whether they are relevant to the current task-set or not.
Numerous studies have addressed the effects of affective stimuli on cognitive processes such as attention, memory and executive functions [18-27]. Actually, it seems that the appearance of an emotional stimulus might interfere with the processing of other stimuli emerging in the temporal vicinity, basically due to the fact that stimuli with emotional content attract attentional resources because of their adaptive relevance [28-31].
The acceptance of the assumption that affective stimuli disrupt subsequent cognitive processing raises the question whether there is an asymmetry between emotional and cognitive processing (i.e., emotional distractors disrupt cognitive processing, but not vice versa). Recently, Reeck and Egner  studied this issue using a face-word Stroop protocol adapted to independently manipulate (a) the congruency between target and distractor stimulus features, (b) the affective salience of distractor features, and (c) the task-relevance of emotional compared to non-emotional target features. As a result of this study, the authors concluded that task-irrelevant emotional distractors resulted in equivalent performance costs as task-relevant non-emotional distracters, whereas task-irrelevant non-emotional distractors did not produce performance costs comparable to those generated by task-relevant emotional distractors. In other words, this study documented the abovementioned asymmetry between affective and cognitive processing, supporting the notion that affective stimuli are prioritized in human information processing.
On the other hand, an increased arousal of the stimulus has been associated with a more intense defensive response when compared to appetitive motivational systems [33-35]. Accordingly, the arousal of unpleasant stimuli is comparatively higher, leading to what has been termed as emotional negativity bias [36-38]. In addition, it has been postulated an emotional positivity offset when lower arousal stimuli are processed, as is possible to infer from the enhanced processing of pleasant compared to unpleasant stimuli, when they both are lower arousal ones [33,39,40].
Emotional words are consistently acknowledged as low arousal stimuli [41-44], particularly in comparison with emotional scenes or faces [45-47]. This effect has been explained as a result that words depict emotional events less vividly . Interestingly, it has been proposed that verbal material is less capable of disrupting cognitive performance than pictures, particularly when using negative words, what reinforces the notion that emotional verbal stimuli associate with lower brain responsivity. However, it seems that arousing verbal stimuli can lead to amygdala activation similar to that induced by emotional faces, pictures, or conditioned stimuli .
1.3. Neural basis of emotional processing
Recent advances in neuroimaging techniques have demonstrated that the amygdala, ventromedial prefrontal cortex (VMPFC), anterior cingulate, insula, nucleus accumbens and basal ganglia are all involved in emotion processing and executive control in some capacity [49-53]. In fact, it has been found that left and right interior frontal gyrus (IFG) regions differentiate between interference and noninterference trials across neutral and emotional stimuli; a region of the left anterior insula and right orbital frontal cortex (OFC) is capable to differentiate between interference and non-interference trials for emotional stimuli, regardless of valence, whereas the insula, OFC and ventral anterior cingulate cortex (ACC) seem to be sensitive to interference resolution for a select valence and that the left amygdala differentiated emotional and neutral stimuli at encoding and response . Furthermore, the behavioral patterns observed in patients with either left temporal lesion or right OFC lesion suggest that the left amygdala and right OFC are both critical to the emotion facilitation effect .
In the last few years, the temporal course of the brain processing of emotional words has been studied through event-related brain potentials (ERPs) techniques, showing that earlier components as P120, N170 and P200 (including a variant closely related to N170 and termed as vertex-positive potential: VPP) could be sensitive to the emotional content of the word and the subsequent attentional allocation process [42,45,56], while later ERP changes as Early Posterior Negativity, N400 and Late Positive Components could reflect semantic stages of processing [40,57,58].
Even though there is a general consensus that emotionally arousing faces or scenes capture a substantial amount of visual processing resources even if they appear as distractors for a concurrent cognitive task, scarce data is available on the effect due to task-irrelevant emotional words.
A recent study evaluated the effect of written emotional words sharing the scene in which subjects had to perform a simultaneous visual perceptual task . The authors reported emotion effects of task-irrelevant words on the ERPs before 300 ms, but not any interference with the visual foreground task was evidenced by task-related steady-state visual evoked potential amplitudes or behavioral data. The results were interpreted as suggesting a specificity of emotion effects on sensory processing that might depend on the information channel from which emotional significance is derived. However, these effects appeared when distractors and task-relevant stimuli shared the same sensory modality –visual-, along with a similar temporal appearance. Therefore, one could speculate if there is any effect of emotional irrelevant words when equating the nature of both relevant and task-irrelevant stimuli, while delivering distractors through a different sensory modality, but immediately preceding the task onset.
1.4. Evaluating the cross-modal influence of verbal affective stimuli on subsequent cognitive processing
Following the previous idea, we studied the effect of auditory emotional words on the ERPs and behavioral performance of a subsequent highly demanding visual verbal working memory task, with the general hypothesis that the enhanced capture of lexico-semantic processing resources by emotional distractor words could last long enough as to interfere verbal subsequent processing, particularly in high cognitive demand situations.
Next, various methodological considerations and results from the abovementioned study are detailed, as well as how they could be interpreted in the context of the previously discussed related literature.
Subjects. In order to explore our hypothesis, 18 healthy, right-handed, university female subjects were recruited to voluntarily participate in the experiment (mean age= 26.1 years; SD= 4.1).
Experimental task. Behavioral data and ERPs were obtained during task performance. Subjects performed a dual working memory task. The first part of the task consisted in the serial presentation of two-syllable, four-letter words during 500 ms. Participants were given explicit instructions to first read the word silently and then, as soon as possible, pronounce aloud an arrangement of letters made up of the second syllable of the word followed by the inverted letters of the first syllable (e.g. BOTE – TEOB). Pronunciation times (RT) and the number of correct responses were measured for all trials. In the second part of the task, subjects were asked to decide, by pressing a key, if a string of four letters (which appear during 500 ms) represented – or not – the inverted order of the first word that was presented [e.g. BOTE – ETOB (inverse; 50% of the stimuli) or OTEB (not inverse: 50% of the stimuli)]. The time interval between the visual appearance of the word and the string of letters was 1000 ms. One hundred and fifty different high frequency words  were used as stimuli. Figure 1 shows the experimental flow chart.
Participants were seated comfortably in a quiet, dimly lit room. Visual stimuli were presented on an SVGA monitor (refresh rate: 100 Hz). Words were written in white capital letters (Arial) against a black background displaying a visual angle of 0.80°. Preceding each one of the trials of the task, a context was presented to the subjects. Five blocks of 50 trials each – a total of two hundred and fifty trials – were configured by combining three randomly-distributed main conditions. After each block, subjects had a brief rest period. The presentation order of the blocks was counterbalanced. The two conditions that constituted the trial blocks were:
A (reference trials): Fifty trials in which the WM trial was free from preceding stimuli.
B (auditory preceding stimuli) 200 trials in which the WM trials were preceded by words –delivered binaurally- showing different emotional content:
Ba - positive (50 trials)
Bb - negative (50 trials)
Bc - neutral (50 trials)
C - control (50 trials)
Auditory stimuli. Auditory stimuli were designed based on the results of a verbal production paradigm performed by 50 female voluntary subjects with similar ages and educational level with the participants in the ensuing electrophysiological experiment. They were instructed to write words freely with three different emotional contents: positive, negative and neutral. Subsequently, the most common 50 words in each category were selected and randomly presented to another group of 50 subjects –similar ages and educational level than participants- with the instruction to classified them in a continuum from very negative (0) to very positive (10) with 5 as the neutral emotional content.
Later, twenty-five words with averaged scores below 1.5 were selected and labeled as “negative” (i.e. “TONTA”; “SILLY”). Other twenty-five exemplars with averaged scores above 8.5 were selected and labeled as “positive” (i.e. “BONITA”, “PRETTY”), while further twenty-five words with scores ranging from 4 to 6 points were selected and labeled as “neutral” (i.e. “LADO”, “SIDE”). Both positive and negative words were female adjectives. The 75 resultant words were tape-recorded in a professional facility studio by a professional broadcaster. The length of the audio files was digitally restricted to 500 ms each. Besides, other 75 audio-files were created to be used as controls, by inverting the 75 files containing the spoken words, with the aim to keep similar physical characteristics but avoiding semantic bias.
Using the selected audio-files, three semi-randomized lists with different emotional content -50 words each- were created (each word was presented twice in its corresponding list). In addition, another list of 50 inverted audio-files was created to act as control (C), including 16 inverted positive, 16 inverted negative and 18 inverted neutral spoken words.
All the auditory stimuli were delivered binaurally via COBY (CV-200) headphones (COBY Electronics, Corp., U.S.A) controlled by the software MindTracer (Neuronic S.A., Cuba), at 85 dB SPL. Previous pilot studies were done to guarantee that the sound level used were not only audible but comfortable.
ERP Acquisition. ERPs were obtained, in all conditions, time-window starting 500 ms after auditory stimuli onset, which corresponded to 300 ms before dual WM task onset, until 750 ms after it. ERPs were recorded from the Fp1, Fp2, F7, F8, F3, F4, C3, C4, P3, P4, O1, O2, T3, T4, T5, T6, Fz, Cz, and Pz scalp electrode sites, according to the International 10-20 system. The electrooculogram (EOG) was recorded from the outer canthus and infraocular orbital ridge of the right eye.
Electrophysiological recordings were made using 10 mm diameter gold disk electrodes (Grass Type E5GH) and Grass electrode cream. All recording sites were referred to linked mastoids. Interelectrode impedances were below 5 kΩ. EEG and EOG signals were amplified at a bandpass of 0.5–30 Hz (3-dB cutoff points of 6 dB/octave rolloff curves) with a sampling period of 4 ms on the MEDICID-04 system. Single trial data were examined off-line for averaging and analysis.
ERP Scoring. Prior to scoring, EEG data was visually corrected for artifacts due to eye movement. Epochs of data on all channels were excluded from averages when voltage in a given recording epoch exceeded 100 µV on any EEG or EOG channel. In general, 3 to 7 epochs had to be rejected in each condition per subject. Thirty free-artifact correct trials were considered to obtain the individual ERP in each condition, reaching a signal-noise ratio higher than 1.5 in all cases. Amplitude and latency for the ERP components of focal interest were measured according to a 100-ms pre-stimulus baseline. All scoring was conducted baseline-to-peak through visual inspection.
Data Analysis. Repeated Measure Analyses of Variance (RM-ANOVAs) were used to study behavioral responses and reaction times. Electrophysiological data was analyzed using Randomized-block Analysis of Variance [Conditions x Recording Sites; see reference 61] with average voltage across each time window as the dependent variable. The latency and amplitude of each ERP component were quantified by the highest peak within each respective latency window. Considering the appearance of the task-relevant stimuli as the initial time instant (t0), several time windows were used to examine averaged ERP-waveforms. In addition, post-hoc Tukey’s HSD tests were carried out to explore the trend of the differences found.
Behavioral results. The analysis of the correct responses showed significant differences between the experimental conditions (F(4,60)= 4.65, p<0.05). Post-hoc comparisons showed that when the WM task onset was preceded by positive words, the number of correct responses significantly decreased, as compared to negative (Bb) or control (C) auditory stimuli (p<0.05), and when compared to neutral (Bc) or none (A) precedent stimuli (p<0.01) as well. Although the comparison between negative and control stimuli did not reached statistical significance, when negative stimuli preceded the task the amount of correct responses tended to decreased. The Table 1 shows the behavioral performances in the experimental task.
|Experimental Task Performance||No Auditory|
|Auditory words with emotional content||Auditory Control stimuli|
The pronunciation times - the time it took the subject to give the verbal response – were also significantly different across conditions (F(4,60)= 6.18, p<0.05). Post hoc analyses showed that the WM task performance preceded by positive words was significantly slower than that associated to control auditory stimuli or the lack of any precedent one (p<0.01). In addition, performances preceded by positive words were also slower than those preceded by neutral or negative words (p<0.05). See Table 1.
Electrophysiological results. Regarding the visual inspection of the resultant ERPs waveforms, three main components were discernible over the fronto-central region, when there was not any auditory stimulus preceding the WM task; an early negativity peaking over 80 ms subsequent to the instant in which the first visual stimuli (word) appeared, followed by a prominent P2 component (VPP) reaching its maximum at 170 ms, and a slow negativity with maximum about 400 ms at vertex. Probably due to the fact that the WM task involved mental manipulation of visual words, a left-lateralized N170 was discernible over the posterior regions. Figure 2 shows the grand-averaged ERPs that correspond to three experimental conditions: none auditory stimuli (A), neutral words (Bc) and reversed-words (C: control) preceding the beginning of the WM dual task.
One first ERP analysis was performed with the aim to elucidate the effect of any auditory stimuli preceding the beginning of the WM task on task-related visual ERP waveforms. With this goal, the three time windows which best represented the main ERPs changes were analyzed in the locations where they mainly occurred (-300-0, 0-300, and 300-750 ms, respectively). The presentation of the first task-relevant stimuli was taken as the initial time instant (t0).
Randomized-block ANOVAs using two factors [Condition (3: A, Bc and C); Recording Sites (8: Fp1, Fp2, F3, F4, C3, C4, Fz, and Cz)] were performed, showing significant differences for both factors (Condition: F(2,322)=17.94, p<0.0001, and recording sites: F(2,322)=5.10, p<0.0001), in the time window that preceded the beginning of the experimental task. No relevant interaction was found. This finding is compatible with the decrease observed in the slight early negative shift during conditions in which auditory stimulus were delivered.
The analysis of the time window in which N80 and P170 occurred, showed significant differences between conditions [(F(2,322)= 39.65, p<0.0001)], recording sites [(F(7,322)= 41.11, p<0.0001)], and their interaction [(F(14,322)= 2.99, p<0.001)]. Post-hoc analysis demonstrated that at fronto-central locations, ERPs reached significantly minor voltage amplitude when there was no preceding auditory stimuli (A), in comparison with the conditions in which they were presented (Bc and Bd; p<0.01).
Finally, the analysis of the N400 component also showed significant differences between conditions [(F(2,322)= 26.74, p<0.0001)], recording sites [(F(7,322)= 2.78, p<0.001)] and their interaction [(F(14,322)= 2.66, p<0.01)]. In this case, post-hoc tests showed that N400 was widely located, while showing significantly greater amplitude when no auditory stimuli were present, in comparison to the conditions in which they were (p<0.01).
Following an analog procedure, randomized-block ANOVAs were performed to clarify the effect of the emotional content of the preceding auditory stimuli on task performance. Therefore, two factors were analyzed [Condition: positive (Ba), negative (Bb) and neutral words (Bc); and Recording sites (Pz, P3, P4, O1, O2, T5 and T6)] in the time-windows which corresponded to the main ERP changes: -300-0 ms, 0-250 ms, 250-500, and 500-750 ms. Figure 3 shows the grand-averaged ERPs that correspond to three experimental conditions preceded by the following auditory stimuli: positive words (Ba), negative words (Bb) and neutral words (Bc).
The ERP analysis of the time elapsed from the auditory stimuli to the task onset (-300-0 ms) showed significant differences between conditions [(F(2,238)= 8.64, p<0.001)] and recording sites [(F(6,238)= 6.55, p<0.0001)] without any relevant interaction. In this case, the experimental conditions preceded by positive and negative words showed significantly minor voltages than that preceded by neutral words. This result suggests that visual ERP are capable of depicting a cross-modal effect of the emotional content of the auditory stimuli even earlier than the beginning of the task-related cognitive effort.
Similarly, the analysis of the time-window between the task-onset and the subsequent 250 ms showed significant differences only between conditions [(F(2,238)= 9.03, p<0.001)] and recording sites [(F(6,238)= 43.83, p<0.0001)]. In this time window the conditions preceded by positive and negative words also showed significantly minor voltages than the neutral ones.
The analysis of the time period between 250 and 500 ms subsequent to the task-onset demonstrated significant differences between conditions [(F(2,238)= 14.43, p<0.0001)] and recording sites [(F(6,238)= 13.68, p<0.0001)] without relevant interactions. As it occurred in the previous time-windows, minor voltages corresponded to trials preceded by positive and negative words, what reinforces the notion that neural effects caused by distracting affective stimuli might last longer than expected.
Finally, the analysis of the later period of time showed that conditions [(F(2,238)= 15.55, p<0.0001)] and recording sites [(F(6,238)= 3.95, p<0.01)] reached statistical significance, also without any significant interaction. In this case, there were also higher voltages in trials preceded by neutral words.
4.1. Unraveling behavioral results
The analysis of the correct responses achieved while performing the experimental task showed that when the onset of the WM task was preceded by positive words, the number of correct responses significantly decreased, as compared to the alternative conditions, while negative words tended to show the same effect without attaining statistical significance. In addition, the pronunciation times were significantly different across conditions, being particularly longer when positive words preceded the task onset.
Therefore, one possible first conclusion might be that auditory emotional words evoked a sustained interfering effect on task performance, even though they were task-irrelevant. This finding coincides with the report from Sakaki and colleagues  who recently studied the effect of emotional events on the cognitive processing of subsequent stimuli. Despite the fact that not all types of later cognitive processes are impaired by preceding negative events, they found that the presentation of negative pictures interferes with subsequent semantic processing. In our experiment, the use of female adjectives might have enhanced the arousal elicited by the auditory verbal stimuli with emotional content, particularly considering that the participants in the experiment were female subjects.
It has been reported that when the subjects are instructed to perform a semantic categorization using emotional words as distractors, affective mismatches are detected automatically and modulate a binding of irrelevant information with responses . Furthermore, the notion that the effect of emotional words could be narrowed to certain aspects of cognitive processing is reinforced by recent findings pointing out that negative words interfere with the allocation of dimensional attention to different features of an attended object, but they do not capture spatial or object-based mechanisms of visual attention .
Whatever the effect of emotional stimuli might be, it should depend on their distinctive characteristics. In this regard, it has been reiteratively demonstrated that the arousal associated to the stimulus definitely influences its subsequent processing. Despite the low arousal attributed to verbal stimuli, emotional words might be more arousing than neutral ones, probably due to their intrinsic relevance to individual social suitability. In fact, these differences in the word-arousal level could explain distinct processing outcomes as it occurred when Guillet and Arndt  demonstrated that while examining memory for peripheral information, memory for peripheral words was enhanced when it was encoded in the presence of emotionally arousing taboo words but not when it was encoded in the presence of words that were only negative in valence.
The effects of the emotional valence of the stimuli have been profusely discussed in the literature. However, beyond the different neural subsystems underlying emotional recognition, the exact effect of emotional words on cognitive processing remains far from being elucidated. During the previous paragraphs, some empirical evidence supporting the notion that negative stimuli interfere with subsequent cognitive processing has been documented. This could portray the tendency observed for some behavioral responses in the present study, but do not elucidate the predominant effect of positive distractors preceding the experimental task.
4.2. Emotional positivity bias
In order to clarify the effect of preceding positive words on later cognitive processing at least two variables must be considered: a) positive emotional valence and b) dissimilarities in sensory activation when distractors are auditory stimuli while task-relevant targets are visually displayed.
Abundant empirical evidence supports the idea that stimuli with positive valence influence subsequent processing. Visual scenes such as smiling and attractive faces, appetizing foods and beautiful pictures can evoke strong emotions. People routinely employ such emotional imagery in the media and even during ordinary social interactions to attempt to bias the decisions of others. However, not all positive stimuli are equally influential. In fact, human smiling facial expressions and images of cute animals bias decisions more than food [66, 67]. Furthermore, there is a recognition bias for information consistent with the physical attractiveness stereotype .
Recent evidence from brain damaged patients has been interpreted as suggesting that the proper recognition of both negative and positive facial expressions relies on the right hemisphere, and that the left hemisphere produces a default state resulting in a bias towards evaluating expressions as happy . The recent report commenting that the activation of the left dorso-lateral prefrontal cortex favors the memory retrieval of positive emotional information  additionally supports this hemispheric disquisition. These findings lead to the conclusion that the positive bias not only includes the recognition process but also the memory and its retrieval, functionally involving several brain neural structures.
With respect to the latter topic to consider -the possible cross-modal effect that time-related auditory and visual stimuli have on cognitive processing- recent studies have found, using different auditory-visual distraction paradigms, that task-irrelevant novel sounds preceding visual targets cause behavioral distraction in adults as reflected by increased reaction times to the visual target preceded by novel sounds when compared to those preceded by standard sounds .
Regarding this issue, San Miguel and colleagues  have proposed that (together with other factors) attention task demands and the temporal position of the novel relative to the encoding or retrieval of the task-related visual information influences whether a novel stimulus causes distraction or facilitation. In fact, they reported a reduced distraction under high memory load . On the other hand, Muller-Gass & Schröger  studied whether the distraction effect is modulated by the difficulty of the auditory task. They found that the distraction effect increased while rising memory load task demands, but not while increasing its perceptual difficulty. Interpreting these results together it could be possible to assume that channel separation between task-relevant information and task-irrelevant distracting information has an interactive effect with task demands in determining the magnitude of auditory distraction. Therefore, when channel separation is possible distraction increases, as it occurred in the experiment conducted by San Miguel and colleagues , and distraction decreases when processing information in both auditory and visual channels concur. In the present study, we used auditory irrelevant stimuli preceding the performance of a highly cognitive demanding WM visual task, thus it could strengthen the potential impact of the attentional capture elicited by the significant sounds and its consequences on subsequent behavioral task-performance.
4.3. Event-related brain potentials
Two main effects can be inferred from the ERP data; 1) visual ERP components reach significantly minor voltage amplitude when there are none preceding auditory stimuli in comparison with conditions in which they are present, 2) When the auditory irrelevant stimuli had an emotional content, there is a discernible decrease in the voltage amplitude of the ERP components which appears very early in the processing stream.
Voltage increases are usually interpreted as signs of greater neural recruitment, which is commonly seen in more novel tasks or ones that are more difficult . Other possible explanations for the amplifying effect that task-preceding irrelevant distractors impinge on the ERP voltage magnitudes was postulated by Nataanen and colleagues in 1982 . These authors proposed that deviant stimulus elicit two overlapping sequences of brain events: exogenous and endogenous. They described the former as an earlier automatic and inflexible set of brain processes that might provide a central-level stimulus to the latter. In addition, they suggest that there is a subsequent endogenous set of brain waves regarded as a sign of stimulus deviance.
In the same logic, we expected to obtain the overlapped effects of the neural state triggered by the auditory stimuli and that necessary to fulfill task demands. Accordingly, the auditory stimuli processing could either distract neural resources from task performance leading to poorer achievements or deploy additional resources thus improving task performance. The present results favored the latter conjecture.
On the other hand, one important point to elucidate resides on the time in which auditory distracting stimuli influence the visual ERP. The simplest assumption might be that any stimulus preceding the task onset should only influence the ERP corresponding to the beginning of the task due to an earlier processing closure related to the task irrelevance of the former stimulus. However, the present results might suggest that the processing closure of the auditory relevant stimuli could take longer than expected, probably due to the different sensory modality in which task-irrelevant and task-relevant stimuli are delivered.
In general, the N170 component has been interpreted as a hallmark of visual orthographic specialization [77, 78] that may reflect increased visual processing expertise , most likely in pre-lexical orthographic processing . The present results seem to correspond well to previously reported findings on the N170 component, where source localization and imaging studies have shown that this early stage of perception processing occurs in the fusiform gyrus and is lateralized depending upon the nature of the stimuli (left side for words; right side for pictures; 81,82]. Accordingly, the component N170 was lateralized at the left side, suggesting that experimental manipulations with the visual words might be performed in a sub-lexical perceptual processing level.
4.3.2. VPP component
The vertex-positive potential is an ERP waveform that has been described as a positive counterpart at centro-frontal sites of the N170 component. The entire N170/VPP complex has been accounted for by two dipolar sources located in the lateral inferior occipital cortex/posterior fusiform gyrus . These authors postulated that early processes in object recognition respond to category-specific visual information, and are associated with strong lateralization and orientation bias. In addition, it is very probably that differences between N170 and VPP effects observed in ERP studies could be accounted for by differences in reference methodology .
In the present experiment the VPP waveform showed greater amplitude at fronto-central regions in the trials preceded by auditory stimuli, whereas its amplitude decreased when the distractors had an emotional content. Despite this component is usually related with configurational information processing more than with emotional processing, the voltage decrement observed in trials preceded by emotional words might depict the amount of resources engaged in the processing of irrelevant stimuli but needed for the concurrent task performance.
4.3.3. Slow negativity
A component named N2b, one that exhibits peaks later than 250 ms in adults, has been reported during performance of category comparison tasks [84, 85]. Experimental evidence suggests that while the N400 component is a specific marker of semantic incongruity, N2b represents a general correlate of inconsistencies in the detection process, or “conflicts”  between representations of task-relevant stimuli features . Both components could fit the explanation for the slight negativity subsequent to the P2 like component observed in the present experiment. However, due to its latency, N400 waveform seems to be more likely to occur in the present conditions.
In our experiment the visual word had to be read and further manipulated in working memory to fulfill the task requirements. The N400 like component observed might be depicting the link between the steps in which visual descriptive information of words is first encoded in semantic memory and subsequently visualized via the network for object working memory . Alternatively, it could depict the timing of the effect resemble brain responses linked to engagement of working memory resources, as it was interpreted recently by Wlotko and Federmeier  while evaluating the influence of contextual information on semantic processing. The differences between conditions preceded by auditory stimuli -emotional versus neutral- seem to additionally address the contextual influence on working memory processing.
5. Conclusions and final statements
Conjunctively, behavioral and electrophysiological results suggest that when verbal distractors precede the beginning of a high demanding verbal WM task, its performance is influenced by the characteristics of the distractors, irrespective of whether they appear in different sensory modalities.
In the present study, a “positivity offset” was confirmed, where positive irrelevant stimuli interfered with task performance. It occurred despite the temporal shift between the appearance of distractors and task-relevant stimuli, as well as, the different sensory modality in which they were both delivered. This could be probably explained as part of the competing effect between irrelevant and relevant stimuli for processing cross-modal common resources.
Even though the topic concerning environmental influences on cognitive processing remains incompletely elucidated, we hope that this work could contribute to the understanding of these important relationships. In fact, increasing experimental evidence on the topic suggest that more attention will be paid in the future to the interaction between contextual environment and cognitive processing demands, due to the general idea that verbal positive material could help to process concurrent information, when it seems that exactly the opposite occurs.
In a more general context, the present results should be interpreted within the extensive framework of emotion-cognitive processing relationships. The multisensory continuous assessment of the environment carried on by the central executive systems is constantly challenged by environmental demands, while its response capacity is limited by the amount of available processing resources. Fortunately, this neurofunctional dynamic seems to run asymmetric, in which the intrinsic relevance of certain stimuli (e.g. faces, words, and emotional stimuli) benefit from a special cognitive treatment.
The authors are grateful to Psic. Vanessa Ruiz-Stovel for her revision of the text and useful comments.