## 1. Introduction

When MEG researchers measure an MEG time series in a well magnetically shielded environment and view their measurements on a real-time monitor, they see an amplitude- and frequency-modulated rhythmic wave and make no judgement about whether the rhythmic wave is a signal or noise. MEG-detectable magnetic fields by themselves reflect macroscopic electrophysiological collective phenomena in the brain, which are generated by a large number (ten thousand or more) of spatially (including directionally) and temporally localized neurons. That is, even rhythmic MEGs are already results of not a random spatiotemporal synchronization. Phase is more important than amplitude when it comes to the synchronization of rhythmic MEGs.

### 1.1. Phase-related phenomena

#### 1.1.1. Event-related desynchronization (ERD) and event-related synchronization (ERS)

Rhythmic MEGs are distinguished by frequency bands: e.g., the delta band (1-4 Hz), theta band (4-8 Hz), alpha band (8-13 Hz), beta band (13-26 Hz), and gamma band (26- Hz). There are occasionally slight differences in the above frequency ranges (e.g., separation of beta and gamma bands) and naming (e.g., mu and tau bands). In the arousal state, the alpha-band MEG is largest in amplitude. As is well-known, it increases when subjects are relaxed and has their eyes closed, and hence, it might relate to an inactive state. Alpha-band suppression is observed in event-related experiments; i.e., the alpha-band MEG decreases when subjects receive stimuli and/or executes motor tasks. During this suppression, gamma- and/or beta-band MEGs sometimes increase in amplitude. Decreases and increases in MEG activity relative to that in the resting state might be caused by populations of synchronizing neurons and are called event-related desynchronizations (ERDs) and synchronizations (ERSs), respectively. ERD is a kind of the suppression of MEG in a specific frequency band, and the ERD of the alpha band is often observed in MEG recordings. On the other hand, ERS is not as distinctly observed because of its short-lived and poor time-locked characteristics. The word ERS seems to be used in context with ERD. In fact, appearances of ERD and ERS would be much more complicated (Lopes da Silva, 2006, Pfurtscheller, 2006, Herrmann et al., 2005, Başar, 1998). ERD and ERS are often mentioned in the literature of EEG-based brain-computer interfaces (BCIs) (Dornhege et al., 2007). In particular, it has been found that well-trained BCI users can use imagery to control ERD/ERS.

The event-related field (ERF) is the MEG version of the event-related potential (ERP), the only difference being the sort of measurement. Of course, there is an intrinsic difference between EEG and MEG; i.e., only MEG can detect intracellular current flows. Many stimuli are typically presented to subjects in ERF experiments. MEG data clip out according to the triggers of stimuli.

In what follows, we denote *x* _{mq }[*n*] to be the ERF of the *m*th channel (*m*=0, 1, …, *M*−1), *n*th time sample (*n*=*N* _{pre}, *N* _{pre}+1, …, *N* _{post}−1, *N* _{post}; *n*=0 means the trigger timing), and *q*th epoch (*q*=0, 1, …, *Q*−1). This is the determination of ERF in this chapter. Do not confuse the ERFs with the following averages; *x* _{mq }[*n*] is averaged over epoch to enhance a signal-to-noise (S/N) ratio:

Equation (1), or conventional averaging, is related to the supposition that the stimulus-locked additive activity is the most important object of analysis, and this supposition has remained implicit in most of the ERP and ERF literature. Additive activities are similar to impulse responses in the system theory, and they are thought to appear only as a result of stimuli and/or motor executions. In fact, conventional averages are experiment-dependent and (mostly) subject-independent; most of the rhythmic activities are averaged out. It is thus natural to think that conventional averages would have important information and rhythmic activities would not. But is this really the case? For instance, do conventional averages converge to additive activities when the number of epochs becomes large? A much more elementary question arises as to whether additive activities actually exist or not. The conventional averages (1) are just mean values in the mathematical sense. Moreover, ‘stimulus-locked’ in the sense of conventional averaging means ‘phase-locked’ to stimuli, and additive activities are often referred as evoked activities.

ERD and ERS researchers are interested in rhythmic activities. Therefore, they often initially calculate the power-spectrograms of MEG epochs, *P* _{mq }[*n*,*k*] (*k*: discrete frequency), and then average the power-spectrograms over epoch:

where WT{ } means a continuous wavelet transform, the mother wavelet of which is the complex Morlet (modulated Gaussian) (Tallon-Baudry et al., 1999). The wavelet transform can be replaced by other time-frequency methods (e.g., the Wigner distribution or the short-time Fourier transform) and the Hilbert transform (frequency-wise). The power calculation of eq. (2) is a simple but nonlinear operation to eliminate stimulus-unlocked phase differences. Therefore, power-spectrogram averaging (2) can reflect ERD and ERS. Note that power-spectrogram averages are usually frequency-wise normalized by the mean power for a pre-trigger duration to enhance the ERD (ERS) changes. Power-spectrogram averaging indeed has played an important role in making researchers notice that a portion of the rhythmic activities are actually event-related. However, power-spectrogram averages may also contain additive activities, if they actually exist. Moreover, if the phases of ERFs are also event-related, another important piece of information regarding the phases in the brain processing is unfortunately lost. ‘Stimulus-locked’ in power-spectrogram averaging means only ‘time-locked to stimuli’ (not care for ‘phase-locked’), and time-locked rhythmic activities are often referred to as induced activities.

Averaging methods for analysing event-related phases have been also proposed. The phase-locking factor (PLF) for a single channel phase synchronization analysis is expressed as

where j is the imaginary unit (Tallon-Baudry et al., 1996). PLF is the absolute value of the average of a specific-frequency *k* _{0} ERF on the complex unit circle, exp(j*ψ* _{mq }[*n*,*k* _{0}]), avoiding the amplitude term *r* _{mq }[*n*,*k* _{0}]. PLF is one if the instantaneous phases are perfectly locked to the stimuli whereas it is zero if they are independent of the stimuli. PLF requires a complex-valued *X* _{mq }[*n*,*k*], whereas power-spectrogram averaging does not. Therefore, the calculation procedure to obtain *X* _{mq }[*n*,*k*] would be restricted to the wavelet transform with the complex Morlet and the Hilbert transform. Regarding the use of the wavelet transform, the frequency of interest in PLF spreads around *k* _{0} in the form of the Gaussian that is the Fourier transform of the Gaussian of the complex Morlet corresponding to *k* _{0}. Therefore, the frequency band around *k* _{0} is included in the PLF analysis. On the other hand, regarding the use of the Hilbert transform, *k* _{0} is only *k* _{0}, and hence it does not have a frequency spread. If one needs a frequency band in PLF when using a Hilbert transform, one should just add some *X* _{mq }[*n*,*k*] to construct a frequency band, of say, *X* _{mq }[*n*,*k* _{0}−1]+*X* _{mq }[*n*,*k* _{0}]+*X* _{mq }[*n*,*k* _{0}+1].

So far, we have explained three averaging methods (1)-(3) for a single channel and their related technical terms. In fact, their applicability to the problem at hand may still be somewhat confusing. In particular, the term ‘phase-locked activities’ includes in its meaning both additive activities and phase-locked rhythmic activities. Accordingly, the conventional averaging (1) picks up both of these activities, and PLF (3) also picks them up, even if the number of epochs is sufficiently large. Moreover, phase-locked activities are just subsets of time-locked activities, and hence, these two activities cannot be separated either. Power-spectrogram averages (2) include not only the above-mentioned activities but also phase-unlocked rhythmic activities.

#### 1.1.2. Long distance synchronization and phase resetting

Why is the phase of ERF modulated if it is actually modulated? It could be that an oscillator is unilaterally drawn to phase synchronization with another oscillator or two are bilaterally drawn to their halfway phase point or a third oscillator’s phase. Long distance synchronization refers to phase synchronization between distant brain areas (Rodriguez et al., 1999). The phase synchronization of two channels would be meaningful when the distance between two phase-synchronized brain areas is large in the sense of the MEG pattern (not only their locations but also their orientations are relevant). The phase-locking value (PLV) was proposed (Lachaux et al., 1999) as a way to analyze cross-channel (between two channels *m* and *l*) phase synchronization:

PLV is sometimes referred as the phase-locking index (PLI). PLV is apparently a cross-channel version of PLF and is also distributed from zero to one. Unlike PLF, the significance of PLV can be statistically tested with a surrogate method as follows. i) Make surrogate sets of phases by random permutation of *ψ* _{mq }[*n*,*k* _{0}] over *q*. ii) Calculate the PLV of *ψ* _{lq }[*n*,*k* _{0}] for each surrogate set. iii) Rank the actual PLV value in the surrogate PLVs. Such significance testing is very important in actual ERF analyses, e.g., comparison among conditions. One may find differences to suit one’s preference if one performs many kinds of analyses. Results are difficult to discuss without knowing their significances.

The single-trial phase-locking index (SPLI) For a cross-channel single epoch analysis is (Lachaux et al., 2000)

where *T* is the single period of *n* corresponding to the frequency *k* _{0}. While PLV is an average over epoch, SPLI is an average over time.

Phase resetting is another way of picturing phase modulation. Long-distance synchronization is based on the cross-channel phase relationship. Phase resetting refers to the phenomenon wherein the phase of an oscillator jumps to a specific value in a short period of time. Since the specific value is not determined by another running oscillator, phase resetting is not a cross-channel phenomenon. If stimuli produce phase resetting, nonzero activities will appear in the conventional averages. Probably because of this, phase-resetting has been discussed in the context of whether activities are additive or not. For instance, i) visual ERFs were divided into two bins by post-stimulus alpha amplitude and ERFs were averaged for each bin. ii) the amplitudes of the averages inherited the post-stimulus alpha-band amplitudes (Makeig et al., 2002). This simple fact could provide strong evidence for the existence of phase resetting.

There is a counterexample. The phase preservation index (PPI) is expressed by

for evaluating how much the phase of a specific pre-stimulus time *n* _{0} and frequency *k* _{0} are preserved in the post-stimulus duration (Mazaheri & Jensen, 2006). The results of a PPI analysis showed that the alpha-band phase was preserved, or not reset. This would be a very simple but strong evidence for the existence of additive activity. Taken together, these results could reveal the existence of a phase resetting component or an additive component in the conventional average without rejecting the existence of the other component. Therefore, there is a possibility of these components coexisting.

Finally, we would like to point out that i) indices (3)-(6) discard the amplitude terms, ii) they are not MEG activities, which can be further analyzed with subsequent methods, but rather indices, and iii) the cross-channel indices (4) and (5) are undirectional.

### 1.2. Simulation: Phase modulation

This section describes a simple simulation performed to investigate the influence of only phase-locked rhythmic activities on the three averages (1)-(3).

A 10-Hz stable sinusoid with an initial phase *θ* is expressed as cos(2π10*t*+*θ*), where *t* is continuous time. For channel 0, *θ* _{0q }(*q*=0, 1, …, *Q*−1, *Q*=200) is a random phase uniformly distributed from 0 to 2π and ERF is *x′* _{0q }(*t*)=cos(2π10*t*+*θ* _{0q }). The ERF for channel 1 is *x* _{1q }(*t*)=cos(2π10*t*+*θ* _{1q }), where *θ* _{1q }is a separately generated random phase. Note that the cosine can be replaced by the sine without any essential difference, since both *θ* _{0q }and *θ* _{1q }are uniformly distributed. Now, every pair of 2π10*t*+*θ* _{0q }(*t*0.1) and 2π10*t*+*θ* _{1q }(*t*0.3) is connected by a straight line for a duration of 0.1<*t*<0.3 (top-left, Fig. 1). The cosines of these connected phases produce a new ERF *x* _{0q }(*t*) (top-right, Fig. 1). ERF *x* _{0q }(*t*) is thus initially *x′* _{0q }(*t*) (*t*0.1) and finally *x* _{1q }(*t*) (*t*0.3), and it is unilaterally drawn to phase synchronization with *x* _{1q }(*t*) for the middle duration (0.1<*t*<0.3). Note that *x* _{0q }(*t*) has no additive activity.

One might be interested in the conventional average of *x* _{0q }(*t*) (middle-left, Fig. 1). Significant activity occurs in the middle duration, although activities are mostly averaged out in the other durations. Moreover, one repeatedly sees essentially the same activity for any uniformly distributed initial phase combination. This is a demonstration that the conventional average is not necessarily equal to, or rather, does not necessarily converge to, an additive activity (even if it actually exists). The power-spectrogram average shows ERD for 10 Hz and ERS for 5 and 15 Hz in the middle duration (bottom-left, Fig. 1). PLF shows a phase bias, or an amount of phase-locking, in the middle duration (bottom-right, Fig. 1).

Let us now investigate whether there is any difference between using the cosine and sine in the simulation setup of the initial *x* _{0q }(*t*). The fact is that the sine of the connected phases results in a significant activity in the conventional average (middle-right, Fig. 1), which is different from the cosine result. These two are π/2 rad apart. The shape of the sine is also reproducible for any uniformly distributed initial phase combination. This means there is an essential difference between cosine and sine for the simulation setup of the final *x* _{0q }(*t*) despite such a random phase connection. The reason is in the final quantity discarded when calculating PLF. PLF provides a phase just before absolute calculation (3). The phase at the extreme of PLF (*t*=0.2 in this case) (bottom-right, Fig. 1) is not a random value but highly likely to be a specific value (π in this case). Accordingly, the line segments consisting of the phases of *x* _{0q }(*t*) in the middle duration have similarly constricted envelopes for any uniformly distributed initial phase combination.

## 2. Epoch filter

ERF is a function of time, channel, and epoch (Fig. 2). There are huge variations for time, or temporal, filters (e.g., bandpass filter, wavelet transform, and Hilbert filter) and for channel, or spatial, filters (e.g., beamformer, Laplacian filter, principal component analysis (PCA), and independent component analysis (ICA)). These filters have their own benefits. Now let us turn to the question of epochs. As shown in the figure, we can consider a third filter for the epoch axis.

### 2.1. Concept

Let us begin with an interpretation of the three averaging methods (1)-(3) as possible epoch filters. PLF (3) is significantly different from conventional averaging (1) and power-spectrogram averaging (2). The most conspicuous difference from an epoch filter viewpoint is that PLF treats epochs as a series; i.e., instantaneous phases give epochs their own order. The instantaneous phases also serve as an order for a series for indices (4)-(6). PLF is a weighted average of ones (in the absolute value sense), to which the ERF amplitudes are normalized at all time sampling points. The weight is derived from the instantaneous phase of ERF. More precisely, it is the filter kernel of PLF (3), exp(j*ψ* _{mq }[*n*,*k* _{0}])/*Q*. Accordingly, the filter kernels of the other two averaging methods (1) and (2) (having non-normalized amplitudes) are 1/*Q*, which does not require the epochs to have an order. Thus, the other two methods can be interpreted as ‘averaging filters’.

The most essential reason for the temporal and spatial filters is that time and channel have own orders based on physical properties. Since time and channel have orders, they are not always assigned the same weights (filter kernels). Epoch filters would be possible if epochs can be given an order, even if this order does not have an unequivocal physical meaning. PLF has built a small bridge to epoch filters.

Filters, in general, have pass- and stop-bands. Although bands might fit in frequency very well (namely, frequency band), bands should be treated here as just subsets. Filters are designed by taking one of the two bands into account. Since the three averaging methods (1)-(3) are not originally designed as epoch filters, they do not handle pass- and stop-bands very well. There is no appropriate band concept corresponding to time-locked and phase-locked activities for epoch filters.

The question thus is can ERFs have bands? Rhythmic activities, which are sometimes referred to as ongoing spontaneous activities, always exist and are amplitude- and/or frequency-modulated. Since phase modulation can be done in place of frequency modulation, it may be fruitful to separate the rhythmic activities into amplitude-modulated (AM) and phase-modulated (PM) components. In addition to these components, additive activities (AD), which are template-like activities corresponding to individual inputs (stimuli) and outputs (motor executions), may exist. Accordingly, let us treat AD, AM, and PM as bands, and let one purpose of epoch filters be to pass one of these bands and to stop the others.

Another purpose of epoch filters is illustrated by the simulation described in section 1.2. In that simulation, the phase of *x* _{0q }(*t*) was unilaterally drawn to phase synchronization with *x* _{1q }(*t*) for the middle duration, and hence, *x* _{0q }(*t*) directionally phase-synchronized with *x* _{1q }(*t*). Such a directional phase synchronization should be quantitatively evaluated. In fact, undirectional and free-amplitude phase synchronizations have so far been evaluated with indices (5) and (6).

Let us introduce a quantity that gives epochs an order. For the above-mentioned two purposes, phase is surely important. Therefore, as the first attempt to design an epoch filter, let us use the instantaneous phase to give an order to epochs, and let us discuss independence of the instantaneous phase as an aside. An epoch is defined as the integer number of stimuli/motor executions that make up an ERF recording. Therefore, an epoch is definitely an independent variable. Although instantaneous-phase-sorted epochs are, however, distinguished by the original numbers, they differ by the other independent variables: time and channel.

Finally, we have a perhaps subjective feeling about filters. We wonder if the inputs and outputs of most of the existing filters might belong to the same category. This means the outputs of epoch filters should not be indices and should be spatiotemporal-structure-preserved ERFs. In other words, the outputs of epoch filters can be used as inputs to spatial and temporal filters.

### 2.2. Preliminary: Temporal filtering

Before explaining the methodologies of epoch filtering, we should mention frequency-selective temporal filtering.

The instantaneous phase will play an important role in ordering epoch filters. To determine the instantaneous phase, we often perform a discrete Hilbert transform (Oppenheim & Schafer, 2007), which retains the real part of the sequence as is and only generates the imaginary part of the sequence under the constraint that the total complex sequence consists of only positive frequency terms; namely, the total complex sequence is analytic. The discrete Hilbert transform consists of i) a discrete Fourier transform, ii) doubling of positive frequency terms (except the DC and Nyquist frequency terms) and zero-padding of negative frequency terms, and iii) an inverse discrete Fourier transform. Therefore, we usually embed temporal filtering in the Hilbert transform by padding zeros to stop-frequency-band terms. Even in the case that only temporal filtering is necessary and an analytic sequence is not, we should perform the pair-wise forward and inverse discrete Fourier transforms and halfway stop-band zero padding for consistency. The need for consistency relates to the fact that the discrete Fourier transform treats a sequence as periodic. In other words, we should not cross-process two sequences, one of which is generated by a convolution (not a periodic procedure) and the other of which is generated by a Hilbert transform (a periodic procedure). Instead of using a Hilbert transform, one can perform a wavelet transform and convolution-based temporal filtering and maintain consistency. However, in so doing, one may encounter difficulty in obtaining a wide-frequency-band-limited analytic sequence. In addition, the wavelet transform with the complex Morlet also generates an analytic sequence unless one takes care about the negative frequency terms for the complex Morlet.

### 2.3. Phase-interpolated averaging

Phase-interpolated averaging, which is an epoch filter, has been proposed for separately extracting AD, AM, and PM and for reducing noise in the averaging sense (Matani et al., 2011). It is originated from the phase-series analysis for echography (Matani et al., 2006).

#### 2.3.1. Methodology

Time and channel, which are independent variables of ERF, are discrete samples of continuous variables in a computer; needless to say about time, channels are actually input coils and hence are spatial samples. On the other hand, an epoch is ordered according to the instantaneous phase. Phase is continuous, and therefore, an epoch can be virtually treated as a sample. However, phase sampling intrinsically has unavoidable jitter. Another characteristic is that phase has a period of 2π, or more precisely, has an uncertainty of integral multiples of 2π. Phase-interpolated averaging consists of two stages: I) interpolation of ERF by eliminating phase sampling jitter and II) a discrete Fourier transform of the interpolated ERF over the epoch.

Stage I) Interpolation of ERFLet us recall ERF, and suppose it is first frequency-band *I* limited,*I* can be selected to be, say, wide band: 0.1-50 Hz, alpha band: 8-13 Hz, etc. Since phase-interpolation is a nonlinear procedure as a whole, frequency-selective filtering should be done first, if necessary. Now, consider the analytic signal of

*q*′ has a one-to-one correspondence with

*q*such that

*ϕ*to be jitter-sampled as

*ϕ*can be approximated by a Fourier series, as

where

The pair of +*k* and −*k* correspond to bipolar same-frequency terms and *k*=0 corresponds to the DC term. Therefore, the number of *k*s must be an odd number 2*K* +1.

Since phase is intrinsically periodic, *c* _{mn }[*k*] can be replaced by

The superscript ‘G’ means grid-sampling.

The periodic version of the sampling theory is expressed as

where

is the Dirichlet kernel of the 2*K*+1th order.

Therefore, the jittered and unjittered (grid-on) phase-sample interface can be formulated as

where

The superscript ‘J’ means jittered-sampling. Note that both *x* ^{J} _{mn }and *x* ^{G} _{mn }are real-valued and their independent variable is only phase (channel *m* and time *n* are treated as constants here). Now, 2*K*+1 is the number of interpolated epochs. By setting *K* such that 2*K*+1<*Q*, we can interpolate the unjittered phase samples by using the least squares solution:

The interpolated ERF is then discrete Fourier transformed. However, so far, we need only the 0th and 1st terms:

They are respectively called the 0th and 1st phase-interpolated averages.

As will be shown in the next simulation, the 0th and 1st phase-interpolated averages are to separately extract the AD activity and the AM component, respectively. The condition number of Ξ_{ mn }, which quantitatively indicate a difficulty of the interpolation, is to reflect the PM component. One may have doubts about the origin of the two 2s as multipliers in eq. (16). These multipliers were numerically determined by simulation (Matani et al., 2011).

Finally, we would like to stress that phase-interpolated averaging is performed for each time sampling point and each channel (not for cross-channel).

#### 2.3.2. Simulation: Separation ability

In the simulation in section 1.2, only PM occurred in the middle duration (0.1<*t*<0.3). On the other hand, in the simulation described here, AD and AM also simultaneously occurred in the middle duration. AD consisting of two bipolar Gaussians was added to the ERFs (top-left, Fig. 3). The envelope of AM was a Gaussian (middle-left, Fig. 3). The rhythmic activities were thus not only amplitude- but also phase-modulated (bottom-left, Fig. 3). Note that the instantaneous phases, which play a very important role in phase-interpolated averaging, were not given ones for this simulation but were actually calculated from the given ERF with the Hilbert transform.

The condition number monotonically increases with the number of interpolated epochs, 2*K*+1(=3, 5, 7, 9), in eqs. (8)-(16) (bottom-right, Fig. 3). This indicates that increasing *K* increases the difficulty of the interpolation. The condition number increases in the middle duration, and thereby, PM is successfully evaluated. The 0th (top-right, Fig. 3) and 1st (middle-right, Fig. 3) phase-interpolated averages mostly overlap for *K*=2 and 3. The overlapping averages successfully extract the AD and AM. This indicates that the stably extracted AD and AM overlap with a certain range of *K*. In practice, one should perform phase-interpolated averaging for some *K*s and then check such an overlap. On the other hand, the conventional average shows a PM-contaminated AD (gray line, top-right, Fig. 3).

The above-mentioned stability depends on not only the instantaneous phase bias but also the amplitude of AD. This simulation was of a difficult case in which the amplitudes of AD and the rhythmic activities (including AM) were comparable. As mentioned in section 3, also as the common experience, AD (if it actually exists) is usually smaller than the rhythmic activities. If this is not the case, no one tries to perform conventional averaging. Thus, we can say that actual phase interpolation is not so difficult.

One may wonder about strange surges at the ends of the 1st phase-interpolated averages and the condition numbers. This is a transient phenomenon caused by the discrete Hilbert transform, which treats the objective series as periodic. To avoid it, we can set the time duration of interest to the center before the analysis and cut off both ends after the analysis. (We would have done so if this article were for journal papers.) Although we think this treatment is sufficient, such transients can in any event be avoided with Hamming-windowing.

Consequently, phase-interpolated averaging works as an epoch filter to pass only one of AD, AM, and PM bands and to stop the others.

### 2.4. Phase-compensated averaging

Phase-compensated averaging, which is an epoch filter, has been proposed for visualizing directional phase synchronization between channels and for reducing noise in the averaging sense (Matani et al., 2010).

#### 2.4.1. Methodology

Phase-compensated averaging consists of filter kernel design and actual filtering.

Regarding the filter kernel design, let a narrow-frequency-band *N* limited analytic ERF of channel *m* be

The filter kernel of phase-compensated averaging is*n* _{0} is a reference time that one can specify freely. The other instantaneous phases and the residual terms in eq. (17) are not used.

Regarding the actual filtering, we can use a wide-frequency-band *W* limited analytic ERF of channel *l* :

The phase-compensated averaging is expressed as

Thus,*m* for the kernel filter design and channel *l* to be filtered can be different from each other. As names for the correlation function, the case of *m*=*l* is called the auto-average and that of *m* *l* is the cross-average. Note that the cross-averages are directional; i.e.,

Phase-compensated averages are complex valued. They should be represented by their absolute values and phases (if necessary) rather than their real and imaginary parts. This is because they usually appear as modulated complex sinusoids having a single center frequency (they look like a synchronization burst), and therefore, the absolute value representations contain the envelopes of the phase-compensated averages, which are the most useful for characterization.

Regarding the significance test, we shall fundamentally adhere to the statistical test of PLV (Lachaux et al., 1999). A surrogate method is used for evaluating the significance, as follows. i) Surrogates are generated by random permutation of*q*. ii) The phase-compensated average for each surrogate is calculated. iii) The absolute values of the averages are ranked for each time sampling point. iv) The *p*<0.05 levels are determined. For this significance measure, phase-compensated averages are represented by their absolute values. If one statistically tests the difference between two phase-compensated averages, the difference in the absolute values of two sets of surrogates should be both one-sided ranked. Thus, there is a hidden merit associated with phase-compensated averaging and the surrogate method: For a single run of ERF recording, which generates only one conventional average, phase-compensated averaging provides many auto- and cross-averages with significance measures.

Finally, we would like to point out a number of methodological differences between phase-interpolated averaging and phase-compensated averaging: i) The filter kernel of the former is calculated for each time sampling point, whereas that of the latter is calculated for only a reference time sampling point *n* _{0} ii) The former is performed for individual channels, whereas the latter is performed for auto- and cross-channels. iii) The former nominally requires the same frequency bands (indicated by *I*) for the filter kernel and the sequence to be filtered, whereas the latter does not care (indicated by *N* and *W*). The most important difference is, of course, that each averaging has its own purpose.

#### 2.4.2. Simulation: Directional synchronization

We will again refer to the ERFs *x* _{0q }(*t*) and *x* _{1q }(*t*) of the simulation described in section 1.2. We would like to confirm that they first oscillated independently (*t*0.1), then *x* _{0q }(*t*) was unilaterally drawn to phase synchronization with *x* _{1q }(*t*) (0.1<*t*<0.3 is called the middle duration), and they eventually became synchronized with each other (*t*0.3). That is, *x* _{0q }(*t*) changed and *x* _{1q }(*t*) remained unchanged.

Phase-compensated averaging of the given ERF is performed for reference times *t* _{0}=0 (top-row, Fig. 4), 200 (middle-row, Fig. 4), and 400 ms (bottom-row, Fig. 4). Since these averages are intrinsically complex-valued, they are represented as absolute values. Note that the instantaneous phases are not given ones for this simulation but were actually calculated from the given ERF with the Hilbert transform and frequency-band *N* and *W* are not specified because the given ERF is already frequency-band-limited.

The significance levels (*p*<0.05) of the phase-compensated averages are generated by 1000 surrogates. Two auto-averages and two cross-averages are represented by their absolute values. Each of average has its own significance level (dotted lines: channel 0, dashed and dotted lines: channel 1, Fig. 4). The following results are valid since each phase-compensated average exceeds the corresponding significance level.

The auto-averages of channel 1 are always a constant value because channel 1 has an independent oscillation (dashed lines in right column, Fig. 4). Note that the phases of channel 1 over *q* are essentially the same for any of the time sampling points (only circularly shifted). The cross-averages of channel 1|channel 0 increase as phase drawing to channel 1 increased (solid lines in right column, Fig. 4). On the other hand, the auto-averages of channel 0 are all short-lived. Each one has a peak only around its own reference time (solid lines in left column, Fig. 4). The cross-averages of channel 0|channel 1 have the same shape, which indicates that channel 0 is unilaterally drawn to phase synchronization with channel 1 (dashed lines in left column, Fig. 4). As an aside, transient phenomena occur at the ends of period (These were not included in the analysis).

These results show that phase-compensated averaging works as an epoch filter to pass a phase-synchronized band and to stop the other bands.

## 3. Actual ERF analysis with epoch filters

We shall explain the practical issues affecting the use of epoch filters by illustrating an analysis of the ERF of an actual semantic priming experiment (Ihara et al., 2007). In typical semantic priming experiments, a pair of words (prime and target) is sequentially presented to a subject and a task (e.g., button-pressing) is required to be performed only on the target word. Reaction time differs depending on whether or not there is a semantic relationship between the prime and target words. ERF sometimes is recorded during priming experiments as a physiological index. Now, although the objective of this priming experiment originally addressed a somewhat complicated issue, or selection of a meaning of a target word having semantic ambiguities through a context with the paired prime, it included a typical semantic priming experiment as a control experiment. Hence, for the sake of this discussion, we will illustrate only the epoch filtering of the ERF from the typical portion.

### 3.1. Semantic priming ERF

The main topic of this chapter is to introduce epoch filters. Here, we will briefly explain the procedures of a sematic priming experiment and the reconstruction of the signal source activities. Such a reconstruction is fraught with problems, due to the ill-posedness of the MEG inverse problem. Although the problems affect the overall performance of epoch filtering, we would believe that this matter should be dealt with as a separate problem from the one at hand.

All the prime and target words are Japanese nouns written with two morphograms (kanji) and three to five syllabograms (kana), respectively (left-panel, Fig. 5). The target words are either semantically related (related condition) or unrelated (unrelated condition) to the prime words. One hundred word pairs for each condition are visually presented in random order with a stimulus onset asynchrony of 1000 ms; the duration of each word is 300 ms. We ask subjects to judge whether the prime and target words are semantically related or not and to press a button after the delayed cue.

MEG in a frequency bandwidth of 0.1-200 Hz is recorded with a sampling frequency of 678 Hz using a 148-channel whole-head system (Magnes 2500WH, BTi). ERFs occasioned upon miss-pressed buttons and eye-blink artifacts are excluded from each condition. From our experience, eye-blinks often cause phase-resets, and therefore, these artifacts must be excluded when analyzing phase-related phenomena.

Dipole activities are tentatively estimated by means of the selective minimum-norm (SMN) method, which obtains minimum L1-norm solutions (Matsuura and Okabe, 1996). The SMN is applied to the conventional averages for a duration of 0-700 ms (target onset: 0 s) for all conditions. All dipoles, the amplitude of which exceeded the noise level of five times the root mean squares of the magnetic fields for a pre-trigger duration of −0.2-0 s, are gathered, neglecting the neighbors within 20 mm (Fujimaki et al., 2002). Thus, approximately 60 dipoles, which span most of the conventional average space, are selected.

The current distribution forms a vector field, and thereby, each discretized site has three orientations. The target time series of the epoch filters should have its related orientation fixed, e.g., parallel to the axons of pyramidal cells in the site. A simple way to fix the orientation with only ERF is to perform a PCA of the three dipole activities for each site. The projection vector corresponding to the first principal component is used for the orientation-fixed dipoles; i.e., three orientations of each site are projected onto the eigenvector corresponding to the largest eigenvalue of the covariance matrix of the three dipole activities in the site.

Up to this point in the experiment, only the dipoles to be fitted have been selected. In the methodological sense, a leadfield matrix, which is based on a real head shape with ASA (ANT Software B.V.), is determined. The epoch-wise (orientation-fixed) dipole activities, which are objects of the epoch filters, are subsequently estimated with least squares fitting.

We shall describe epoch filtering for a typical subject. Fifty-one orientation-fixed dipoles are used for building the leadfield matrix and 12 dipoles (Dipoles 1-12) out of the 51 are selected. Note that these 12 dipole activities showed conditional differences (Ihara et al., 2007). Each of these 12 dipoles belongs to one of six regions of interest (ROIs) (right-side, Fig. 5); the left anterior inferior frontal cortex (LaIFC: Dipoles 1 and 2), the left posterior inferior frontal cortex (LpIFC: Dipole 3), the anterior medial temporal lobe (LaMedT: Dipole 4), the left anterior middle and inferior temporal areas (LaT: Dipoles 5-7), and the left posterior superior temporal and the inferior parietal areas (LpST1: Dipoles 8-11, LpST2: Dipole 12).

The dipole activity version of the ERF is epoch filtered from the next subsection.

### 3.2. Phase-interpolated averages

First, we will show how phase interpolation works. As a typical example, we will choose the source activity of LaT (Dipole 6) for the related condition. The actual dipole activities for the first seven epochs show rhythmic activities, and the conventional average of the all 98 epochs seem to run through the approximate center (top row, Fig. 6). It is of interest to determine whether the wave (spike is not clear for LaT) of the conventional average come from AD or PM. As shown in the simulation in section 1.2, even only PM can generate nonzero activity in the conventional average. While the instantaneous phases at 0 ms are almost uniformly distributed, those at 200 and 400 ms are clearly biased (second row, Fig. 6). Therefore, the dipole activities seem to at least partially come from PM. The phase interpolation is performed for *K*=3. The seven (=2*K*+1) interpolated dipole activities are not rhythmic compared with the original activities (third row, Fig. 6). This is because of the phase interpolation that placed instantaneous phases on the seven-phase-grid at any time sampling point (bottom row, Fig. 6). The reason a rhythmic wave is rhythmic is that its instantaneous phase rotates. Conversely, if an instantaneous phase is fixed, its waveform will be a straight line with an occasional small fluctuation. If AD exists, the interpolated dipole activities shift in the same direction. Slight upward shifts are observed for the 200-450 ms duration. Thus, the 0th phase-interpolated average, which is the twice the average of the interpolated activities as shown in eq. (16), shows a slight AD (third row, Fig. 6). Note that while the instantaneous phases for the real dipole activities are calculated, those for the interpolated dipole activities are just phase-grids for *K* and are not recalculated from the interpolated dipole activities.

Next, we perform phase-interpolated averaging (*K*=2, 3, 4) on six dipoles for the related condition; one dipole is selected from each of the ROIs. The overlap of the 0th and 1st phase-interpolated averages implies that the analysis is stable, whereas spike-like changes in the condition numbers indicates that these averages at the corresponding time are somewhat unreliable. Each 0th phase-interpolated average looks like a low-amplitude replica of the corresponding conventional average (first and fourth columns, Fig. 7). The residual amplitude of the conventional average comes from PM, which is evaluated with condition numbers (third and sixth columns, Fig. 7). Such an amplitude relationship between two averages is often observed for other ERFs. Therefore, the conventional averages would be due to both AD and PM. Regarding the ratio of AD and PM, sometimes, the 0th phase-interpolated averages are very small. This indicates that PM might be dominant. However, we must point out that imprecise instantaneous phases are apt to yield very small 0th phase-interpolated averages. Phase-interpolated averaging, which is more or less fail-safe, does not generate phantom ADs. Therefore, the fact that the 0th phase-interpolated averages even very slightly inherit the trace of the conventional averages cannot deny the existence of AD. On the other hand, the 1st phase-interpolated averages do not show characteristics common to all ROIs.

The 1st phase-interpolated averages of the alpha-band-limited LpIFC and LpST1 dipole activities show alpha-band ERD (top-row, Fig 8). On the other hand, those of the gamma-band-limited LpIFC and LpST1 dipole activities show intermittent amplitude increments (bottom-row, Fig 8). These results might be gamma-band ERS. Since gamma-band ERS is generally not much more time-locked than alpha-band ERD, phase-interpolated averaging only vaguely visualizes gamma-band ERS. Thus, although 1st phase-interpolated averages of the wide-frequency-band limited ERF give almost no information, those of the narrow-frequency-band limited ERF are at least useful for extracting alpha-band ERD.

### 3.3. Phase-compensated averages

Unlike phase-interpolated averaging, phase-compensated averaging generates many auto- and cross-averages. Therefore, an overview of the all averages is extremely important and should be performed first. Furthermore, non-significant averages, in the surrogate test sense, should be masked in the overview. For this purpose, we would recommend using image-like matrix (*l*-by-*m*) plots of the root-mean-squares:*F* is a temporal neighbor duration of *n* (Fig. 9). Although the intensity (white means maximum intensity) is used for the sake of visibility, all the non-significant averages should be displayed in black (minimum intensity). Now, in typical control ERF experiments, the difference between two ERFs, which may be caused by the only different condition out of many common ones, is evaluated. Thus, two conventionally averaged ERFs for enhancing an S/N ratio should basically be similar to each other. In fact, in this ERF experiment, phase-compensated averages of the related and unrelated conditions for *N* of the alpha band (*W*: 1-52 Hz) are quite similar in duration *F* of 100-300 ms (top and middle rows and second column, Fig. 9). Those for the other durations are only slightly different (top and middle rows to the left of the separation line, Fig. 9). This implies that the averages for the *F* of 100-300 ms are reliable and those for the other durations would merit further investigation. The significances of the differences are checked after the reliabilities of the individual auto- and cross-averages (significance and similarity) are evaluated (bottom row to the left of the separation line, Fig. 9). The significance difference should be valid regardless of whether or not the significances of the two conditional auto- or cross-averages that produce the difference are valid, because these two conditional significance tests are independent of each other. Note that the numbers of surrogates are 1000 for the auto- and cross-averages and 1000 pairs (total 2000) for the differences. If one performs phase-compensate averaging on a single condition ERF recording, ERFs can be randomly divided into two bins for the similarity check.

We cannot deny that the above-mentioned procedure is somewhat paradoxical. Our approach ends up finding a small but significant difference in almost the same two averages. Only the significance of the differences may be important; the significances of the individuals are not so important. On the other hand, we would like to narrow down the number of significant averages. It might be difficult for researchers, in particular, those having a hypothesis bias, to fairly compare a huge number of data on the same basis. Individual researchers should thus decide whether the auto- and cross-averages for *N* of the beta band should be investigated or not (right-most column, Fig. 9). As an aside, none of the auto- and cross-averages for *N* of the gamma-band were significant to the individual auto- and cross-averages and their differences. Note that *W* is always set to 1-52 Hz despite that *N* changes.

*N*: alpha band (left of the dashed vertical line),

*N*: beta band (right of the dashed vertical line),

*W*: 1-52 Hz, reference time: 200 ms). Top-row) Related condition. Middle row) Unrelated condition. Bottom-row) Difference between related and unrelated conditions. Horizontal axis) Dipole for calculating instantaneous phases. Vertical axis) Dipoles to be filtered. Panel title) Duration

*F*for calculating the root mean squares. The gray-scale intensity of each crossover square indicates the root mean squares of its auto- or cross-average (black: not significant).

Next, let us investigate the phase-compensated averages (*N*: alpha-band, *W*: 1-52 Hz) of two dipole pairs: Dipole 1 at LaIFC and Dipole 6 at LaT, and Dipole 3 at LpIFC and Dipole 8 at LpST1 (Fig. 10). This is because these two combinations are directionally brighter and darker (almost in black) in the difference plot (third row and second column panel, Fig. 9). Note that we shall hereafter use the ROI names for dipole numbers.

First, regarding the LaIFC and LaT pair, the auto- and cross-averages (top-left four panels, Fig. 10) and their differences (bottom-left four panels, Fig. 10) for both conditions significantly appear around the reference time (vertical gray lines), but the cross-averages for the unrelated condition (thick lines) are very small. Therefore, the phase-synchronization between LaIFC and LaT might occur especially for the related condition (thin lines) although the local phase-synchronizations within LaIFC and LaT might occur for whatever the condition.

Let us discuss the related condition by inspecting the panels in a row-wise fashion. Suppose the current time is the reference time. The auto-average of LaIFC reaches a significant peak at the current time (no temporal lag), and the cross-average of LaIFC|LaT does so before the current time. Thus, the current phase of LaIFC is up-to-date whereas that of LaT is older. On the other hand, the auto-average of LaT reaches a significant peak at the reference time, and the inverted cross-average (LaT|LaIFC) does so after the current time. Thus, the current phase of LaT is up-to-date whereas that of LaIFC is younger. This row-wise comparison indicates that the phase of LaIFC precedes that of LaT. Therefore, LaT might be drawn to phase synchronization with LaIFC around the reference time. Such temporal comparisons of the peaks are relatively easy to perform when the auto- and cross-averages are all single-peaked.

We should point out that only the auto-average of LaIFC for the unrelated condition consists of significant intermittent bursts, which accordingly yields a significant conditional difference (top-left, bottom-left four panels, Fig. 10). For instance, if one is interested in which dipoles are drawn to phase synchronization with Dipole 1 (at LaIFC) for unrelated condition, the cross-averages between Dipole 1 and Dipole 8 should be subsequently evaluated by consulting the corresponding column of Dipole 1 (Unrelated panels, Fig. 9).

Regarding the LpIFC and LpST1 pair, the auto- and cross-averages (top-right four panels, Fig. 10) and their differences (bottom-right four panels, Fig. 10) for both conditions significantly appear around the reference time (vertical gray lines). There is a consistent tendency in their differences that the auto- and cross-averages of LpIFC and LpST1 for the related condition are greater than those for the unrelated condition just before the reference time and the amplitude relation is inverted just after the reference time (bottom-right four panels, Fig. 10).

Let us examine the results for the related condition in more detail. LpIFC might be drawn to phase synchronization with LpST1 for the same reason as the LaIFC and LaT pair. However, for the unrelated condition, the cross-average of LpST1|LpIFC is double-peaked across the reference time whereas the auto-average of LpST1 has a single peak at the reference time. Therefore, phase-drawing between LpIFC and LpST1 might be bidirectional for the unrelated condition. If one is interested in such double-peaked cross averages, one may wish to investigate the cross averages between Dipole 3 (at LaIFC) and Dipole 12 by consulting the corresponding columns of Dipole 3 (Unrelated and *F* of the 100-300 ms panel, Fig. 9).

### 3.4. Discussion

We will briefly discuss the actual ERF analyses using the two epoch filters.

In the typical sematic priming ERF, the conventional average for the unrelated condition is larger (in absolute value) than that of the related condition after a certain latency these two averages branch off. In fact, this original study concluded so (Ihara et al, 2007). However, the same is not necessarily true for the phase-compensated averages (Fig. 10). The phase-interpolated averages indicate that the conventional averages might consist of AD and PM and these two might cooperatively act when they have the same polarity (Fig. 7). If there is a gap between the conditional difference between the conventional averages and the phase-compensated averages, the activity that fills in the gap might be PM. In addition, directional relationships between ROIs can be discussed with these two epoch filters. This is the most conspicuously different point from the original study.

Are these two epoch filters able to analyze phase-related phenomena? Phase-interpolated averaging can be used to visualize ERD and ERS without additive activities (Fig. 8). Long-distance directional synchronization between LaIFC and LaT and between LpIFC and LpST1 can be visualized by phase-compensated averaging. In fact, these long-distance directional synchronizations likely happen since LaIFC and LaT are connected by the uncinate fasciculus and LpIFC and LpST1 by the arcuate fasciculus. In contrast, we can only speculate about phase-resetting. When the phase is reset, by which instantaneous phase jumps to a specific phase regardless of the phase just before phase-resetting, the condition number of phase-interpolated averaging could be very large at that moment. In fact, there are spike-like shapes in the condition numbers (Fig. 8). However, it must be noted that the spikes always appear if *K* (2*K*+1: the number of interpolated ERFs) increases and the condition number cannot detect phase-resetting if the neuronal population in charge of phase-resetting is small. Thus, we cannot say that this spike indicates phase-resetting. There is one other thing we should point out about phase-resetting: recall the difference between using the cosine and sine in the simulation described in section 1.2 (Fig. 1). Conventional averages of various ERF experiments have shapes that are experiment-dependent. According to the phase-interpolated averages (Fig. 9), AD and PM would have similar waveforms and be similar to conventional averages, and hence AD and PM would be time-locked. Therefore, cosine, sine, or their combination has to be determined by something in the case of the simulation. This determination is identical to setting the instantaneous phase to a specific value at a specific time. We wonder if this could be achieved only by resetting the phase. In the case of phase-resetting driven by stimuli, the first phase reset should occur at a very early latency, e.g., before N70 for visual stimuli if N70 is observed.

## 4. Conclusion

In this chapter, we explained the methodological and practical aspects of two epoch filters. The epoch filters are made possible by giving epochs an order. Since the order is of the instantaneous phases, these filters can be used to analyze phase-related phenomena. If another order is introduced, a different epoch filter can be designed and a different concept can be created accordingly. We hope that new epoch filters besides the ones presented here will be designed.