Open access peer-reviewed chapter

# Model Testing Based on Regression Spline

By Na Li

Submitted: May 3rd 2017Reviewed: February 5th 2018Published: June 6th 2018

DOI: 10.5772/intechopen.74858

## Abstract

Tests based on regression spline are developed in this chapter for testing nonparametric functions in nonparametric, partial linear and varying-coefficient models, respectively. These models are more flexible than linear regression model. However, one important problem is if it is really necessary to use such complex models which contain nonparametric functions. For this purpose, p-values for testing the linearity and constancy of the nonparametric functions are established based on regression spline and fiducial method. In the application of spline-based method, the determination of knots is difficult but plays an important role in inferring regression curve. In order to infer the nonparametric regression at different smoothing levels (scales) and locations, multi-scale smoothing methods based on regression spline are developed to test the structures of the regression curve and compare multiple regression curves. It could sidestep the determination of knots; meanwhile, it could give a more reliable result in using the spline-based method.

### Keywords

• fiducial method
• multi-scale smoothing method
• nonparametric regression model
• partial linear regression model
• regression spline
• varying-coefficient regression model

## 1. Introduction

It is well known that the model which contains nonparametric functions, such as partial linear model and varying-coefficient model, plays an important role in applications due to its flexible structure. However, in practice, investigators often want to know whether it is really necessary to fit the data with such more complex models rather than a simpler model. This amounts to testing the linearity of nonparametric functions in a regression model. In this chapter, we first consider the following three frequently used regression models.

Nonparametric regression model:

y=fx+ε.E1

Partial linear regression model:

y=Zb+fx+ε.E2

Varying-coefficient model:

y=z1f1x1++zpfpxp+ε.E3

In models (1)(3), yis the response variable, Z=z1zpis a p-dimensional regressor, xand x1,,xpare covariant taking values in a finite interval, εis the error, bis a parameter vector, and f(x)and fjxj,j=1,2,,pare unknown smooth functions. Usually we suppose that zxand εare independent and ε˜F/σ, where Fis a known cumulate distribution function (cdf) with mean 0 and variance 1; σis unknown. Without loss of generality, we can suppose that xand x1,,xptake values in [0, 1]. We try to test the linearity of fxin models (1) and (2) and the constancy of fjxjin model (3) for some j12p.

The hypothesis testing in nonparametric regression model was considered in many papers. Härdle and Mammen [1] developed the visible difference between a parametric and a nonparametric curve estimates. Based on smoothing techniques, many tests were constructed for testing the linearity in regression model; see Hart [2], Cox et al. [3], and Cox and Koh [4] for a review. Recently, Fan et al. [5] studied a generalized likelihood ratio statistic, which behaves well in large sample case. Tests based on penalized criterion were developed by Eubank and Hart [6] and Baraud [7].

The linearity of partial linear regression model (2) was studied by Bianco and Boente [8], Liang et al. [9], and Fan and Huang [10]. There are also many other papers concerning such testing problems (see [11, 12, 13, 14, 15, 16], among others). The constancy of the functional coefficient fjxjin varying-coefficient model (3) was studied in Fan and Zhang [17], Cai et al. [18], Fan and Huang [19], You and Zhou [20], and Tang and Cheng [21]. Local polynomials and smoothing spline methods to estimate the coefficients in model (3) can be seen in Hoover et al. [22], Wu et al. [23], and so on.

The critical values of most of the previous tests were obtained by Wilks theorem or bootstrap method. So such tests only behave well in the case of relatively large sample size. This chapter would give some testing procedures based on regression spline and the fiducial method [24] in Section 2. It has a good performance even when the sample size is small.

In using the regression spline, the key problem is the determination of knots used in spline interpolation. As we know that, for smoothing methods such as kernel-based method and smoothing spline, the smoothness is controlled by smoothing parameters. For the well-known kernel estimate, the bandwidth that is extremely big or small might leads to over-smoothing or under-smoothing, respectively. In order to avoid the selection of an optimal smoothing parameter, multi-scale smoothing method was introduced by Chaudhuri and Marron [25, 26] based on kernel estimation for exploring structures in data. This multi-scale method is known as significant zero crossings of derivatives (SiZer) methodology. The basic idea of SiZer is to infer a nonparametric model by using a wide range of smoothing parameter (bandwidth) values rather than only using one “optimal” value in some sense.

There have been many versions of SiZer for various applications, such as the local likelihood version of SiZer in Li and Marron [27], the robust version of SiZer in Hannig and Lee [28], and the quantile version of SiZer in Park et al. [29]. In addition, Marron and deUñaÁlvarez [30] applied SiZer to estimate length biased, censored density, and hazard functions; Kim and Marron [31] utilized SiZer for jump detection and Park and Kang [32] applied SiZer to compare regression curves. The smoothing spline version of SiZer was proposed by Marron and Zhang [33]. It used the tuning parameter (penalty parameter) that controls the size of penalty as the smoothing parameter is.

Comparing with bandwidth for kernel-based method and tuning parameter for smoothing spline, it is more difficult to determine the number of knots and their positions. For this reason a multi-scale smoothing method based on regression spline is proposed in Section 3 to test the structures of nonparametric regression model. The proposed multi-scale method does not involve the determination of the “best” number of knots and can be extended easily to a more general case.

## 2. Tests for nonparametric function based on regression spline

In this section, the linearity of function fxin model (1) is tested based on regression spline and fiducial method. Then, the proposed test procedure for model (1) is extended to test the linearity of model (2) and the constancy of function coefficient in model (3), respectively.

### 2.1. Test the linearity of nonparametric regression model

Without loss of generality, we suppose that xin model (1) takes values in [0, 1] and the set of knots is T = {0 =t1<t2,,<tm=1}. In order to estimate model (1), nonparametric function f(x)is fitted by kth order splines with knots T. This means that

fxj=1m+k1βjgjx,E4

where βjis coefficient and gjx,j=1,2,,m+k1,is basis function for order k splines, over the knots t1,t2,,tm.

With n-independent observations Y=y1y2ynn, the basis matrix Gn×m+k1is defined by G=gjxi,xiis the designed point,i=1,2,,n;j=1,2,,m+k1.Hence, model (1) can be approximated as Y+ε. The least squares estimator of coefficients is

β̂=GTG1GTY,E5

and the estimator of fxican be expressed as

Ŷ=f̂x1f̂x2f̂xnT=GGTG1GTY.E6

For testing the linearity of model (1), linear spline is used to approximate fx. It means that basis function gjxis a linear function:

g1x=xt2t2t1l2t,
gk1x=xtk2tk1tk2lk1txtktktk1lkt,3km,E7
gmx=xtm1tmtm1lmt.

In this case, the approximated function in (4) is a linear interpolation with k=1. The true value is βj=ftj,j=1,2,,m. The linearity of function fxcan be written as

H0:β2β1t2t1=β3β2t3t2==βmβm1tmtm1.

Null hypothesis H0can be expressed in matrix as Lβ=0,

where

L=h2h1h2h100000000hm1hm1hm2hm2,

where hj=tj+1tj,j=1,2,,m2. Null hypothesis H0is equivalent to the following one:

H0:Lβ=0.E8

The p-value for testing hypothesis H0will be derived by the fiducial method in the following context. Assume that matrix Ghas full rank, and let ε˜σN01. In model Y=+ε, the sufficient statistic of βσ2is β̂S2,where β̂is defined in (5) and

S2=YIPGY,PG=GGG1G.

By Dawid and Stone [34], the sufficient statistic can be represented as a functional model:

β̂=β+σGG12E1,S=σE2,E=(E1,E2)Q,E9

where Qis the probability measure of E=(E1, E2)and E1N0Imand, independently, E22χ2nm.From linear regression model, the fiducial model of βcan be obtained:

β̂=β+SE2GG12E1,E=(E1,E2)Q.E10

Given β̂S2,the distribution of the right side in fiducial model is the fiducial distribution of β. That is, the fiducial distribution of βis the conditional distribution of REβ̂S2when β̂S2is given, where

REβ̂S2=β̂SE2GG12E1.E11

For testing hypothesis H0, the p-value is defined as

pβ̂S2=QLREβ̂S2EQREβ̂S2Σ2LEQREβ̂S2Σ2,E12

where Q(·) and EQexpress the probability for an event and the expectation of a random variable under Q, respectively, and Σis the conditional covariance matrix of LEQREβ̂S2given β̂,S2and vΣ2=vΣ1vfor a vector v.

According to the definition of generalized pivotal quantity in [35], REβ̂S2is a generalized pivotal quantity and also a fiducial pivotal quantity about β. Naturally, LREβ̂S2is the fiducial pivotal quantity about Lβ. With the definition of Qin Eq. (10), we have that

pβ̂S2=1Fm2,nmnmβ̂LLGG1L1Lβ̂m2S2,E13

where Fm2,nmis the cdf of F-distribution with degrees of freedom m2andnm.

Under model (1) and the hypothesis that fxis a linear function, null hypothesis H0given in (8) is true. Suppose that the error is normally distributed, then the p-value given in Eq. (12) distributes as uniform distribution on interval (0, 1). On the other hand, under some mild condition, the test procedure based on pβ̂S2is consistent. Which means that pβ̂S2tends to be zero in probability 1 if H0is false. The corresponding theoretical proof of the large sample properties and finite sample properties of pβ̂S2is the same as the proof given in Li et al. [36].

In applications, we need to check some hypotheses as follows:

H01:fx=Cβ1=β2==βm,
H02:fx=Cxβ2β1t2t1=β3β2t3t2==βmβm1tmtm1,and,β1=0.

The p-values for testing H01and H02can be obtained by replacing Lin (12) by L01and L02, respectively, where L02=e1L,e1=1000and

L01=h2h1000000hmhm1.E14

### 2.2. Test the linearity of partial linear model

To test the linearity of model (2), p-value can be established analogously. With n-independent observations Y=y1y2ynn, model (2) can be represented as.

yi=Zib+fxi+εi,i=1,2,,n,

where Zi=zi1zip,b=b1bp,xi,i=1,2,,narefixed designed points. With the approximation of fxgiven in (4), model (2) can be approximated by Y+ε, where X=Gn×p+m+1;=(zij);i=1,2,,n;j=1,2,,p;Gis the same as above; and θ=bβ. Then p-value for testing the linearity of model (2) can be defined by replacing Gin (12) by X, βby θ,and Lby L03, respectively,L03=0m2×pL.

The large sample and finite sample properties of the testing procedure for model (2) are the same as the test procedure for model (1).

### 2.3. Test the constancy of functional coefficient in varying-coefficient model

For model (3), investigators often want to know whether the coefficients are really varying; this means to test the constancy of the coefficient functions, that is, testing hypothesis:

H31:fjx=Cjforj=1,2,,pand some constantCj,E15
H32:fj0x=Cj0for somej=j0and some constantCj0.E16

With the set of knots T = {0 =t1<t2,,<tm=1},coefficient fjxcan also be approximated by

fjx=k=1mβjsgjx,j=1,2,,p,

where the true value of βjs=fjtk. Basic functions gj,j=1,2,,m+1were defined in (7). The varying-coefficient model (3) is approximately represented as

Y=+ε,E17

where X=F1Fpis n×mpmatrix and Fj=zjifkxi,k=1,2,,m,i=1,2,,n,j=1,2,,p. β=β1βpis mp-dimensional parametric vector, βj=fjt1fptm.

It is worth noting that under null hypothesis H31defined in (15), regression model (3) is equivalent to model (17). However, this equivalence does not hold under null hypothesis H32defined in (16). Null hypotheses H31and H32can be expressed in matrix as the following two, respectively:

H31:L1β=0,E18
H32:L2β=0,E19

where L1is pm1×mpmatrix.

L1=L0100L01,L01=1100000011,
L2=0m1×mj0mL0m1×mpmj0m1×mp.

In the same way as the p-value in (13) is defined, p-value to test hypotheses H31and H32can be defined as below if the error εdistributes as normal distribution:

p31β̂S2=1Fpm1,nmpnmpβ̂L1L1XX1L11L1β̂pm1S2,E20
p32β̂S2=1Fm1,nmpnmpβ̂L2L2XX1L21L2β̂m1S2.E21

According to the above discussion, it can be seen that p31β̂S2is uniformly distributed over (0, 1) under hypothesis H31. However, under null hypothesis H32, varying-coefficient model (2) is not linear. Hence, there is a difference between the distribution function of p32β̂S2under H32and uniform distribution. This difference has an accurate expression, which can be seen in Li et al. [37] (Theorem 3). On the other hand, p31β̂S2and p32β̂S2both tend to be zero in probability if null hypotheses are false when sample size tends to be infinity under some mild conditions. The corresponding proof was provided also in Li et al. [37].

## 3. Multi-scale method based on regression spline

For regression spline, the number of knots controls the smoothness of the estimator. The determination of knots is important and plays a large influence on the inference results. The GCV method is usually used to choose an optimal number of knots. While, but after the number of knots is given, the determination of the optimal positions of knots is difficult. Shi and Li [38] chose knots by placing an additional new knot to reduce the value of GCV, until it could not be reduced by placing any additional knots. Hence, once a knot was selected, it cannot be removed from the knot set. Mao and Zhao [39] determined the locations of knots conditioned on the number of knots mfirst and chose mlater by GCV criterion. In fact, the locations of knots can be considered as parameters which can be estimated from data. This is the free-knot spline; see DiMatteo et al. [40] and Sonderegger and Hannig [41]. However, the estimation of the optimal locations is computationally intractable, and replicate knots might appear in the estimated knot vectors [42].

On the other hand, many statisticians think that the statistical inference based on one smoothing level is not reliable although it is the optimal one. Therefore, multi-scale method is developed to estimate and test nonparametric regression curves. Chaudhuri and Marron [25, 26] proposed a multi-scale method to explore the significant structures (local minima and maxima or global trend) in data, which is known as SiZer. Significant zero crossings of derivatives (SiZer) is a powerful visualization technique for exploratory data analysis. It applies a large range of smoothing parameter values to do statistical inference simultaneously and use a 2D colored map (SiZer map) to summarize all of the results inferred at different smoothing levels (scales) and locations.

In this section, a regression spline version of SiZer is proposed for exploring structures of curve and comparing multiple regression curves, respectively. The proposed SiZer employs the number of knots as smoothing parameter (scales). For the sake of simplicity, linear spline is employed first to construct SiZer, which is denoted as SiZerLS. In addition, another version of SiZer—SiZerSS—is introduced, which is proposed in Marron and Zhang [33]. In SiZerSS, smoothing spline is used to infer the monotonicity of fx, and the tuning parameter (penalty parameter) that controls the size of penalty is chosen to be as the smoothing parameter. Finally, SiZer-RS, a version of SiZer based on higher-order spline interpolation, is constructed to compare multiple regression curves at different scales and locations simultaneously.

In order to understand SiZerLS clearly, we first present an example in which SiZerLS are simulated. This example is modified from Hannig and Lee [28] with the same regression function:

fx=5+4.21+x0.30.034+5.11+x0.70.014.

The observations generated from model (1) with 200equally spaced design points from (0, 1) and σN0,0.5are plotted in Figure 1. Estimator f̂mxdenotes the linear spline smoother obtained from (6) using mequally spaced knots chosen from (0, 1). The curves of f̂mxwith different values of mare plotted in Figure 1 too. The simulated SiZerLS map and SiZerSS map are shown in Figure 2, respectively.

In Figure 2, BYP SiZerLS is SiZerLS map based on multiple testing procedures, BYP, where BYP denotes the multiple testing procedure proposed in Benjamini and Yekutieli [43]. SiZerSS is the smoothing spline version of SiZer. The two SiZers are simulated under the same range of scales and nominal level 0.05. There are four colors in SiZer maps: red indicates that the estimated regression curve is significantly decreasing; blue indicates that the estimated regression curve is significantly increasing; purple indicates that the curve is neither significantly increasing nor decreasing; gray shows that there are no sufficient data for conducting reasonable statistical inference. Figure 1 preliminarily shows that SiZer maps can locate peaks well. The theoretical foundation of SiZerLS and SiZerSS will be discussed in more detail at a later stage.

### 3.1. Construction of SiZerLS map for exploring features of regression curve

The proposed SiZerLS map will be constructed on the basis of the p-values with multiple testing adjustment. The p-value for testing the monotonicity of the smoothed curve is defined first based on linear spline approximation and fiducial method in the same way as p-values in Section 2. Consequently, multiple testing adjustment is discussed detailedly to control the row-wise false discovery rate (FDR) of SiZerLS.

In the view of SiZer, all of the useful information is included in the smoothed curve, which is defined below. Suppose we have observations xiyii=1nfrom regression model (1). By linear spline estimation, estimator f̂mxcan be obtained:

f̂mx=gxGTG1GTY,E22

where gx=g1xg2xgmx; gjx,j=1,,mare the basis functions defined in (7) on the basis of mknots; and Gis the matrix defined in Section 2. The smoothed curve at smoothing level mis denoted as.

fmx=Ef̂mx=gxGTG1GTf,

where f=fx1fx2fxn. SiZer focuses on fmx.Its monotonicity is determined totally by GTG1GTf. Hence, it is enough to test the following m1pairs of null hypotheses:

HIk=fmtk=ekGG1Gfek+1GG1Gf=fmtk+1(and)
HDk=fmtk=ekGG1Gfek+1GG1Gf=fmtk+1,k=1,2,,m1,E23

where ekis an m-dimensional column vector having 1 in the kth entry and zero elsewhere. Let bdenote GG1Gf. Then, HIkand HDkcan be written as

HIk=Lkb0,k=1,2,,m1;HDk=Lkb0,k=1,2,,m1,E24

where Lk(ekek+1). The p-values to test hypotheses in (24) under linear model Y=Gb+εcan be defined using pivotal quantity about b. This pivotal quantity is REβ̂S2, which is defined in (11). The p-value for testing HIkis the fiducial probability that null hypothesis holds:

PIkβ̂S=PLkREβ̂S0=PLkβ̂SE2GG12E10
=PnmLkGG1GE1LkGG1Lk12E2nmβ̂SLkGG1Lk12,E25

where the subscript Ikof PIkrepresents the interval (tk,tk+1) in which we test monotonicity and mrepresents the number of knots used in linear interpolation. In addition, p-value PDkβ̂Sfor testing HDksatisfies equation PIkβ̂S+PDkβ̂S=1.

It is worth noting that p-value PIkβ̂Sis uniformly distributed on (0,1) if all of the hypotheses HIk,HDk,k=1,2,,m1are true (regression function is a constant). In applications, p-value PIkβ̂Sfor testing HIkcan be approximated as below when n.This approximation is reasonable (see Theorem 1 in [44]):

PIk,mβ̂S1ΦnmLkβ̂SLkGG1Lk1/2.E26

The proposed SiZerLS map will be constructed on the basis of the above p-values with multiple testing adjustment. In fact, SiZer is a visual method for exploratory data analysis, and it focuses on exploring features that really exist in data instead of testing whether some assumed features are statistically significant in a strict way. FDR is the expected proportion of the false positives among all discoveries, and FDR can be either permissive or conservative according to the number of hypotheses. Considering that different numbers of hypotheses need to be tested for SiZerLS with respect to various smoothing parameters, the multiple testing adjustment to control FDR would be better if used to improve the exploratory property of SiZer. Hence, the well-known multiple testing procedure which was proposed in Benjamini and Yekutieli [43] (denoted as BYP) is applied to control the row-wise FDR of SiZerLS. The BYP was proved to control FDR under α for any dependent test statistics.

#### 3.1.1. Benjamin-Yekutieli procedure to control FDR (BYP)

Suppose that we have obtained p-values PIk,mβ̂Sfor testing hypotheses HIkin (23), k=1,2,,m1:

1. Order p-values PIk,mand get the ordered p-values PI1,m,PI2,m,,PIm1,m.

2. For a given p-value α, find the largest ifor k=1,2,,m1for which PIi,mm1j=1m11jand reject all HIk,mfor k=1,2,,m1.

The detailed steps to construct SiZerLS with BYP adjustment are given below:

Step 1.Construct 2D grid map. Without loss of generality, we assume that designed points xi,i=1,2,,nare chosen from [0, 1]. Then the 2D map is a rectangular area [0, 1; log101/mmax,log101/mmin]; see BYP SiZerLS displayed in Figure 2. The value of mis determined by the following rule: m=round1/10l, where function round(∙) is the nearest integer function and ltakes equally spaced values from interval log101/mminlog101/mmax. For a given m, abscissa xtakes values at the corresponding knots Tm=t1t2tm. On the basis of different values of mand Tm, the 2D map is divided into many pixels.

Step 2.Calculate p-values for each pixel. Each pixel in the 2D map constructed in step 1 is determined by two adjacent knots and a determined m. For pixel tktk+1m=m0, we calculate p-value PIk,m0and PDk,m0for testing hypotheses HIk,m0and HDk,m0, respectively, with m0knots.

Step 3.Multiple testing adjustment. For a given value m=m0, carry out multiple testing procedure BYP using p-values PIk,m0(PDk,m0), k=1,2,,m0, obtained from step 2 to test the fowling family of hypotheses simultaneously:

HI1,m0HI2,m0HIm01,m0HD1,m0HD2,m0HDm01,m0.

Step 4.Color pixels. According to the multiple testing results at smoothing level m0if HIkis rejected and HDkis accepted, pixel tktk+1m=m0is colored red to indicate significant decreasing. On the contrary, if HIk,,m0is accepted and HDk,m0rejected, pixel tktk+1m=m0is colored blue to show significant increasing; purple is used for no significant trend in other cases.

In SiZer map, gray indicates that no sufficient data can be used to test the monotonicity of regression function at point xwith mknots. Such sufficiency is quantified as effective sample size (ESS). Noting that the number of nonzero elements in the kth column of Ghas a demonstrable effect on the inference in interval tktk+1,and it is determined directly by how many observations are included in tktk+1, we define ESStkmas.

ESSt1mESSt2mESStmmGG111.

In SiZerLS map, pixel tktk+1m=m0would be colored gray if.

minESStkm0ESStk+1m0<5.

In order to avoid selecting knots, mequally spaced knots or equal x-quantiles are used in interpolation. The smoothing level of regression spline estimate is controlled by mtogether with the positions of knots. The level of smoothness should be reduced to detect some local fine feature; however, the total number of knots should be limited to avoid excessive under-smoothing in a wide range. In applications of SiZerLS, the range of scales is recommended to include the coarsest smoothing level, m=2, and the finest smoothing level, avgxTmmaxESSxmmax< 5.

### 3.2. Construction of SiZerSS map for exploring features of regression curve

SiZerSS given in Marron and Zhang [33] employed smoothing spline to construct SiZer map for nonparametric model (1). Given xiyii=1nand a smoothing parameter λ, the smoothing spline estimator is the function f̂λthat minimizes the regularization criterion over function f:

i=1nωiyifxi2+λfx2dx.E27

By simple calculation, we can get the estimator vector:

f̂λ=f̂λx1f̂λx2f̂λxn=W+λK1WY=AλY,E28

where weight matrix W=diagω1ω2ωnand the hat matrix Aλ=W+λK1W.

In order to construct SiZerSS, the derivative of fat any point xneeds to be estimated along with its variance. Let si=xi+1xiand n×n1matrix Q=qij,i=1,2,,n,j=2,,n1, where qj1,j=sj11,qjj=sj11sj1,qj+1,j=sj1, and qi,j=0for ij2.Let γ1γ2γn=fx1fx2fxn. By the definition of natural cubic spline, fx1=fxn=0. Let γ=γ2γn1. According to Theorem 2.1 of Green and Silverman [45], the vectors fand γspecify a natural cubic spline fif and only if Qf=,

where Ris a (n2)×n2symmetric matrix with elements rij,i=2,,n1,j=2,,n1,which is given by rii=13si1+si,ri,i+1=ri+1,i=16siand rij=0for ij2. The estimator γ̂can be obtained from equation R+λQQγ=QY. Then estimator f̂xand f̂xcan be written as a linear combination of f̂and γ̂. Let hix=xxi,i=1,2,,n.When x<x1.

f̂λx=f̂λx1+h1xf̂λx2f̂λx1s1s16γ̂2,f̂x=f̂λx2f̂λx1s1s16γ̂2.

When xixxi+1, let δix=1+hixsiγ̂i+1+1hi+1xhiγ̂ifor i=1,2,,n,

f̂λx=hixf̂λxi+1hi+1xf̂λxisi+hixhi+1xδix6,
f̂λx=f̂λxi+1f̂λxisi+hixhi+1xγ̂i+1γ̂i6si+hix+hi+1x6δix.

(When) x>xn

f̂λx=f̂λxn+hnx6f̂λxnf̂λxn1sn1+sn1γ̂n1,
f̂λx=16f̂λxnf̂λxn1sn1+sn1γ̂n1.

The variance of f̂λxcan be calculated easily if the estimator of σ2,the variance of the error in model (1), is obtained. σ2can be estimated by the sum of squared residuals yif̂λxi2. If σ2is a function of x, σ2xcan be estimated by yif̂λx2. The confidence interval of fλxare of the form:

f̂λx±q.SD̂f̂λx,E29

where qis based on the nominal level. For details, see Section 3 of Chaudhuri and Marron [25].

SiZerSS can be constructed as SiZerLS. For different values of x, if interval (29) contains zero, pixel xλis colored purple; if confidence interval is on the right side of zero, blue is used to indicate increasing; otherwise, red is used to imply decreasing. Gray is used to indicate that there is no sufficient data to do reliable inference. The sufficiency can be found in Chaudhuri and Marron [25].

The simulated SiZerLS and SiZerSS maps are displayed in Figure 2, where the red and blue regions locate the bumps of regression curve accurately. This simulation illustrates the good behavior of SiZerLS and SiZerSS in exploring features in data.

### 3.3. Construction of SiZer-RS map for comparing multiple regression curves

The comparison of two or more populations is a common problem and is of great practical interest in statistics. In this subsection, comparison of multiple regression curves in a general regression setting is developed based on regression spline. Suppose we have n=i=1kniindependent observations from the following kregression models:

yij=fixij+σixijεij,i=1,2,,k,j=1,2,,ni,E30

where xijs are covariates, the errors εijN01s are independent and identically distributed errors, fi·is the regression function, and σi2·is the conditional variance function of the ith population. We are concerned about whether the kpopulations in model (30) are equal; if not, what is the difference? To this end, a multi-scale method, SiZer-RS, based on regression spline is proposed to compare fi·across multiple scales and locations.

As described in Park and Kang [32], the choice of smoothing parameter is also important for comparing regression curves. They developed SiZer for the comparison of regression curves based on local linear smoother. SiZer map for comparing regression curves is a 2D color map, which consists of a large number of pixels. Each pixel is indexed by a scale (smoothing parameter) and a location; the color of a pixel indicates the result for testing the equality of two or more multiple regression curves at the corresponding location and scale. SiZer provides us with more information about the locations of the differences among the regression curves if they do exist. Park et al. [46] developed an ANOVA-type test statistic and conducted it in scale space for testing the equality of more than two regression curves.

The works mentioned above are kernel-based method. Besides it, regression spline is an important smoothing device and is used widely in applications. For a given smoothing parameter m(the number of knots used in regression spline), the p-value for testing the equality of kregression curves at point xis established. Consequently, SiZer-RS is constructed in the same way as SiZerLS for comparing multiple retrogression curves based on higher-order spline interpolation.

For a given smoothing parameter m(the number of knots used in regression spline), the smoothed curve is defined as fi,mx=E(f̂i,m(x)), where f̂i,mxis the regression spline estimator. SiZer-RS for comparing multiple regression curves is based on the testing results for testing null hypothesis:

Hm,x:f1,mx=f2,mx==fk,mx,E31

at point xwith smoothing parameter m. Without loss of generality, we still suppose that the explanatory variable xtakes value from [0, 1]. On the basis of a knot set Tm=0=t1<t2<tm=1, we have the approximation:

fixs=1m+q1βi,sgm,sxNmxβim,E32

where βim=βi,1βi,2βi,m+p1.The estimator of fixat smoothing level mcan be obtained f̂i,mx=Nmxβ̂im, in which, Nmx=gm,sxs=12m+q1.If q=3, Nmxis defined below:

Nmlx=tltl4tl4tl3tl2tl1tltx+3,l=2,3,,m+3,

where tl=tminmaxl1mfor l=2,1,,m+3:

tx+3=tx3,t>x0,tx.

For a function g·,tl4tl3tl2tl1tlg·denotes the fourth-order divided difference of g·, that is:

t1t2g=gt,ift1=t2=tt1t2g=gt2gt1t2t1otherwise,t1t2tkg=gk1t,ift1==tkt1t2tkg=t2t3tkgt1t2tk1gtkt1,otherwise.

Then model (31) can be approximately written as the following linear regression model:

Yi=Gimβim+ΣiEi,E33

where

Yi=yi1yi2yini,Gim=Nmlxini×m+2,Σi=diagσixij,Ei=εi1εi2εini.

At first, we suppose Σiis known and then replace it by its available estimator.

From regression model (33), we can get the estimator β̂im=GimΣi1Gim1GimΣi1Yi.Let bimdenote the expectation of β̂im:

bim=Eβ̂im=GimΣi1Gim1GimΣi1fi,

where fi=fixi1fixini. Therefore, the smoothed curve

fi,mx=Ef̂i,mx=ENmxGimΣi1Gim1GimΣi1Yi=Nmxbim.E34

Denote bm=b1mb2mbkm,and correspondingly, denote its estimator as β̂m=β1mβ2mβkm. Hypothesis Hm,xcan be presented as

Hm,x:Lmxbm=0k1,E35

where

Lmx=NmxNmxNmxNmxNmx0000Nmx000000000Nmx

is a k1×km+q1matrix.

The p-value for testing hypothesis Hm,xin (35) can be defined as

pm,xβ̂imΣ̂m=P{TmxLmxLmxΣ̂mLmx1LmxTmx
β̂imLmxLmxΣ̂mLmx1Lmxβ̂im},E36

where TmxGimΣ̂i,m1Gim1GimΣ̂i,m12Ei,i=1,2,,k;Σ̂i,m=diagσ̂ixijj=12niis an estimator of the variance matrix of the ith regression model and

Σ̂m=diagGimΣ̂i,m1Gim1i=12k

is an estimator of the variance matrix of Tmxgiven β̂im,σ̂im2,i=1,2,,k.The estimator of σixijcan be found in Li and Xu [36], where the smoothing parameter ,mp, can be used as a pilot smoothing parameter, which is different from mused in f̂i,mx. SiZer-RS map can be constructed based on different values of mp, which represents the different trade-offs between the structure of regression curve and errors.

The two SiZer maps given in Figure 4 are constructed using the data plotted in Figure 3 to compare three regression curves f1x=fxx=0,f3x=0.5sin2πx. Since the variance of errors is a constant, it can be estimated by the sum of squares of residues. In this case, pilot smoothing parameter is avoided [47, 48]. The two blue regions in Figure 4 clearly show their difference across interval (0, 1). The gray color indicates that there is no sufficient data that can be used to get credible testing results at xand nearby. The sufficiency is quantized as ESSxmfor SiZer-RS, and pixel xmis colored gray if ESSxm<5:

ESSxmmini=1,2,,kNmxGimGim111.

Figure 4 shows that SiZer-RS map can explore the differences between regression curves accurately.

It is worth noting that, for SiZer-RS map, the coarsest smoothing level should be m=q+1to ensure the effectiveness of the qth regression spline and the finest smoothing level is recommend to be the one such that avgxx1x2xgESSxm<5, where x1,x2,,xgare points at which hypothesis Hm,xis tested and pixels are produced by combing different values of m. In applications, a wide range of values of mpcan be used to generate a family of SiZer-RS maps. Particularly, mpand mcan both be used as smoothing parameters simultaneously to construct a 3D SiZer-RS map [47, 48].

## 4. Conclusion

This chapter introduces regression spline method for testing the parametric form of nonparametric regression function in nonparametric, partial linear, and varying-coefficient models, respectively. The corresponded p-values are established based on fiducial method and spline interpolation. The test procedures on the basis of the proposed p-value are accurate in some cases and are consistent under some mild conditions, which means that the p-value tends to be zero when null hypothesis is false as sample size and the number of knots used in spline interpolation tend to be infinity. Hence, the proposed test procedures are performed well especially in small sample size case.

The spline-based method frequently used smoothing method, which can be used easily with other statistical methods. When using the spline-based method, the smoothing level is controlled by the number of knots and their positions. In order to sidestep the determination of knots and obtain more reliable results, multi-scale smoothing methods are proposed based on spline regression to infer structures of regression function. The multi-scale method is a visual method to do inference at different locations and smoothing levels. In addition, the smoothing spline version of multi-scale method is also introduced. The proposed multi-scale method can also be used for comparing multiple regression curves. Some real data examples illustrate the practicability of the proposed multi-scale method.

The MATLAB code of SiZerLL and other versions of SiZer based on kernel smoother is available from the homepage of Professor Marron JS; the MATLAB code of SiZerLS can be downloaded from the following website:

chapter PDF
Citations in RIS format
Citations in bibtex format

## More

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## How to cite and reference

### Cite this chapter Copy to clipboard

Na Li (June 6th 2018). Model Testing Based on Regression Spline, Topics in Splines and Applications, Young Kinh-Nhue Truong and Muhammad Sarfraz, IntechOpen, DOI: 10.5772/intechopen.74858. Available from:

### Related Content

#### Topics in Splines and Applications

Edited by Young Kinh-Nhue Truong

Next chapter

#### Penalized Spline Joint Models for Longitudinal and Time-To-Event Data

By Huong Thi Thu Pham and Hoa Pham

#### Time Series Analysis and Applications

Edited by Nawaz Mohamudally

First chapter

#### Introductory Chapter: Time Series Analysis (TSA) for Anomaly Detection in IoT

By Nawaz Mohamudally

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.