## 1. Introduction

A large number of problems in production planning and scheduling, location, transportation, finance, and engineering design require that decisions be made in the presence of uncertainty. From the very beginning of the application of optimization to these problems, it was recognized that analysts of natural and technological systems are almost always confronted with uncertainty. Uncertainty, for instance, governs the prices of fuels, the availability of electricity, and the demand for chemicals. A key difficulty in optimization under uncertainty is in dealing with an uncertainty space that is huge and frequently leads to very large-scale optimization models. Decision-making under uncertainty is often further complicated by the presence of integer decision variables to model logical and other discrete decisions in a multi-period or multi-stage setting.

Approaches to optimization under uncertainty have followed a variety of modeling philosophies, including expectation minimization, minimization of deviations from goals, minimization of maximum costs, and optimization over soft constraints. The main approaches to optimization under uncertainty are stochastic programming (recourse models, robust stochastic programming, and probabilistic models), fuzzy programming (flexible and possibilistic programming), and stochastic dynamic programming.

This paper is devoted to improvement of statistical decisions in revenue management systems. Revenue optimization – or revenue management as it is also called – is a relatively new field currently receiving much attention of researchers and practitioners. It focuses on how a firm should set and update pricing and product availability decisions across its various selling channels in order to maximize its profitability. The most familiar example probably comes from the airline industry, where tickets for the same flight may be sold at many different fares throughout the booking horizon depending on product restrictions as well as the remaining time until departure and the number of unsold seats. Since the tickets for a flight have to be sold before the plane takes off, the product is perishable and cannot be stored for future use. The use of the above strategies has transformed the transportation and hospitality industries, and has become increasingly important in retail, telecommunications, entertainment, financial services, health care and manufacturing. In parallel, pricing and revenue optimization has become a rapidly expanding practice in consulting services, and a growing area of software and IT development, where the revenue optimization systems are tightly integrated in the existing Supply Chain Management solutions.

Most stochastic models, which are used in revenue optimization systems, are developed under the assumptions that the parameter values of the models are known with certainty. When these models are applied to solve real-world problems, the parameters are estimated and then treated as if they were the true values. The risk associated with using estimates rather than the true parameters is called estimation risk and is often ignored. When data are limited and/or unreliable, estimation risk may be significant, and failure to incorporate it into the model design may lead to serious errors. Its explicit consideration is important since decision rules which are optimal in the absence of uncertainty need not even be approximately optimal in the presence of such uncertainty.

In this paper, we consider the cases where it is known that the underlying probability distributions belong to a parameterized family of distributions. However, unlike in the Bayesian approach, we do not assume any prior knowledge on the parameter values. The primary purpose of the paper is to introduce the idea of embedding of sample statistics (say, sufficient statistics or maximum likelihood estimators) in a performance index of revenue optimization problem. In this case, we find the optimal stochastic control policy directly. We demonstrate the fact that the traditional approach, which separates the estimation and the optimization tasks in revenue optimization systems (i.e., when we use the estimates as if they were the true parameters) can often lead to pure results. It will be noted that the optimal statistical decision rules depend on data availability.

For constructing the improved statistical decisions, a new technique of invariant embedding of sample statistics in a performance index is proposed (Nechval et al., 1999; 2004; 2008; 2010a; 2010b; 2010c; 2010d; 2010e; 2011a; 2011b). This technique represents a simple and computationally attractive statistical method based on the constructive use of the invariance principle in mathematical statistics. Unlike the Bayesian approach, an invariant embedding technique is independent of the choice of priors, i.e., subjectivity of investigator is eliminated from the problem. The technique allows one to eliminate unknown parameters from the problem and to find the improved invariant statistical decision rules, which has smaller risk than any of the well-known traditional statistical decision rules.

In order to obtain improved statistical decisions for revenue management under parametric uncertainty, it can be considered the three prediction situations: “new-sample” prediction, “within-sample” prediction, and “new-within-sample” prediction. For the new-sample prediction situation, the data from a past sample are used to make predictions on a future unit or sample of units from the same process or population. For the within-sample prediction situation, the problem is to predict future events in a sample or process based on early data from that sample or process. For the new-within-sample prediction situation, the problem is to predict future events in a sample or process based on early data from that sample or process as well as on a past data sample from the same process or population. Some mathematical preliminaries for the within-sample prediction situation are given below.

## 2. Mathematical preliminaries for the within-sample prediction situation

*Theorem 1.* Let X_{1} ... X_{k} be the first k ordered observations (order statistics) in a sample of size m from a continuous distribution with some probability density function f_{} (x) and distribution function F_{} (x), where is a parameter (in general, vector). Then the joint probability density function of X_{1} ... X_{k} and the *l*th order statistics X_{l} (1 k < l m) is given by

where

(3) |

represents the conditional probability density function of X_{l} given X_{k}=x_{k}.

*Proof.* The joint density of *X*_{1} ... *X*_{k} and *X*_{l} is given by

(4) |

It follows from (4) that

i.e., the conditional distribution of X_{l}, given X_{i} = x_{i} for all i = 1,…, k, is the same as the conditional distribution of X_{l,} given only X_{k} = x_{k}. This ends the proof. □

*Corollary 1.1.* The conditional probability distribution function of *X*_{l} given *X*_{k}=*x*_{k} is

(6) |

*Corollary 1.2*. If l = k + 1,

(7) |

(8) |

*Corollary 1.3*. If l = m,

(9) |

(10) |

### 2.1. Exponential distribution

In order to use the results of Theorem 1, we consider, for illustration, the exponential distribution with the probability density function

and the probability distribution function

*Theorem 2.* Let X_{1} ... X_{k} be the first k ordered observations (order statistics) in a sample of size m from the exponential distribution (11). Then the conditional probability density function of the lth order statistics X_{l} (1 k < l m) given X_{k} = x_{k} is

(13) |

and the conditional probability distribution function of the lth order statistics X_{l} given X_{k} = x_{k} is

(14) |

*Proof.* It follows from (3) and (6), respectively. □

*Corollary 2.1.* If l = k + 1,

(15) |

(16) |

*Corollary 2.2*. If l = m,

(17) |

(18) |

*Theorem 3.* Let X_{1} ... X_{k} be the first k ordered observations (order statistics) in a sample of size m from the exponential distribution (11), where the parameter is unknown. Then the predictive probability density function of the lth order statistics X_{l} (1 k < l m) is given by

where

is the sufficient statistic for , and the predictive probability distribution function of the *l*th order statistics X_{l} is given by

*Proof.* Using the technique of invariant embedding (Nechval et al., 1999; 2004; 2008; 2010a; 2010b; 2010c; 2010d; 2010e; 2011a; 2011b), we reduce (13) to

where

is the pivotal quantity, the probability density function of which is given by

ThenThis ends the proof. □

*Corollary 3.1*. If l = k + 1,

*Corollary 3.2*. If l = m,

### 2.2. Cumulative customer demand

The primary purpose of this paper is to introduce the idea of cumulative customer demand in inventory control problems to deal with the order statistics from the underlying distribution. It allows one to use the above results to improve statistical decisions for inventory control problems under parametric uncertainty.

*Assumptions.* The customer demand at the *i*th period represents a random variable Y_{i}, i{1, …, m}. It is assumed (for the cumulative customer demand) that the random variables

represent the order statistics (X_{1} … X_{m}) from the exponential distribution (11).

*Inferences.* For the above case, we have the following inferences.

Conditional probability density function of Y_{k+1}, k{1, …, m 1}, is given by

Conditional probability distribution function of Y_{k+1}, k{1, …, m 1}, is given by

Conditional probability density function of

Conditional probability distribution function of

Predictive probability density function of Y_{k+1}, k{1, …, m 1}, is given by

Predictive probability distribution function of Y_{k+1}, k{1, …, m 1}, is given by

Predictive probability density function of Z_{m} is given by

Predictive probability distribution function of Z_{m} is given by

## 3. Stochastic inventory control problem

Most of the inventory management literature assumes that demand distributions are specified explicitly. However, in many practical situations, the true demand distributions are not known, and the only information available may be a time-series of historic demand data. When the demand distribution is unknown, one may either use a parametric approach (where it is assumed that the demand distribution belongs to a parametric family of distributions) or a non-parametric approach (where no assumption regarding the parametric form of the unknown demand distribution is made).

Under the parametric approach, one may choose to estimate the unknown parameters or choose a prior distribution for the unknown parameters and apply the Bayesian approach to incorporating the demand data available. Scarf (1959) and Karlin (1960) consider a Bayesian framework for the unknown demand distribution. Specifically, assuming that the demand distribution belongs to the family of exponential distributions, the demand process is characterized by the prior distribution on the unknown parameter. Further extension of this approach is presented in (Azoury, 1985). Application of the Bayesian approach to the censored demand case is given in (Ding et al., 2002; Lariviere & Porteus, 1999). Parameter estimation is first considered in (Conrad, 1976) and recent developments are reported in (Agrawal & Smith, 1996; Nahmias, 1994). Liyanage & Shanthikumar (2005) propose the concept of operational statistics and apply it to a single period newsvendor inventory control problem.

This section deals with inventory items that are in stock during a single time period. At the end of the period, leftover units, if any, are disposed of, as in fashion items. Two models are considered. The difference between the two models is whether or not a setup cost is incurred for placing an order. The symbols used in the development of the models include:

c = setup cost per order,

c_{1}= holding cost per held unit during the period,

c_{2}= penalty cost per shortage unit during the period,

g_{} (y_{k+1}|k) = conditional probability density function of customer demand, Y_{k+1}, during the (k+1)th period,

= parameter (in general, vector),

u = order quantity,

q = inventory on hand before an order is placed.

### 3.1. No-setup model (Newsvendor model)

This model is known in the literature as the *newsvendor* model (the original classical name is the *newsboy* model). It deals with stocking and selling newspapers and periodicals. The assumptions of the model are:

Demand occurs instantaneously at the start of the period immediately after the order is received.

No setup cost is incurred.

The model determines the optimal value of u that minimizes the sum of the expected holding and shortage costs. Given optimal u (= u^{*}), the inventory policy calls for ordering u^{*} q if q < u^{*}; otherwise, no order is placed.

If Y_{k+1} u, the quantity u Y_{k+1} is held during the (k+1)th period. Otherwise, a shortage amount Y_{k+1} u will result if Y_{k+1}> u. Thus, the cost per the (k+1)th period is

The expected cost for the (k+1)th period, E_{}{C(u)}, is expressed as

The function

It follows from (31), (32), (40), and (43) that

and*Parametric uncertainty.* Consider the case when the parameter is unknown. To find the best invariant decision rule_{1}, and the ancillary factor . In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters whose probability distribution does not depend on unknown parameters. Note that a pivotal quantity need not be a statistic—the function and its value can depend on parameters of the model, but its distribution must not. If it is a statistic, then it is known as an ancillary statistic.

Transformation of C(u) based on the pivotal quantities V, V_{1} is given by

where

Then E{C^{(1)}()} is expressed as

The function E{C^{(1)}()} can be shown to be convex in , thus having a unique minimum. Taking the first derivative of E{C^{(1)}()} with respect to and equating it to zero, we get

It follows from (47), (49), and (51) that the optimum value of is given by

the best invariant decision rule is

and the expected cost, if we use u^{BI}, is given by

It will be noted that, on the other hand, the invariant embedding technique (Nechval et al., 1999; 2004; 2008; 2010a; 2010b; 2010c; 2010d; 2010e; 2011a; 2011b) allows one to transform equation (40) as follows:

(55) |

Then it follows from (55) that

where

represents the expected prediction cost for the (k+1)th period. It follows from (57) that the cost per the (k+1)th period is reduced to

and the predictive probability density function of Y_{k+1} (compatible with (40)) is given by

Minimizing the expected prediction cost for the (k+1)th period,

with respect to u, we obtain u^{BI} immediately, and

It should be remarked that the cost per the (k+1)th period,

(62) |

where the probability density function of the ancillary statistic W (compatible with (40)) is given by

Then the best invariant decision rule

*Comparison of statistical decision rules.* For comparison, consider the maximum likelihood decision rule that may be obtained from (44),

where

Since

it follows from the above that

Thus, in this case, the use of *
* and may be considerable.

### 3.2. Setup model (s-S policy)

The present model differs from the one in Section 3.1 in that a setup cost c is incurred. Using the same notation, the total expected cost per the (k+1)th period is

As shown in Section 3.1, the optimum value u^{*} must satisfy (43). Because c is constant, the minimum value of ^{*}. In Fig.1,

The equation yields another value s_{1} (> S), which is discarded.

Assume that q is the amount on hand before an order is placed. How much should be ordered? This question is answered under three conditions: 1) q < s; 2) s q S; 3) q > S.

Case 1 (q < s). Because q is already on hand, its equivalent cost is given by

Thus, the optimal inventory policy in this case is to order S q units.

Case 2 (s q S). From Fig. 1, we have

Thus, it is not advantageous to order in this case and u^{*} = q.

Case 3 (q > S). From Fig. 1, we have for u > q,

This condition indicates that, as in case (2), is not advantageous to place an order that is, u^{*} = q.

The optimal inventory policy, frequently referred to as the s S policy, is summarized as

The optimality of the s S policy is guaranteed because the associated cost function is convex.

Parametric uncertainty. In the case when the parameter is unknown, the total expected prediction cost for the (k+1)th period,

(75) |

is considered in the same manner as above.

## 4. Airline revenue management problem

The process of revenue management has become extremely important within the airline industry. It consists of setting fares, setting overbooking limits, and controlling seat inventory to increase revenues. It has allowed the airlines to survive deregulation by allowing them to respond to competitors' deep discount fares on a rational basis.

An airline, typically, offers tickets for many origin–destination itineraries in various fare classes. These fare classes not only include business and economy class, which are settled in separate parts of the plane, but also include fare classes for which the difference in fares is explained by different conditions regarding for example cancellation options or overnight stay arrangements. Therefore the seats on a flight are products, which can be offered to different customer segments for different prices. Since the tickets for a flight have to be sold before the plane takes off, the product is perishable and cannot be stored for future use. The same is true for most other service industries, such as hotels, hospitals and schools.

### 4.1. Airline seat inventory control

At the heart of airline revenue management lies the airline seat inventory control problem. It is common practice for airlines to sell a pool of identical seats at different prices according to different booking classes to improve revenues in a very competitive market. In other words, airlines sell the same seat at different prices according to different types of travelers (first class, business and economy) and other conditions. The question then arises whether to offer seats at a relatively low price at a given time with a given number of seats remaining or to wait for the possible arrival of a higher paying customer. Assigning seats in the same compartment to different fare classes of passengers in order to improve revenues is a major problem of airline seat inventory control. This problem has been considered in numerous papers. For details, the reader is referred to a review of yield management, as well as perishable asset revenue management, by Weatherford et al. (1993), and a review of relevant mathematical models by Belobaba (1987).

The problem of finding an optimal airline seat inventory control policy for multi-leg flight with multiple fare classes, which allows one to maximize the expected profit of this flight, is one of the most difficult problems of air transport logistics. On the one hand, one must have reasonable assurance that the requirements of customers for reservations will be met under most circumstances. On the other hand, one is confronted with the limitation of the capacity of the cabin, as well as with a host of other less important constraints. The problem is normally solved by the application of judgment based on past experience. The question arises whether or not it is possible to construct a simple mathematical theory of the above problem, which will allow one better to use the available data based upon airline statistics. Two models (dynamic model and static one) of airline data may be considered. In the dynamic model, the problem is formulated as a sequential decision process. In this case, an optimal dynamic reservation policy is used at each stage prior to departure time for multi-leg flights with several classes of passenger service. The essence of determining the optimal dynamic reservation policy is maximization of the expected gain of the flight, which is carried out at each stage prior to departure time using the available data. The term (dynamic reservation policy) is used in this paper to mean a decision rule, based on the available data, for determining whether to accept a given reservation request made at a particular time for some future date. An optimal static reservation policy is based on the static model. The models proposed here contain a simple and natural treatment of the airline reservation process and may be appropriate in practice.

### 4.2. Expected marginal seat revenue model (EMSR)

Different approaches were developed for solving the airline seat inventory control problem. The most important and widely used model – EMSR – was originally proposed by Littlewood (1972). To explain the basic ideas of the Littlewood model, we will consider the seat allocation problem on a nonstop flight with two airfares. We denote by U the number of seats in the aircraft, by u_{L} the number of seats reserved for passengers with a lower fare and by p the probability that a passenger who would pay the higher fare cannot find a seat because it was sold to a passenger paying a lower fare. A rise in variable u_{L} means a rise in probability p. A rise in the number of seats for passengers with the lower fare, i.e. a rise in variable p, decreases the number of seats for passengers with the higher fare, i.e., variable U – u_{L} decreases. A reduction in variable U – u_{L} increases the probability that a passenger paying the higher fare cannot find a vacant seat on their desired flight because it has been sold to a passenger with a lower fare. Probability p is given by

where the _{L} to which reservations are accepted for passengers with the lower fare. We denote by c_{1} the revenue from passengers with the lower fare and by c_{2} the revenue from passengers with the higher fare. Let u_{L} seats be reserved for low fare passengers. The revenue per lower fare seat is c_{1}. Expected revenue from a potential high fare passenger is c_{2}p. Passengers with lower fares should be accepted until

i.e.,

Reservation level u_{L} is determined by solving the following equality:

where

Richter (1982) gave a marginal analysis, which proved that (77) gives an optimal allocation (assuming certain continuity conditions). Optimal policies for more than two classes have been presented independently by Curry (1990), Wollmer (1992), Brumelle & McGill (1993), and Nechval et al. (2006).

Parametric uncertainty. In order to solve (79) under parametric uncertainty, it can be used, for example, the following results.

Theorem 4. Let X_{1} ... X_{r} be the first r ordered past observations from a previous sample of size n from the two-parameter Weibull distribution

where both distribution parameters (β – scale, - shape) are positive. Then a lower one-sided conditional prediction limit h on the lth order statistic

where z_{h} satisfies the equation

(84) |

are ancillary statistics, any r2 of which form a functionally independent set (for notational convenience we include all of z_{1}, …, z_{r} in (85); z_{r-1} and z_{r} can be expressed as function of z_{1}, …, z_{r-2} only), _{1} … X_{r}) from a sample of size n from the two-parameter Weibull distribution, which can be found from solution of

(Observe that an upper one-sided conditional prediction limit h on the lth order statistic

Proof. The joint density of X_{1} ... X_{r} is given by

_{1}… X

_{r}from a complete sample of size n, and let

Parameters and in (90) are scale and shape parameters, respectively, and it is well known that if _{1} and V_{2} are the pivotal quantities whose distributions depend only on n. Most, if not all, proposed estimates of and possess the necessary properties; these include the maximum likelihood estimates and various linear estimates.

Using (90) and the invariant embedding technique (Nechval et al., 1999; 2004; 2008; 2010a; 2010b; 2010c; 2010d; 2010e; 2011a; 2011b), we then find in a straightforward manner, that the joint density of V_{1}, V_{2}, conditional on fixed

where

is the normalizing constant. Writing

(93) |

where

we have from (91) and (93) that

Now v_{1} can be integrated out of (95) in a straightforward way to give

(96) |

This completes the proof. □

Remark 1. If l=m=1, then the result of Theorem 4 can be used to construct the static policy of airline seat inventory control.

Theorem 5. Let X_{1} ... X_{k} be the first k ordered early observations from a sample of size m from the two-parameter Weibull distribution (82). Then a lower one-sided conditional prediction limit h on the lth order statistic X_{l} (l > k) in the same sample is given by

where w_{h} satisfies the equation

(98) |

where _{1} ... X_{k} from a sample of size m from the two-parameter Weibull distribution (82), which can be found from solution of

(Observe that an upper one-sided conditional prediction limit h on the lth order statistic X_{l} based on the first k ordered early-failure observations X_{1} ... X_{k}, where l > k, from the same sample may be obtained from a lower one-sided conditional prediction limit by replacing by 1)

Proof. The joint density of X_{1} ... X_{k} and X_{l} is given by

(104) |

_{1}... X

_{k}from a complete sample of size m, and letand

_{1}, V

_{2}, W, conditional on fixed

(106) |

where

is the normalizing constant. Using (106), we have that

(108) |

and the proof is complete. □

Remark 2. If l=m, then the result of Theorem 5 can be used to construct the dynamic policy of airline seat inventory control.

## 5. Conclusions and directions for future research

In this paper, we develop a new frequentist approach to improve predictive statistical decisions for revenue optimization problems under parametric uncertainty of the underlying distributions for the customer demand. Frequentist probability interpretations of the methods considered are clear. Bayesian methods are not considered here. We note, however, that, although subjective Bayesian prediction has a clear personal probability interpretation, it is not generally clear how this should be applied to non-personal prediction or decisions. Objective Bayesian methods, on the other hand, do not have clear probability interpretations in finite samples.

For constructing the improved statistical decisions, a new technique of invariant embedding of sample statistics in a performance index is proposed. This technique represents a simple and computationally attractive statistical method based on the constructive use of the invariance principle in mathematical statistics. The method used is that of the invariant embedding of sample statistics in a performance index in order to form pivotal quantities, which make it possible to eliminate unknown parameters (i.e., parametric uncertainty) from the problem. It is especially efficient when we deal with asymmetric performance indexes and small data samples

More work is needed, however, to obtain improved or optimal decision rules for the problems of unconstrained and constrained optimization under parameter uncertainty when: (i) the observations are from general continuous exponential families of distributions, (ii) the observations are from discrete exponential families of distributions, (iii) some of the observations are from continuous exponential families of distributions and some from discrete exponential families of distributions, (iv) the observations are from multiparametric or multidimensional distributions, (v) the observations are from truncated distributions, (vi) the observations are censored, (vii) the censored observations are from truncated distributions.