1. Introduction
The car rental industry plays a huge role globally. It can act as a lubricant for production and consumption to ease their mutual restraints. It can also expand the automotive consumer market and evaluate the popularity of new cars before they come into the market. At the same time, because the public transportation system is limited by operating time, departure frequency, accessible range, comfort, privacy, and other conditions, car rental, with its outstanding advantages such as high flexibility, ease of use, and privacy, has shown considerable growth over the years. The global car rental market was valued at
$79.5 bn in 2018 and is expected to reach
$125.4 bn by 2025, registering a CAGR of 5.1% from 2019 to 2025 [
1]. In recent years, the demand for car rental services in the Asia-Pacific region has become the largest and fastest growing segment, especially in China and India, where there are high population density and rapidly growing demand.
Mainstream car rental companies started to use revenue management (RM) as a crucial tool for reducing operating costs and improving competitiveness in the 1990s. RM is an important interface between enterprises’ supply and market demand, and its objective is to maximize profits. The most widely known RM application occurs in the airline industry. It mainly includes four aspects, namely forecasting, overbooking, capacity control, and pricing [
2]. Among them, forecasting is the foundation and provides vital input information for other work. A 12.5–25% reduction of forecast error could translate into 1–3% incremental revenue generated from the RM system [
3,
4]. Therefore, to achieve the desired level of accuracy, it is important and necessary to innovate the forecast method constantly [
5]. In reality, influenced by booking and/or capacity limits, the customer’s present demand may not be satisfied, which will lead to it being lost (spilled) or recaptured by a more expensive (buyup) or cheaper (buydown) available product. The sample booking curves with three grades of products (the fare order is A > B > C) are shown in
Figure 1 [
6,
7]. Therefore, in order to repair the constrained demand, it is necessary to carry out an unconstrained estimation of the customer’s actual demand and forecast the future demand on this basis [
8,
9].
Research into unconstrained estimation (sometimes also called detruncation, uncensoring) in the RM system started more than 40 years ago and mainly focused on the airline industry. Unconstrained estimation mainly uses observed “censored demand” to estimate and calculate the “uncensored demand”. The original methods were based on statistics, such as the expectation maximization (EM), projection detruncation (PD), booking curve (BC), and mean imputation. Since the beginning of this century, discrete customer choice models have grown in popularity to estimate unconstrained demand [
8,
10,
11]. Forecasting usually yields the demands and no-show rate by booking class, the latter of which will not be discussed here. The common methods include time series, linear regression, exponential smoothing, double exponential smoothing, pickup, neural networks, principal component analysis, and adaptive models [
6,
12,
13]. Since air transportation can meet the travel demand between two points, the prediction of OD demand is necessary [
14,
15,
16]. Guo surveyed the history of research on unconstraining methods by reviewing over 130 references [
17]. Weatherford provided two reviews of unconstrained estimation and forecast methods broadly used in different industries up to 2016 [
18,
19]. Recent discussions include forecast multipliers and hybrid forecasting [
9], the effect of customers’ reference price on demand [
20], Gaussian processes for unconstraining demand [
21], and demand forecast accuracy [
22]. Due to the important research and application value of revenue management theory, many advances have been made in the hotel, car or truck rental, cargo/freight, internet service and retailing, cruise line, rail, and other industries regarding demand estimation and forecasting [
19].
Unlike the airline industry, the car rental industry is more concerned with the demand at each rental station, and OD forecasting does not need to be considered as much. Supply and demand are dynamically balanced, but there is still an overestimation or underestimation. Since demand determines the size of the leasing station inventory and fleet, demand estimation and forecasting techniques are worth studying [
23]. The first reports of RM application to car rental were at Hertz and National Car Rental [
24,
25]. Jishan used a decomposition method to identify the actual latent demand from the total recorded turndowns [
26]. Gordon simplified the problem by human–computer interaction, hoping to find the right level of detail at which to forecast and optimize [
27]. Dhruval optimized the fleet with the help of a robust design methodology, considering fleet as a product and cars as an affecting variable [
28]. Kourentzes addressed the frequently encountered situation of observing only a few sales events at the individual product level and proposed variants of small demand forecasting methods to be used for unconstraining [
29]. In general, there is relatively little research on unconstrained estimation and prediction of car rental demand. By referring to research methods in other fields, relevant research in the car rental industry is worthy of a further discussion.
Due to the relative lack of articles that deal with car rental demand forecasting, this work develops an optimization methodology to estimate and forecast the demand for a car rental station; it is then applied to a case study in China. By using the case study, the work also enables comparison between the effects of different methods, especially for cases of small demand. The remainder of this work is organized as follows.
Section 2 presents the problem statement and hypothesis, including an improved spill model for unconstrained estimation. In
Section 3, we propose a hybrid forecasting method based on the Holt–Winters model and backpropagation neural network. Subsequently,
Section 4 introduces the case study with numerical experiments and a discussion.
Section 5 concludes the work.
2. Unconstrained Estimation
2.1. Hypothesis and Variable Definitions
Before studying the forecasting method, the conditional hypothesis and variable definitions should be explained. This work is mainly aimed at the car rental industry, and the method is also applicable to the hotel and aviation industries, etc.
Hypothesis 1. Unconstrained demand of car type in presell lead time meets normal distribution, and the lead time is seven days.
Hypothesis 2. Demands of various car types are random and independent, and the numbers of car rentals are counted by day; the deadline is midnight.
Hypothesis 3. Inventory quantity is constant without car dispatching among stations, taking no account of batch demand, overdue return, or oversold behaviors.
Hypothesis 4. The customer’s willingness to pay is from low to high.
The variables are defined as follows:
: index of car types, . Indexes of are listed in an increasing order of car rental price, corresponding to the gradually increased levels;
: index of rental demand recording times, ;
: review points of inventory control, ; presale system will open when , and it will be closed when .
: presale lead time interval between and ;
: order quantity limit of car type in in the documented circle, namely the quota ceiling for rent;
: cumulative order quantity of car type at review point in the documented circle, particularly ;
: presale state of car type in in the documented circle, “1” indicates that demand is not restricted, and “0” indicates that demand is limited.
: order quantity of car type in in the documented circle, particularly , ;
: spillage of car type in in the documented circle, particularly ;
: complete spillage of car type in in the documented circle, particularly ;
: cumulative spillage of car type 1 to at review point in the documented circle, ;
: unconstrained estimation of car type in in the documented circle, particularly ;
: real value of car type in in the documented circle.
2.2. Unconstrained Estimation Methods
For the revenue management system, the starting point of analysis is some set of assumptions regarding an underlying stochastic or deterministic demand process [
30]. In fact, when the real customer demand exceeds the preset order quantity limit, the recorded demand data in the presale system would be censored. Using these “cutoff demand data” for demand forecasting is bound to reduce the accuracy of forecasts and directly impact the revenue. Therefore, deducing the real historical customer demand from the recorded demand data is the first problem. Some common unconstrained estimation methods are as follows.
2.2.1. Expectation Maximization
EM is one of the most common methods to restore incomplete data by using the iterative algorithm in statistics. The EM algorithm is mainly composed of two steps, E (expectation) and M (maximization)—that is, after initializing the parameters to be estimated through the observable incomplete data information, the conditional expectation of missing information in the incomplete data is calculated by parameter estimation, and the missing information is replaced accordingly. In step E, an expected log-likelihood function of complete data is obtained. In step M, the improved parameter values are obtained by using this function through the maximum likelihood estimation method. The whole process will be repeated until convergence [
31].
Considering the unconstrained demand
in
with the expectation
and variance
, the initial values are then
When all data is constrained,
and
cannot be initialized properly, and the repairing process is abandoned. Let the unconstrained preliminary estimation value be equal to the increment of the observed value; then, the unconstrained estimation value after
iterations will be
When only one set of data is constrained, initialization is abandoned, and the convergence is tested directly by . The iteration is divided into two steps.
Step E: unconstrained estimation value is
where
is the normal probability density function, and
when
.
Step M: find the partial derivative of the log-likelihood function
versus
and
, and let them equal 0 to calculate the re-estimated
and
.
Convergence condition: the operation should be stopped when or , where is any small given number. Otherwise, go back to step E and continue iterating.
2.2.2. Projection Detruncation
The initial values, convergence time, and iteration number of the EM algorithm in the iterative process are more susceptible to their own data in varying degrees. Based on the EM algorithm, Hopperstad proposed PD, which solves the limitation of EM in a large-scale unconstrained estimation [
32,
33]. The research proves that the PD algorithm is more flexible in practical applications.
PD and EM have the same principle, both of which include step E and M. Both algorithms use the iterative idea of statistics to estimate the parameters of constraint demand data, but the difference is that the parameter-related heuristic method is used in PD at step E. In PD, the unconstrained estimation value is gained without calculating the conditional expectation of the constraint data. In step M, the convergence conditions are the same, namely
When
,
When
,
where
is the distribution function of normal distribution, and parameter
is any given constant (
) and stays the same throughout the whole process.
represents the probability that the observed constrained demand has a true value greater than
. Instead of calculating a conditional expectation, in the probability distribution graph (
Figure 2), PD balances two things. The first is the area of the probability distribution between the order quantity limit (that is, the original constrained value) and the new estimate or projected value (area M), and the other is the area between the new estimate and infinity (area N). The horizontal axis value corresponding to region N represents the portion of the real demand that is underestimated by the unconstrained estimation [
34]. When
, the PD algorithm can truly balance the values of M and N, yielding nearly similar results to the EM algorithm.
2.2.3. Multitype Spill Model
The multitype spill model (MSM) takes into account the vertical recapture of different price grade requirements, where the buyer’s behaviors of buyup and buydown indwell in different price levels. Therefore, the unconstrained estimation process based on MSM is closer to reality, which can effectively avoid the overestimation of real demand due to the repeated records of the revenue management system’s actual observable demand between different car type prices. The spill model is mainly divided into two parts, the first part being the calculation of spillage, and the second part being the calculation of unconstrained demand [
35]. The specific calculation process is as follows:
Check if the rental demand is constrained in
. When
, the rental demand is not constrained and no repair is performed; otherwise, the repairing process is performed, go to next step.
Parameters initialization. The distribution parameters of the demand in are initialized by the observable “unconstrained demand”, which is the same as the EM algorithm.
Calculation of spillage and cumulative spillage. According to the assumptions, the customer’s willingness to pay is characterized from low to high car type, so the spillage should be calculated from the lowest car type.
For a single car type, cumulative spillage can be determined by the following formula:
where
,
is the probability density function of unconstrained demand for car type
in
. When
, rental demand was constrained,
where
,
is the distribution function of standard normal distribution, and
is the distribution function of standard normal distribution [
36].
For a multitype situation, when
, considering cumulative spillage
at review point
from car type 1 to
.
When
,
When
, calculate
by considering the cumulative spillage
at review point
from car type 1 to
.
When
,
Get the repaired result, namely unconstrained estimation.
2.3. Improved Multitype Spill Model
Generally, traditional revenue management assumes that customer selection behavior is short sighted and ignores customer subjectivity. However, competition in the car rental market has intensified, and the “buyer market” has gradually expanded, so the subjective initiative of customers has been more fully reflected. The customer can choose target commodity according to his or her own preferences and subjective utility. These factors can be divided into three categories according to the differences in sources, namely accidental factors, objective factors, and subjective factors.
Accidental factors are not expected and have some randomness, such as concerts, sports events, and weather changes. Objective factors are external factors, which are generally not changeable in a short period of time. Objective factors are not directly related to the customers themselves, including policy and regulations, regional economic level, and rental vehicle attributes. Subjective factors are the customer’s own determinants, which are expressed in terms of customer preferences, trip purposes, age, and literacy. Subjective factors can be grasped by the analysis of the customer’s choice intention, usually in the form of a questionnaire.
2.3.1. SP/RP Survey
The basic data of customer selection preference were obtained by stated preference (SP) and revealed preference (RP) surveys. The SP survey preresearches the subjective preferences of customers when renting a car before the rental occurs. The RP survey analyzes the actual selection behavior of customers in the context of what has happened. SP data were more flexible, while RP data were more reliable.
The SP survey can obtain multiple data from a single interviewee, with the advantages of small sample and low survey cost. The basic principle of the SP survey is to predetermine the various attribute factors and their influence levels and then invite interviewees to score the plan in the set situation. The respondents’ choices may be different from their actual actions, but they can still find the main factors that can be calculated and have a large impact on the overall mean. Then, the secondary factors that are difficult to measure will be removed. The data obtained from the RP survey are the actual occurrence data, since the data records behaviors that have been taken by the respondent or the actual selection behavior observed. The basic principle of the RP survey is to describe the occurrence or existence of the situation with different attribute factors and influence levels and to invite the respondents to score the content.
Before the SP survey plan is designed, condition attributes, decision attributes, and corresponding level values should be determined. The survey content includes the importance of the brand effect, the satisfaction of car rental type, the sensitivity of the rental price, returning location, satisfaction of the vehicle status, and the satisfaction with a special offer, service attitude, procedure convenience, and accident handling. The RP survey is aimed at customers who have made choices, and the influencing factors are rated as very satisfied, satisfied, general, not satisfied, and very dissatisfied. The survey includes the type of rental vehicle and satisfaction with the rental price, return location restrictions, vehicle status, service attitude, procedure convenience, and accident handling when the preferred rental vehicle is rejected. See
Appendix A and
Appendix B for the SP/RP questionnaire of the case study.
Customers are affected by many attributes when selecting multiple models. In order to determine the degree of influence of each attribute, AHP is usually used. AHP (analytic hierarchy process) is an easy way to quantitatively and qualitatively deal with some fuzzy and complex subjective problems. The most critical step is to obtain a judgment matrix by pairwise comparison to calculate the weight of each factor. The specific calculation process is not given. However, due to the uncertainty of expert experience, the subjective constructed judgment matrix may not meet the conditions of complete consistency. In order to reduce the inconsistency of subjective judgment, the VPRS (variable precision rough set) can be used to simplify the factors affecting the customer’s choice behavior, and similar factors are described by several factors to construct the judgment matrix.
VPRS expands the Pawlak rough set by introducing a confidence level
, so that admissible classification error is allowed to a certain extent. It can improve the flexibility of decision rules and, at the same time, reflect the correlation in data analysis, which is beneficial to the discovery of relevant data from unrelated data, that is, the implicit patterns in the data can be more clearly expressed [
37].
Let be an information system, where is a finite nonempty set called the domain object space. is a finite nonempty subset of attributes. If the attribute in can be further divided into two disjoint nonempty subsets, namely the conditional attribute set and decision attribute set , , , , then is called the decision table. is the union of the attribute domains, i.e., for , where denotes the domain of attribute ; is an information function which associates a unique value of each attribute with every object belonging to .
For any subset
and
,
is an equivalence relation on
, then the corresponding equivalence class is represented as
. By replacing the inclusion relation with a majority inclusion relation in the original definition of lower approximation of a set, we obtain the following generalized notion
-lower approximation and
-upper approximation:
Then, the classification accuracy can be defined as
where
measures the percentage of knowledge that can be correctly classified in the existing knowledge given a certain
in the domain.
2.3.2. Intention Analysis
Nine factors affecting the customer’s car rental behavior are regarded as conditional attributes by the Delphi method (
Table 1), and the level values of the conditional attribute are defined by three grades: 1 = satisfaction, 2 = something in between, and 3 = dissatisfaction. The level values of the decision attribute can also be defined by three grades: R = rent, B = book but not necessarily rent, and G = give up.
Two rental sites (International Convention & Exhibition Center and Jiangbei Airport) of China Auto Rental Inc. (Beijing, China) were selected in Chongqing, China. Interviewees were randomly selected near each site, and the investigators cooperated to fill out the questionnaire. A total of 50 questionnaires were distributed and 40 questionnaires were returned, of which 30 were valid. By sorting out and analyzing the survey data, customer selection decision information can be obtained, as shown in
Table 2.
The following equivalence classes are available:
, where , , , , , , , , .
, where , , .
, , .
Classification accuracy of VPRS is an evaluation of the ability to perform object classification with a confidence level. Let . Similarly, the classification quality of each attribute division can be determined according to Equations (20)–(22).
, , , , .
The case where there is zero in the above classification quality indicates that some interleaved repetitions or substitutions exist among the nine main attributes. Therefore, there are six main factors affecting customer choice, namely rental price
, service attitude
, returning location
, vehicle status
, procedure convenience
, and accident handling
. By comparing the six elements in pairs, the importance of each factor in the conditional attribute on the customer decision can be obtained. The judgment matrix
is
Obviously, the judgment matrix is a positive and negative matrix, which fully satisfies the consistency condition, and its maximum eigenvalue is . The corresponding feature vector is . After normalizing, the weight vector of is .
2.3.3. Customer Choice Probability
The multitype spill model is based on traditional revenue management. The assumption is that people in reality are irrational. When the product price is lower than the willingness to pay, the customer will buy it immediately. Obviously, this assumption ignores the initiative of the decision-making subject. In reality, individuals have various cognitive deviations. If we do not proceed from the customer’s choice behavior, we cannot find the customer’s car rental rules and the changing trend of demand, which is very unfavorable to the car rental company’s ability to set prices and inventory in the later stage.
Therefore, in order to describe the customer’s choice behavior, the Logit model is used to solve the above problems. The utility maximization theory is assumed, and the utility size is used to measure the probability that a customer selects a certain type of vehicle. The various attributes of different types constitute a comprehensive utility value. It is generally believed that the greater the utility of a customer to select a particular type of vehicle, the greater the probability that such a vehicle will be selected. Customers choose a variety of types and make a choice after evaluating the utility based on various attributes. The customer’s choice behavior can be described by a utility function , where and are the average utility and the random utility error, respectively, corresponding to the customer’s selecting.
Assuming that the customer is rational, then each customer will choose the product that will maximize utility. The probability that the customer chooses car type
is
Assuming that
are mutually independent and obey Gumbel distribution, then the probability variable deviations of the two independent Gumbel distributions are subject to the Gumbel distribution, which yields the general form of the multinomial logit model:
Since the RP survey is based on actual conditions, the survey respondents are customers who return cars to the rental sites. In the design of the questionnaire, the actual upgrade or downgrade has been fully reflected. Therefore, the corresponding utility when the customer’s rental car type is upgraded or downgraded is
If the customer did not lease type
and then rented the type
, the probability of upgrading or downgrading is
where
is the preference probability of buyup or buydown with car type
when the customer’s choice of car type
is rejected, and
is the leaving probability when car type
is rejected.
The improved multitype spill model (IMSM) needs to calculate the complete spillage of all constrained data first and then calculate the demand transferring from other car types which are constrained according to the customer preference probability. Excluding this part is the spillage; then, the sum of cumulative order quantity and spillage will be the unconstrained estimation.
Excluding the transferring is the spillage:
Then, the unconstrained estimation is
3. Hybrid Forecasting
3.1. Selection of Forecast Method
For demand forecasting, there are three methods: quantitative analysis, qualitative analysis, and decision analysis. Quantitative analysis relies on a large amount of historical data and can be divided into time series forecasting, which uses time to organize data, and the causal analysis method, which uses relationships to organize data. The time series method is the most commonly used, in which the Holt–Winters model and moving average method are representative. Time series forecasting can predict future demand, but it cannot explain the reason. Causal analysis commonly uses regression analysis and simulation methods. The causal analysis method uses the causal relationship between data to look for changes and is generally used for macro prediction.
Qualitative analysis mainly relies on expert knowledge and experience to evaluate and does not involve quantitative analysis. Delphi and judgments fall into this category. Although this kind of method operates simply, the subjectivity is too strong and the effect is also not favorable. The decision analysis method combines quantitative and qualitative methods. At present, the market survey and randomness method are more commonly used, but the research is not very mature. The quantitative analysis method is commonly used to compare the above three methods.
The Holt–Winters model is a kind of time series forecasting model that avoids the deficiencies of the moving average method. It uses a cubic smoothing equation to make different data have different weights, and the predicted value is the weighted sum of the previous data sequence. Larry Weatherford et al. deemed the availability of the basic neural network more useful than traditional forecasting methods (moving averages, exponential smoothing, linear regression, etc.) by comparing the mean absolute percentage error [
38].
The backpropagation (BP) neural network is a widely used nonlinear forecast method that can simulate the neural structure of the human brain and solve more complicated problems. The BP neural network can use its own nonlinear characteristics to simulate the development trend of the data, without requiring an assumption function. When the prediction accuracy is reached, the future demand can be predicted according to the learning situation. However, the BP neural network relies on the initial conditions and is prone to fall into the local optimal solution. Therefore, the single prediction method has inevitable defects, and the error accuracy may not reach the conditions for actual use. This paper intends to use the hybrid forecasting to combine weights of multiple forecast methods and improve the forecasting accuracy.
3.2. Holt–Winters Model
The Holt–Winters forecasting model is observed to outperform other techniques for the time series, having changing seasonality, mean, and growth rate [
39]. It is an adaptive model that automatically recognizes changes in data patterns. For example, if the deviation is caused by internal interference, it can be considered that the new observation has the same influence as the original data, and thus gives the same weight to the data of different periods. If the deviation is caused by external interference, then new observations and the original data have different influences on the prediction results, and the new observations have a higher impact on the prediction events. In order to show that the value of the data in different periods has a different influence on the forecasting results, different weights can be given to different periods.
The demand for car rental is a nonstationary time series with seasonal and cyclical trends. The Holt–Winters model can perform very accurate forecasting of this regular time series data, especially for trends and seasonal changes. It can decompose linear time series, seasonal variations, and random variation time series and properly filter the impact of random fluctuations. The Holt–Winters model is an improvement and development of the moving average method. It does not need to store much historical data but also considers the importance of each period of data and uses all historical data.
First, smoothing equations can be obtained by iteration:
where
is the smoothed value,
is the trend value, and
is the seasonal index of car type
at
after iterations.
is the length of the seasonal period, and
,
,
are smoothing coefficients.
Then, the predicted value at the future time point
is
where
is the seasonal index of car type
, which is one cycle ahead of
,
.
The way to determine the three smoothing coefficients , , and is to minimize the error between the forecasted and actual values. In order to obtain more accurate and objective parameters, the traditional method is the residual square sum minimum method. Smoothing coefficients are all located in the interval (0, 1) and increase with 0.1 step length. The squares of the prediction residuals are calculated separately and summed until the smoothing coefficients corresponding to the sum of squares of the smallest residuals are found.
3.3. BP Neural Network
The neural network generally includes an input layer, an output layer, and a hidden layer. The input layer is located in the first, and no neurons are connected at the front end. The output layer and the hidden layer are all connected with neurons, and the weights have a one-to-one correspondence. The impact factors of output data include input, weight, threshold, and excitation function. Because of the large number of neurons, a large amount of information is stored, giving the neural network powerful data processing capabilities. The neural network has the advantages of a strong parallel processing ability, strong nonlinear processing ability, strong self-adaptation and learning ability, and strong associative memory and fault tolerance.
The BP neural network has a simple structure, strong plasticity, clear learning steps, and mathematical meaning. It has been proved that the BP neural network can simulate any complex nonlinear mapping by selecting three layers. The BP neural network is a feedforward type network that utilizes an algorithm of error back propagation. There are only feedforward associations between neurons, and no feedback, intralayer, or interlayer correlation. Linear sigmoid-type functions are generally used as excitation functions. Since the excitation function is measurable everywhere, for a BP network, the divided area is no longer a linear partition but an area composed of a nonlinear hyperplane, which is a relatively smooth surface. Compared with linear partitioning, this classification is more accurate and has greater fault tolerance. The learning method of the BP neural network is to strictly adopt the gradient descent method, so that the analytical formula of weight correction is also very clear.
3.3.1. Network Structure
A full connection is achieved between the upper and lower layers of the BP neural network, and there is no connection among each neuron. Studies have shown that when the output layer and input layer use a linear activation function and the hidden layer uses the sigmoid activation function, a BP neural network with a hidden layer can map all continuous functions. Therefore, when constructing a BP neural network model, only one hidden layer is generally used, as shown in
Figure 3.
3.3.2. Learning Process
The learning process of the BP neural network consists of two parts: forward and reverse propagation. When the propagation is positive, the sample is delivered to the input layer and then processed by the hidden layer and passed to the output layer. If the value obtained by the output layer does not satisfy the expectation, then it enters the backpropagation link, which uses the error back to the input layer and continuously corrects the weight between layers during the transfer process. After repeated propagation, when the error is small enough to be acceptable, the learning process then stops (shown in
Figure 4).
The specific learning steps are as follows:
- Step 1
Program initialization. Select sigmoid as the activation function, then determine the minimum error, learning rate, and momentum coefficient.
- Step 2
Calculate output. Input the initial weight and calculate the output values of the hidden layer and the output layer processing unit.
- Step 3
Calculate the error value. When error is less than the given minimum error, go to Step 5; otherwise, go to Step 4.
- Step 4
Backpropagation to adjust the weight between hidden layers; then reuse Step 2.
- Step 5
Acquire the optimal output value and the weight of each layer and end the algorithm.
3.3.3. Sample Selection
After the model is established, the sample needs to be trained. In general, the training sample needs to meet four characteristics. (1) There is a close functional relationship between the input and object variables, and the object variable will change obviously due to the change of input variables. (2) Input variables are independent of each other. It is impossible to accurately calculate other components by using the components of input variables. (3) The data to be predicted has a certain commonality with the sample data. (4) Sample size should have a certain scale, so that the combination of all samples can reflect the mapping relationship between output variables and target variables.
The BP neural network applies the data-driven idea, that is, using a nonlinear characteristic to approximate a time series, and then using the clear logical relationship and historical data to express future values. Suppose there is a time series
, where the historical data is
; to forecast the future value at time
, some kind of nonlinear function relationship between
and
needs to be found. Forecasting methods generally include single-step predictions, multistep predictions, and rolling predictions. Among them, the rolling prediction can reach a certain training sample and reflects the relationship between the output sample and predicted data. In fact, the rolling prediction starts from the single-step prediction, then feeds the output value back to the import as part of the input for predicting future values (see
Table 3).
3.4. Weight Determination
Hybrid forecasting is to assign several kinds of single prediction methods to different weights to form a comprehensive forecasting model. It can accurately and reasonably use the valuable information of a single forecasting model, better adapt to future changes, and reduce forecasting risk.
The key to hybrid forecasting is to properly determine the weight of various forecasting methods, and reasonable weights will improve the prediction accuracy greatly. Common weight determination methods include the arithmetic average method, variance reciprocal method, mean square reciprocal method, simple weighting method, and linear programming method. Among them, the arithmetic averaging method is suitable for treating each individual model equally if it is not known to each model. Usually, it is not optimal, and the sum of squared errors is large. The variance reciprocal method, mean square reciprocal method, and simple weighting method all have higher prediction accuracy than the arithmetic average method, but it is necessary to have a certain understanding of the prediction requirements in advance. All three methods above have one thing in common, that is, the variance is used to calculate the weight. The variance is the degree of fluctuation of the response variable above and below the mean. When the prediction result is not ideal and the error fluctuation is not large, the weight will still be large, which is unreasonable.
Therefore, this paper adopts the linear programming method, which determines the optimal weighting coefficient by taking the minimum absolute value of the combined prediction value as the objective function.
5. Conclusions
To fully use the information of historical car rental data, a two-stage joint approach was proposed for predicting multitype car rental demand. The method considers the customer’s choice behavior to improve the spill model, which can estimate the unconstrained demand effectively and uses the hybrid forecasting model to predict future short-term demand.
For historical car rental demand data, the repaired data can more accurately reflect customer demand. In order to prove the effectiveness of the method, EM, PD, and MSM were compared in the unconstrained demand estimation stage, and the Holt–Winters, BP neural network, and hybrid models were used for comparison in the future demand prediction stage. For different car types, different models can form different prediction results in two stages. The comprehensive calculation results show that the proposed method is superior to other combinations in terms of stability and effectiveness of prediction.
Specifically, using the case study, for the repair of the demand data of each car type, IMSM has obvious advantages compared with the other three methods, and the average error of the example test is 3.06%. For the prediction of future demand of each car type, based on the calculation results of different unconstrained estimation models, the performances of the Holt–Winters model, the BP neural network model, and the hybrid model are quite different. The hybrid forecasting model has the best effect, and the relative error of the predicted results is within the acceptable range for different car types or overall needs. Therefore, the case study shows that the proposed method outperforms other methods. It can be regarded as a new way to help car rental companies to predict customer demand. There are many factors affecting customers’ car rental behavior, and with the application of mobile apps, the way of car reservation is changing. In the future, more influencing factors, such as electric vehicles, urban traffic restriction policies, and road traffic conditions, should be taken into account. In addition, the different distribution characteristics of car rental demand and the combined optimization of forecasting methods will also be interesting research topics.