This paper presents a control framework to co-optimize the velocity and power-split operation of a plug-in hybrid vehicle (PHEV) online in the presence of traffic constraints. The principal challenge in its online implementation lies in the conflict between the long control horizon required for global optimality and limits in available computational power. To resolve the conflict between the length of horizon and its computation complexity, we propose a receding-horizon strategy where co-states are used to approximate the future cost, helping to shorten the prediction horizon. In particular, we update the co-state using a nominal trajectory and the temporal-difference (TD) error based on co-state dynamics. Our simulation results demonstrate a 12% fuel economy improvement over the sequential/layered control strategy for a given driving scenario. Moreover, its real-time practicality is evidenced by a computation time per model predictive controller (MPC) step on average of around 80 ms within a 10 s prediction horizon.
The technical maturity of advanced driving assistance systems (ADAS) encourages research into designing a longitudinal velocity controller systematically from an optimal control perspective. Advancements in ADAS, together with a powertrain-level controller, can achieve excellent overall system-wide efficiency. Among various types of vehicles with different energy sources, a plug-in hybrid vehicle (PHEV) is of particular interest. It is because, for a PHEV with two energy sources (fuel and electricity), energy-efficient driving can further benefit fuel economy beyond what can be achieved with an optimal power-split on ordinary human driving.
Over the years, there have been extensive efforts toward increasing the efficiency of (P)HEVs. The book of Sciarretta and Vahidi  presents comprehensive discussions that incorporate almost every aspect of connected and automated vehicles using a variety of energy sources. The work of Bae et al.  presents an ecological adaptive cruise controller (ECO-ACC) with a two-level framework for a PHEV to minimize energy consumption while avoiding collisions and complying with traffic signals. However, its focus was based solely on velocity planning and safety control rather than jointly optimizing vehicle-following and powertrain dynamics. The work of Heppeler et al.  introduced a layered approach for the predictive control of the vehicle and powertrain dynamics of an HEV and demonstrated a 3–10% fuel consumption reduction compared to the desired velocity obtained respecting speed limits followed by the equivalent consumption minimization strategy. However, their proposed controller did not account for car-following, i.e., it did not consider a lead vehicle (LV). Therefore, it is unclear how the controller would perform in terms of fuel economy and safety guarantee when traffic constraints induced by the LV need to be considered in the short-horizon but are ignored in the long-horizon SOC planning.
Most of the existing literature on energy-efficient control of a PHEV employs a layered control framework, including (1) a velocity planning (eco-driving) layer, sometimes with an additional ACC layer to guarantee safety in the presence of an LV, albeit the information of the powertrain dynamics can be partially included to better estimate the energy consumption  and (2) a powertrain-control layer to coordinate the operation between the engine and electric motors to minimize fuel and guarantee a desired battery state-of-charge (SOC) level at the end of the trip with the velocity determined by a velocity planning layer. Unlike the existing eco-driving approaches based on the layered control framework that indirectly aims to minimize fuel consumption, in this study, it is proposed to co-optimize the velocity and powertrain operations to explicitly achieve the maximum system-wide efficiency while guaranteeing the safety and the desired terminal SOC. Moreover, the proposed solution strategy is potentially implementable in real-time.
For PHEVs, the predominant roadblock to adopt a co-Optimization control framework online originates from its numerical implementation. To be more concrete, to optimize the battery charge-depletion rate and the velocity for a PHEV, one needs to solve a trajectory optimization problem (TOP) for the entire trip with a specified SOC to be satisfied at the end of the trip. In the online implementation, the TOP is posed as an economic model predictive controller (EMPC) problem. To the authors’ knowledge, the work of Huang et al.  is the first that succeeded in solving the TOP directly in real-time rather than tracking a reference explicitly. However, the work in Ref.  only considered the powertrain dynamics with a single battery SOC. For the co-optimization problem, due to additional controls and states, it is unclear of their real-time implementability when the EMPC aims to cover the entire trip. Besides, as shown later in the paper, co-optimization requires predicting a LV’s driving trace. Such a prediction will become less accurate as the prediction horizon increases.
The length of the prediction horizon is generally constrained to reduce the computational complexity associated with the TOP, which makes it difficult to achieve global optimality. To resolve this issue, we propose a receding-horizon strategy where the co-states are used to approximate the future cost. In particular, the co-state is updated using a nominal trajectory and the temporal-difference (TD) error based on co-state dynamics. The proposed receding-horizon control framework with co-state correction based on TD error is shown in Fig. 1, which will be detailed in Sec. 3. In the proposed strategy, distance constraints are generated from the prediction of the LV’s trajectory. A reference SOC from a nominal trajectory obtained offline is used to softly update the co-state at the end of each prediction horizon, which is then rolled out backward-in-time to get the control at time t = k. This is the so-called shooting iteration block and will be discussed in Sec. 3.1. A TD error is used to correct the co-state value sampled from a noisy nominal trajectory to warm-start the single shooting iteration based on its co-state dynamics. This is the future terminal co-state initialization block and detailed in Sec. 3.2.
The remainder of the paper is organized in the following manner: Sec. 2 briefly summarizes the control-oriented model and the centralized control framework as an EMPC problem with related constraints. Then, an innovative strategy to solve the centralized control problem is detailed in Sec. 3. Section 4 presents the simulation results with their comparison against the offline sequential velocity smoothing and power-split optimization. Section 5 concludes the paper with a summary and future works.
2 Receding-Horizon Fuel-Efficient Controller Design
This section describes a control framework to co-optimize the velocity and powertrain operation of the PHEV online as an EMPC problem. A detailed discussion of the novel online implementation strategy based on a TD Bellman equation error will be presented in the next section.
2.1 Control-Oriented Model.
Two subsystems are considered for co-optimizing the ego PHEV’s velocity and powertrain operation to achieve minimum fuel consumption in the presence of a LV : (i) the vehicle-following subsystem and (ii) the hybrid powertrain. Detailed descriptions of the dynamics and their corresponding constraints are omitted for space consideration. These two subsystems are connected through the vehicle (longitudinal) velocity vk and the driver-demanded torque tpk. In this work, the ego vehicle is assumed to drive on a single lane without grades and considers only its longitudinal dynamics.
2.2 Economic Model Predictive Controller Formulation for Co-Optimization.
The objective of the EMPC design for our problem is not to penalize deviation from a pre-defined equilibrium  but rather to directly minimize the total fuel consumption (the economic cost) of the PHEV on a given trip by simultaneously optimizing its velocity and its powertrain operation. Despite the lack of explicit reference tracking, the battery SOC is required to be depleted to a specified low level at the end of the entire trip. In this study, the vehicle is assumed to be in traffic and hence is subject to tight time-varying upper and lower bounds on its position depending on its LV.
3 Approximation Strategy for Solving Economic Model Predictive Controller
In the original EMPC formulation (P) presented in the previous section, the problem horizon needs to cover the entire trip. However, as mentioned in the previous section, this formulation could be too computationally demanding to be solved in real-time without a specialized numerical strategy like the one proposed in Ref. . Inspired by the TD update rule  in reinforcement learning, an online implementation strategy is proposed where only a short prediction horizon Np instead of the entire trip horizon Nf can be considered. In the proposed strategy, intermediate SOC and its associated co-state values from nominal trajectories obtained offline are used to warm-start the computation. Note that the nominal SOC trajectory is used only to softly update the co-state, and hence no explicit reference tracking formulation is needed in the proposed implementation strategy. Figure 1 presents the receding-horizon implementation strategy to approximately solve by approximately solving . Among all are two critical blocks: (1) shooting iteration Ⓐ and (2) final terminal co-state initialization Ⓑ, will be explained in detail in this section.
3.1 Shooting Iteration.
The numerical algorithm in solving the associated mixed-integer nonlinear optimal control problem for the entire trip offline with single shooting is a modified version of the algorithm proposed in Ref. . Since the algorithm for solving is not the focus of this work, in the sequel, is considered solvable. Note that unlike the general single shooting based on the discretization of the continuous-time Pontryagin’s minimum principle, single shooting is applied with co-state backward-in-time propagation. To be concrete, at time t = kΔt, it requires
initial guesses of the state and control trajectories within the prediction horizon Np, and , where and , with xk and uk defined in Sec. 2.1 and
an initial guess of the terminal co-state associated with the SOC at the end of the prediction horizon,
In reality, however, the exact trace of the entire trip of the ego PHEV’s LV is not known in advance, making it impossible to accurately predict the exact SOC value at the end of each prediction horizon. Nevertheless, the traffic monitoring system, on-board GPS, and mobile apps enable the possibility of recording the velocity trace of the same route. The repeated traces can be averaged and used as the nominal LV’s velocity trace to form the position constraints for the ego PHEV. Then, the minimum fuel consumption problem can be (approximately) solved to obtain the nominal SOC and its corresponding co-state trajectories. As will be detailed next, the nominal SOC and its corresponding co-state trajectory are used in Ⓐ and Ⓑ in Fig. 1. As is shown in Fig. 1 Ⓑ, at each time t = kΔt, from the nominal SOC trajectory, the nominal SOC value at the end of the prediction horizon is obtained; then, this value is used in Ⓐ to solve . However, due to the uncertainty in the future driving condition, strict enforcement of the terminal SOC at can result in infeasibility. To avoid this infeasibility issue, single shooting is performed only with a fixed number of iterations, and is solved approximately.
3.2 Future Terminal Co-State Initialization.
As mentioned previously, to solve numerically, an initial guess of the state and control trajectories as well as the initial guess of the terminal co-state at the end of the prediction horizon are required. The initial guesses of the state and control trajectories are obtained from the shifted trajectories of the previous model predictive controller (MPC) step. The most critical part in the short-horizon implementation strategy is the proper initialization of the terminal co-state . Although it is possible to warm-start the terminal co-state using the value from the nominal co-state trajectory , it is observed that (1) this simple interpolation would induce a systematic bias to the terminal SOC at the end of the trip when errors exist in the position constraints within the prediction horizon (induced by the prediction error of the LV’s position) and (2) if the terminal co-state from the previous MPC step is used to warm-start the next MPC step; large oscillations will be induced in the resulting co-state trajectory.
Since the battery SOC dynamics are slow, the dynamics of its corresponding co-state are also slow . It means that the updated can be used as the warm-start of the next MPC iteration, .
4 Simulation Results and Discussions
In simulation, each MPC step with prediction horizon Np is performed with 10 single shooting iterations (Ⓐ as shown in Fig. 1). The computation time per MPC step is around 80 ms on average.1 For comparison, a single shooting iteration for the entire trip offline is 17.7 s for this particular 2-h trip.
Previously, we demonstrated the effectiveness of velocity smoothing followed by a power-split optimization (sequential optimization) in fuel economy improvement . Minimizing the acceleration of the velocity profile induces a smoothed driving trace in line with some of the work in eco-driving [10,11]. However, the sequential type of optimization requires a layered implementation, where each layer has its own objective function. Consequently, such an implementation is in essence decentralized. By comparison, the direct fuel-efficient MPC strategy in this work adopts a centralized control framework. Since one of the objectives is to compare the performance of the centralized and decentralized control framework, in this work, the result obtained by offline sequential optimization is used as the baseline for comparison.
4.1 Position Constraints Under Uncertainty in Prediction.
4.2 Influence of the Penalty on Acceleration.
This section investigates the tradeoff between a driver’s ride comfort and fuel economy. As can be seen from the dot dash curves in Fig. 3(a) (also observed in the entire driving cycle when the speed is not too high), the resulting acceleration obtained from direct fuel-efficient EMPC is of a bang-bang type and may not be acceptable for abrupt changes in acceleration. The penalty on acceleration as in (6) is set to be [0, 0.01, 0.1], and the MPC results with these penalty values are presented in Fig. 2 as compared to those obtained from offline sequential optimization. As seen from Fig. 3(a), the resulting acceleration becomes smoother with an increase in the penalty upon acceleration. However, the total fuel consumption increases, as shown in the second subplot in Fig. 2 and Table 1, meaning that a smoothed driving trace is not the most fuel-efficient one for a PHEV.
This paper proposes a receding-horizon control framework to determine the powertrain operation and velocity simultaneously for fuel-efficient car-following of a PHEV. To resolve the conflict between the horizon length and the resulting computation complexity, we propose approximating the future cost with the co-state. The co-state based on the data from a nominal trajectory is adjusted with a TD error for preventing oscillations and systematic bias in the co-state estimation. The proposed control strategy demonstrates an additional 12% fuel economy benefit in its online implementation compared to what can be achieved by the offline solution of a typical layered approach, velocity smoothing followed by power-split optimization. Additionally, to accommodate drivability, a penalty on acceleration is considered, and the degradation in fuel economy is quantified.
The computations are done on a Mac OS X with an Intel® Core i5 2.7 GHz processor and 8GB RAM.
The work was funded in part by the Advanced Research Projects Agency-Energy (ARPA-E), U.S. Department of Energy, under Award Number DE-AR0000837, also known as NEXTCAR. The authors would like to thank Southwest Research Institute and Toyota Motor North America Research & Development for continuous feedback and support.
Conflict of Interest
There are no conflicts of interest.
- K =
the rate of adjustment of co-state based on the difference in the SOC, used in the single shooting
co-state sampled from the nominal trajectory
co-state corrected by TD error
the MPC problem at time t = kΔt with a horizon length Nf that equals to the length of the entire trip (prediction horizon Np). The resulting terminal state of charge (SOC) at the end of the trip should be equal (or very close) to the desired value SOCf ()
- , =
nominal SOC and its corresponding co-state trajectories obtained as an average of offline simulation results