Container forecasting

Container flows are strongly linked to economic development. The more money people have, the more goods they are able to buy. Trade is at the core of the modern global economy and the is the fuel for globalisation.

Predicting future container volumes over various time frames brings benefits to stakeholders across the industry. Stakeholders we work with range from large institutional investors who seek robust, long-term asset valuations to operational managers seeking to reduce short term congestion and optimise their human and quayside resources.

We were recently asked about our forecasting methodology, specifically ‘how accurate’ we believe our outputs to be. Our inputs are dynamic, so our forecasts evolve month by month and giving a one-off answer is difficult. Instead, we explain the relative strengths and weaknesses of the different methods. We look at the different methods we use for long-term, short-term, and medium-term, then we present our outputs using data aggregated to some of the major trade lanes. In summary:

1. We can use a neural network for very accurate short- term (month, 2 month or 3 month ahead forecasts), which can help with short-term operational planning. This is less useful further ahead than this as it is difficult to extrapolate.

2. For the longer term, a GDP forecast helps us understand how global macroeconomic trends will affect container trades, but it is a less accurate projection.

3. To fill the gap, we use an ARIMA model to look at up to a year ahead. This is less accurate than the neural network, but more accurate than a GDP forecast and is perfect for a medium-term outlook.

1. LONG-TERM FORECAST Accurate long-term forecasting is crucial to investors and governments looking to invest in, secure financing for or build new port facilities. Port facilities require massive capital outlays, and we need some certainty that future throughput can pay that back. We have looked at the container flows (in TEU) of the major container trade lanes. A typical long-term forecasting methodology is to use GDP. We have used the aggregated GDP of the importing countries on each trade lane as the covariate to forecast the annual TEU volumes of a given trade. The relationship between GDP and annual container flow is quite strong and allows a linear model to fit past data (from 2013 to 2017) relatively well.


Asia-USEC container flow

There are some limitations to using GDP:

• Sound, consistent GDP forecasts covering all relevant countries are required. A major issue is that reliable GDP forecasts rarely extend beyond 5 years, so a decision needs to be made on the long-term trend growth. We usually use the IMF for short-term forecasts.

• GDP forecasts are usually annual, which means using a larger, monthly data set is not possible. • Sectors performing differently can’t be captured: agriculture could be failing in a region, but the steel industry could be booming. This development will have a materially different effect on traffic flows from that indicated by GDP if that sector’s imports or exports constitute a significant proportion of overall cargo. Use of a sole GDP driver will always create a large weighting error with different kinds of maritime traffic.

• It relies on the assumption that the multiplier was and will always remain the same value. If there has been structural change during the period that is used to establish the relationship between GDP, sectors and/or any other drivers that affect port traffic, and this change is not recognised, the relationship used to project cargo volumes in the forecast period will be incorrect (although it might be adjusted in adaptive models, but the change would be slow).


In the short-term, very accurate monthly forecasting of the container flow on each trade lane can be useful for operational decisions relating to resource and asset allocation. They could also be useful for government bodies reviewing monthly economic growth outlooks.

Using monthly data allows us to have more data and implement a time series forecasting model. It also removes a reliance on GDP (or other) forecasts . Instead we use a recurrent neural network.


Representation of a recurrent neural network

The diagram above shows how this works. For a given trade lane, the blue circles are the values of container flow for each month: x_0 is the value for the first month in the dataset, x_1 the second, and so on. The “A boxes” represent non-linear transformations and the magenta circles represent outputs. The horizontal arrows represent the fact that information “propagates through time”. This means that the algorithm uses information from past periods to help predict future values. In simple terms, we compare the magenta outputs h_0,h_1....,h_t to versions of the original blue actual values to simulate the future.

This represent one cycle in the algorithm. We then measure the difference between our predictions and the actual values. Finally, we use an optimization algorithm to adjust what transformations take place in the “A boxes” to minimize the error metric.

For some of the trade lanes, we also added a trend component to improve performance. This means that we remove the trend from the data before making the predictions, and add it back afterwards.

By implementing this methodology, we obtain robust forecasting accuracy for each trade lane. This is far superior to the 6.4% error we found in the GDP linear regression. We expect the mean error will fall over time as we expand our data set.


Example of fit on the Asia-Med trade lane

This gives excellent one month and even 3 month forecasts, although the accuracy decreases the further we go into the future. We update these models every month as new data comes in to improve our accuracy. This allows our stakeholders to have extremely accurate information on what is going to happen in the future, facilitating better decision making. Short-term port-level forecasts can be used directly in daily port operation activities if decisions need to be made on the acquisition of additional equipment and material, allocation and arrangement of workers and machines. Furthermore, unlike in manufacturing industries, the capacity of the container terminal cannot be increased in response to seasonal variation in demand in the short-term by adopting such strategies as keeping inventory, outsourcing, overtime working. Therefore, forecasts on a short-term basis are essential for the control and scheduling of a container port system, and for the terminal operator in decision making and planning. We can apply this method to forecast container flow at the level of seaports or trade lanes, or other particular flows if our clients make the data available.

Port of Singapore example

To illustrate the efficiency of our model at the port level, we apply it to forecast the monthly container throughput in the Port of Singapore.


As the above plot shows, the model we implemented gives a very accurate monthly forecast, the overall relative error (accuracy) is an excellent 2.79% and the R-squared (relationship) is an excellent 0.997 – almost 1.

This allows for accurate monthly forecasts and even though the model was not designed for longer forecasts we can extrapolate the forecasts for several months. Doing so yields the following 3-month forecast for container throughput:



We have also developed a methodology using an ARIMA model to plug the gap between long and short term forecasting. ARIMA means autoregressive integrated moving average. As with a neural network, it forecasts based on historical values, so is not reliant on covariates like GDP, but it is is less dependent on the amount of data available. Therefore, it is more useful in situations where the quantity of data is relatively small. For an ARIMA model we need to calibrate certain parameters. These are the time lags involved, the extent to which we look at the difference in trade volumes between periods rather than just the total volumes and the length of time to be considered in the moving average.

We run this procedure for each trade lane, identifying the best model for each. As an example, the chart here shows our model for Asia-North Europe.

For Asia – North Europe, our ARIMA model gives a good medium term monthly forecast. The overall relative error is 4.5%. This shows a lower error than the GDP, but higher than the neural network. In this case, the advantage of this model is that it can be used for medium term forecasting (i.e. up to a year) with less accuracy decrease than a deep learning approach.