Rise of Artificial Intelligence: Chapter 5: Prediction| Complexica

The Rise of Artificial Intelligence is available for purchase in hard copy from Amazon, Booktopia, Readings, Dymocks, and other booksellers, as well as in ebook form.

Chapter 5: Prediction

This supplementary video to Chapter 5 of The Rise of Artificial Intelligence discusses a few key issues related to predictive modeling. An overview of AI-based prediction models is provided together with a demo of a particular simulation that can be used as a predictive model. The video concludes with an overview of ensemble models for the ongoing example of distributing cars to auction sites.

Some of material in this video is based on a complex business problem that's used as a running example. The following article provides a full explanation of this problem as well as its complexities:

Michalewicz, Z., Schmidt, M., Michalewicz, M., and Chiriac, C., A Decision-Support System based on Computational Intelligence: A Case Study, IEEE Intelligent Systems, Vol.20 (4), July-August 2005, pp.44 – 49

Click here to download Chapters 1 & 2 of The Rise of Artificial Intelligence: Real-world applications for revenue and margin growth, and please contact us to request a soft copy of any other chapter of the book.

Transcript (reading time: 18:51 min)

[00:00:05]

Hi, this is Zbigniew Michalewicz, I'm one of the co-authors of The Rise of Artificial Intelligence, and this is a supplementary video to Chapter 5 of the book. The topic of this presentation is prediction. Some of the material presented in this chapter – like the previous one and the next one – is based on the complex business problem of distributing cars that I will use as a running example. And the recommending reading is the article which is displayed here, which is available from the same website as this video.

[00:00:44]

The outline of this talk is straightforward: We'll start with general remarks on predictive modeling, talk a little bit about the granularity of predictive models and then about Artificial Intelligence-based predictive models. Then we move to simulation, we will talk about agent-based systems and simulation as a predictive model, and conclude by returning to the car distribution example – and in particular, we'll pay close attention to ensemble predictive models for this car distribution problem. Predictive modeling from high-level level perspective can be visualized in the following way: The central question is "What will happen next?"

[00:01:26]

And we answer this question based on some historical data. So, for example, we have some sales values for a product in some part of the country. And we're looking at some number of weeks or months back and we have to predict what will happen next: how much we will sell next week, and the following week, and so on? We can build a variety of predictive models for this problem. These models may be linear, quadratic, or exponential – this is not important. What is important is that each curve represents some prediction: what would happen in the future? We are making bets that the distribution of dots or sales values will follow a particular curve. And it may happen that this doesn't hold true, that these models were wrong to some extent, and the longer the prediction horizon, the larger the magnitude of errors. But if we look at this question: what will happen next? We shouldn't think in terms of just historical data.

[00:02:44]

When we are making all these predictions, very often, we have additional datasets that can be used to make the predictive model much better, much more stable, much more accurate. We may have patterns from the last few years. So this set of historical data can be very often extended by several years back, which will be very, very useful because then it will be easier to study seasonality. It will be easier to study special events like Easter, Christmas, school holidays, whatever they might be.

[00:03:25]

Weather patterns may also be very important. We may have available data from the Bureau of Meteorology for the weather around that time. Then we may have information about new products that were introduced at that time, and these new products may have influenced sales of the product in question. Or we may have information about past and current promotions, which is very important. These promotions might explain these movements of dots up and down: there was a promotion, there wasn't a promotion.

[00:04:04]

By having this information and knowing which promotions are planned for the future, it would be easier to predict the future with a higher degree of accuracy. Or we may have information about competitors and their products, which might be also very, very useful. And if we combine all these data sources and build a predictive model very carefully, then we can get some meaningful results. The prediction of our model might be quite remarkable. One additional observation is the following: any problem-solving activity – and in particular a problem of prediction – is a two-step process: we first build a model of the problem and then we use this model to make a prediction. So this first step, model building, is of the greatest importance, and clearly, if the model is wrong, then the prediction is wrong, so we should pay special attention to this first step, how to look at the problem, how to extract the most important information, and how to combine this information, knowledge, and data into a model that will perform well.

[00:05:28]

So building a predictive model is quite challenging. Of course, we should have historical data available. And as I have just indicated, this historical data is not only for the variable in question – like sales volume in the future – but all historical data, like seasonality and weather and promotions and competitors and new products; everything which is relevant, which is available. Then we should identify the key variables. And some people may think the more variables, the better. But that needn't be the case: if the model is too complex, very often the performance would be very, very poor for a variety of reasons.

[00:06:18]

The next step would be the extraction of useful knowledge from data trends, seasonality, dependencies between variables, and so on, and then selection of an appropriate model, whether it will be a neural network, fuzzy rules, or even a simple statistical model. But sooner or later, we have to make a decision: which model would be the most appropriate for the problem at hand, before training, validation and testing of the model.

[00:06:50]

And on top of everything, we should have a mechanism for regular updates because very often we operate in dynamic environments. We are getting feedback, we are getting new data coming in and we have to have a method: what to do about this, how to include feedback or arrival of new data to update the model so it stays current. Again, if the model is wrong, then the answer is wrong. This is why it's so important to take very good care with this modeling effort. A few comments about the granularity of the model: Let's assume that we are modeling some transportation effort connected with delivery from warehouses to customers and so on.

[00:07:48]

And clearly, we can look at a map of Australia. This is for air travel, so it will be perfect. We have cities, we have a destination, you can estimate time, you can extend this data by timetables of flights and so on. But we would like to also deliver by trucks or trains, and in that case it's a necessary to go a little bit deeper and identify major highways or secondary roads and get additional information about major cities and distribution of their neighborhoods.

[00:08:28]

And we can keep going deeper and deeper, taking care of intersections and information about peak hours and the average times of getting from one point to another all the way down to traffic lights and one-way streets and traffic densities and so on. So if we're trying to build a very accurate model, we might think need all this information, thousands of variables, if not more, and every street and traffic light and peak hours and so on.

[00:09:04]

And the question is where to stop? How deep we should go to solve the problem? And there are no easy answers for that. Everything is based on the problem: how will we apply this predictive model? What is an accepted accuracy of the model? Are we predicting delivery times in five minute time windows or is a range of three hours sufficient? Perhaps just identifying the day of delivery would be okay?

[00:09:38]

Also, how will we update this model? If we go to this deep level, it will be very difficult to maintain this predictive model because any road work, road block, accident, and so on should be immediately introduced into the model, and this simply isn't feasible. How often will the model be used? And so on. There are many, many questions. So this decision of granularity relies on experience and expertise and is very often part of the "art" rather than the "science."

[00:10:09]

There are many Artificial Intelligence-based predictive models. One of the most popular is neural networks. Most of people have heard the term and here we have a network of nodes and connections, where we are presenting some input. This is the lowest level of the neural network. Let's say we provide the product ID and the sales of the last week, and the promotion period and seasonality and competitor behaviour or whatever it may be. And this information is processed from one level to another, and the information is passed further with different weights, and the output would be, for example, the best price and the ideal promotion period.

[00:11:04]

We can also use agent-based system as a predictive model. We talked a little bit about this in the last video, and the general idea is that we can model a variety of variables and interactions of these variables to observe an "emergent behavior," which is a sort of prediction. If the variables interact in this and that way, then this is the most likely outcome. We'll have a look at such simulations in a few seconds.

[00:11:45]

Fuzzy logic is a very powerful tool that can serve as a predictive model here. The trick is that we can describe the problem as a set of rules, but the rules are expressed in fuzzy terms: high, low or medium, sharp, not sharp, high speed, low speed, medium speed, and so on. And this is how human being reason: If we're driving too fast and approach a sharp turn, then we'll slow down rather than speed up. And if our speed is 70 km an hour and the terrain is 35 degrees, then we'll reduce the speed by 25 percent. There are some mechanisms, the fuzzy mechanism of processing information in a way where at the end we get a very precise recommendation, or a precise prediction.

[00:12:50]

We can also use random forest models. A decision tree is one of the most popular modeling tools, and all of us use decision tree one way or the other. If this happens, then turn left, if something else happens, turn right – and then after turning left or right, if this happens, then do something else, and so on. Very often, applications for getting credit in banks are organized as a decision tree. If the number of years of employment is less than 10, follow one branch. If it's more than 10 years, follow another branch. And then if anyone's salary is lower or higher than such and such number, we follow appropriate the branches until we arrive at the final recommendation. Random forest combines many decision trees into one model to provide much better stability, efficiency, and accuracy.

[00:13:53]

We can use genetic programming, which is similar to random forest. But the difference is that we're not dealing with trees, but with computer programs. So imagine that we create 100 computer programs randomly and these programs are responsible for giving us a prediction of some sort. And we evaluate these predictive models and historical data. Some of them predict better; some a bit worse. And we run an artificial evolutionary process.

[00:14:30]

We select the better programs by doing some genetic tricks, such as crossovers and mutation. We generate offspring, new programs, replace the weaker programs and we run it generation by generation. We will talk about evolutionary programming in the next supplementary video for Chapter 6. But this is the general idea. We evolve the best predictive model rather than build it from scratch. Also, we can take several models and combine them together into an ensemble model, and this ensemble model will consist of a few models that, for example, vote or calculate the average of the recommendations.

[00:15:20]

So by playing with the different aspects of the problem, this ensemble approach may give us some very interesting, very good results. We'll return to ensemble models towards the end of this presentation. Let's now have a look at particular simulation that can be used as predictive model. We are trying to predict what would happen to the traffic during peak hours during particular circumstances. We have several variables like the rate of inflow of cars, the ramp flow, and even the politeness of drivers, because if it's very hot, they may be a little bit nervous, less polite, and so on.

[00:16:01]

So we start the simulation and watch the movement of cars, and suddenly we ask: what would happen if the inflow of cars is much higher, and the ramp flow is also much higher? We're approaching peak hour, more and more people are leaving work, and it's also very hot. So politeness goes down. We also increase the percentage of trucks entering the highway, and the top of that, we can introduce an accident blocking one lane and then observe what is happening, what would be the outcome of these circumstances. And we can look at this simulation as a predictive model. We can understand the delays and we can predict delays; we can do a variety of things. So let's return to the car distribution example. I introduced this problem last time, but this is short repetition: GMAC Leasing Company, part of General Motors, has many cars are coming back after completion of leases and rentals.

[00:17:17]

This number is around 1.2 million cars being returned annually, and each day a team of 23 analysts are making decisions on which auctions site to send all these cars. So for each individual car, they have to make a decision: From this distribution center, this particular car should be shipped to this and that location. And to do it in a meaningful way, we have to have a predictive model. We have to know that if we send this particular car to this particular location, then the most likely price is, let's say, $12,373. Without this information, without any predictive model, we would be making decisions in the dark.

[00:18:11]

And there are many prediction issues. First of all, we have many variables to consider: Car-related variables and auction-related variables like make, model, odometer reading, year of production, or location of the auction site and its neighbourhood demographic, other information around this auction site, and possibly weather patterns, changes in market conditions, and seasonality. For example, if we send convertibles to Chicago in winter then we'll be regretting it for the next few months.

[00:18:51]

Also, we have to take into account the "volume effect" because if we send too many similar cars to a single location, then we'll depress the price. So this is the overall situation: We have several distribution centers, we have several hundred cars sitting at each distribution center, and every single day, the team in Detroit is making up to 7,000 individual decisions. This car from this particular distribution center should be sent to this particular auction site, car by car, distribution center by distribution center, keeping all this information in their heads: the price of each car, if it's sent it here, it would sell for such and such. But if it's sent over there, the price will be slightly different, not to mention transportation costs and other variables, which the optimizer will be dealing with.

[00:19:52]

The predictive model is still a key for the following reasons: First of all, if the predictive model is wrong, there is nothing to optimize: Any decision we can make might look OK, and can concentrate on minimizing transport costs because everything else is like a random guess, and the price prediction would involve some key variables.

[00:20:17]

Of course, the make, model, year, and odometer reading would be absolutely key variables for any car. Then we have other variables like the VIN number, colour features, damage level, and so on. We can buy VIN Decoder, which would give us additional information on each car, which is why the VIN is listed here. We have to know about the distribution of other cars that would be sold together with the car in question, because we have to know the volume effect. We have to know the date and auction site because we're making a prediction for the future and the date of auction may be important.

[00:20:58]

We have to know the current inventory information for this particular auction. And of course, the seasonality, predicted weather, and so on. We may also have access to external data, such as petrol prices and current trends. Suddenly, the color yellow is very popular in California – a couple of thousand dollars premium on yellow Corvettes. And the optimizer interacts with the predictive model all the time. It works like this: The optimizer is trying to find the best possible distribution by asking the predictive model the value of each possible distribution. And the predictive model, let's say, would respond: this particular distribution would give you an average lift of $213 per car. By the way, we need some benchmark, we need some number to evaluate the quality of the distribution. Here, average lift was selected as this measure and the benchmark for comparison – we are measuring against the distribution where we send cars to the closest auction site, which is what happened in the past very, very often.

[00:22:22]

So with respect to the benchmark distribution, this new distribution proposed by the optimizer is better by $213 per car. But the optimizer is trying to find the best distribution. So the optimizer would modify this distribution a little bit and then ask the predictive model: what is the value of this particular distribution? And the predictive model would say: in this case, the average lift is a little bit lower, $209 per car. And the optimizer would then generate another distribution and again ask predictive model: what about that? And the predictive models may say this is very good, this is $213 per car.

[00:23:10]

This process of interaction between the optimizer and the predictive model, which serves as an evaluation function, is repeated many, many, many times, possibly a million times, if we have enough time to run the optimizer through millions of iterations. And this predictive model, which is of the greatest importance, was based on an ensemble model, with the general idea as follows: We provide the VIN number of the car for which we'd like to get a price prediction, but this is extended by additional input variables. So apart from all other variables for this particular car, we are looking at the distribution of other cars that are heading towards one particular auction site and we know the date of the auction at this auction site, we know also the current inventories at all auction sites, including cars in transit. And then we process all this information.

[00:24:17]

Many different models evaluate the different aspects of this problem with different weights. And finally, the system will converge to the answer. If we take this car and we ship it to a different that auction site, we will get $24,507 – which is very, very important information for the optimizer to make the optimal distribution. And to build this ensemble model, we split the training data into two subsets: one and two.

[00:24:58]

The first subset was used to train a variety of models, and the second subset was used to model and tune their prediction, because on the second data subset, we know all the historical values, we know all the prices these cars were sold at. So we are using these predictions to tune the system to arrive at one meta model, which, by the way, was a neural network. So initially, when we start with this subset, number one, we train a variety of models, we are talking about 12 or 13 different models. We create some basic model for the make and model. And the model is making these predictions for one year old cars and then the remaining models are making adjustments for volume effect, for mileage, for the year, seasonality factors, and so on , with all these models working together to make the final prediction. So model one would give us the base prediction and all the other models would give some adjustment. And then we are training this meta-model, this neural network, to come up with the final price prediction.

[00:26:31]

So when we are considering a new case – a particular car for which we would like to get a price prediction – then we look at all attributes of this car, the distribution of other cars and auction dates, and where we plan to send this car, the inventory levels there, everything. And let's say the base prediction is $12,500 and we make a variety of adjustments and then the neural network would tell us: the final prediction is $11,250.

[00:27:08]

With this explanation, let's conclude by looking at particular demo of the car distribution system. And in this demo screen we can switch to the distribution tab, and see thousands of cars listed in this file. These are cars which are ready for shipment. They are sitting at different distribution center locations. And we have only a handful of variables that are displayed in this demo: make, model, year, and trim level.

[00:27:46]

And needless to say, there are many, many additional variables which are not displayed here. And of course, the question is which auction sites these cars should be directed to? If we switch to visualization and we start the optimizer, the optimizer would look at these millions of different possible distributions, and for each distribution, ask the predictive model: if we do it this way, what is the quality measure of this proposed distribution? So let's stop this for a second and let's return to the distribution screen.

[00:28:31]

And you can see this particular column is already filled in. So we look at a particular car, Honda Accord, trim level, year of production, 2003, and the recommendation of the system is that this car should be sent to Birmingham in Alabama. And if we do it together with other cars, we will get $15,302. This is the role of the predictive model: without these values, the optimizer can't do a decent job. This piece of information is essential.

[00:29:14]

So this is how it works– cooperation between optimization and predictive model. And during the next video for Chapter 6, we will talk about optimization. So you will see how these components work together, what the optimizer is actually doing, in which way the new distributions are generated. Thank you.

Contact

To schedule a demo or learn more about our software products, please contact us:

"Larry will be our digital expert that will enable our sales team and add that technological advantage that our competitors don't have."

Kerry Smith
CEO, PFD Foods
$1.6 billion in revenue
PFD Food Services uses Complexica's Order Management System

"Lion is one of Australasia’s largest food and beverage companies, supplying various alcohol products to wholesalers and retailers, and running multiple and frequent trade promotions throughout the year. The creation of promotional plans is a complicated task that requires considerable expertise and effort, and is an area where improved decision-making has the potential to positively impact the sales growth of various Lion products and product categories. Given Complexica’s world-class prediction and optimisation capabilities, award-winning software applications, and significant customer base in the food and alcohol industry, we have selected Complexica as our vendor of choice for trade promotion optimisation."

Mark Powell
National Sales Director, Lion
Lion

"At Liquor Barons we have an entrepreneurial mindset and are proud of being proactive rather than reactive in our approach to delivering the best possible customer service, which includes our premier liquor loyalty program and consumer-driven marketing. Given Complexica’s expertise in the Liquor industry, and significant customer base on both the retail and supplier side, we chose Complexica's Promotional Campaign Manager for digitalizing our spreadsheet-based approach for promotion planning, range management, and supplier portal access, which in turn will lift the sophistication of our key marketing processes."

Richard Verney
Marketing Manager
Liquor Barons

"Dulux is a leading marketer and manufacturer of some of Australia’s most recognised paint brands. The Dulux Retail sales team manage a diverse portfolio of products and the execution of our sales and marketing activity within both large, medium and small format home improvement retail stores. We consistently challenge ourselves to innovate and grow and to create greater value for our customers and the end consumer. Given the rise and application of Artificial Intelligence in recent times, we have partnered with Complexica to help us identify the right insight at the right time to improve our focus, decision making, execution, and value creation."

Jay Bedford
National Retail Sales Manager
Dulux

"Following a successful proof-of-concept earlier this year, we have selected Complexica as our vendor of choice for standardizing and optimising our promotional planning activities. Complexica’s Promotional Campaign Manager will provide us with a cloud-based platform for automating and optimising promotional planning for more than 2,700 stores, leading to improved decision-making, promotional effectiveness, and financial outcomes for our retail stores."

Rod Pritchard
Interim CEO, Metcash - Australian Liquor Marketers
$3.4 billion in revenue
Metcash_ALM_logo

"After evaluating a number of software applications and vendors available on the market, we have decided to partner with Complexica for sales force optimisation and automation. We have found Complexica’s applications to be best suited for our extensive SKU range and large set of customers, being capable of generating recommendations and insights without burdening our sales staff with endless data analysis and interpretation.

Aemel Nordin
Managing Director, Polyaire
Polyaire chooses Complexica for sales force optimisation and automation

"DuluxGroup is pleased to expand its relationship with Complexica, a valued strategic partner and supplier to our business. Complexica’s software will enable DuluxGroup to reduce the amount of time required to generate usable insights, increase our campaign automation capability, personalise our communications based on core metrics, and close the loop on sales results to optimise ongoing digital marketing activity."

James Jones
Group Head of CRM, DuluxGroup

"Instead of hiring hundreds of data scientists to churn through endless sets of data to provide PFD with customer-specific insights and personalised recommendations, Larry, the Digital Analyst® will serve up the answers we need, when we need them, on a fully automated basis without the time and manual processes typically associated with complex analytical tasks.”

Richard Cohen
CIO, PFD Foods
$1.6 billion in revenue
PFD_Food_Services

"As a global innovator in the wine industry, Pernod Ricard Winemakers is always seeking ways to gain efficiencies and best practices across our operational sites. Given the rise of Artificial Intelligence and big data analytics in recent times, we have engaged Complexica to explore how we can achieve a best-in-class wine supply chain using their cloud-based software applications. The engagement is focused on Australia & New Zealand, with a view to expand globally."

Brett McKinnon
Global Operations Director, Pernod Ricard Winemakers
Pernod_Ricard_Logo

"70% - 80% of what we do is about promotional activity, promotional pricing -- essentially what we take to the marketplace. This is one of the most comprehensive, most complex, one of the most difficult aspect of our business to get right. With Complexica, we will be best in class - there will not be anybody in the market that can perform this task more effectively or more efficiently than we can."

Doug Misener
CEO, Liquor Marketing Group
1,400+ retail stores
Liquor Marketing Group LMG uses Complexica's Promotional Campaign Manager

"The key thing that makes such a difference in working with Complexica is their focus on delivering the business benefits and outcomes of the project."

Doug Misener
CEO, Liquor Marketing Group
1,400+ retail stores
Liquor Marketing Group LMG uses Complexica's Promotional Campaign Manager

"Australia needs smart technology and people, and it has been a great experience for me to observe Complexica co-founders Zbigniew and Matt Michalewicz assemble great teams of people using their mathematical, logic, programming, and business skills to create world-beating products. They are leaders in taking our bright graduates and forging them into the businesses of the future."

Lewis Owens
Chairman of the Board, SA Water

"Having known the team behind Complexica for some years ago now, I am struck by their ability to make the complex simple - to use data and all its possibilities for useful purpose. They bring real intelligence to AI and have an commercial approach to its application."

Andrew McEvoy
Managing Director, Fairfax Media - Digital

"I have worked with the team at Complexica for a number of years and have found them professional, innovative and have appreciated their partnership approach to delivering solutions to complex problems."

Kelvin McGrath
CIO, Asciano

“Working with Complexica to deliver Project Automate has been a true partnership from the initial stages of analysis of LMG’s existing processes and data handling, through scoping and development phase and onto delivery and process change adoption. The Complexica team have delivered considerable value at each stage and will continue to be a valued partner to LMG."

Gavin Saunders
CFO, Liquor Marketing Group
Liquor Marketing Group LMG uses Complexica's Promotional Campaign Manager

“Complexica’s Order Management System and Larry, the Digital Analyst will provide more than 300 Bunzl account managers with real-time analytics and insights, to empower decision making and enhanced support. This will create more time for our teams to enable them to see more customers each day and provide the Bunzl personalised experience.”

Kim Hetherington
CEO, Bunzl Australasia

"The team behind Complexica develops software products that are at the cutting edge of science and technology, always focused on the opportunities to deliver a decisive competitive edge to business. It has always been a great experience collaborating with Matthew, Zbigniew and Co."

Mike Lomman
GM Demand Chain, Roy Hill Iron Ore

"The innovations that the Complexica team are capable of continue to amaze me. They look at problems from the client side and use a unique approach to collaborating with and deeply understanding their customers challenges. This uniquely differentiates what they bring to market and how they deliver value to customers."

John Ansley
CIO, Toll Group
toll_logo

"Rather than building out an internal analytics team to investigate and analyse countless data sets, we have partnered with Complexica to provide our sales reps with the answers they need, when they need them, on a fully automated basis. We are excited about the benefits that Larry, the Digital Analyst will deliver to our business.”

Peter Caughey
CEO, Coventry Group
Coventry_Group_v2

Kim Hetherington
CEO, Bunzl Australasia

"After an evaluation process and successful proof-of-concept in 2016, we have chosen to partner with Complexica to upgrade the technological capability of our in-field sales force. The next-generation Customer Opportunity Profiler provided by Complexica will serve as a key tool for sales staff to optimise their daily activities, personalise conversations and interactions with customers, and analyse data to generate actionable insights."

Stephen Mooney
Group Sales Capability Manager, DuluxGroup
$1.7 billion in revenue
Dulux Group uses Complexica's Customer Opportunity Profiler

"After evaluating a number of software systems available in the marketplace, we have ultimately selected Complexica as our vendor of choice for sales force automation and CRM. Given the large SKU range we carry and very long tail of customers we serve, Complexica’s applications are best suited to deal with this inherent complexity without burdening our staff with endless data entry."

Nick Carr
CEO, Haircaire Australia
Australia's largest distributor of haircare products
Haircare Australia to use Complexica's Customer Opportunity Profiler, CRM and Order Management System

“Asahi Beverages is Australia’s largest brewer, supplying a leading portfolio to wholesalers and retailers, including some of Australia’s most iconic brands. Last year Asahi Beverages acquired Carlton & United Breweries, which is its Australian alcohol business division. To harness the strength of our expanded portfolio, we partner with our customers to run multiple and frequent trade promotions throughout the year, delivering long-term growth for both our business and theirs. Given the inherent complexity in optimising promotional plans and our continued focus on revenue and growth management, we have selected Complexica as our vendor of choice after a successful Proof-of-Concept of its world-class optimisation capabilities.”

Kellie Barnes
Group Chief Information Officer
Asahi Beverages
Asahi