Revenue Optimization Engine’s New Machine Learning Model Improves Prediction Accuracy
In a previous blog post, we talked about how Recurly uses machine learning to optimize subscription billing for our customers and prevent involuntary churn. As part of our goal to help our customers maximize their subscription revenue, we introduced the Revenue Optimization Engine in 2018. When a recurring transaction fails, this technology creates a customized retry schedule, so subsequent retries of that transaction have a higher chance of succeeding. This technology is driven by machine learning which relies on models based on Recurly’s incredible breadth of historical subscription data which identifies factors that are highly correlated with successful transaction processing.
Our machine learning model is the backbone of the Revenue Optimization Engine. The model’s accuracy in predicting when a retry is most likely to be successful directly impacts our customers’  retry success rates. Machine learning models are most effective when they are constantly updated, based on new data. After the development of the last retry model, we made some changes that resulted in a significant improvement in the accuracy of the model’s predictions.
Let’s explore these improvements in more detail.
Retraining the model Â
While a machine learning model should be robust enough to handle small changes in the data, as time goes on, subscribers change their behavior which impacts the model. For example, there may be changes in laws, policies, and rules regarding credit card payments. Or a subscription business may switch payment gateways. These and other factors affect the quality of our retry model’s predictions. A model trained on transaction data gathered in 2017 might not be as effective on transactions from 2018, especially in a rapidly changing market like subscription commerce.
We asked ourselves what can be done to mitigate this and improve the performance of our model over time? First, we built a system to continuously evaluate and retrain the model. Using this system, we closely monitored the model’s performance, and we retrained the model when its prediction accuracy dropped. The process of retraining involves using the same machine learning-driven algorithm and features but training them based on newer, more recent data. This allows the model to learn newer patterns and structures from the data.
The value of iteration
One of Recurly’s core values is to iterate everything: we want to continually improve our product, features, and processes. While retraining a model is a great first step, there are other ways to lift our retry model’s performance even higher. We recently developed a newer version of the model that includes new training data, additional model features, and a different machine learning algorithm.
Recurly’s data advantage plays a significant role here. Recurly has an incredible amount of  information from having processed hundreds of millions of transactions over the past nine years. This information includes data on payment methods, credit card brands, and historical data about our customers’ subscribers. Our data also includes transactions from all over the globe, involving multiple currencies, gateways, and much more. And our data encompasses hundreds of subscription businesses from a wide variety of both B2B and B2C industries.
Creating new features based on this data results in increased accuracy in our model’s predictions. We also calculated aggregated model features like average invoice length which help with our model’s predictions.
Gradient Boosting Machines
After using new data to build additional features, we began updating our model. In a previous blog, we discussed several machine learning algorithms like logistic regression and random forests. At Recurly, in addition to iterating, we love to innovate and try new ideas. So, we tried Gradient Boosting Machines (GBM) this time.
As you may remember from the older post, a random forest includes several decision trees built in parallel where each tree gets to vote on whether it thinks a transaction would succeed or fail. The final decision is an aggregate of the votes of each individual tree.
A GBM also includes lots of decision trees, the difference being they are organized in a series, one after the other. One tree makes predictions; some are right and some are wrong. Every subsequent tree pays attention if there was an incorrect prediction by the previous tree and tries to correct that mistake. This goes on until the process reaches the last tree. The final prediction is made by aggregating the predictions of all the individual trees. In our data, GBM generated more accurate predictions, thus helping to maximize the chance of completing a transaction successfully.
A model based on machine learning is not as inscrutable as people often think. One very useful aspect of machine learning is that you can identify which features are the most important to your model in making its predictions. Also, there are other metrics such as precision/recall that are useful in evaluating a model’s accuracy.
What do these metrics mean? Recall measures the ratio of correctly predicted successes compared to all actual successes. Precision measures the number of correctly predicted successes compared to all predicted successes (both correct and incorrect). These metrics provide a holistic view of our model’s performance, one that is not possible by just looking at a single metric like accuracy. A model with high precision and recall can capture more instances of successful retries.
Next steps
Where do we go from here? We have refined our model, validated it using historical data, and have seen more accurate predictions as a result. This is exciting! The next step is to deploy the machine learning model to production and run experiments to further evaluate our model. Once this is done, our customers should recover even more failed transactions than ever. Stay tuned to learn more!