How Credit Scoring Models Deal With a Pandemic

All models are based on the assumption that the future will be like the past. This is particularly true in credit scoring, where past behaviour is used as an input to estimate the future creditworthiness of a person or a company.

Data scientists have already observed how our behaviour during the pandemic is messing with the AI models. In the context of credit risk this creates a lot of new challenges for risk and data science teams to avoid adverse consequences. In this post, we go behind the scenes on what changes credit risk and data science teams are making to ensure continuity of accurate credit assessment.

When the future is not like the past

The behaviour of most people in the world changes quite a bit during a global pandemic. This creates major challenges for credit risk and data science teams - all production models have to be reviewed and the assumptions behind the models need to be reassessed. The typical questions that risk teams ask themselves during a time like this are:

  • How to accurately verify customer income, given the economic impact on certain industries;
  • Should new rules around industry “whitelists” and “blacklists” be created and how to maintain those rules in the future;
  • Which changes in behaviours are captured or missed by existing data sources, such as credit bureaus;
  • How new government policies and support to SMB's, freelancers and individuals changes their creditworthiness in the short and long term;
  • What additional information should be acquired from customers now and in the near future, to verify their true credit worthiness. 

''At times like these, models and processes are monitored very carefully. Extra validation processes are created before turning down or approving new customers.'' says Agrita Garnizone, Head of Data at Nordigen. ''If credit scoring is done with gray-box models or when model interpretability has been favored over complexity, risk teams look at the decision of the prediction. That allows them to validate or correct the probability of default.''

''Data points gathered during a pandemic are treated with extra caution to avoid training models with observations that are out of the ordinary. I wouldn't consider collected data during a pandemic as normal data and add it to training data sets for the future. I would treat it as outliers or at very least apply special treatment.'' says Agrita.

How credit scoring models are adjusted

The adjustments that data teams have to make to credit scoring models to ensure they maintain a level of accuracy depends on the effects that the pandemic has had on the customer population.

In a case when a customer population is affected equally in terms of data points used in a scorecard, it can be expected that default rates by score buckets will grow proportionally.

Even if a scorecard is made for a specific product or there is doubt about the “proportional effect”, expert modelling can be used to evaluate new default rates and score buckets. Data teams then adjust cut-offs of existing models to align with an acceptable level of risk they want to take.

After adjustments to risk policies are done, data teams monitor models very closely. The metrics used in monitoring are not the standard default definitions (e.g. 90 days loan repayment overdue), but other proxies that capture short-term effects. The short-term metrics are used to verify assumptions and further adjust risk rules, policies or models, if necessary.

''There are no outcomes on covid19 loans just yet, but in the long-term some risk teams quite likely will use “covid19 scorecards” alongside scorecards created before the pandemic and combine the decisions before a final decision,'' says Jekaterina Borodina Kletnaja, Senior Data Scientist at Nordigen. ''The process that risk teams will deal with it is to adjust risk policies, monitor them, adjust further, monitor again - and repeat this process continuously.''

Going digital opens up new possibilities - and challenges

With the need to digitalise customer-facing processes, comes the ability to use new data sources. This is great news for credit risk teams - more data sources mean more ways to understand the credit worthiness of customers and monitor how changes in behaviours impact ability to repay liabilities. It comes with a list of challenges as well.

New data sources require substantial efforts in data collection, preparation and analysis before any testing of the predictive power can be started. New data sources such as open banking data can add a lot of value, but integrations with banking APIs and data preparation can be a major challenge for IT-strapped companies.

''If a bank or lender didn't use open banking in their lending process before, now is a good time. Open banking data provides an objective overview of the financial health of a customer - it reveals income trends and spending behaviours, which is not available in any other data source. This rich data is very valuable to risk teams,” says Roberts Bernans, Chief Product Officer at Nordigen.

“But we see many lending companies struggling to find development resources to make integrations with third-party banking APIs. Our product roadmap for the post-covid time is filled with solutions and features that help reduce the integration-anxiety that many of our clients face today with open banking data,'' adds Roberts. 

The new normal

While the world is adjusting to the new constraints, credit risk and data science teams will have to become more agile to accommodate for the changing environment.

This opens up a lot of opportunities for teams to try and explore new technologies and data sources, which they previously did not have a need or chance to test. The output from all of the changes is more robust credit scoring models and policies that can perform well even in times of uncertainty and this is important to ensure continuous availability of credit.

Latest Stories

Here’s what we've been up to recently.