A Complete Guide to Credit Risk Modelling

This article explains basic concepts and methodologies of credit risk modelling and how it is important for financial institutions.

In credit risk domain, statistics and machine learning play an important role in solving problems related to credit risk. Hence role of predictive modelers and data scientists have become so important.

Table of Contents

What is Credit Risk?

In simple words, it is the risk of borrower not repaying loan, credit card or any other type of loan. Sometimes customers pay some installments of loan but don't repay the full amount which includes principal amount plus interest.

For example, you took a personal loan of USD 100,000 for 10 years at 9% interest rate. You paid a few initial installments of loan to the bank but stopped paying afterwards. Remaining unpaid installments are worth USD 30,000. It's a loss to the bank.

Do you remember or aware of 2008 global recession? In US, low-creditworthy customers were given home loans which were risky due to their high likelihood of default. To compensate risk, banks used to charge high interest rate. Banks further sold these loans to investors as Collateralized Debt Obligations (CDOs), considered low-risk from 2004-2007. As defaults increased, banks seized (foreclosed) properties. It caused a real estate bubble burst and a sharp decline in home prices. This led to a global recession as many financial institutions had invested in these funds.

Introduction to Credit Risk Modelling

What is Credit Risk Modelling?

Credit risk modeling refers to data driven risk models which calculates the chances of a borrower defaults on loan (or credit card). If a borrower fails to repay loan, how much amount he/she owes at the time of default and how much lender would lose from the outstanding amount. In other words, we need to build probability of default, loss given default and exposure at default models as per regulatory basel norms.

Basel Regulations

A committee was set up in year 1974 by central bank governors of G10 countries. It is to ensure that banks have minimum enough capital to give back depositors’ funds. They meet regularly to discuss banking supervisory matters at the Bank for International Settlements (BIS) in Basel, Switzerland. The committee was expanded in 2009 to 27 countries.

Basel I

Basel I accord is the first official pact introduced in year 1988. It focused on credit risk and introduced the idea of the capital adequacy ratio which is also known as Capital to Risk Assets Ratio. It is the ratio of a bank's capital to its risk. Banks needed to maintain ratio of at least 8%. It means capital should be more than 8 percent of the risk-weighted assets. Capital is an aggregation of Tier 1 and Tier 2 capital.

  1. Tier 1 capital : Primary funding source of the bank. It includes shareholders' equity and retained earnings
  2. Tier 2 capital : Subordinated loans, revaluation reserves, undisclosed reserves and general provisions

In Basel I, fixed risk weights were set based on the level of exposure. It was 50% for mortgages and 100% for non-mortgage exposures (like credit card, overdraft, auto loans, personal finance etc). See the example shown below -

Mortgage $5,000 Risk Weight 50% Risk Weighted Assets $2500 (Mortage * Risk Weight) Minimum Capital Required $200 (8% * Risk Weighted Assets)

Basel II

Basel II accord was introduced in June 2004 to eliminate the limitations of Basel I. For example, Basel I focused only on credit risk whereas Basel II focused not only credit risk but also includes operational and market risk. Operational Risk includes fraud and system failures. Market risk includes equity, currency and commodity risk.

In Basel II, there are following three ways to estimate credit risk.

Standardized Approach

For corporate, the banks relies on ratings from certified credit rating agencies (CRAs) like S&P, Moody etc. to quantify required capital for credit risk. Risk weight is 20% for high rated exposures and goes up to 150 percent for low rated exposures. For retail, risk weight is 35% for mortgage exposures and 75% for non-mortgage exposures (no rating by credit rating agencies required for retail).

Corporate Exposure $5,00,000 Credit Assessment AAA Risk Weights 20% Risk Weighted Assets $1,00,000 Minimum Capital Required $8,000 
Internal Ratings Based (IRB) Approach Probability of Default (PD)

Probability of default means the likelihood that a borrower will default on debt (credit card, mortgage or non-mortgage loan) over a one-year period. In simple words, it returns the expected probability of customers fail to repay the loan. Probability is expressed in the form of percentage, lies between 0% and 100%. Higher the probability, higher the chance of default.

Exposure at Default (EAD)

It means how much should we expect the amount outstanding to be in the case of default. It is the amount that the borrower has to pay the bank at the time of default.

Loss given Default (LGD)

It means how much of the amount outstanding we expect to lose. It is a proportion of the total exposure when borrower defaults. It is calculated by (1 - Recovery Rate).

LGD = (EAD – PV(recovery) – PV(cost)) / EAD PV (recovery)= Present value of recovery discounted till time of default. PV (cost) = Present value of cost discounted till time of default.

Someone takes $100,000 home loan from bank for purchase of flat. At the time of default, loan has an outstanding balance of $70,000. Bank foreclosed flat and sold it for $60,000. EAD is $70,000. LGD is calculated by dividing ($70,000 - $60,000)/$70,000 i.e. 14.3%.

Expected Loss

Expected Loss is calculated by (PD * LGD * EAD).

Probability of Default 2% Exposure at Default $20,000 Loss Given Default 20% Expected Loss $80 
Foundation and Advanced IRB Approach

There are two types of Internal Rating Based (IRB) approaches which are Foundation IRB and Advanced IRB.

Foundation IRB
PD is estimated internally by the bank while LGD and EAD are prescribed by regulator. Advanced IRB
PD, LGD, and EAD can be estimated internally by the bank itself. Effective Maturity (M)

It is a duration that reflects standard bank practice is used. For Foundation IRB, the effective maturity is 2.5 years (exception is repo style transactions where it is 6 months). For Advanced IRB, M is the greater of 1 year or the effective maturity of the specific instrument.

Basel III

Basel III accord was scheduled to be implemented effective March 2019. In view of the coronavirus pandemic, the implementation had been postponed to January 1, 2023.

Basel III has incorporated several risk measures to counter issues which were identified and highlighted in 2008 financial crisis. It emphasis on revised capital standards (such as leverage ratios), stress testing and tangible equity capital which is the component with the greatest loss-absorbing capacity.

The concept of building internal models and external ratings for estimating PD, LGD and EAD remains same as it was in Basel II. However there are some changes introduced in Basel III. It is shown in the table below.

Basel II Basel III
Common Tier 1 capital ratio(shareholders’ equity + retained earnings) 2% * RWA 4.5% * RWA
Tier 1 capital ratio 4% * RWA 6% * RWA
Tier 2 capital ratio 4% * RWA 2% * RWA
Capital conservation buffer(common equity) - 2.5% * RWA

Does Basel IV exist?

The Basel Committee introduced "Basel III: Finalizing post-crisis reforms" in 2017, an extension of Basel III. In the US, it's termed Basel III Endgame. In the UK, it is called Basel 3.1 and some refer to it as Basel IV. But officially there are only 3 Basel Accords and it is being considered as a part of Basel III only.

The EU regulatory authority has set January 2025 as the implementation date, while both the UK and US regulatory authorities aim to implement the changes by July 2025.

IFRS 9

IFRS 9 is an International Financial Reporting Standard dealing with accounting for financial instruments. It replaces IAS 39 Financial Instruments which was based on the incurred loss model whereas IFRS 9 focuses on the expected loss model that covers also future losses.

In IFRS 9, the idea is to recognize 12-month loss allowance at initial recognition and lifetime loss allowance on significant increase in credit risk

Probability of Default Modeling

In this section, we covered various steps and methods related to PD modeling.

Define Dependent Variable

Binary variable having values 1 and 0. 1 refers to bad customers and 0 refers to good customers.

Bad Customers : Customers who defaulted in payment. By 'default', it means if either or all of the following scenarios have taken place.

Indeterminates or rollovers : These customers fall into these 2 categories :

All the other customers are good customers . Indeterminates should not be included as it would reduce the discrimination ability to distinguish between good and bad. It is important to note that we include these customers at the time of scoring.

We consider 12 months as performance window to flag defaults which means if a customer has defaulted any time in next 12 months, it would be flagged as 'Bad'

Methodologies for Estimating PD

There are two main methodologies for estimating Probability of Default.

  1. Judgmental Method
  2. Statistical Method
Judgmental Method

It relies on the knowledge of experienced credit professionals. It is generally based on five Cs of the applicant and loan.

Judgmental methods have become past as Statistical methods are more popular these days. But it is still widely used when historical data is not available (especially new credit products).

Statistical Method

In today's world, nobody has time to wait for 1-2 months to know about the status of loan. Also many borrowers apply for loan through bank's website. Hence real-time credit decisions by bank is required to remain competitive in the digital world. The advantage of using statistical method is that it produces mathematical equation which is an automated and faster solution for making credit decisions.

This method is unbiased and free from dishonest or fraudulent conduct by loan approval officer or manager.

This method also comes with higher accuracy as statistical and machine learning models considers hundreds of data points to identify defaulters.

Data Sources for PD Modeling

Steps of PD Modeling

Statistical Techniques used for Model Development

Model Performance in PD Model

There are main 2 levels of performance testing -

  1. Discrimination : Ability to differentiate between good (non-defaulters) and bad (defaulters) customers
  2. Calibration : Check whether the actual default rate is close to predicted PD values
Statistical Tests for Model Performance
Discrimination : Area under Curve, Gini coefficient, KS Statistics Calibration : Hosmer and Lemeshow Test, Binomial Test
Check out this link for detailed explanation : Model Performance Simplified

Rating Philosophy

It refers to the time horizon for which ratings measure credit risk and how much they are influenced by cyclic effects.

In general, hybrid model (considering both PIT and TTC) is used.

rating philosophy

Credit Scoring and Scorecard

Probability of Default model is used to score each customer to assess his/her likelihood of default. When you go to Bank for loan, they check your credit score. This credit score can be built internally by bank or Bank can use score of credit bureaus.

Credit Bureaus collect individuals' credit information from various banks and sell it in the form of a credit report. They also release credit scores. In US, FICO score is very popular credit score ranging between 300 and 850. In India, CIBIL score is used for the same and lie between 300 and 900.

Types of Scorecards

1. Application Scorecard : It applies to new (first time) customers applying for loan or credit card. It estimate probability of default at time applicant applies for loan. See the example below how it works.

How scorecard works

Suppose cutoff for granting loan = 350 Profile of a New Customer Age 30 Gender Male Salary 15000 Total Points = (100 + 85 + 120) = 305 Decision : Refuse Loan 
Data required for application scorecard

We use customer's application or demographic data along with credit bureau data. There is no observation window for historical data as these are new customers. Definition of Bad is same which is 90+ days past due. Performance window is generally 12 to 24 months from opening account.

Application scorecard is used majorly for the following tasks:

2. Behavior Scorecard : It applies to existing customers to assess whether customer will default in loan payment. Performance window is generally 6 to 18 months.

Behavior scorecard is used majorly for the following tasks:

Difference between Application and Behavior Scorecard

Application scorecard is applied on new customers (generally lower than 1 year) whereas Behavior scorecard is applied on existing customers (greater than 1 year). For application scorecard, we don't require well-calibrated default probabilities. But calibrated default probabilities are required for behavior scorecard as per Basel norms. These two scorecards are also different in terms of usage. See the explanation above in their respective section how they are generally used.

scorecard

Collections Scoring

It predicts probability that a loan already late for a given number of days will be late for another given number of days. They are typically built for performance windows of one month.

Desertion Scoring

It predicts the probability a borrower will apply for a new loan once the current loan is paid off.

Important Terminologies related to Credit Risk

Stressed PD vs. Unstressed PD

Stressed PD: A stressed PD depends on the risk attributes of borrower but is not highly affected by macroeconomic factors as adverse economic conditions are already factored into it.

Unstressed PD: An unstressed PD depends on both current macroeconomic and risk attributes of borrower. It moves up or down depending on the economic conditions.

Downturn LGD and EAD

Under Basel II and III, financial institutions need to estimate downturn LGD and EAD. By 'downturn', it means adverse economic conditions. We need to select the month with highest default rate and then take two consecutive quarters (6-month) window on both sides of this point and consider it as downturn period and then take maximum of EAD and LGD which provides the downturn estimates. It is required because LGD and EAD can be affected by downturn economic conditions.

Conditional PD

It is the probability of default during the second year given that it does not default during the first year. To calculate conditional PD, we need probability of not defaulting by the end of year 1 (P0) and unconditional probability of defaulting during the second year (P1).

If P0=0.5 and P1=0.1 so Conditional PD i.e. Prob(default | Survival) would be 0.1/0.5 = 20%

Lifetime PD vs 12 month PD

As per IFRS 9, we require two types of PDs for calculating expected credit losses (ECL).

Suppose 12-month PD is 3% which means survival rate is 97% (1 - PD). 2nd and 3rd year conditional PD is 4% and 5%.

  1. 1st year cumulative survival rate (CSR) is same as first year survival rate (SR).
  2. 2nd year cumulative survival rate = 1st year CSR * SR of 2nd year = 97% * 96% = 93%
  3. 3rd year cumulative survival rate = 2nd year CSR * SR of 3rd year = 93% * 95% = 88%. Lifetime PD = 1 - 88% = 12%

Macroeconomic factors to consider to estimate ECL

Estimating Expected Credit Loss (ECL) is crucial for banks and other financial institutions to manage the risk of lending money. To do this well, they must think about different macroeconomic factors that can affect how likely people are to repay their loans. Here are some important macroeconomic factors to consider when estimating ECL:

Stress Testing

In simple terms, stress testing is like giving a financial institution (such as a bank) a really tough test to see if it can handle difficult situations. Instead of just looking at regular situations, stress tests make them imagine extreme and rare problems, like a big economic crisis or unexpected disasters. By doing this, we can figure out how strong and prepared the institution is to handle these tough times and make sure it can stay stable even in the worst-case scenarios. For example, how a 5% increase in the unemployment rate affects the performance of a bank.

Types of Stress Testing

There are three types of stress testing.

  1. Scenario Analysis : Banks use scenario analysis to imagine different future situations and see how they might affect their financial health. It helps them prepare for risks and make better decisions.
  2. Reverse Stress Testing : In reverse stress testing, banks start with a negative outcome and figure out what could cause it. It helps them identify vulnerabilities and improve risk management.
  3. Sensitivity Analysis : Sensitivity analysis involves testing different factors to see how they impact the bank's performance. It helps banks understand their exposure to risks and adjust their strategies accordingly.

Softwares used in risk analytics

Let's split this section into two parts -

1. Data Extraction
Most of the data is stored in relational databases (SQL Server, Teradata). Analyst need to have expert level knowledge of SQL to extract or manipulate data. Data is not saved in a single SQL table or database. In order to extract relevant data fields from database, you need to select multiple tables and join them based on matching key(s). During this process, you need to apply some business rules (excluding some type of customers or accounts). Transaction table is generally in mainframe environment so basic knowledge of mainframe and UNIX would be key. Mainframe and UNIX are not primary skill sets banks generally look for in risk analyst (It's good to have!). Developers are generally hired for this work.

2. Model Building
SAS is the most widely used software in risk analytics. Despite huge popularity of R and Python these days, more than 90% of banks and other financial institutions still use SAS. Banks also started exploring R and Python. They are building (or already built) syntax library (repository) in R and Python language for credit risk projects.

SAS can be easily integrated with relational databases and mainframe. Many companies execute both data extraction and model building steps in SAS environment only.

Hope you have got a fair idea of how predictive modeling is used in credit risk domain and what are the key credit risk parameters. In risk analytics, domain knowledge is more important than technical or statistical knowledge. Hope this article helped you in filling that gap. Please provide your feedback in the comment box below.

Related Posts Spread the Word!
Share Share Tweet

Deepanshu Bhalla

About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

While I love having friends who agree, I only learn from those who don't
Let's Get Connected Email LinkedIn

Post Comment 35 Responses to "A Complete Guide to Credit Risk Modelling"

Hi Deepanshu really very informative for beginner's like me.
Can you please example of how behavior score card can be used to set credit Limit Reply Delete

Behaviour score generated based on customer history(Transaction, Delinquent,overlimit or past due or loan defaulter or credit card credit limit utilization. So in that case if BEH score is good that means, He/she is a good customer. So bank can use this beh score range and can increase credit limit Delete