Categorisation and Insights API

Overview
Integration
User Guide
Products
API Documentation

# Insights: Income

Income insights gives an overview on the main attributes of the clients’ income, including income source, type, amount, and information describing income changes over time. Income can be used for verifying income and evaluating thin file customers in a number of use cases including loan origination, buy now pay later, credit assessment and others. Keys from income insights output can also be used as a parameter in scoring models or as simple business criteria in the loan evaluation process.

## Glossary

Keywords used throughout the product response and API documentation are listed and explained below:

• Income: all income transactions that have been categorised with one of the categories from income definition. For example, if salary (Nordigen category ID 85) is considered to be the only legitimate income and is used as the whole definition, then all fields calculated on “income” are calculated based on salary transactions only.

• Discretionary income is the amount of an individual's income that is left after all necessary payments are made (like paying taxes and paying for personal necessities, such as food and shelter). Here income is used as per glossary, and necessary expenses are the expense definition parameter explained in the customization section. Often, the term disposable income is used to refer to the term discretionary income, and while the two are similar, their definitions don't match exactly.

• Debt: all outgoing transactions that have been categorised as Loans (i.e., one of child categories of Nordigen category ID 79, for example, monthly payments to existing mortgage).

• Income payment: each individual income transaction (transaction's amount is greater than 0).

## Field purpose and definitions

Income insights response holds 13 key-value pairs, where 10 values are just float or integer values while 3 are objects with nested elements. All key-value pairs except income_by_category and definition object give overview of client’s income taking into account whole income and expense definition. Key-value pairs and objects are described below:

#### Income insights

Fields like  calendar_months and calendar_months_with_income gives insights of statement’s length in full calendar months. In the same example as above, that would be all months in between February to August, i.e, 7 months. Count of calendar months with income is always equal or smaller than the count of months.

Fields average_monthly_income and average_monthly_discretionary_income give the actual overview of what is client’s average monthly income, what is the average amount spent on necessities (direct representation of expense definition) and what is the amount left for any leisure activities, savings or new loan repayments. For both fields at first a monthly aggregate is created (sum for income, and difference between sum of income and sum of expenses for discretionary income) and then applied mean. For example, let’s imagine a statement with 5 calendar months - each month an individual receives a salary of 1000 EUR and 500 EUR from freelance jobs and each month the total of 500 EUR is spent on necessities. Average monthly income is $$\frac{5 \cdot (1000 + 500)}{5} = 1500$$ EUR. Average monthly discretionary income is $$\frac{5 \cdot (1000 + 500 - 500)}{5} = 1000$$ EUR.

Field debt_to_income follows the standard formula of debt to income ratio, i.e., sum of amount spent on debt (Nordigen parent category ID 79) to sum of income (whole income definition). For example, if a statement hold information of 10 calendar months, where in each month the only debt is mortgage payment of 100EUR and monthly income in the first 5 months is 1000 EUR, then 2000 EUR in the last 5 months, then debt to income ratio is $$\frac{10 \cdot 100}{5*1000 + 5 * 2000} \sim 0.067$$.

Object last_incomplete_month and fields average_days_between_income_payments, days_since_last_income_payment can be analyzed together. While both fields are quite self-explanatory, last_incomplete_month values (explained in detail below) gives an overview of current month’s income, e.g., whether some portion of expected income is already received this month.

Note: as outlined in the API documentation, days_since_last_income_payment take into account last incomplete month’s transactions as well. For example, if we use the same dates from examples above - an uploaded statement’s start date is 15th of January and end date is 20th of September, and the last income transaction is on 29th of August, then the response value will be 22 days.

Similarly monthly_regularity, monthly_stability and monthly_trend can be analyzed together to determine whether income is stable, the overall trend is positive and there are no significant fluctuations. The purpose of these fields can be illustrated with a quick example - an individual receives their salary once a month for a six month period. For the first five months the salary is 1000 EUR, but last month's salary is 10'000 EUR. This leads to the conclusion that while income is regular (salary is received every month), income amounts do fluctuate (salary is not the same amount each month), however taking into account positive trend (from 1000 to 10000), these fluctuations could be interpreted as a good change.

Income trend is defined as a slope of linear approximation of monthly income data. Namely, the more positive the trend value is, the steeper is the increase in data (and vice versa for negatives). Outliers are corrected (smoothed) before approximation. Trend calculations are limited to not more than the last 12 calendar months of data in order to minimize impact of old data points. Field is calculated only if the statement contains at least 3 calendar months, otherwise null is returned.

Monthly regularity is defined as the percentage of calendar months with income. Field is calculated only if the statement contains at least 3 month of data. For more complex regularity calculations that take into account (bi-)weekly regularity and other patterns, please look into features insights.

Stability is designed to capture any big fluctuations of income for an individual. While perfect stability is defined as fixed income, smaller fluctuations should also be considered normal, for instance, hourly wage will fluctuate from month to month. In the formula, the last 3 calendar month income has a three times larger weight compared to older data points in earlier months. Field is calculated only if the statement contains at least 2 calendar months. The following stability value ranges can be used as a rule of thumb to gain deeper understanding of obtained result:

• [0.85; 1.00]: stable

• [0.50; 0.85): mostly stable

• [0.30; 0.50): unstable

• [0.00; 0.30): very unstable

#### Fields in last incomplete month object

As first incomplete and last incomplete month transactions are dropped for most field calculations, last_incomplete_month object is meant to summarize received and expected income in the last incomplete month.

Taking into account average monthly income from historical full calendar month data, the expected_remaining_income is the difference between average monthly income and already received income, thus giving an insight whether some portion of income is not yet received in the last incomplete month.

Similarly, the remaining_monthly_discretionary_income is the amount of income that remains for spending in this calendar month after all mandatory expenses are paid. Formula takes into account start balance of account, usual average monthly income and usual average monthly expenses to forecast what will be the overall account balance.

#### Fields in definitions object

As explained in the customization section, definition object is used for sanity check purposes. Holds information of used income and expense definitions.

#### Fields in income by category object

Income here is associated with the respective key which is one of the categories listed in the income definition. The purpose of this object is to give a more granular view of a client's income sources.

For example, all values below for key “85” are calculated on all Salary transactions (Nordigen category ID 85), whereas all values for key “5” are calculated on all Pension transactions (Nordigen category ID 5). Here it’s easy to notice that all values in Pension block are null, which means that this statement does not have any pension transactions.


...
"income_by_category":{
"85":{
"average_income_payment":1500.00,
"average_monthly_income":1500.00,
"days_since_last_income_payment":21,
"median_income_payment":1500.00,
"number_of_income_payments":9
},
"5":{
"average_income_payment":null,
"average_monthly_income":null,
"days_since_last_income_payment":null,
"median_income_payment":null,
"number_of_income_payments":null
}
}
...



Fields average_monthly_income and days_since_last_income_payment are calculated in the same way as explained above, but income taken into account for this field is only the respective key category.

Similar approach is done for fields average_income_payment, median_income_payment, number_of_income_payments - but instead of aggregating transactions on a monthly basis, each individual transaction is treated as an observation and descriptive statistics (mean, median and count respectively) is applied over the constructed vector.

To give an illustrative example, let’s imagine a bank statement with 10 transactions out of which 6 are recognized as Salary. Statement’s end date is 2021-04-30. In income by category object where category is Salary we are interested in salary transactions only, and that creates time series as follows [(“2021-01-05”, 100), (“2021-01-25”, 1000), (“2021-02-05”, 100), (“2021-02-25”, 1000), (“2021-03-15”, 1500), (“2021-04-15”, 1500)].

Hence all the fields are calculated:

• End date is 2021-04-30, so days_since_last_income_payment is 15;

• average_monthly_income is mean of monthly aggregate array [1100, 1100, 1500, 1500], i.e., 1300;

• average_income_payment is mean of array [100, 100, 1000, 1000, 1500, 1500], i.e., approx 866.67;

• median_income_payment is median of array [100, 100, 1000, 1000, 1500, 1500], i.e., 1000;

• number_of_income_payments is count of transactions that are categorised as salary or count of elements in the array, i.e., 6.

The purpose of having both fields average monthly income and average income payment is to give an insight whether a client's income is split via multiple smaller payments (for instance, salary is paid out on a weekly basis) or that is a single payment per month.