5 min


Features v2.0 Launch

...

We're releasing an API for accessing the latest Features library version - Features v2.0. The new feature generation library is designed to extract information from raw bank account data (transactions and account meta-data) and produce variables for building predictive models from open banking data. The new Features 2.0 is now available for developers and data science teams to test and apply in their next open banking data project. To get early access to the API, please request access here. 

What is a feature?

In data science, features are used to build predictive models. Feature is an observable event or a characteristic that can be quantified and recorded -  in order words, information about each observation in the data set. Feature is the machine learning version of a word that other disciplines might call an “attribute”, a “factor”, a “predictor”, an “ independent variable” or just a “variable”. The quality of predictive models depends on the quality features, which is why data scientists invest resources in feature engineering. Feature engineering often requires extensive research, which is why data science teams often acquire external feature libraries of pre-engineered features that can be used to build and test predictive models in shorter periods of time.
 

About Features v2.0

The new features library can generate up to 1 million unique features for every bank statement. The features are segmented into groups according to their characteristics, including features generated based on descriptive statistics, end-user financial behaviour patterns or high-level bank account information. 

Most features in the library are numerical, with a few exceptions of categorical features. All categorical features have a small, finite value set, which allows using them in simple processes  such as one-hot encoding. Complexity for features ranges from simple binary features (e.g. indicating an event) to complex behaviours that capture information over the span of multiple patterns.

The library contains features that can support most popular use-cases where transaction data is used including credit risk assessment, credit scoring, credit application fraud detection, income verification, automated loan application screening, customer segmentation for marketing and personal finance management. 
 

Feature examples

  • Binary feature:
    does the bank account contain any bailiff transactions
  • Feature generated from simple descriptive statistics:
    average monthly payment to all loan institutions
  • Feature generated from financial behaviour patterns:
    how regularly a user finances a loan with another loan

     

Practical application

Features v2.0 supports the most popular use-cases of open banking data as well as other sources of bank account information, including user-submitted bank statements, card transactions, transactions from mobile wallets and more. 

While the number of features that the new library can generate for every bank statement is substantial, not all features will be predictive or contain valuable information for all use-cases and models. To find the most valuable features, data science teams have to run a feature selection process, where a larger set of features are tested and a smaller subset of features are picked for the final models. For the current version of our API, Nordigen data science team assists all users of the API in the feature selection process. 

The new Features v2.0 has been tested in credit scoring projects with the initial results yielding 7-14 percentage point GINI uplift.  The new features library can be used in all 19 countries, where Nordigen provides Transaction Categorisation service. 
 

Behind the R&D

Features v2.0 is the result of 9 months of research and testing with external partners to develop a complete set of open banking data features. Some of the newly generated features will be available as part of our existing self-service platform in products like IncomeLoansRisk and Marketing starting this year. The findings from the research will also be used to improve the performance of Simple Score
 

Comparison with Features v1.0 (launched in August, 2019):

  • More than 90% of the features in the library are new,
  • Improved speed - calculations now run on average 3-10x faster,
  • Improved data cleansing - includes additional layer of transaction pre-processing,
  • Improved stability of features - includes outlier detection and treatment, robust and precise calculations for non-standard bank statements,
  • Additional functionality - feature value normalization and imputation that can be applied and returned as a response. 
     

Request access today

To get access to the API for testing Features v2.0, please fill out this application.
For more information, please contact us: info@nordigen.com


 

Share

share on facebook share on linked

Article by

...Agrita Garnizone
Tagged with: fintech data openbanking datascience datafeatures

Recommended articles

No upfront fees, pay as you go.

Create an account for FREE and get started today!

No upfront fees

Keep me updated

Lorem ipsum dolor sit amet consectetur adipisicing elit. Quidem, rerum sunt quisquam fugiat facere voluptatum dolorum nulla aperiam voluptas veniam tempora in commodi non odit ullam debitis quod dolor cupiditate.

Thanks! You're successfully subscribed to Nordigen news.
Ooops, something went wrong!

By sumbitting the form, you accept Nordigen’s Privacy Policy.