Creating an open world with fair identity verification

August 20, 2020

The world is changing and we can no longer do everything in person. But at Onfido, we’re working to keep the world open, by enabling businesses to give people access to all the services they need, when they need them. We do this with identity verification.

Remote identity verification is essential now that we can't do anything in person. It means banks can still provide financial services for people unable to go into branches, fintechs can help migrants send money to their families abroad, and supermarkets can remotely onboard new staff to feed the nation as demand rises.

Our role in widening access means we have a responsibility to make sure everyone has access to services. This is reflected in the increasing number of documents we support globally, our efforts to make our products accessible—through our partnerships with the Royal Institute of the Blind and Digital Accessibility Center—and importantly in our efforts to build a product that works consistently across all demographics.

Award-winning tech that represents a global experience

We’re proud to share that our FaceMatch algorithm, which establishes proof of identity document (ID) ownership, has achieved market-leading accuracy while being the fairest it has ever been across ethnicities. It has also recently been recognized by CogX as the “Best Innovation in Algorithmic Bias”.

This recent upgrade to our FaceMatch algorithm was developed in consultation with the Information Commissioner’s Office (ICO) as part of their new Privacy Sandbox. While our participation in the Privacy Sandbox is still ongoing for a few more months, it has helped Onfido ensure that we are developing our algorithms lawfully, fairly and in a transparent manner consistent with privacy laws and our Privacy Policy.

We prove ID ownership by analyzing the face on the document and comparing it to the user’s face on a freshly captured selfie or video. Our algorithm is specially trained for this task, which means it’s able to deal with unique problems that arise. This includes glare on the document, document security features and holograms partially obstructing the face, reflections, tilted angles, significant age gaps, etc.

Our latest False Acceptance Rate is now 0.01%, at a False Rejection Rate of 1%. In other words, there’s only a 1 in 10,000 chance of incorrectly matching two faces that don’t belong to the same person. It’s the highest ever reported accuracy for algorithms comparing faces on identity documents vs. those in a selfie or a video. And more importantly, the difference in performance between ethnicity groups is as small as it’s ever been (as part of our work with the ICO, we defined nationality, which we extract from the ID, as a proxy for ethnicity as outlined below. It should also be noted that some groups are more homogenous (e.g. Africa) and others are more diverse (e.g. The Americas).

  • Overall average: 0.01% False Acceptance Rate
  • Europe: 0.019%
  • Americas: 0.008%
  • Africa: 0.038%
  • Asia: 0.017%
  • Oceania: 0.014%

The above figures represent an overall 10x improvement from our previous algorithm version, and a 60x false acceptance improvement for users in the “Africa” category. All the while our False Rejection remained the same at 1%, so genuine users continue to enjoy a frictionless experience.

But these improvements are more than just data points. They mean that the businesses we work with see fewer falsely accepted fraudulent cases. This translates to less money laundering, terrorist financing, and fewer victims of identity fraud.

Cross-functional collaboration beyond the organizational walls

Our recent improvements represent a mammoth effort leveraging our in-house expertise in Machine Learning, privacy, and data ethics. Our team re-architected our FaceMatch algorithm from scratch and used new ways to sample data to train the model, focussing on sampling our training dataset in a way that is balanced across demographics to reduce differences in performance.

To improve the fairness of our algorithms, we needed to classify the photos of individuals in a way that reflected race and ethnicity and then test the performance of our algorithms on those classifications.  To ensure our work met the highest possible ethical standards, we applied to the UK Information Commissioner (ICO) to undertake this work in their new Privacy Sandbox.  

In the ICO’s Privacy Sandbox, we have had the privilege to work alongside the ICO’s Sandbox Team, accessing the resources and know-how of the wider organization. Through a series of regular meetings and workshops, we investigated how Onfido can go about measuring and mitigating algorithmic bias. We are also glad to have been able to help the ICO learn more about the technical side of facial recognition technology and to jointly explore ways to combine technological innovation and privacy.

While we aim to conclude this work with the ICO later this year, those looking for a deep dive can read more in this publication by our Director of Privacy in Harvard’s Carr Center for Human Rights Policy.

A call for new standards

Compared with many businesses within the identity space, we don’t put our technology in the hands of off-the-shelf solutions. We build our algorithms for Identity Verification in-house, and we’re proud of that.

Building our algorithms in-house enables us to look at identity verification as a whole. For example, off-the-shelf algorithms that compare selfies with selfies underperform when comparing selfies with faces on an ID. While photos of IDs have their own partial obstruction problems such as security features, as well as problems like blur and glare that are not found in selfies. There are also aging differences between the ID and the selfie.

Now, in an age of increasingly digital-first products, selfie-to-document matching is needed to ensure that individuals applying for a product remotely are both legitimate and eligible for access. But industry standards have not adapted to reflect this.

Today the industry still talks of “face-matching” in terms of matching a face to another face. This is reflected in NIST’s Facial Recognition Vendor Test (FRVT) which assesses 1:1 face-to-face matching. It includes multiple facial image types from high-resolution high-quality images, to poorly captured facial images collected with surveillance cameras or webcams. These differences translate into very different algorithms topping the charts for each category. However, no such equivalent exists when matching a face to a document, such as a government ID.

If our industry had a global standard such as the FRVT for face-to-document face-matching, we could improve performance and better combat fraud attack vectors such as facial lookalike fraud. Now’s the time to change this by implementing a global standard.

What to look out for as a buyer

Here are some things to look out for when choosing a vendor for identity verification.

Performance on NIST tables does not translate to proof of document ownership

Ask vendors for performance metrics that are specific to verifying faces against IDs.

Overall performance is not representative

Reporting algorithm performance with overall numbers such as our 0.01% False Acceptance Rate, while an industry-wide practice, only tells half the story. Overall numbers include a large amount of “easy matches”, where it is very easy for any algorithm to give the right answer, thus bloating the numbers to look better than they will look in reality. No real fraudster would attempt to defraud a system using an ID with a face that looks nothing like their own.

It’s important to ask vendors for a breakdown of performance per group of similar demographics, such as our numbers per nationality. Fraudsters will always try to attack systems with faces that already look similar—same gender, same ethnicity, same age. 

Consider third party testing

If the vendor is unable to provide metrics for either category, consider using a third party to test the system at scale. Make sure the testing subjects are representative of your target demographics and that a large sample of IDs is used.

Closing comments

Making digital identity more human starts with making it accessible to all, and our commitment to enabling access on a global scale is an ongoing commitment. Our False Acceptance Rate delta among different ethnicities is still not zero, and we will continue to work on our algorithms until it is closer to being so—we already have a new algorithm in development right now, with promising early results.

Previous Article
AI for good: using artificial intelligence to prevent and mitigate humanitarian crises
AI for good: using artificial intelligence to prevent and mitigate humanitarian crises

The impact of AI in nearly every sphere of life is constantly growing and the potential for it to be used f...

Next Article
How can we ensure Machine Learning models are fair?
How can we ensure Machine Learning models are fair?

This blog examines how we can create a mathematical definition of fairness, and what impact fair machine le...