During last months, I asked many of my friends and family members regarding their experiences while opening an account with one of the thousands of Fintechs or setting up new account with car/scooter sharing company or other similar usercases espeically
requiring identity verification. Given my background in identity management and digital identities, offcourse my questions were more focussed towards specific steps in process when it came to their identity document verrification. Interestingly enough many
could not guess that these steps were any different from regular steps or even recall all the steps for identity verification. Those who did simply said I just needed to take my ID pictures, my selfie and some kind of face movement. Seemed easy, straight forward
and harmless. This made me think how far ID verification companies have come to do their job as much possible in background without creating too much friction for user. While for an average user it was as simple as taking automated pictures of their ID documents,
they did not even realize how much technology is working behind the scene to ensure -
- Genuine users are allowed to go through the process of account opening and other similar use cases
- Fraudsters are blocked from the process who may be impersonating someone else or even using potentially fraudulent documents
This exercise made me write a bit more about what exactly is happening under the hood when it comes to identity verification with focus on document verification. These documents could be passports, government issued identity cards, resident permits, driver
licenses and so on.
Various companies have taken different technical approaches to how they go about
- Reading what is written on document (Name, date of birth, ID number etc)
- Ensure that document is not manipulated or fraudulent
Here are some of the approaches that I have seen in the market –
A. Static document images using Google algorithm
Google has invested millions over the year in building Cloud vision API and made it available to companies also for professional use and not just for its own internal use in products like google translate etc. Using “Cloud Vision technology”, once can also
use their API to do the OCR on various types of documents including Identity documents even though technology is not best suited for such cases as context is missing unlike reading some text let’s say from the page of a book. Technology works with a single
image of document, one for front and one for backside. IDV vendor is most probably sending these images to one of the Google’s servers somewhere and retrieving the results with certain confidence scores. As one can imagine Google tech is optimized for OCR
and not necessarily to detect the genuineness of document itself. IDV provider has to rely on other algorithms to detect if document is genuine or not, as much as possible with static images.
B. Static document images with in-house/licenced OCR algorithms
It is somewhat like option A. Big difference here is that IDV provider has built their own technology in house to read the ID documents (OCR) or possibly licensing someone else’s technology from the market as there are many countries specific providers in
the market. When it comes to challenges faced, this technical approach to detect fraudulent documents is also no different from what is mentioned in point A. One big advantage with this architecture is that PII data itself can reside in IDV provider’s servers
(and not sent to Google etc). Which is always advantageous when dealing with organizations having strict data privacy requirements or meeting regional compliance requirements like GDPR in Europe. But if IDV provider is using any 3rd party OCR algorithms,
it is always hard to know what data protection policies are in place to ensure document images are not shared with original technology provider. Something to always consider.
C. Dynamic document images and in-house OCR/security algorithms
This technical approach here is quite different from architectures presented in option A or even option B. Instead of static single images of documents, whole video frames are created from different angels either by recording the whole video of process or
creating a video with multiple frames. There are many advantages to this approach namely, multiple frames from video can help ensure documents can be read out from various images and thus improving overall field level accuracy from document readout (e.g. first
name is clearer and read out from first image of video whereas date of birth is clearer and readout from second image of video and so on). Other big advantage is document fraud detection. The fact that document video is collected at various angles, various
types of document fraud attempts can be detected like detecting if document has enough security features like hologram, if fraudster is trying to work with color print out of document or even trying to use some orphan or fake documents available on some websites.
As one can imagine, this technical architecture is process heavy and requires technology to be highly optimized top to bottom all the way from mobile app to OCR algorithm and security algorithm running on IDV provider’s servers.
Different service providers using identity verification technologies in their products have different understanding and tolerance for security and fraud depending on use case they cater to. But as online services are growing exponentially, so are fraudsters.
Given some leading IDV providers in the market have already built their technology stack closer to option C, service provider can carefully select the right IDV vendor and aim for highest level of security and fraud detection. If some something goes wrong,
there is always a damage, direct or indirect, all the way from paying regulatory fines to reputational damages to company brand.
Well now going back to friends and family after interviewing regarding their experience, when I explain to them what’s happening under the hood many got amazed with sophistication of technology almost like some sci-fi fiction.