Driver's License Verification with AI Based OCR
Digital identify verification
Nanonets automated digitization for North America's largest leader in the digital identity ecosystem. The client provides automated age & identity verification solutions, processing over 10 million ID cards in the US annually.
The client helps its partners with verification process while on-boarding a new customer to a hotels, insurance policies, banks. As a part of this on-boarding process, each customer has to submit ID documents. The most commonly submitted ID cards are Drivers' Licenses and Passports. To process over 50,000 documents submitted each month, our client was using a traditional OCR engine along with a team of 10 contractors that manually reviewed these results - an expensive, time consuming and error prone process.
Challenges with traditional ID verification solutions
A major challenge with using a standard out-of-the box solution was that as these documents have a different template or format by country and even by state in some continents, the traditional rule-based solutions failed.
Often the photographs were of poor resolution, tilted, not adhering to the standard photograph protocols.
Our client ruled out maintaining the traditional OCR solution internally as it was turning out to be more expensive with little improve in accuracy. It then started looking out for available solutions to buy in the market. They had already tried AWS Textract and Abby and found that their overall accuracy levels didn’t move the needle in terms of their automation requirement. While their models could extract data from some documents well, they completely jumbled up the fields in a lot of them.
The client had a pressing need for a solution that is:
- Highly-secure, ensuring data privacy of the users
- Highly-accurate, that would help reduce costs and processing time.
- Intelligent enough to scan, handle process a variety of different formats
- Intelligent enough to identify and pre-process images before feeding into the model
The company then chanced upon Nanonets and found the perfect match to Driver License OCR Reader challenge.
Optical Character Recognition for Driver's License
Nanonets differs from other providers by developing a custom trained-model, using customer data to train on top of their pre-trained model. The last layers of the neural network learn from customer data to predict accurately later in production.
Our client appreciated that Nanonets provided a lean OCR solution where:
- The client had to provide just a few hundred samples (500-1000) to train a highly accurate model
- The client had the freedom of defining the fields to be extracted
- The solution automatically identified whether the images adhere to specifications like the high resolution, full ID card visible, the upright orientation of the ID card etc. and providing instant feedback to customers
- The OCR API could identify the structure of the license, the titles and fields in it.
- These fields can be extracted as json outputs so that it could be easily integrated with our client's application
- Response times were significantly shorter (less than 15 seconds)
- The solution could easily be integrated in their application to process Driver's Licenses in real time
Another feature that the client really liked was that Nanonets models could be configured to run on-premises using docker containers such that the user data never leaves their infrastructure. Nanonets also supported integration with their software so that ID cards can be processed in real time.
Impact of automated data extraction
Nanonets was able to reduce the time taken to verify ID cards by 60% by automating ID card digitization, thus significantly reducing the effort of manual data entry. Though the accuracy was lower compared to humans, the number of manual reviewers was reduced along with the number of passes required for each ID card to make sure there’s no error. This meant that the company reduced its cost by 55% while also providing its customers more convenience in their services and its employees less repetitive and more engaging work.