Client:
That Index Consulting
Our client - That Index Consulting, is based in the UK and specializes in document indexing services. As index service providers, they extract the required information from documents and organize data using predefined frameworks. This results in streamlined information access for enterprise clients, allowing them to save time and money.
The Challenge
That index aimed to onboard one of the UK's largest charities, which dealt with numerous suppliers. The client needed to index financial documents, such as invoices and credit notes, from suppliers worldwide. This task involved managing hundreds of thousands of documents with various templates, including those presenting challenges.
While the company could extract data from invoices and credit notes for other clients using traditional OCR and perform data enhancement on Google Sheets before uploading to their MSSQL database, the sheer volume of documents made this approach unfeasible. Consequently, they couldn't onboard the client without automating the data extraction and enhancement process through an AI service.
Using a traditional OCR service would require extensive training to adapt to the new format and lack the flexibility to apply data-action rules to enhance the data. Therefore, they needed an automation tool to handle all current and future edge cases.
The absence of document post-processing would fail their business processes, such as PO matching. This, in turn, would require them to verify the accuracy of the data for each document, taking approximately three minutes per document for the hundreds of thousands of documents. Correcting incorrect invoices may take several minutes, depending on the scope of the error.
They specifically required a service that could:
- Accuracy: Utilize AI-based OCR for high accuracy across diverse templates.
- Data Enhancement: Conduct checks and apply data enhancement rules before exporting data. The workflow should be easily customizable to ensure future adaptability.
- Cost-effectiveness: Seek a cost-effective service suitable for enterprise use cases, particularly in managing document volumes.
Integrations: Opt for a service capable of seamlessly importing data from emails and exporting it to their MSSQL database in real-time.
The Solution
Nanonets is a workflow automation company that empowers businesses to streamline their processes.
With Nanonets, a customized AI-powered workflow was developed, featuring a highly accurate OCR and over 100 data-action steps to capture and present data in the correct format.
The workflow breakdown is as follows:
- Import: All documents seamlessly entered the platform through an API call using the import block.
- Data Extraction: The OCR block efficiently extracted crucial information from invoices and credit notes in various formats, achieving over 90% document extraction accuracy.
- Data Enhancement: To address intricacies like incorrect formats or missing digits, ThatIndex designed a workflow with over 100 data block steps. These blocks facilitated various operations, sometimes involving multiple actions for a single field.
For instance, matching PO codes on purchase orders issued by the client with credit notes from suppliers required accuracy, given the significance of identifying the correct invoice or credit note PO. Traditional OCR would have necessitated manual review of tens of thousands of invoices for corrections.
The challenge arose due to distinct UK-based PO codes that differ significantly from those in Europe or America. These UK PO codes include spacing and an alphanumeric mix, making OCRs prone to misinterpreting 'O' (letter o) as '0' (zero). In-app data blocks were employed to format the document accurately, a crucial step for enhancing data accuracy, especially since most suppliers were from the UK.
Data blocks proved effective in post-processing, incorporating checks and validation flags. Index utilized data blocks to format dates, numbers, and currencies and match information across two tables.
The simplicity of deploying a workflow allowed That Index to develop the entire process independently and make necessary adjustments on the fly.
- Export: Processed data was then directly transferred to their MSSQL database. In contrast to other solutions requiring data download and upload, Nanonets enabled real-time data export into their database, posing privacy-related concerns.
Nanonets has allowed us to process documents from different sources and formats. It is flexible enough to easily incorporate new changes to the data template with minimal training and add new data enhancement rules.