Advancements in new technologies are transforming companies in the construction industry. Many problems in the construction sector including difficulties compiling and sharing project information have been dodged for decades with the use of software. These developments could not have come at a better time. Construction projects are becoming more complex and expensive putting managers under tremendous pressure to reduce costs, timelines and improve efficiency.

However, Invoice Processing is one area in the Construction Industry that continues to not use the latest technology advances. As a result, it continues to be manual, time-consuming, and error-prone, casting a negative impact on a company's bottomline. In this blog, we’ll discuss Invoice Processing automation in the construction sector by reviewing modern techniques like Optical Character Recognition (OCR) and Deep Learning (DL).

Below are the table of contents:

Construction Invoices: Types and Procedures

A Construction Invoice is a document sent by a contractor, sub-contractor, or supplier to their customer when payment is owed for work performed. It sets a payment obligation, thereby creating an accounts receivable. Invoicing is what keeps the cash flowing. Therefore, creating, tracking and managing them is one of the essential tasks.

Here's what a typical construction invoice will include:

  • The date on which invoice is generated
  • Names and addresses of both parties
  • Description of the goods and services
  • The price and quantities for those goods and services
  • The terms of payment

Typically, managers use software to generate these invoices. Sometimes, they manually enter it on the company's invoice template. Below are some types of invoices that are frequently used in construction companies.

Construction Invoice: A construction supervisor issues a construction invoice. These are usually entered in a software and exported as PDF or printed as manual forms. Every company follows a particular template; this helps in organising and extracting necessary information when required. Apart from the basic information, some of the components of a construction invoice template include work duration, project notes, a breakdown of the labour, costs associated, and hourly rates to be charged. Additionally, these contain essential tax information, as well.

According to some studies, construction businesses are losing more than one full day of work every week to inefficiency in maintaining their invoices.
Screenshot of construction invoice (src)

Supplier Invoice: Supplier invoices are the sales invoices and bills issued by the supplying vendor and received by the buying customer. Customers also refer to supplier invoices as vendor invoices. Labour accounts for nearly 62% of all accounts payable costs. This includes everything from receiving, routing, and filing supplier invoices to issuing payments to vendors and correcting payment errors. A lot of that labour is going directly towards the manual intervention of solving problems with invoices including data entry mistakes and missed payments. Below is a screenshot of how a supplier invoice looks like.

Supplier Invoice Template (src)
Automation can help staff spend less time with manual data entry limited to instances where there may be major errors and reduce the need for paper invoices.

Architect Invoice: Architecture invoices are documents provided by architects to get paid for their work. As the construction companies do not generate these invoices, storing and organising them is a hectic task.

Screenshot of Architect Invoice (src)

Subcontractor Invoice: A subcontractor invoice is a breakdown of the services provided, the costs, and information about how the client should compensate for the work. These invoices  come in vast numbers and therefore organizing them using information extraction algorithms helps understand the total amount spent and the amount in the budget.

Now that we’ve seen the most frequently used invoices, let’s dive into automating and digitising construction invoices.

How to digitise construction invoices better with OCR?

OCR and Deep Learning have enabled machines to perform different tasks. One of OCR technology's most significant tasks is its ability to "learn" and develop a more nuanced approach to reading specific types of invoices. It's possible that once an OCR solution is trained how to read an invoice, the accuracy of its reading can increase over time.

Digitising construction invoices involves several human moderated steps :

  • Data Collection / Capture
  • Processing Images
  • Extracting Text with OCR and Deep Learning
  • Exporting Text to Databases or Spreadsheets

Let’s look at these steps in details:

1. Data Collection / Capture

Data collection is the initial step to automate information extraction from construction invoices. It includes scanning or capturing invoices manually using a simple camera or gathering soft copies that are in different formats in one place. To make the digitization process more accurate, one must make sure to have a robust dataset containing different templates as they are further trained on OCR and DL algorithms.

2. Processing Images

After we have enough data, the images inside them are processed and verified. It includes verifying the orientation and the image’s resolution to ensure the text and tables are correctly aligned before they are sent into OCR.

That’s not all! One major step that most users miss is to check the authenticity of the invoices. It involves finding the blurred spots of the invoice to see if any number or dates are manipulated. This can be done using computer vision techniques.

3. Extracting Text with OCR and Deep Learning

Before getting started, let’s look at what OCR is. It is a computer technique that converts scanned images or documents into editable text. For example, consider our use-case of extracting information from construction invoices, using OCR we can extract all the text whenever a form is passed as an input.

To see this in action, we’ll look at one of the most powerful and open-source OCR, Tesseract. It's an open-source python-based software developed by Google. However, even popular tools like Tesseract fail to extract text in some complex scenarios. They blindly extract text from given images without any processing or rules. Therefore, they require some intelligent algorithms backing them; this is where deep learning comes into the picture.

Let’s look at how Tesseract, extracts text from construction invoices:

Building Laur Dream # ABCR
123 Maia Street Ste 101. Ashland Orequa.97$20 ~$41-488-1234 ~bholen@hocaom =Laense:012348 Date:05/31/08
John Abercrombie Home
Mona Fitch 1234 Hersey Street
120 Pine Street Ashland OR 97520
Ashland OR 97520
Fellman, 1234 Due On Receipt
Invoice Items:
tom Status | Amount _|
Job Phase: Excavation
Excavation Invoiced $630.00
Job Phase: Foundation
Kit Foundation Invoiced $1,800.00
Description Amount Notes
Original Estimate
06/01/09 $130,398.01
Deposit Received
02/01/07 $1,800.00
Allowance Variances
Cabinets $1,650.00 Allowance: $5,610.13, Actual: $7,260.13
Counters $800.00 Allowance: $1,200.00, Actual: $2,000.00
Doors $80.00 Allowance: $1,320.00, Actual: $1,400.00
Floor Covering $75.00 Allowance: $4,265.00, Actual: $4,340.00
Windows $150.00 Allowance: $8,508.00, Actual: $8,658.00
Total Variance $2,755.00 Estimated: $27,253.13, Actual: $30,008.13
With Markups $3,306.00
Change Orders
Cost Adjustment
Cost Adjustment -$8,003.00 Estimated: $130,398.01, Actual: $122,395.01
Draws (Invoices)
05/30/06 $2,730.00
11/06/06 $5,374.50
11/07/06 $13,393.50

This isn't orderly or usable. As discussed, the core job of OCR is to extract all the text from a given document irrespective of template, layout, language, or fonts. But our goal is to pick all the critical information like customer name, form type, and financial details from the construction invoices. These aren't handled by the top OCR engines like Tesseract and others. Therefore, we rely on deep learning trained on massive datasets and enable the models to learn. Let's discuss them in the next section.

Deep learning approaches have seen advancement in reading the text and extracting structured and unstructured information from images. By using existing deep learning methods with OCR technology, companies and individuals have automated digitizing documents and enabled more comfortable manual data entry procedures, better logging and storage, lower errors and better response times.

Several tools are available in the market and in the open-source community for such tasks, all with their pros and cons. Some of them are Google Vision API, Amazon Rekognition and Microsoft Cognitive Services. The most commonly used open-source tools are CUTIE and LayoutLM.

Here’s an image of how Nanonets model based on OCR and Deep Learning is able to extract fields and line items from construction invoices.

4. Exporting Text to Databases or Spreadsheets

After the models are trained and deployed, it's important to export the data into either a CSV file or a spreadsheet. This helps the end-user to upload or store all the records into databases, CRMs, or ERP systems. This process can also help validate the outputs if there are any errors while extracting the data. Additionally, we can draw insights, trends and make predictions based on the extracted information from construction invoices.

Start using Nanonets for Automation

Try out the model or request a demo today!