Reply to comment

Automating Capture to Transform Unstructured Data Processes

Rate this:
Total votes: 0

We live in a fast paced world where consumers demand high quality and rapid services. When we buy an item on-line we expect a confirmation email in seconds or we start to be concerned that something has gone wrong.

The desire for rapid and meaningful response is also growing in more complex transactions such as a mortgage or insurance application, even though these often require documents to be supplied that contain the data needed to decide on the claim or application. Consumers can now take and send pictures on mobile devices or scan documents from home, and are attracted by the simplicity and availability of 24/7 e-business.

When transforming these types of business process a capture strategy is needed to enable documents to be delivered and processed in a fast and low cost means, with a consistent high level of quality.

Within this article I contrast processes that are initiated with structured and unstructured data, review the challenges and costs associated with unstructured data, and suggest how automating the extraction of unstructured data can deliver competitive advantage.

What is Data Capture and Why is it Important?

The execution of each distinct electronic business process is entirely dependent on its specific data values. Capture is the method of gathering the input data that each process instance uses. There are three common approaches to capturing seed data for processes. The data is either:

  • Imported from a line of business system either directly or via an automated transfer mechanism,
  • Manually entered by a customer via an application such as a website or an app,
  • Extracted from semi-structured documents such as invoices or completely unstructured content such as emails, letters or unknown forms.

The first two are examples of highly automated processes launched by structured data at the point the data is delivered; where manual intervention is only required if the process requires it for fulfilment. Consider the example of buying a dress online. The shopper selects the specific dress they want and provide additional information such as colour and size. Data capture is complete the moment the customer elects to buy the item, with address and delivery details subsequently supplied. Manual effort is only needed to box and post the correct dress.

These processes result in a consistent and rapid execution which provides a predictable and repeatable customer experience. The process typically has a low cost to execute due to minimal need for checking or manual hand-offs.

The third is an example of processes using unstructured data in the form of documents. These include traditional paper based applications and letters, but are increasingly dealing with documents  being sent electronically as pictures from mobile phones, or attachments within emails. Modern technology allows content to arrive from customers almost immediately, but that only increases the expectation from the customer for a speedy response.

Unstructured data is not in an accessible format that can be immediately used to launch and add to automated business processes. A strategy is needed to extract the data and transform it into a useful format.

Approaches to Extracting Unstructured Data

Achieving customer satisfaction is increasingly important and difficult to achieve. Satisfied customers are 50% more likely to listen to a sales offer, while dissatisfied customers — if they even choose to listen — are twice as likely as satisfied customers to decline an offer after listening to it.[i] Customers do not expect poorer or slower services just because documents are provided as process inputs.

There are a number of ways to transform the data. A technology-free approach is possible where printed documents are given to workers to read, extract and re-key data or launch actions based on the content. This traditional approach requires knowledgeable workers who are trusted to take the correct action. There is a need for quality assurance, measurement and training to execute these steps in a consistent way. This dependence on manual resource makes these processes slow to complete and typically are less reliable than processes seeded with structured data. It is difficult to see how this approach can deliver a competitive value. The alternative is to invest in technology.

Technology for Extracting Unstructured Data

Extraction technology is now a remarkably mature and reliable choice that is relatively inexpensive to implement and benefit from. These technologies use:

  • ICR (Intelligent Character Recognition) to translate handwritten words,
  • OCR (Optical Character Recognition) to extract machine characters such as those in this article and,
  • IMR (Intelligent Mark Recognition) to be able to determine if a check box has been selected on a form.

Capture technology is most successful and easiest to implement when data is extracted from structured forms where a template is known. But it can also be highly effective when dealing with semi or unstructured documents too, but this does take more effort.

With an automated capture process, it is possible to limit human touches to only verify and validate that the data automatically extracted from the documents is accurate. These tasks tend to be completed on an exception basis only and require minimal effort from an end user. In many instances it is also possible for someone to complete the task that has no expertise in the subject matter as they are only confirming characters have been extracted correctly.

Leading capture technology solutions also allow for techniques such as double blind verification, where two people confirm the same data, separate from each other, to maximise the likelihood that extracted seed data is accurate.  Although this adds extra time and cost, it is typically much faster than the traditional method.

This technology typically interfaces well with leading BPMS, allowing an end to end capability for automating business processes.


When an effective capture strategy is implemented that minimises the time and cost of transforming unstructured into structured data there are many benefits to gain. These include:

  • Significantly reducing the time taken to respond to customers who are required to provide documents as process inputs. Use cases include insurance claims and mortgage requests, where drop off rates between completing the structured data application form and then submitting paper documents remains high using the traditional model of “snail mail”.
  • Removing the need for documents to be sent via the post, by uploading via an app or website instead reduces storage costs as there is no need to manage physical content.
  • Shrinking the cost of processing as an effective capture strategy will reduce the human interactions required to extract and validate the data. Reduced manual involvement also reduces the likelihood of error which will please the customer.
  • The maturity of the technology allows for fast implementation, allowing for rapid ROI.

The net effect is a faster, reliable and consistent process that can transform the interaction with customers and deliver a tangible competitive advantage.


Within the document I have highlighted problems that unstructured data creates when attempting to improve processes. A data capture strategy is required that quickly and reliably transforms this data so that it can be processes in a structured and repeatable way. A technology based capture solution will provide the greatest benefit and should therefore be given due consideration when improving these types of process.

[i] Financial Services Customer Experience Survey, Maritz, 2008. (quoted in “Insurance Providers: Improving Customer Retention through the Contact Center” by Jacada, 2008).



Shopping cart

There are no products in your shopping cart.

0 Items
Remind me later

Free training!

Want to sample our training?
Attend our Open House for immediate access to sample some of our newest courses. 

Schedule an appointment with a training advisor to learn more about our certificate programs.

Act now. The Open House is only available for a limited time.