Unstructured data: How to implement an early warning system for hidden risks


Managing the devil you know is difficult enough. But in risk management, it’s the devil you don’t know that can spring chaos on your organization. And risks are growing—and growing in complexity—at such a clip that the periodic cycle of internal audits and updating of controls often can’t keep up, particularly when it comes to risks that can go viral in this hyper-connected electronic age.

Thus, a new reality is emerging: The scope of risk on which internal controls are focused at many companies is much narrower than the scope of risks actually faced by the organization. So organizations must implement new strategies to identify and manage risks. For most organizations, that strategy should include monitoring a broader swath of internal and external data.

Accountants and management have long used internal controls to identify, assess, and manage risks. And, increasingly, organizations have turned to audit software to conduct more extensive internal control testing and transaction monitoring. The problem is that most companies have used such technology primarily to monitor controls embedded in structured data—e.g., operating statistics and financial facts found in transactional data such as accounts payable, customer sales, inventory, and labor hours—contained in enterprise resource planning and other financial systems. But companies rarely use such technology to effectively identify risks beyond those related to business and transaction processing in financial and operating system internal controls.

What often goes unmonitored is a broad universe of data including email, desktop documents, internet logs, phone calls, text messages, and even social media messages and online customer product reviews. These data have become known throughout industry as “unstructured data,” a catchall term that lacks precision because much of the data included under that rubric actually have structured components. Email, for instance, contains address, senders, times, and the like. (See Exhibit 1 for examples of types of unstructured data.) 

Regardless of the nomenclature, monitoring these data could help a company spot emerging opportunities and risks. This article highlights techniques that can be used to expand the scope and effectiveness of an organization’s risk management program in a way that may help identify hidden risks before they emerge as full-blown crises.



Every organization faces corporate events where information flow and timing generate risk: confidential product R&D activities, new product announcements, corporate earnings releases, news releases that may affect share price, significant changes in product pricing, procurement activities that indicate a shift in product strategy, confidential executive search activity, or announcements of joint ventures and other major investments.

Often, that large body of underexamined data explains why a risk existed, how the risk developed, who was involved, and what corollary risk exposures there might be. In other words, these are the “who, what, when, where, how, and why” of a risk. Email messages and other electronic documents, for example, are the types of data that attorneys and regulators will seek to pursue in lawsuits and forensic accounting investigations. But the tale of the risk, in those instances, is often learned only in retrospect.

Organizations that proactively monitor these kinds of data, however, might be able to better detect risks before they emerge, surprise management, and turn into post facto investigations.


In stock option backdating cases, for example, the manipulation of event timing and uneven access to important earnings and performance data generate the risk. Email data, not the ledgers, generally reflect the timing and intent of the information and activities leading up to the transaction. Investigators examine email traffic surrounding each grant to determine the intended date of the grant and the date on which it was actually communicated. In some cases, data may reflect unusual gaps in email traffic, which might indicate that a key executive has deleted his or her emails on that particular topic. Analysis of the electronic files underlying a stock grant memorandum, meanwhile, may show that a memo was in fact created, finalized, and printed long after the memo date and grant date typed inside the memo.

To avoid these kinds of risks, a company might consider monitoring certain data for a limited period of time just before and after a key event. It could monitor the communications, document, and data transmissions of the personnel who are relevant to the event: executives, financial managers, IT technicians, product engineers, procurement staff, in-house counsel, public relations personnel, contractors, and even vendors such as the financial printer, search firm, or audit firm.

One practical way to operationalize this strategy is to recognize that these risk events are likely included on the quarterly agenda of the board of directors and its committees. For any critical, upcoming event, the board and management should consider whether preventive monitoring is appropriate. The monitoring may be confidential in order to detect illegitimate activity, or it may be announced as corporate policy and a control measure to ensure that employees are more careful in their communications. In either case, any monitoring activities should be governed by a clear corporate policy that is communicated to employees and accounts for the cultural and legal considerations outlined below.


Proactive monitoring strategies might also help companies during the due diligence of a merger or acquisition.

The Federal Trade Commission and the Department of Justice Antitrust Division often request emails, documents, spreadsheets, presentations, and financial reports from parties to develop a detailed view of the parties’ products, sales, customers, competitors, market share and market dynamics, business plans, and strategy. Their goal is to determine whether a deal would pose an unacceptable threat to free competition in the parties’ particular market.

It’s a model that parties conducting their own due diligence reviews for any significant transaction might want to follow. By conducting streamlined reviews of correspondence, memos, presentations, and even recent social media postings of key executives, finance managers, or account representatives, they can develop a more complete view of the financial, operational, and other risks that affect the transaction’s value.


For many financial industry organizations, the sales function is regulated by the Financial Industry Regulatory Authority and the SEC, which require employers to monitor and preserve sales communications with customers and the general public, including corporate and personal email, telephone calls, instant messages, text messages, and social media postings.

While other industries do not have the same compliance requirements, many nonetheless implement similar measures for regulatory compliance, quality assurance, or talent retention purposes. Notwithstanding the fact that employee confidentiality and privacy are always competing concerns, the needs of the company and employee can and have been balanced effectively.

For example, Company A is a manufacturer competing in a sales-driven industry in which effective sales representatives are in high demand. The sales process is highly regulated and subject to Foreign Corrupt Practices Act risks in several locations overseas. The company’s human resources director would like to ensure that sales personnel succeed in navigating regulatory requirements and that any areas of weakness receive further policy clarification and training. In addition, where the parties are subject to noncompete or nonsolicitation agreements, she may also like to know about any activity by competitors to recruit key sales personnel or to obtain key customer information.

The director is fortunate that, within the past year, the IT department has implemented an email archiving program. The software was installed at the legal department’s request to centralize company email for purposes of records management and destruction, and it has the added benefit of streamlining email servers for the IT department. The software also can search employee email and attachments, and allows the HR director to set up automated searches to monitor the email of key sales reps. Here is what’s included:

One search designed to identify any emails or attachments discussing potentially questionable sales practices, significant gifts, creative discounts, and so on. These emails are forwarded to the HR director’s assistant for review and follow-up if necessary. That may include forwarding emails to the sales rep’s manager for clarification or to the legal department to address any regulatory questions, and then compiling any case studies for use in future training sessions.

A second set of searches to identify email traffic to and from competitor email addresses. This information gives the HR director visibility into any solicitation of key sales personnel or sharing of confidential customer information.

The email searches are supplemented by an anonymous review of web search logs to determine whether sales personnel are unhappy in their positions, spending an unusual amount of time searching job boards, or researching salary information.

Further, the HR director has had special concerns about email traffic in overseas subsidiaries that may indicate corrupt practices by employees. In addition to setting up email searches designed to detect these messages, an additional search is conducted for emails and attachments that are encrypted. Depending on the email recipient, this may be an excellent security practice to protect legitimate information, but for other recipients it may indicate an attempt to hide damaging transmissions.

IT supplements the search for encrypted employee emails by using its software management program to periodically sweep computers in foreign offices for unapproved encryption software.

For any sales representative or circumstances that warrant further scrutiny, the director monitors text messages sent from the employee’s corporate cellphone and emails sent from personal email accounts using corporate computers.

Some of these monitoring activities should be announced as policy, and others should be kept confidential. In organizations that place a premium on employee confidentiality, a number of mechanisms can ensure that monitoring and reporting activities are conducted anonymously, with personal follow-up only in instances where it is clearly warranted.


Many organizations have adopted unstructured data mining techniques to leverage their business models and profits, focusing largely on external risks and opportunities. Health care organizations, for example, are using unstructured data analytics to streamline health care processes and to suggest or confirm medical diagnoses. Other companies are using these techniques to develop R&D strategies based on far-flung information regarding potential competitors, intellectual property claims, available funding, the regulatory context, and trends in publications.

But an organization’s decision to expand data monitoring and analysis to include an early warning system for hidden internal risks can prove to be sensitive. Decision-makers should consider a number of practical, cultural, and legal factors. Among them:

Organization size. Some smaller or more centralized organizations may decide that the costs of data monitoring exceed the benefits. There may be more direct ways for management to keep its fingers on the pulse.

Organizational culture. Some companies may think that certain forms of data monitoring could compromise their culture and that they prefer to rely solely on an atmosphere of mutual trust to identify and manage risks.

Operational environment. The customers, sales and operating locations, legal and regulatory context, and products and services sold are examples of specific organizational considerations that may cause management to feel a need for a new level of risk monitoring.

Legal concerns. Organizations should always consult counsel to determine whether a particular form of data monitoring and analysis is inconsistent with privacy and other requirements in the relevant jurisdictions or is in fact helpful for detecting potential liability risks (see Step 3 under “Suggested Methodology”).

Employment policies. It is important to realize that many organizations already have policies in place, typically published in the employee handbook, explicitly stating that the company has the right to monitor employee use of company computer equipment and any personal devices traversing the corporate network. It is very common, for example, for organizations to monitor employee web surfing and to investigate their email messages without first notifying the employee.


Mining and monitoring such diverse data may sound complex—and, perhaps to some, a bit Orwellian. But, if managed carefully, Big Brother could help save the company.

Indeed, such a monitoring system can be one of the most straightforward, immediate, and cost-effective strategies a company can implement to head off any risks before they emerge and impact profits. Solutions may include real-time monitoring of a targeted data stream or occasional mining of a sample data set. The IT department might use existing corporate technologies such as spam filters, email archives, or enterprise search utilities, or a consultant might install new technologies on-site or capture data samples for off-site analysis. Some technologies that might come into play are text mining and analytics, concept clustering, relevance ranking, sentiment analysis, social/entity mapping, visual analytics, similarity analysis, and technology-assisted document review.

Here’s a brief methodology for implementing a data-monitoring program within an organization:

1. Articulate your risk profile. Every risk management program should begin by articulating the top strategic risks the organization needs to manage. There are various ways to accomplish this task, and it is probably something you already do every day. In fact, the data will sometimes tell you about significant risks—or opportunities—that you were not aware of.

2. Develop a matching data profile. Together with your IT team, identify the corporate and internet data sources, as well as third-party databases and data aggregations, that are most appropriate for mining and monitoring the profiled risks.

3. Establish your policy on confidentiality and privacy. Most organizations already have policies in place that alert employees to the employer’s right to monitor activity conducted on the employer’s premises, equipment, and networks. In addition, these policies typically cover use of personal devices, email accounts, social media, and so on to the extent they are used on the enterprise network, to distribute company information or to express positions on matters of interest to the organization. Nevertheless, employees may have an expectation that this is done in rare circumstances. In consideration of employee relations and retention, a systematic monitoring program may require additional policy refinement and communications. This process may prompt you to perform some monitoring on an anonymous basis or have it performed by a third party as a firewall to limit the types of information relayed to and viewed by management.

4. Define the risk indicators. Analyze the selected data types and develop risk indicators that will be used to determine which data elements are subject to mining and monitoring. Risk indicators might include employee names and roles, competitor names, keywords and phrases, concepts, date ranges, relationships and contacts, or sample/seed documents that can be used to find “more documents like this.”

5. Define any data capture needs. In many cases, data monitoring can be implemented online in an existing system. For example, messages sent to or from a competitor may be flagged by setting up a simple rule in the IT department’s spam filter, or the corporate email or file server may be searched using existing search technologies. As an alternative, data files can be copied and analyzed offline in circumstances where offline technologies are more effective or cost-efficient, or lessen the impact on a specific IT system.

6. Use technology to automatically filter the significant data and documents. Electronic discovery, text mining, and other tools can quickly filter through mountains of data on an automated basis. These technologies are increasingly less expensive and are designed to detect a wide array of risk indicators. For example, keyword phrases, web addresses, email addresses, and other criteria can be used to pull risk-related documents from the data. Other forms of technology-assisted review can be used to push unexpected risk relationships to reviewers by clustering documents by topic, email address, social relationship, or similar criteria lying latent in the data. This push capability allows reviewers to see risks that they did not know existed.

For example, the reviewer can detect the status and trends of customer sentiment from internet postings. Or a social map of email traffic can identify who is talking to whom within the enterprise or over the internet.

7. Identify the expert analyst. Once the data have been filtered and ranked for relevance by the technology, an individual will need to be designated to evaluate the results. This individual may vary by the risk being monitored. For violations of corporate policy in social media, it may be the human resources director. For certain compliance violations, it may be an in-house attorney. Illegitimate transmission of confidential information may be routed to the chief risk officer. The CFO might send the CEO a roll-up report of findings relating to items on the board of directors’ agenda.

8. Define a frequency for your analysis. Depending on the nature of the risk, the data may be mined on a periodic but infrequent basis or on a more frequent or continuous basis for more sensitive risks. The key is to identify the frequency that best balances the type and severity of the particular risk and its likelihood of occurrence against the cost and effort needed to evaluate the filtered data.

People-Process Solutions to Better Risk Management

In addition to technological solutions, the people-process approach establishes risk management as a strategic activity integrated at all levels of the organization, with clear processes for identifying, evaluating, and responding to risk. Those initiatives can include:

  • Establishing a lead risk executive.
  • Lobbying the board and C-suite to place greater emphasis on integrating risk management with strategic planning, budgeting, and other management processes.
  • Implementing a defined and rigorous process for risk dialogue between board and management, and driving the results down into the organization.
  • Establishing a management risk committee or working group.
  • Using strategic management techniques such as SWOT (examining strengths, weaknesses, opportunities, and threats), as well as scenario planning to identify and inventory strategic risks.
  • Encouraging out-of-the-box thinking about what risks are strategic.
  • Identifying key risk management practices and embedding them into core business processes and organizational structures.
  • Implementing a structured, efficient, and effective process for monitoring and reporting emerging risks to the board.
  • Providing checkpoints to periodically evaluate and change strategic risks and enterprise risk management processes.
  • Providing ongoing enterprise risk management updates and information to the entire organization, with training and continuing education for directors and senior management.
  • Developing risk scorecards tied to hard risk indicators such as customer retention or inventory.
  • Establishing a hotline for employees to anonymously report potential risks.


The term unstructured data is imprecise and covers a lot of ground. Data such as email, desktop documents, internet logs, phone calls, text messages, and social media messages have become known as “unstructured data,” even though these data actually have structured components.

Unstructured data often go unmonitored. Companies rarely use technology to effectively identify risks beyond those related to business and transaction processing in financial and operating system internal controls.

Unstructured data can help explain why a risk existed. It can also explain how the risk developed, who was involved, and what corollary risk exposures might exist. Organizations that monitor these kinds of data can better detect risks before they surprise management and turn into post facto investigations.

Unstructured data can help detect various types of risks. Checking the data can uncover manipulation of event timing, provide deeper understanding of a merger partner during due diligence, ensure compliance with certain regulations, or help monitor employee engagement and retention.

Privacy can be a concern. In organizations that place a premium on employee confidentiality, a number of mechanisms can ensure that monitoring and reporting activities are conducted anonymously. Organizations should always consult counsel to determine whether a particular form of data monitoring and analysis is inconsistent with privacy and other requirements in the relevant jurisdictions.

People are also part of the risk management puzzle. In addition to technological solutions, a people-process approach can establish risk management as a strategic activity integrated at all levels of the organization, with clear processes for identifying, evaluating, and responding to risk.
Christopher S. Beach ( chrisbeach@beachanalyticsllc.com ) is managing director of Beach Analytics LLC in Woodlands, Texas. William R. Schiefelbein ( bschiefelbein@techlawsolutions.com ) is managing director of consulting and chief strategy officer for TechLaw Solutions Inc. in Chantilly, Va.

To comment on this article or to suggest an idea for another article, contact Jack Hagel, editorial director, at jhagel@aicpa.org or 919-402-2111.


JofA articles


  • AICPA Audit Risk Assessment Tool and Guide (#WRA-XX, online subscription; #AAGRAS12P, paperback; and #AAGRAS12E, ebook)
  • Forensic Analytics: Methods and Techniques for Forensic Accounting Investigations (#WI890462).
  • Internal Control—Integrated Framework: Executive Summary, Framework and Appendices, and Illustrative Tools for Assessing Effectiveness of a System of Internal Control (#990025P, paperback; #990025E, ebook)

CPE self-study

Internal Control: Essentials for Financial Managers, Accountants, & Auditors (#731905, text; and #181859, DVD/manual)

For more information or to make a purchase or register, go to cpa2biz.com or call the Institute at 888-777-7077. For more fraud resources, go to cpa2biz.com/fraud.


AICPA Forensic and Valuation Services

Where to find March’s flipbook issue

The Journal of Accountancy is now completely digital. 





Get Clients Ready for Tax Season

This comprehensive report looks at the changes to the child tax credit, earned income tax credit, and child and dependent care credit caused by the expiration of provisions in the American Rescue Plan Act; the ability e-file more returns in the Form 1040 series; automobile mileage deductions; the alternative minimum tax; gift tax exemptions; strategies for accelerating or postponing income and deductions; and retirement and estate planning.