Data mining your general ledger with Excel

Learn how to use Microsoft’s ubiquitous spreadsheet application to unlock the wealth of information stored in your company’s financials.
By J. Carlton Collins, CPA

Data mining your general ledger with Excel
Image by erhui1979/iStock

The general ledger is a vast historical data archive of your company's financial activities, including revenue, expenses, adjustments, account balances, and often much more. The detailed transactions recorded within the typical general ledger represent a treasure-trove of information and potential insights about the company's operations, but the traditional general ledger report's mundane format doesn't immediately serve up those insights.

The solution for wringing out valuable insights is to analyze the general ledger using Microsoft Excel's PivotTable, filtering, and charting capabilities. The problem is that, by default, most general ledger reports aren't correctly formatted for instant analysis in Excel. Fortunately, with a little trickery, a clever CPA can transform a blandly formatted general ledger into a data nugget-enriched PivotTable goldmine.

The trick involves using Excel's Go To Special, Blanks tool to repeat account numbers and descriptions on each row to make your data pivot-ready. This enhanced format helps to unleash Excel's full set of data analysis tools to ferret out those telling trends, useful tidbits, and meaningful observations.

The process of extracting golden information from your general ledger involves the following basic steps: (1) Export the general ledger to Excel; (2) repeat account numbers and descriptions for each row; (3) delete unnecessary rows and columns; (4) pivot the data in an Excel PivotTable; (5) sort and filter the results; (6) add additional data fields to the PivotTable to enhance your reports; and (7) chart your results. This article explains the process step by step. Click here download an Excel file with the sample data I used herein.

Of all the Excel tools, the PivotTable provides the most effective way to analyze your company's general ledger data because it enables you to slice, dice, drill, group, analyze, compare, chart, and mine those data across multiple accounts, departments, locations, months, or years to ferret out meaningful trends and observations. As mentioned, most CPAs don't pivot their general ledgers because they are not organized in a pivot-ready format. This article explains how to pivot a general ledger using QuickBooks as the example accounting system, but these procedures can produce similar results when applied to general ledgers produced by virtually any accounting system. For some very large companies, the general ledger can be so big that it has too many rows for Excel to pivot (see the sidebar, "Special Instructions for Companies With Large General Ledgers," at bottom of page), but the steps described in this article will work in the vast majority of situations. To aid your understanding of the scrubbing and pivoting processes described in this article, I have created a demonstration video that is available at the bottom of this page. (For the sake of simplicity, this video example includes only debit transactions of a general ledger's expense accounts.)

STEP 1: EXPORT GENERAL LEDGER TO EXCEL

In all desktop editions of QuickBooks, start by displaying the General Ledger report on screen, adjust the date range as desired, click the Expand button (to reveal the report's transaction details), and then from the report menu, select Excel, Create New Worksheet, Export. This action reproduces the QuickBooks general ledger report in a new Excel workbook.

In the rare instance that your accounting system does not offer the option to export your general ledger to Excel, you can still accomplish this task by printing your general ledger report to a Print File and then importing the resulting .prn file into Excel. In general, open Windows Print Manager and select Add a printer or scanner. Next, navigate to Manually add a new printer and apply any printer settings you desire, but be sure to select FILE: (Print to File) (as shown in the screen shot "Importing Into Excel") as the printer port.

Importing into Excel

printer-port


Once this Print to File option has been created, you can print the general ledger to this printer option from your accounting system and import the resulting .prn file into Excel. You will then need to parse the data in Excel using the Text to Columns parsing tool before proceeding to Step 2 below.

STEP 2: PREPARE THE DATA FOR PIVOTING

The next step is to prepare the Excel-based general ledger data for pivoting, an action sometimes referred to by CPAs as "scrubbing the data." This step typically involves performing a handful of common cleanup tasks, which are summarized below and discussed in greater depth thereafter:

a. Convert formulas (if any) to values.

b. Clean and format the data. Remove merged cells (if any).

c. Repeat account numbers (or row labels) for each row.

d. Remove unnecessary columns.

e. Remove unnecessary rows.

a. To convert formulas to values, select each column that contains formulas, copy those columns, and paste them to the same locations using the Paste Values command. This action will convert all formulas in the column to values. (If you are unsure of whether your imported general ledger contains formulas, you can search the worksheet for equal signs to identify the presence of formulas and make a note of those columns containing formulas.)

b. To clean and format the data, select the entire worksheet and, as necessary, do the following: Make all fonts and font colors the same, remove all borders and underlines, remove all hyperlinks, format cells containing numbers as values, format cells containing dates as dates, adjust all columns to the appropriate widths, and adjust all rows to the same height. If necessary, remove any merged cells by selecting the entire worksheet, then right-click anywhere on it and select Format Cells, Alignment, and clear the box labeled Merge cells.

c. The process of repeating the report's account numbers and labels for each transaction row requires a simple but clever trick, as follows: As illustrated in the simplified example in the screen shot "Repeating Numbers, Labels for Each Transaction Row," the objective is to repeat all account numbers and/or descriptions (circled in red) for each transaction row (highlighted in yellow).

Repeating numbers, labels for each transaction row

repeating


You might be tempted to copy and paste the account numbers and/or labels individually, but this method is simply impractical when hundreds of accounts or thousands of transactions are involved. A step-by-step example of the more efficient approach to repeating your row labels is as follows:

i. In cell B3, enter the formula =B2.

ii. Copy cell B3 by pressing Ctrl+C.

iii. Highlight the range from B2 to the end of column B's data range.

iv. Press the F5 key to launch the GoTo dialog box.

v. Click the Special button, check the radio button labeled Blanks, and then click OK.

vi. Paste the data by pressing Ctrl+V. Your account descriptions are now repeated for every transaction row.

vii. Next, convert these repeated labels from formulas to text by selecting Column B (in this example), copying the entire column by pressing Ctrl+C, and then pasting the copied data as values to the same location by selecting the Home tab's Paste Special, Paste Values option. (Note: Without this extra step, you will end up with formulas instead of values in column B, which are not suitable for pivoting.)

d. Remove all unnecessary columns such as blank columns and columns containing information not pertinent to your final PivotTable. This is a straightforward procedure that involves selecting and deleting each unwanted column.

e. The process of removing unnecessary rows is more involved than removing columns because general ledger reports tend to have only a dozen or so columns, but they may contain tens of thousands or even hundreds of thousands of rows. This larger volume of rows makes deleting the unnecessary rows one at a time impractical. Instead, you will want to sort the data and delete the unwanted rows in groups, examples of which are discussed below.

(Note: As an option, before you sort your data, it may be a good idea to insert a new blank column A and then number the data's rows sequentially. This will enable you to re-sort the data to their original general ledger order, should you later desire to do so.)

Sort the data according to the column that contains your total row labels (column B in this example); this will group all of the totaling rows together. Select all of the totaling rows and delete them as a group (see the screen shot "Deleting Totaling Rows as a Group").

Deleting totaling rows as a group

totaling-rows


Sort the data according to the column containing your amounts (column H in this example), which will group the rows that contain no values (or zero values) together. Scroll to those rows with no values and select and delete them (as suggested in the simplified example in the screen shot "Deleting Rows With No Values").

Deleting rows with no values

deleting-rows


STEP 3: PIVOT YOUR GENERAL LEDGER DATA

Once the data are pivot-ready, pivot the general ledger data by selecting a single cell in your data range and from the Insert tab, select PivotTable, OK. Place checkmarks next to the Data Fields you want included in your PivotTable, and format the results as desired. For purposes of this article, I am using a general ledger containing 20 years' worth of fictitious QuickBooks data, consisting of 44,479 expense transactions. (While pivoting your general ledger for a single year is probably the norm, I am including 20 years of general ledger data in this example to demonstrate the added dimension of analyzing multiple years.) I exported this general ledger report to Excel, applied the data-scrubbing procedures described above, and pivoted the results by selecting the following data fields: Accounts, Amounts, and Dates. This process produced a PivotTable summarizing amounts with each unique transaction date displayed as a column.

Because daily columns are not very meaningful, I right-clicked a random date in the PivotTable's row labels, selected Group, and then selected the options to group the data by Years (instead of Days) and clicked OK. As you can see, the resulting PivotTable report becomes far more readable and more meaningful, as shown in the screen shot "Grouping Data by Year."

With the general ledger data organized as a PivotTable, the data can now be sorted by any column, any column can be filtered, and any value can be drilled (by double-clicking that value) to reveal the underlying transactions.

Grouping data by year

data-by-year


STEP 4: FILTERING YOUR PIVOTTABLE

One reason for pivoting your general ledger is so the data can be easily filtered. As an example, to review all of the insurance expenditures over the past 20 years in this general ledger, I clicked the Filter button and entered the phrase Insurance.

The resulting filtered report summarizing all accounts containing the phrase Insurance is shown in the screen shot "Sorting Insurance Expenditures by Year."

Sorting insurance expenditures by year

sorting-expenditures


In the event that the report you want is unavailable via filtering, you can cherry-pick the accounts you want. As an example, I used the PivotTable label filter to cherry-pick the electricity, gas, water, garbage, phone, and internet to focus on utility expenditures. The resulting chart is shown in the screen shot "Sorting Utility Payments."

Sorting utility payments

utility-payments


It is also possible to filter a PivotTable using Excel's Slicer tools. As an example, the screen shot "Filtering Tax Expenditures by Year" shows Slicers being used to create an analysis of tax expenditures for a select number of years. This analysis was created by clicking a cell in the PivotTable to select it, then from the Insert tab, the Slicer menu option was selected. In the resulting Insert Slicers dialog box, the Date and Account fields were selected (which inserts date and account slicers on the worksheet). These slicer boxes were then used to filter the general ledger data to display years 2000 through 2005, and only those accounts pertaining to taxes. The resulting filtered PivotTable is positioned at the top of the screen shot.

Filtering tax expenditures by year

tax-expenditures


STEP 5: CHARTING YOUR DATA

Once the general ledger data are organized in a PivotTable format and filtered into subsections, it's relatively easy to chart the results to help visualize the data and detect meaningful trends or anomalies. For example, the 3-D pie chart in the screen shot "Insurance Payments" depicts all insurance payments made over a 20-year period.

Insurance payments

insurance-payments


STEP 6: MINE THE DATA DEEPER

At this juncture, the PivotTable process described thus far better organizes your general ledger data, making them easier to sort, filter, and analyze. However, the best insights may come from adding data general ledger fields to your PivotTable. For example, the QuickBooks general ledger includes additional columns for subaccounts, payment types, payees, and transaction memos. Upon generating the PivotTable, these additional columns become available data fields that can be included in the PivotTable report for added dimensions and additional information.

To illustrate the process of deeper analysis, I pivoted my actual general ledger for the past 20 years and added the QuickBooks Memo data field to the mix. (In QuickBooks, I've used the Memo field to record various transactions' details including the names of hotels, gas stations, restaurants, grocery stores, etc.) I then analyzed my family's expenses for the following categories: educational expenses; insurance payments; charitable gifts; IRA, SEP, and savings contributions; and federal, state, and local taxes. While these reports were very insightful, the data were considered too private for inclusion in this article, so I prepared the following analysis of restaurant expenditures to illustrate the type of detailed reports you might extract from your general ledger data. I started by filtering my general ledger PivotTable to display only the Meals - Restaurants account, and then added the general ledger's Memo field to the PivotTable. The results revealed that I have dined out 3,728 times at 609 restaurants over the past 20 years, which works out to 3.58 times per week. Further, I paid a total of $92,853 for those meals, or an average of $24.91 per meal.

I then sorted the PivotTable by total dollars spent at each establishment and charted the top 20 restaurants using Excel 2016's new Treemap chart format, and then added restaurant logos. The resulting chart is shown in the screen shot "Treemap of Restaurant Expenditures." You can also see how to create a Treemap chart in the video "How to Create an Excel Treemap," available below.

Treemap of restaurant expenditures

treemap


Though the chart's labels are not very legible in this article's small graphic image, the full-size version of this chart reports the total dollars spent at each establishment along with the total number of visits, and the size of each box represents the relative amount of money spent at that restaurant compared with the other 19 restaurants (i.e., Sonny's BBQ topped this chart, raking in a total of $5,162 in purchases from 219 visits, while El Torero was the most frequently visited restaurant by the Collins family with 220 visits, receiving $5,002).

I was also able to calculate that 68 of the 609 restaurants (11.12%) cost me more than $100 per meal on average, including 17 meals at Ruth's Chris Steak House (costing $109.18 per meal), four meals at Bennie's Red Barn (costing $142.44 per meal), and three meals at Little Tokyo (costing $150.95 per meal), as examples. It is important to keep in mind that these restaurant data are incomplete in several respects. For example, these data only include meals that I paid for; I've eaten at many other restaurants where I did not pay, so those details are left out of the equation. Further, the per meal costs I mentioned aren't necessarily comparable because the number of guests present varied and my general ledger memo field was not used to capture the size of the dinner parties. I point out these shortcomings to underscore the importance of capturing as much detail as possible when entering transactions into your accounting system. Failure to capture your transaction details now will undermine your attempts to data-mine your general ledger.

Admittedly, these restaurant details may not lead to any meaningful observations from a CPA's perspective; however, I believe these reports do help convey the type of detailed information that could be hibernating in your general ledger. As you can see by the examples, the process of organizing your general ledger into a PivotTable unlocks tools for organizing, slicing, dicing, drilling, and visualizing the data. When asked whether they've ever pivoted their general ledger data, many CPAs say it never occurred to them to try. However, once they've discovered this handy process, many CPAs report it's an efficient way to mine and analyze the data.

Armed with this knowledge, you may find that pivoting, sorting, filtering, and charting your general ledger reveals useful insights and important trends regarding your organization. You may also be prompted to expand your efforts to record more transaction details to bolster your data archive for future analysis.


Special instructions for companies with large general ledgers

Because each Excel worksheet has a limited number of rows, the solution described in this article works best with smaller general ledger reports (fewer than 1,048,576 rows) so the report will fit on a single Excel worksheet. If your general ledger has more rows than will fit on a single Excel worksheet, here are three possible workarounds, assuming your computer can handle the larger volume of data.

1. Pivot monthly general ledgers. You might consider exporting and pivoting your monthly general ledger reports individually, rather than pivoting the entire annual general ledger report. This approach would likely accommodate any company with an annual general ledger report that contains fewer than 12 million rows.

2. Use Excel's Data Model tool. Another alternative is to export each month's general ledger to separate worksheets and then pivot the multiple worksheets together using Excel's Data Model tool. To use this functionality, when creating a PivotTable, check the Add this data to the Data Model box in the Create PivotTable dialog box as pictured below.

create-pivottable


This Data Model approach enables you to export portions of your large general ledger a million records at a time onto individual worksheets and then pivot all worksheets together into one PivotTable.

Caveat: Working with millions of rows of data requires a powerful computer with many microprocessors and a great deal of RAM. As a test, I created two Excel files—one containing two and another containing 12 monthly general ledgers with 1 million records each—and then used the Excel Data Model to pivot them together on my $1,149 Dell XPS 8900 Signature Edition, Intel Core i7-6700 CPU 4.00 GHz computer with 16 GBs of RAM and eight microprocessors. This process worked satisfactorily with two monthly general ledgers, but it crashed when I attempted to pivot the 12 monthly general ledgers (see the screen shot below).

memory

Many factors affect how a computer uses memory, so it is difficult to pinpoint exactly how many CPUs and how much memory would be needed to quickly pivot a general ledger containing 12 million transactions.

3. PowerPivot. Beginning with Excel 2010, Microsoft offers the free downloadable PowerPivot tool (available at msdn.microsoft.com, which enables users to quickly pivot millions of rows of data. Like Excel's PivotTable tool, PowerPivot also creates a PivotTable but uses a newer and more powerful engine capable of handling and crunching much more data at a much faster speed. For example, once you have imported and cleaned your 12 monthly general ledgers into separate worksheets, PowerPivot could be used to pivot those worksheets and produce a workable PivotTable report in Excel. Best of all, PowerPivot is much faster and operates well on typical computers, such as mine.


About the author

J. Carlton Collins (carlton@asaresearch.com) is a technology consultant, a conference speaker, and a JofA contributing editor.

Note

Instructions for Microsoft Excel in this article refer to the 2007 through 2016 versions, unless otherwise specified.

To comment on this article or to suggest an idea for another article, contact Jeff Drew, senior editor, at jdrew@aicpa.org or 919-402-4056.

SPONSORED REPORT

CPEOs provide peace of mind around payroll services

The creation of these new IRS-certified service providers for small businesses clarifies some issues around traditional professional employer organizations.

QUIZ

Pronoun practice to help polish your prose

Using pronouns correctly in writing and speech can help you make a good impression. Try our 10-question quiz.