The Power of Arrays

The Excel tool that performs multiple functions in a single step.
BY PAUL GOLDWATER AND TIMOTHY FOGARTY

  

 
 

One of the most powerful features of Excel is the array—a formula designed to act simultaneously on sets of two or more values in order to calculate other values. Yet, because arrays appear to be forbidding, few CPAs use them. This article is designed to dispel arrays’ bad reputation and demonstrate how they can speed and simplify your work while making it less prone to errors. So get ready to overcome your bias against arrays.

We’ll begin with the most basic array formula, and as we move—step by step—to more complicated ones, you’ll see how powerful arrays can get. To make it easier for you to follow along, download an Excel file here. The file contains two versions of each worksheet. One worksheet in each set has blank cells in which you can practice entering the arrays and other formulas mentioned in this article, while the other has all the cells already completed.

AVERAGE IT
Accountants often need to tightly summarize data. Exhibit 1 uses a one-dimensional array formula on payroll information to calculate the average pay of each employee and the global average of all employees. (In your downloaded file, see the Average It worksheet).

 

Here’s how we did it:

To calculate the average per employee, select the range G3:G7 and type this formula:

=(B3:B7+C3:C7+D3:D7+E3:E7+F3:F7)/5

Then press Ctrl+Shift+Enter, which does two things: It automatically places curly brackets—{ }—around the formula, labeling it an array formula, and simultaneously triggers the array calculation. The array formula now exists in G3 to G7 and cannot be changed except by rewriting the entire formula.

To calculate the average pay per month, select the range B8:G8 and type in this formula:

=(B3:G3+B4:G4+B5:G5+B6:G6+B7:G7)/5

Then press Ctrl+Shift+Enter. The global average for all employees is now in cell G8.

RANK IT
Creating two-dimensional arrays is slightly more challenging. Consider again the payroll data of Exhibit 1. This time we want to rank the paychecks by size. To do that, copy the list of names (as shown in the lower half of Exhibit 2) and type this formula:

=RANK(B3:F7, B3:F7)

Press Ctrl+Shift+Enter, and presto, the data are ranked—a task that would be far more difficult without arrays.

 

ANALYZE IT
Arrays also are useful when performing analyses that impose conditions upon mathematical operations. For example, say you have a spreadsheet loaded with sales data and you want to see various subsets of the data based on ranges of products’ prices and quantity (see Exhibit 3 ). For this calculation we will use a single-cell array. (See the worksheet Single Cell in your downloaded file.)

If we want to know the sum of range B3:B7 by range C3:C7, typically we could create D3:D7 and place the sum in cell D8. However, an array formula in D9 would do the job in one pass: {=SUM(B3:B7*C3:C7)}

Table 1
Accounting question of interest Cell Formula
Sum (using arrays) D9 {=SUM(B3:B7*C3:C7)}
Sum (if greater than the average price) D10 {=SUM((C3:C7>AVERAGE(C3:C7))*B3:B7*C3:C7)}
Sum (if greater than the average quantity) D11 {=SUM((B3:B7>AVERAGE(B3:B7))*B3:B7*C3:C7)}
Sum (if greater than the average price
and greater than theaverage quantity)
D12 {=SUM((C3:C7>AVERAGE(C3:C7))*B3:B7*(B3:B7>
AVERAGE(B3:B7))*C3:C7)}
Sum (if greater than the average price
or greater than the average quantity)
D13 {=SUM(((C3:C7>AVERAGE(C3:C7))+(B3:B7>
AVERAGE(B3:B7))>0)*B3:B7*C3:C7)}

That simple formula extracts all the information from the underlying cells without the usual sum formulas in D3:D8. The grayed area at the bottom of Exhibit 3 (D9:D13) contains various array formulas to compute numerous values of interest to CPAs. Table 1 lists many of the typical ways CPAs are called upon to manipulate such data and the array formulas that perform each of those calculations.

 

Advisory: Often CPAs need to know, and perhaps to explain to clients or executives who are not handy in Excel, how certain spreadsheet numbers are derived (see screenshot below). You can display this information easily by clicking on Tools, Formula Auditing, Evaluate Formula. This allows you to step through the calculations, first seeing cell references and then the numbers those references represent.

CALCULATE THE CONSTANTS
Arrays often need to import constants, including useful explanatory information such as dates, names and numbers. This comes up, for example, when you need to include transaction fees, such as sales taxes and shipping charges, in formulas. Exhibit 4 shows invoices both before the application of a 4% sales tax and a 1% shipping fee (column D) and after (column G). See the downloaded file’s Single Cell worksheet.

 

This kind of array has to be created and then recalled by way of a drop-down menu, which is evoked by clicking on Insert, Name, Define (see Exhibit 5). Under Names in workbook, add the designation of a name such as StandardCharges, and in the Refers to field, add the specification of the percentage rates, such as = {0.04,0.01}. Both of these values could have been stored in separate cells on the spreadsheet. But often these values (constants) should be hidden or made inaccessible so the employees cannot easily change them.

 

Place the formula in the affected cells (in this case E3:E7 and F3:F7) that recall the percentage rates and apply them to the base amounts. For example, the sales taxes require this formula to be created as an array:

{=D3:D7*INDEX(StandardCharges,1)}

The shipping charge array requires selecting the second value from StandardCharges:

{=D3:D7*INDEX(StandardCharges,2)}

Let’s look at three typical calculations CPAs face. For each, refer to Exhibit 6 and your downloaded worksheet Calculate Constant. Once again, use the Evaluate Formula tool to observe what Excel is doing for each of the following examples.

 

SUMMING BASED ON A CONDITION
When you need to sum values based on one or more conditions, the array formula (in D9) is:

=SUM(IF((D3:D7>=1000)*(D3:D7<1700),D3:D7))

This formula returns the sum of all cells in the range D3:D7 where the value is greater than or equal to 1,000 and less than 1,700. You can include more conditions in the array formula if necessary.

SUMMING THE n LARGEST VALUES IN A RANGE
Cell D10 contains this formula:

=SUM(LARGE(D3:D7,ROW(INDIRECT("1:2"))))

This formula returns the sum of the two largest values in the range D3:D7. LARGE is an Excel function that will be evaluated twice, each time with a different second argument (that is, 1, 2). Also, if you wish to sum the n smallest values in a range, use the Excel SMALL function instead of the LARGE function.

COMPUTING AN AVERAGE THAT INCLUDES ZEROS
Often the use of averaging commands is distorted by the presence of zeros in the data. Zero may indicate that the transaction is a different type, and that the calculation is not relevant. In many situations we wish to ignore zero values when determining an average. In D11 ( Exhibit 6) the array formula =AVERAGE(IF(D3:D7<>0,D3:D7)) a verages all nonzero values in the range D3:D7. If you had used the Excel function =AVERAGE(D3:D7) , you would have received the value 1,133.07 rather than 1,416.34.

COUNTING THE NUMBER OF DIFFERENCES IN TWO RANGES
Arrays are exceptionally helpful in ascertaining difference conditions that might have significance. That’s especially true when performing internal controls, where consistency and agreement are essential. In Exhibit 7, cell E13 contains an array formula that compares corresponding values in two ranges (D3:D11 and E3:E11) and determines the number of differences that are greater than a specific value. The specific value is in I13 ($2.00); think of this as a materiality threshold on the discrepancy, to separate out those that are trivial or caused by predictable events. If there are no differences between the two ranges, the formula will return 0. When entering the data for this formula, both ranges must be the same size. (Refer to the downloaded file’s Differences worksheet.)

 

IDENTIFY THE CELLS
The previous array formula returned the number of differences, but we also may want an itemized listing of those differences. To do that in the range F15:F23 enter this array formula:

=LARGE($E$3:$E$11-$D$3:$D$11,ROW(INDIRECT("1:"&$E$13)))

The formula will return the six cells (see cell E13) that are of interest to us because their hourly rate increased by more than $2. To list in ascending order the six employees whose hourly rate has increased by more than $2 per hour, enter the following formula in E15 and copy it into the range E16:E23:

=INDEX($B$3:$B$11,MATCH(F15,($E$3:$E$11)-($D$3:$D$11),0))

RETURNING THE LOCATION OF THE MAXIMUM VALUE IN A RANGE
To return the row number of the maximum hourly rate in the range E3:E11, enter this formula in F26:

=MIN(IF(E3:E11=MAX(E3:E11),ROW(E3:E11),""))

It would return the value 5. To return the address of the maximum value, use this array formula in G26:

=ADDRESS(MIN(IF(E3:E11=MAX(E3:E11),ROW(E3:E11),"")),COLUMN(E3:E11))

The value $E$5 will be returned. Obviously this task can be accomplished by eyeballing the data, but when you’re dealing with hundreds of employees, Excel can do the job more effectively.

DETERMINING WHETHER A RANGE CONTAINS VALID VALUES
Many times we need to know whether the values in one range are contained in another. This could be used to uncover fraud based on a “padded” payroll, for example. (Refer to the downloaded file’s Padded Payrol l worksheet.) We could check for that by determining whether every employee receiving a check was also in the master list of employee names. Exhibit 8 demonstrates four array formulas that could do the job—that is, compare payroll names against a master list.

 

Start by placing this formula in J2:

=ISNA(MATCH(TRUE,ISNA(MATCH($B$2:$B$12,$A$2:$A$10,0)),0))

It will return either TRUE or FALSE. In this case it returns FALSE because there are names in the range B2:B12 receiving payroll checks that are not in the range A2:A10 (the master list of employee names).

Then, in J3 enter this formula:

=SUM(1*ISNA(MATCH($B$2:$B$12,$A$2:$A$10,0)))

It returns the value 4—the number of names receiving a payroll check that are not in the master employee list.

Next, to discover who these people are, enter this formula in the range C2:C12:

=SMALL(IF(ISNA(MATCH($B$2:$B$12,$A$2:$A$10,0)),
ROW($B$2:$B$12),""),ROW(INDIRECT("1:"&$J$3)))

The formula returns the array of row numbers of concern.

And then in D2:D12 is the formula:

=INDIRECT(ADDRESS(C2,2))

It returns the names on the payroll checks.

As you can see, CPAs can use array formulas in many ways. The formulas do complex jobs in step-saving ways and are able to cull material that would otherwise consume long hours of searching.

Paul Goldwater, Ph.D., is an associate professor of accounting at the University of Central Florida in Orlando, Fla. Timothy Fogarty, CPA, Ph.D., is an associate dean and a professor of accounting at Case Western Reserve University in Cleveland. Their e-mail addresses are paul.goldwater@bus.ucf.edu and timothy.fogarty@case.edu , respectively.

Where to find March’s flipbook issue

The Journal of Accountancy is now completely digital. 

 

 

 

SPONSORED REPORT

Get Clients Ready for Tax Season

This comprehensive report looks at the changes to the child tax credit, earned income tax credit, and child and dependent care credit caused by the expiration of provisions in the American Rescue Plan Act; the ability e-file more returns in the Form 1040 series; automobile mileage deductions; the alternative minimum tax; gift tax exemptions; strategies for accelerating or postponing income and deductions; and retirement and estate planning.