One of the most powerful
features of Excel is the array—a formula designed
to act simultaneously on sets of two or more
values in order to calculate other values. Yet,
because arrays appear to be forbidding, few CPAs
use them. This article is designed to dispel
arrays’ bad reputation and demonstrate how they
can speed and simplify your work while making it
less prone to errors. So get ready to overcome
your bias against arrays. We’ll begin with
the most basic array formula, and as we move—step
by step—to more complicated ones, you’ll see how
powerful arrays can get. To make it easier for you
to follow along, download an Excel file here.
The file contains two versions of each worksheet.
One worksheet in each set has blank cells in which
you can practice entering the arrays and other
formulas mentioned in this article, while the
other has all the cells already completed.
AVERAGE IT Accountants
often need to tightly summarize data. Exhibit 1
uses a one-dimensional array formula on payroll
information to calculate the average pay of each
employee and the global average of all employees.
(In your downloaded file, see the Average
It worksheet).
Here’s how we did it: To calculate the
average per employee, select the range G3:G7 and
type this formula:
=(B3:B7+C3:C7+D3:D7+E3:E7+F3:F7)/5 Then
press Ctrl+Shift+Enter, which does two things: It
automatically places curly brackets—{ }—around the
formula, labeling it an array formula, and
simultaneously triggers the array calculation. The
array formula now exists in G3 to G7 and cannot be
changed except by rewriting the entire formula.
To calculate the average pay per month,
select the range B8:G8 and type in this formula:
=(B3:G3+B4:G4+B5:G5+B6:G6+B7:G7)/5
Then press Ctrl+Shift+Enter. The global average
for all employees is now in cell G8.
RANK IT Creating
two-dimensional arrays is slightly more
challenging. Consider again the payroll data of
Exhibit 1. This time we want to rank the paychecks
by size. To do that, copy the list of names (as
shown in the lower half of Exhibit 2) and type
this formula: =RANK(B3:F7, B3:F7)
Press Ctrl+Shift+Enter, and presto, the data
are ranked—a task that would be far more difficult
without arrays.
ANALYZE IT Arrays also are
useful when performing analyses that impose
conditions upon mathematical operations. For
example, say you have a spreadsheet loaded with
sales data and you want to see various subsets of
the data based on ranges of products’ prices and
quantity (see Exhibit 3 ). For this
calculation we will use a single-cell array. (See
the worksheet Single Cell in your
downloaded file.) If we want to know the
sum of range B3:B7 by range C3:C7, typically we
could create D3:D7 and place the sum in cell D8.
However, an array formula in D9 would do the job
in one pass: {=SUM(B3:B7*C3:C7)} Table 1
| Accounting
question of interest | Cell |
Formula | Sum (using
arrays) | D9 |
{=SUM(B3:B7*C3:C7)} |
Sum (if greater than the average
price) | D10 |
{=SUM((C3:C7>AVERAGE(C3:C7))*B3:B7*C3:C7)}
| Sum (if greater than the
average quantity) | D11 |
{=SUM((B3:B7>AVERAGE(B3:B7))*B3:B7*C3:C7)}
| Sum (if greater than the
average price and greater than
theaverage quantity) | D12 |
{=SUM((C3:C7>AVERAGE(C3:C7))*B3:B7*(B3:B7>
AVERAGE(B3:B7))*C3:C7)} |
Sum (if greater than the average
price or greater than the average
quantity) | D13 |
{=SUM(((C3:C7>AVERAGE(C3:C7))+(B3:B7>
AVERAGE(B3:B7))>0)*B3:B7*C3:C7)}
| That
simple formula extracts all the information from
the underlying cells without the usual sum
formulas in D3:D8. The grayed area at the bottom
of Exhibit 3 (D9:D13) contains various array
formulas to compute numerous values of interest to
CPAs. Table 1 lists many of the typical ways CPAs
are called upon to manipulate such data and the
array formulas that perform each of those
calculations.
Advisory: Often CPAs need to know, and
perhaps to explain to clients or executives who
are not handy in Excel, how certain spreadsheet
numbers are derived (see screenshot below). You
can display this information easily by clicking on
Tools, Formula
Auditing, Evaluate
Formula. This allows you to step through
the calculations, first seeing cell references and
then the numbers those references represent.
CALCULATE THE CONSTANTS
Arrays often need to import constants,
including useful explanatory information such as
dates, names and numbers. This comes up, for
example, when you need to include transaction
fees, such as sales taxes and shipping charges, in
formulas. Exhibit 4 shows invoices both before the
application of a 4% sales tax and a 1% shipping
fee (column D) and after (column G). See the
downloaded file’s Single Cell
worksheet.
This kind of array has to be created and then
recalled by way of a drop-down menu, which is
evoked by clicking on Insert,
Name, Define
(see Exhibit 5). Under Names in
workbook, add the designation of a name
such as StandardCharges, and in the Refers
to field, add the specification of the
percentage rates, such as = {0.04,0.01}. Both of
these values could have been stored in separate
cells on the spreadsheet. But often these values
(constants) should be hidden or made inaccessible
so the employees cannot easily change them.
Place the formula in the affected cells (in
this case E3:E7 and F3:F7) that recall the
percentage rates and apply them to the base
amounts. For example, the sales taxes require this
formula to be created as an array:
{=D3:D7*INDEX(StandardCharges,1)} The
shipping charge array requires selecting the
second value from StandardCharges:
{=D3:D7*INDEX(StandardCharges,2)} Let’s
look at three typical calculations CPAs face. For
each, refer to Exhibit 6 and your
downloaded worksheet Calculate
Constant. Once again, use the
Evaluate Formula tool to
observe what Excel is doing for each of the
following examples.
SUMMING BASED ON A CONDITION
When you need to sum values based
on one or more conditions, the array formula (in
D9) is:
=SUM(IF((D3:D7>=1000)*(D3:D7<1700),D3:D7))
This formula returns the sum of all cells in
the range D3:D7 where the value is greater than or
equal to 1,000 and less than 1,700. You can
include more conditions in the array formula if
necessary.
SUMMING THE n LARGEST VALUES IN A
RANGE Cell D10 contains this
formula:
=SUM(LARGE(D3:D7,ROW(INDIRECT("1:2"))))
This formula returns the sum of the two
largest values in the range D3:D7. LARGE
is an Excel function that will be
evaluated twice, each time with a different second
argument (that is, 1, 2). Also, if you wish to sum
the n smallest values in a range, use the Excel
SMALL function instead of the
LARGE function.
COMPUTING AN AVERAGE THAT INCLUDES
ZEROS Often the use of averaging
commands is distorted by the presence of zeros in
the data. Zero may indicate that the transaction
is a different type, and that the calculation is
not relevant. In many situations we wish to ignore
zero values when determining an average. In D11 (
Exhibit 6) the array formula
=AVERAGE(IF(D3:D7<>0,D3:D7)) a verages all
nonzero values in the range D3:D7. If you had used
the Excel function =AVERAGE(D3:D7) , you would
have received the value 1,133.07 rather than
1,416.34.
COUNTING THE NUMBER OF DIFFERENCES IN TWO
RANGES Arrays are exceptionally
helpful in ascertaining difference conditions that
might have significance. That’s especially true
when performing internal controls, where
consistency and agreement are essential. In
Exhibit 7, cell E13 contains an array formula that
compares corresponding values in two ranges
(D3:D11 and E3:E11) and determines the number of
differences that are greater than a specific
value. The specific value is in I13 ($2.00); think
of this as a materiality threshold on the
discrepancy, to separate out those that are
trivial or caused by predictable events. If there
are no differences between the two ranges, the
formula will return 0. When entering the data for
this formula, both ranges must be the same size.
(Refer to the downloaded file’s
Differences worksheet.)
IDENTIFY THE CELLS The
previous array formula returned the number of
differences, but we also may want an itemized
listing of those differences. To do that in the
range F15:F23 enter this array formula:
=LARGE($E$3:$E$11-$D$3:$D$11,ROW(INDIRECT("1:"&$E$13)))
The formula will return the six cells (see
cell E13) that are of interest to us because their
hourly rate increased by more than $2. To list in
ascending order the six employees whose hourly
rate has increased by more than $2 per hour, enter
the following formula in E15 and copy it into the
range E16:E23:
=INDEX($B$3:$B$11,MATCH(F15,($E$3:$E$11)-($D$3:$D$11),0))
RETURNING THE LOCATION OF THE MAXIMUM
VALUE IN A RANGE To return the
row number of the maximum hourly rate in the range
E3:E11, enter this formula in F26:
=MIN(IF(E3:E11=MAX(E3:E11),ROW(E3:E11),""))
It would return the value 5. To return the
address of the maximum value, use this array
formula in G26:
=ADDRESS(MIN(IF(E3:E11=MAX(E3:E11),ROW(E3:E11),"")),COLUMN(E3:E11))
The value $E$5 will be returned. Obviously
this task can be accomplished by eyeballing the
data, but when you’re dealing with hundreds of
employees, Excel can do the job more effectively.
DETERMINING WHETHER A RANGE CONTAINS VALID
VALUES Many times we need to know
whether the values in one range are contained in
another. This could be used to uncover fraud based
on a “padded” payroll, for example. (Refer to the
downloaded file’s Padded Payrol l
worksheet.) We could check for that by determining
whether every employee receiving a check was also
in the master list of employee names. Exhibit 8
demonstrates four array formulas that could do the
job—that is, compare payroll names against a
master list.
Start by placing this formula in J2:
=ISNA(MATCH(TRUE,ISNA(MATCH($B$2:$B$12,$A$2:$A$10,0)),0))
It will return either TRUE or FALSE. In this
case it returns FALSE because there are names in
the range B2:B12 receiving payroll checks that are
not in the range A2:A10 (the master list of
employee names). Then, in J3 enter this
formula:
=SUM(1*ISNA(MATCH($B$2:$B$12,$A$2:$A$10,0)))
It returns the value 4—the number of names
receiving a payroll check that are not in the
master employee list. Next, to discover
who these people are, enter this formula in the
range C2:C12:
=SMALL(IF(ISNA(MATCH($B$2:$B$12,$A$2:$A$10,0)),
ROW($B$2:$B$12),""),ROW(INDIRECT("1:"&$J$3)))
The formula returns the array of row numbers
of concern. And then in D2:D12 is the
formula: =INDIRECT(ADDRESS(C2,2))
It returns the names on the payroll checks.
As you can see, CPAs can use array formulas
in many ways. The formulas do complex jobs in
step-saving ways and are able to cull material
that would otherwise consume long hours of
searching.
Paul Goldwater,
Ph.D., is an associate professor of accounting
at the University of Central Florida in Orlando,
Fla. Timothy Fogarty, CPA,
Ph.D., is an associate dean and a professor of
accounting at Case Western Reserve University in
Cleveland. Their e-mail addresses are
paul.goldwater@bus.ucf.edu
and
timothy.fogarty@case.edu
, respectively. |