The XML Files

In the 21st century economy, XML is the way you will move information.


XML IS A NEW MARKUP LANGUAGE— a relative of the Web’s HTML—that codes all information, making it accessible to many users across virtually all programs and platforms.

XML TELLS YOU WHAT the information is, such as a customer name in a purchase order, and how it should be presented, such as in print or in a Web site.

CPAs HAVE TO MOVE and manipulate vast amounts of data. XML can automate much of the rote work by allowing different systems to speak to each other, saving CPAs for more highly valued analytical work.

XML IS NOT THE PROPERTY of any one company; it is an open language available to everyone. Any accounting software vendor can incorporate XML and XML standards.

XML USE INCLUDES the automation and simplification of audit schedules, the elimination of cumbersome and expensive accounts payable systems and the creation of universal, easy-to-implement e-commerce solutions.

THE NEED TODAY IS for standardization of the coding process, so all financial report information, for example, can be easily understood no matter which program or platform you use.

CHARLES HOFFMAN, CPA, is a Tacoma-based information systems consultant specializing in accounting-related solutions. He was the 1997 winner of the AICPA Innovative User of Technology Award. His e-mail is . CHRISTOPHER KURT is a managing associate at PricewaterhouseCoopers’ consumer and industrial systems practice. His e-mail is . RICHARD J. KORETO is senior news editor of the Journal . Mr. Koreto is an employee of the American Institute of CPAs and his views, as expressed in this article, do not necessarily reflect the views of the AICPA. Official positions are determined through certain specific committee procedures, due process and deliberation.

XML art1 I magine you could give another dimension to your tax records, audit work-papers, payroll system, financial statements—anything CPAs work with. Imagine you could give each electronic record, each unit of information in your office, a label, or tag, that would explain what the data mean to whoever wanted to use that information—a person or a computer program. "Jane Doe" would not be just a name but a person identified as a corporate client in Wichita; "$322.28" would be labeled as an accounts payable item to Acme Office Supplies. Even if the tags were in plain English your computer system would understand them. Imagine your accounting software—all accounting software—could add tags automatically and use a standard method and standard tags, rather than proprietary methods and tags. No one could own the codes because they would be open—not a product of Microsoft or any other company. And as long as you're dreaming, make believe the software is free. Now stop imagining, because it's here. Extensible markup language (XML) may still be an unfamiliar computer language to many, but it's not science fiction and it's readily understandable by anyone who understands the Web. XML is in use today and is completely changing the way business information moves around the world.

XML is about having one universal way to exchange data, rather than hundreds of different ways. XML is both easy to understand and capable of the most sophisticated data-management tasks. Perhaps it's easiest to describe it by looking at the name itself. First, what is a markup language? Markup refers to its tags, or codes, which identify pieces of information (see "Markup Languages: A Genealogy," sidebar). Think of XML as a superintelligent version of hyper-text markup language—HTML—the language of the Internet. As an experiment, use your browser to reveal the HTML codes on a Web page. A word in boldface looks like this: <B>boldface</B>. A heading centered on a page looks like this: <center>Heading</center>. But HTML describes just presentation style. It can tell you that Jane Doe is a boldface, italic second-level head; it can't tell you if it names a person or a company. XML can tell you all of this.

What is meant by calling this particular markup language "extensible"? HTML codes are fixed; you can't add new ones to suit your needs. But XML codes can be "extended." The number of different codes is limited only by your imagination. If the accounting profession was to agree on a set of codes for financial statements, every accounting software manufacturer could incorporate it. The result would be uniformly coded financial statements that users could share across all platforms. For example, consider two merging corporations, using different programs for accounts payable. Today, this is a problem. But if each company used XML-based software, the underlying tags would be the same—the information translates with little effort. That's the power of standards. (Think about e-mail today—virtually any computer can send a readable e-mail to anyone in the world. Why? Because there is a widespread agreement on the standards for Internet mail different software manufacturers and ISPs adhere to.) In addition, XML can take care of display in printed reports, Web sites and CD-ROMs. That is, the tags incorporate the information the computer needs to know for properly formatting the data for use in any digital or traditional print medium.

But XML is not about memorizing a series of tags, a purely clerical function. In fact, software manufacturers will likely make these tags a built-in part of their programs, so you don't have to remember what the tag for an accounts payable client is. (You may know how your car's transmission works, but you don't have to think about it as you drive to work every day.) Rather, CPAs should think of XML as a new tool that will help them more than ever be the information experts their companies and clients need. Today, financial information flows to the IRS in tax returns, and paper invoices flow through an accounting system. XML will help automate these processes in one standard way. That is, the first time a piece of information is entered—such as when a new customer buys a product—is the last time any person in your company has to manipulate it. The codes will identify that customer and her purchase in your accounts receivable system; your inventory system (which is connected via extranet to your supplier, who also has an XML-based system); and your tax database, so you know whether or not tax is to be added and how much. Ultimately that information makes its way into financial statements, in a format suitable for the outside auditors.

This means organizations will no longer need to squander enormous economic and human resources to simply translate information from one form to another. Accounting professionals will be free to focus on the value-added work that business managers truly want—analysis of business information. CPAs who start now to learn the basics of how XML works and what it can achieve will cement their positions as corporate knowledge managers.

XML art2


At present, the Internet is merely an access medium—the conduit connecting millions of islands of information, creating a massive data repository. The recent explosion of Internet use is based partially on widely accepted standards such as the network protocol TCP/IP. But to reach its true potential, the Internet must evolve beyond an information access medium—it must facilitate comprehension by establishing and maintaining context between bits of data. That is, XML can help sort out the difference in your database between London, England; London, Canada; and London, Robert—a customer in New York.

An "information-understanding" standard allows people and computers to easily search, sort, move, display, personalize, adapt and otherwise manipulate information while maintaining its context and internal relationships. XML's markup or data-tagging structure helps achieve this goal. Most business information exists in unstructured documents and proprietary databases. Although these data repositories are connected to the Internet, their relevant information is not necessarily easily accessible. Problems that XML can help solve include:

  • Internet searches that return hundreds of thousands of hits for a simple topic. XML-coded data in a search engine's database would allow users to clearly specify they're searching for information about olive oil, not a petroleum product and not Popeye's girlfriend.

  • The daunting task of integrating one application with another.

  • The distribution of documents in a variety of standards that may conflict with your tools. Again, if different manufacturers all use XML, competing products will still be able to talk to each other.

  • The need for multiple copies of the same information: one for your printed reports, one for your Web site and one for CD-ROMs. Today, it takes time-consuming manual effort to turn information in a printed report into a usable format for a digital medium, for example. But XML would automatically code the instructions for each format, saving time and reducing the human-error factor.

But of all XML's advantages, one stands above the rest: XML is an open standard, not a product of one vendor. It does not hold users hostage to one software company. (By comparison, the authors prepared this article on a Windows 95 platform—the property of Microsoft.) The more companies agree to a standard series of codes for a given application—like accounting—the more powerful XML becomes. And because everyone's product is better with XML features, there are incentives to cooperate.

Markup Languages: A Genealogy
Although barely a year old, XML is well founded theoretically and is based on extensive industry experience. XML is a derivative of standard generalized markup language (SGML), which has been a widely supported standard for 10 years. SGML is used extensively in document publishing but has proved too complex for wide adoption and use on the Web. HTML (hypertext markup language), the standard that is partially responsible for the explosion of the World Wide Web, is also a derivative of SGML. However, the simplicity and document display focus of HTML make its use somewhat limited. XML is the direct result of the knowledge gained by experiences with both SGML and HTML.

Markup languages have been around for years. They had their origins in the early 1960s, when IBM realized that it had many different types of systems that didn't talk to each other. The IBM solution software engineers, led by Charles Goldfarb, came up with was generalized markup language, a series of codes all systems could share. Over the years, that markup language has evolved into several different but still related languages:

Standard generalized markup language: SGML was developed between 1978 and 1986. It is a robust International Standard (ISO 8879) used by many document-intensive industries such as publishing, aerospace, heavy manufacturing and pharmaceuticals. SGML is very complex, but it is the de facto standard for the interchange of large, complex documents of all sorts.

Hypertext markup language: HTML was developed by Tim Berners-Lee beginning in 1989. He called his hypertext system the World Wide Web, which has become the most used service on the Internet. HTML is quick and easy to use. But as Web documents evolved from static display pages to interactive applications, the limitations of HTML became quickly apparent. HTML documents proved time consuming to maintain and could not support these requirements, and Web designers turned to other solutions.

Extensible markup language: XML solves many of the problems of SGML and HTML. The W3C (an international consortium that seeks to promote Web standards) released it as a formal standard in February 1998. Although compatible with SGML, it is far less complicated. XML can incorporate formatting codes, and it works well with a wide variety of computer programs and platforms. It's particularly suitable for the Web, can be viewed and edited with any text editor, and was designed to be readable by humans. As in Goldilocks and the Three Bears , it's the program that's not too hard, not too soft, but just right.


The pace of the evolution of new Internet-based technologies, including XML, is truly astonishing. The pace of these changes is often referred to as "Internet time." Internet time is a concept similar to "dog years," in which seven dog years—or Internet years—equate to one human year. (A 10-year-old refrigerator still does its job; a 10-year-old computer is a museum piece.) Right now, XML is already in use in many applications; it's just not visible to users. That is how well the technology can work. In a very short time—Internet time—it could be ubiquitous.

Already two consumer accounting systems use XML: Microsoft Money and Intuit's Quicken. Both of these packages support a standard called Open Financial Exchange (OFX). Microsoft, Intuit and CheckFree jointly developed OFX to facilitate financial transactions between consumers' personal accounting packages and their banks over the Internet. This is just one application—how about others? For example, when will seamless Internet-based systems transfer an invoice from your suppliers' accounts receivable systems directly into your accounts payable system and then from your accounts payable system into their banks? That's a good question for your accounting software vendor. Customers drive vendor requirements; if end-users start asking, vendors will deliver. This evolution of capabilities appears inevitable.

Another example of XML in use is the Open Applications Group, a nonprofit industry consortium that includes the world's leading business software companies. Among its members are IBM, PricewaterhouseCoopers, Oracle, Great Plains, PeopleSoft, QAD and SAP. This group has developed approximately 100 XML document type definitions (DTDs)—the lists of codes in XML that define the data in business transactions. (A DTD may list that every customer name will be coded, or tagged, <CUSTNAME>. It may further mandate that <CUSTNAME> will be subordinate to—nested within—a tag called <PURCHASE> that will indicate it's a name that's part of a purchase order. [See "XML: Behind the Scenes," page 75.]) This organization is working to make business application integration easy and reliable, while reducing costs and implementation time. Does your accounting system conform to this integration specification? Again, ask your accounting software vendor.

If you use products such as PointCast or information channels through your Web browser, you are using information structured in XML. Specialized channels provide information targeted for specific vertical industries. For example, PricewaterhouseCoopers publishes news updates for the Internet and telecommunications industry through PointCast. These channels are not limited to news but also include training and events. The PwC telecommunications channel also provides training and industry event schedules targeted at specific businesses. Other organizations are developing channels to provide up-to-the-minute information in additional areas. Already, XML applications are providing significant value to businesses by making information readily available to everyone.

Many other industries are not far behind in incorporating XML. For example, the health-care industry has an initiative under way to adopt XML to exchange information. Users of SGML, led by software companies such as XML design contributor ArborText, are making use of XML to make their vast collection of SGML documents available to users via the Internet.

XML art4


CPAs can apply XML to many business problems, handling different file formats,application integration, business transactions, Internet technologies and knowledge management. These are diverse topics, but a common thread links them—the need for error-free conveyance of complex information, including its context and interrelationships. XML can provide a straightforward solution. Below are three examples of ways CPAs can use XML, but many others exist. As additional products with embedded XML capability come to market over the next few years, the uses for XML will multiply.

XML audit schedules. Imagine what auditing would be like if a client could transfer all required audit schedules to its auditor in an industry-standard format.

To test this concept, the CPA firm Knight, Vale & Gregory, working with a Great Plains developer, created a prototype. Information about the prototype is available at .

The prototype—a model of how XML might be used in an audit engagement—includes 10 electronic audit schedules developed using XML. Each schedule has a DTD that specifies which audit schedules are required and what information must be provided for each. For example, general ledger trial balance, accounts receivable trial balance and accounts payable trial balance schedules are necessary for the audit: The XML program will not work if this information is not provided. These 10 draft audit schedules have been submitted to the AICPA as a first draft of an industry standard intended for use throughout the accounting profession.

Much like printing a report, an accounting system user can "print" information to an XML data file. Using the standard report writer that comes with Great Plains dynamics, a report was created that "prints" XML to a text file, thus exporting accounting system data. The result is a text file containing properly formatted XML data. Remember, XML is "open"—not a Great Plains product. Any accounting software with a report writer could do this.

What problems have these enhancements solved? First, data are transferred from client to auditor electronically rather than via a paper printout. Second, neither auditors nor clients have to understand the accounting system tables or data fields—the program automatically generates a standard format file. Third, the data provided to the auditors are in a familiar format, rather than a difficult-to-understand relational database specific to a particular application. Users can retrieve and manipulate the data files much more easily with XML.

There's more. Knight, Vale & Gregory developed a prototype-auditing tool that used the accounting system data exported to this XML format to test the possibilities. With this prototype tool, an auditor was able to go from raw data in the accounting system to determining the scope for accounts receivable testing and printed confirmations in 10 minutes. A demonstration of this is on theWeb site.

These results were promising. In fact, thanks to standardization, it appears that proper use of XML can reduce audit costs significantly while increasing the amount of work auditors can accomplish. Using electronic data transfer rather than manually re-keying data admits less chance for human error in the data exchange process. XML even adds value to audit schedules: The schedule data may also be used for later analysis. In fact, as the next example shows, XML can help you easily migrate analysis information into any format you want.

XML financial reporting. The AICPA High Tech Task Force has created the first full set of XML based financial statements. See these financial statements at . (Microsoft Internet Explorer 5.0 is required.) The benefits of XML extend further than expected: Reports other than financial statements would benefit from using XML, as seen in the following scenario.

Imagine getting an e-mail message from your bank with an XML representation of your bank statement attached. Rather than having to key "cleared check and deposit" information into your accounting system bank reconciliation software, you could simply import the attached XML file to update your files. If the information in the report is structured with XML tags, specific elements such as check numbers, amounts and transaction details can be processed automatically and easily transferred from one system to another.

Remember that XML lets you print the reports on paper, to a Web site, into an e-mail, to a screen or even onto a CD. You determine the final presentation format of the data by selecting a standard style sheet appropriate for the selected media. The style sheet can take advantage of the features of the specific media, such as hyperlinks on a Web site, to move from a summarized representation of data to the details. In the future, software companies may design and sell "off-the-shelf" style sheets for different purposes.

If all accounting system reports existed in an XML format, CPAs could easily adapt them for many other purposes. Although without XML you can of course print reports to a text file, the meaning of the data in the text file is not clear to computers, so it is hard to use those data. XML makes sure the data retain meaning.

XML electronic transactions. Until recently, cost and complexity often meant e-commerce was an option only for big companies. The wide array of transaction formats made implementing these solutions challenging. Different standards such as EDI transaction sets, ACH transactions, IRS proprietary formats, EDGAR formats, BAI, SWIFT, FSML and others work in different ways. A single XML-based standard for each of these transaction sets would allow online entrepreneurs to use a single set of generalized tools. Of course, many companies have a substantial investment in the current e-commerce technologies, such as EDI. In many cases, these systems are up and running adequately to meet current business needs. However, simplification initiatives such as XML-based EDI may spur smaller, more cost-sensitive companies to get into e-commerce.

The XML/EDI Group, a consortium of both traditional EDI vendors and XML solution providers, is working to establish a framework for wide variety of e-commerce solutions based on XML. The approach it recommends accommodates a wide range of business needs and enables companies to share information across different platforms. The result will be significantly different from the traditional EDI model, which transmits relatively simple, inflexible information between trading partners. Companies today often implement EDI grudgingly, bullied into it by influential customers. However, XML can make this process much easier to implement, so more companies may start to enter into EDI-type relationships with enthusiasm.

XML art3


Basically, XML provides the structure to let a single transaction contain complex information, such as processing rules for the data. It is flexible and extensible to meet changing business needs without requiring significant development rework. XML lets CPAs easily index and search the documents to locate specific information. Finally, users can read and update the information with a Web browser without additional processing. These capabilities will revolutionize how business partners exchange information as well as manage it internally.

For example, an office spends a lot of time manually processing each invoice. The work begins when Company A sends an invoice to Company B, which then keys it into its accounts payable system and cuts a check. Company A receives that check, keys it into its accounts receivable system and creates a deposit slip. It takes that deposit to the bank, where the bank processes the check and mails it back to Company B, which then has to reconcile its bank account. With XML, a coded electronic invoice will start the process and automatically interact with the recipient, bank and everyone's accounting systems seamlessly. One day, CPAs of the future will look back on the current system as evidence of our primitiveness.


CPAs can rise above today's methods and processes to see a vision of future capabilities. Sometimes it is difficult to believe that such advances are possible. But happen, they do. When compact discs were introduced, LP records and cassettes disappeared from store shelves, as did record and tape players, almost overnight. Ironically, soon DVDs (Digital Video Disks) may replace CDs. "The Future of Finance" (JofA, Aug.95, page 47) predicted that the amount of time spent on transaction processing would shrink by 25% and that there would be a 40% increase in financial productivity. Technologies such as XML are making these predictions come true. In fact, the future may be here already: Microsoft Internet Explorer 5.0 supports XML and upcoming versions of Netscape Navigator likely will do so as well.

For More on XML...
Great Plans
Knight, Vale & Gregory
Open Applications Group
What Is

It is important that the accounting profession participate in the events that are going to change accounting—such as the development of XML—rather than wait for the changes to happen to the profession. These efforts could benefit all the members of our networked world. Do you want groups that may not even involve CPAs to set the standards for XML codes? Standards groups will need to communicate; otherwise, dissimilar standards serving only a few will result. CPAs should read the following call to arms:

  • Investigate the potential of XML. The AICPA High Tech Task Force has already started this. You can ask companies that make software you use—and recommend to others—if they will support XML in future versions.

  • Support the development of the required DTDs for financial reporting, auditing and business transactions to allow rapid, standard adoption of the technology within the accounting profession. The Open Applications Group has already begun DTD development for business transactions, and the High Tech Task Force has created draft standard audit schedules. See the list of URLS in the sidebar ("For More on XML"), which shows you where to get more details and how you can get involved in professionwide ventures.

  • If you do get involved with others supporting XML standards, use the credibility CPAs are known for to help facilitate rapid yet thoughtful standards for XML-based electronic transactions.

  • Encourage your company or clients to support standards organizations composed of technology leaders working together rather than vendor-driven initiatives that will serve only a few.

  • Learn new technologies such as XML to become leaders and help ensure that the CPA does not go the way of the blacksmith—again, see the URL list.

  • Involve your firm or company in R&D efforts—experiment with technologies such as XML to solve the accounting problems we must solve every day.

  • Learn to operate in Internet time.

Now that has become a household word and start-up Web companies run by billionaire twenty-somethings are giving way to the well-funded efforts of traditional media corporations, you may think the era of rapid change is over. Think again. It's just beginning. Technology continues to provide opportunities to significantly improve the way we do business.

Welcome to the ramparts of the information revolution.

XML: Behind the Scenes

How exactly does XML work? It actually consists of three basic document types, a main XML data file with the codes and two auxiliary files. The main file provides the raw data, the second includes structure information, and the third provides presentation instructions—how it will look. These document types could be physically located in one file or separate files depending on your need. Think of each document as a container of information. That is, one file contains the data, another explains the organization of the data to the computer and the third governs how the data will look in a printed report, Web site or wherever the user wants to put them.

Here's a description of each document type:

XML data file: This contains marked-up information. (See the example below.) For example, a purchase order document contains all the information about that purchase order, such as the customer number, customer name, purchase order number and the purchase order line items that include the quantity ordered, an item name and a price.

DTD: A document type definition describes the structure of the data in the XML file. The DTD is optional, but it is very useful. It helps preparers properly format the data and also helps the recipient understand the data. Generally, industry groups create DTDs for the documents they transfer internally or between business partners. In fact, DTDs will support all information transferred between business partners. In some cases, everyone in the world could agree to a single DTD for such universal items as business cards or addresses. A preparer can create a DTD that essentially acts as a proofreader by refusing to run a program unless certain information is included in the XML file.

XSL: A style sheet, or XSL document, provides formatting and presentation information for the data contained in an XML data file. Different style sheets can be used to conform to different media. For example, one style sheet formats documents for a printer, another for display on a Web site and a third for information on CD-ROM. Style sheets are optional. For example, they are generally unnecessary for transfering information from one computer system to another, since a computer doesn't care what the data look like.

What XML Looks Like: A Sample XML Data File
Note the tags in <angle brackets>. They identify John Doe as a <NAME> in a <CUSTOMER> record of a <PURCHASE_ORDER>. It's clear to you—and your computer. See how some tags are nested, subordinate to a more encompassing tag. This XML data file's DTD can insist that there be an entry for the <PRICE> tag, for example. Without this entry, the program won't run, allowing the preparer to catch a problem.
<?xml version="1.0"?>
      <NAME>John Doe</NAME>
      <ADDRESS>PO Box 99</ADDRESS>
      <CITY>New York</CITY>
    <ITEM>Blue Sled</ITEM>
    <ITEM>Red Wagon</ITEM>


Keeping you informed and prepared amid the coronavirus outbreak

We’re gathering the latest news stories along with relevant columns, tips, podcasts, and videos on this page, along with curated items from our archives to help with uncertainty and disruption.


Excel walk-through: Sparklines

Want to liven up your spreadsheets with some color and graphical elements? Kelly L. Williams, CPA, Ph.D., shows how to use Excel sparklines, which illustrate data trends and patterns via small charts that fit in a single Excel cell.