- column
- TECHNOLOGY Q&A
Creating an AI agent in ChatGPT
ChatGPT’s agent can set complex goals, create a plan to meet those goals, and adapt to changing circumstances.
Related
A New Frontier: CPAs as AI System Evaluators
Using TEXTSPLIT to dissect Excel text strings
Using Excel’s TEXTBEFORE AND TEXTAFTER functions to easily tame messy data
TOPICS
Q. I heard the new ChatGPT agent is really good with browsing the internet and working with spreadsheets. What is an agent, what can it do, and what can it do with spreadsheets that’s so noteworthy?
A. ChatGPT agent is a new agentic AI tool within the ChatGPT interface. Like other agentic AI systems, this tool can make decisions and perform complex tasks with little to no human help. Where generative AI — including ChatGPT’s basic chatbot interface — is content-focused, creating text, files, images, or video output, agentic AI is focused on autonomous action and results.
ChatGPT’s agent can set complex goals, create a plan to meet those goals, and adapt to changing circumstances. For example, the agent can take a user’s prompt and then open a web browser, click through website pages, and search for data to achieve the desired task. The agent can not only retrieve data, but also add, update, and delete data.
Agentic AI may be poised to change the way CPAs work, but it’s still AI. As such, it is a mathematical, statistical algorithm that takes in text and images to calculate what would be the most likely response based on the training data. It does not have the experience of a human.
With that in mind, what’s it like to work with an agent? Let’s walk through how I used a ChatGPT agent to create a focused attendee itinerary for the CPA.com Digital CPA Conference. To begin, I opened a new chat, clicked on the +, and switched to “Agent mode.”

The input interface changed to “Describe a task” and included some suggestions. I then used the following prompt and clicked the up arrow to start the agent.
“I will be attending Digital CPA in the winter of 2025. I need help creating a schedule of sessions which I want to attend. The topics I am interested in are Artificial Intelligence and Client Advisory Services, which can be shortened to CAS. Here is a link to the agenda: https://www.cpa.com/digital-cpa/agenda
Create a spreadsheet of an agenda for me with the name of the session, the start time and the name of the speaker.”

The agent then acknowledged the task and began by creating a browser session. I watched it click on the Digital CPA website and navigate around. Because the agent is testing and determining which actions work best in response to the prompt, different tasks will take different lengths of time. This agent, for example, took 22 minutes to complete the task. Following are screenshots of the agent running and a brief explanation of what’s happening in each.

In the screenshot above, the agent acknowledged the task and prepared its digital workspace. The agent wrote the message shown in the black box in the center of the screenshot.

The agent next determined how it could filter the sessions to only CAS and AI topics. The black box shows the agent explaining the action it is trying to take.

The agent then tested to find the speaker information from the session. It recognized that the “More” button might give it more information (see above).

The agent then looked for the “Less” button to hide the extra information. There is some lag in how quickly the agent works, so it has not noticed that “Less” is already on the screen above.

When the agent reached the bottom of the screen, it attempted to grab the scroll bar and drag it up. The agent then recognized that the screen did not move. After two failed attempts to scroll up, it instead used its internal keyboard to press the “Home” key and jump to the top of the webpage.
After gathering all the data, the agent displayed the results of its work. The screenshot below shows the time and a couple rows of the schedule.

The agent took some interesting liberties with the task. In focusing on the AI and CAS topics, it skipped adding one of the keynotes because the keynote’s description did not include CAS or AI. It also converted all session times from their local Eastern Standard Time (EST) to Pacific Standard Time due to my location. Since I am attending this event in person, I wanted sessions to be listed with the local time zone.
Before I downloaded the Excel file of the schedule provided with the agent’s response, I wanted it to modify the times back to EST. I used a simple prompt of:
“Change all the start times to be EST.”
The agent then modified the times in the spreadsheet to be EST, as shown in the screenshot below.

Agents are still in the early stages of development, so there are some potential risks to keep in mind.
For one, agents seem to be inconsistent. I tried the same prompt twice, and I received some variations in the sessions that were selected.
An agent can also misclick in the web browser, creating a risk of it clicking on something I don’t want it to. I would not want, for example, to have an agent click on a link that takes them to any system where there is the possibility of modifying data. Agents are not accurate or consistent enough to be trusted to make changes.
I can take control of the agent’s web session if an issue comes up. If, for example, I need to get the agent past a point where it is stuck, I can help it. A use case would be if I wanted it to research information that is behind a login. I can help it get logged in and then the agent can continue to get information.
About the author
Wesley Hartman is the founder of Automata Practice Development.
Submit a question
Do you have technology questions for this column? Or, after reading an answer, do you have a better solution? Send them to jofatech@aicpa.org.
