An AI Agent is so much more than an LLM.
AI agents and LLMs can both come in the form of a chatbot, so the difference might not be obvious.
While an LLM can answer a question and do basic tasks, an agent can chain multiple steps together, reason about the problem, search the web and really anything else.

For example: upgrading from the free ChatGPT 3.5 to 4 upgrades the LLM, but also gives it the ability to search the web and review its answers etc.
However, the details of these systems are mostly hidden…
I built Polony AI to dive into how such agents might work and why they can do what they can.
A Difficult Question
“Find the tallest mountain in Australia whose height in metres is a prime number.”
ChatGPT 3.5 answered this with the rambling and incorrect:
The tallest mountain in Australia is Mount Kosciuszko… Mount Townsend, which is located… Mount Townsend’s height is approximately 2,209 meters… 2209 is indeed a prime number. Therefore, Mount Townsend fits the criteria…
Using Polony AI, which is powered only by GPT 3.5, we get:
Mt Northcote
whose height is 2,131 metres, certainly a tall mountain and this time, the height is actually a prime number.
We know LLMs have knowledge about the world, so why couldn’t GPT-3.5 get it right?
The question is too complex. An LLM just starts predicting words and answering this question requires more thought. An LLM answering a question is the equivalent of saying the first thing that comes into your head. This requires looking at the heights of many mountains and, one by one, assessing whether that number is prime.
Introducing Polony AI…
Let’s look at the steps taken by Polony AI when asked the exact same question.
First, it wrote a plan of what to do:
1. Begin by searching for a list of the tallest mountains in Australia.
2. Write a Python program to filter out the mountains with prime number heights.
3. Implement a function to check if a number is prime.
4. Iterate through the list of mountains, check if their heights are prime, and store the tallest one.
5. Print out the name of the tallest mountain with a prime number height in meters.
You can see that taking these steps is a much more effective method than just having a guess. Remember, this is the exact same ai model that answered incorrectly before, but this time we’ve given it time to think, as well as giving it a few more abilities.
So, let’s see how it goes at executing this plan. After every action, Polony has the ability to choose between:
- Browsing the web
- Writing code
- Running code
- Submitting its answer
Each of these options is a procedure that may require more input, and, once completed, it will have produced some sort of output or information that can be accumulated. These pieces of information are constantly used to guide the decisions.

True to its plan, it first chose to browse the web.
Polony chose to search “Tallest mountains in Australia” and based on looking at the top 10 google results, it decided to visit:
https://www.ga.gov.au/scientific-topics/national-location-information/landforms/highest-mountains
The information from this site was then stored in its memory. You can see it draw on this exact data in the next step.
From here it chose to write code and produced:
import math
# Function to check if a number is prime
def is_prime(n):
if n <= 1:
return False
if n == 2:
return True
if n % 2 == 0:
return False
for i in range(3, int(math.sqrt(n)) + 1, 2):
if n % i == 0:
return False
return True
# List of mountains and their heights (data obtained from the provided webpage)
mountains = [
{"name": "Mount Kosciuszko", "height": 2228},
{"name": "Mount Townsend", "height": 2209},
{"name": "Mount Twynam", "height": 2195},
{"name": "Rams Head", "height": 2190},
{"name": "Unnamed peak on Etheridge Ridge", "height": 2180},
{"name": "Rams Head North", "height": 2177},
{"name": "Alice Rawson Peak", "height": 2160},
{"name": "Unnamed peak south-west of Abbott Peak", "height": 2159},
{"name": "Abbott Peak and Carruthers Peak", "height": 2145},
{"name": "Mt Northcote", "height": 2131}
]
# Find the tallest mountain with a prime number height
tallest_mountain = {"name": "", "height": 0}
for mountain in mountains:
if is_prime(mountain["height"]) and mountain["height"] > tallest_mountain["height"]:
tallest_mountain = mountain
print(tallest_mountain["name"])
Not the prettiest code in the world, but it certainly works. It uses the data from the webpage, it checks for primes and indeed finds the largest one. We’ll deduct a mark here as it strays from the plan and forgets to print the height, something the question asker is probably interested in.
Next it chooses to run the code where we see the output
Mt Northcote
printed to the console. Satisfied with this, Polony chooses the submit option.
Where to from here?
Just by combining an LLM with quite simple software procedures we can transform it from giving a best guess to thoughtfully completing a task. The improvements from here are quite clear, rather than a single system performing everything, the tasks can be delegated to specialised agents. Eg. an agent specifically trained to find and clean data from the web. A control agent can then delegate tasks and update a plan on the fly as new information comes in. We can also add many more possible actions as well as a better review processes to catch mistakes and keep aligned to the overall goal.
Leave a comment