Why AST beats GPTs - featuring php-parser, ChatGPT 4.5 and Grok 3

As I'm manually writing this article, GPTs are on the hype train now. In this post, we'll use freshly released ChatGPT 4.5 and Grok 3 and see, if they know the AST of PHP well enough to be used on a large PHP project.

Understanding AST takes longer than writing an English sentence in a chat. But once you see the abstract syntax tree in a code you're reading, it cannot be unseen.

Today I'll try to convince you to start writing your own AST visitors and see how they make magic happen on your codebase.


"I was blind but now I see"

Where do GPTs excel?

Just a couple of days ago, Peter Levels released a HTML + JS flight simulator build with Cursor (IDE-like GPT focused on code).



He shares a vibe coding session with improvements to the game. Fun to watch, and worth following.


This shows the main advantages of GPTs:


GPTs are based on probability, as they learn on existing available data without weights. The more data you have on a specific topic, the easier it for them is to understand and generate it.


Using HTML and JS is easy, with 10 000+ sources on the topic. But what about Symfony 7.3?


It's not even released yet, but GPTs don't care.

Where does AST excel?

An abstract syntax tree is a way we see the code. We use it to look for precisely defined pattern. If the pattern is found, we do some action. If not, we move on.

We can run it on a legacy codebase to upgrade 10 000 Doctrine entity annotations to PHP 8.0 attributes:

use Doctrine\ORM\Mapping as ORM;

class User
{
-    /**
-     * @ORM\Id
-     * @ORM\Column(type="integer")
-     * @ORM\GeneratedValue(strategy="AUTO")
-     */
+    #[ORM\Id]
+    #[ORM\Column(type: 'integer')]
+    #[ORM\GeneratedValue(strategy: 'AUTO')]
    private $id;

This shows the main advantages of AST:



Once you learn AST for PHP, it's reusable in other languages too.


One more advantage is that abstract syntax tree is a concept. Like light refraction, dependency injection, cost-benefit effect, etc. Once we learn it, we know it, forever.


Innovation Propagation Lag

Do you remember how you wanted to know how to use PHP latest features? We used Google to find a solution, mostly on Stackoverflow. We lacked understanding, so we copy-pasted the code and hoped it would work - in a similar way someone does with GPTs now. There was just one problem, the most popular solutions had high scores in both Google and Stackoverflow rankings. The new and innovative solutions were not there, as they were not popular yet.

That leads to a state where the most popular solutions were the most outdated ones. Especially for a language, that releases a new version every year. Yes, we can use a hack to limit all Google/StackOverflow result to last 12 months, but the easiest path always wins.

GPTs suffer from the same design flaw - you've probably seen jokes about the ChatGPT cut-off date, as it yields old presidents instead of those elected 2 months ago.

This flaw can be counterbalanced by shortening the cut-off date or just keep learning on fresh data. That's what Grok is trying to do. Yet, the innovation is very slow and GPTs keep propagating outdated solutions and patterns, because there are more sources, more discussions, and more strong opinions.

6 Years Behind

We humans suffer from the same flaw. We tend to stick with existing solutions that we've known for years, instead of constantly trying new ones.

We have PHP 8.4 out now, but a lot of codebases do not use even old PHP features. One of the most missed features is param/return string, int, bool, and float and object type declarations:

function addNumbers(int $a, int $b): int
{
    return $a + $b;
}

Too new? They were released in PHP 7.0, in January 2019, that's 6 years ago. Still, many codebases including frameworks are not using them. We're lagging 6 years behind the released feature.

How can we adapt our existing codebases to something 1-year-old then?


In the context of this post, we should ask: "How can GPTs adapt to something 1 year old and suggest it at first shot"?


Legacy Successful Projects

"These mountains that you are carrying
you were only supposed to climb"

When we talk about legacy projects, we don't include only a mix of PHP and HTML with thousands of files. Legacy projects include existing projects:

These projects already have value and are growing. Also, they have more value to be extracted. The same way 5-story old buildings in capital cities...


...the same way you can extract a whole new floor in your house.


Legacy projects don't have a bad carma, but rather hidden source of great power to be discovered.


How to learn AST?

The best way of learning is by doing something meaningful to you: pick 1 problem in your PHP codebase that you have known about for years and fix it.


Use case: Upgrade FOS Rest bundle 2 to 3

For simple upgrades, we can use IDE or in-IDE GPTs. For more complex we can use Rector, which has already prepared sets to handle e.g. PHP, Symfony, Laravel, or PHPUnit upgrades.

But we care about our specific project, which neither GPT nor Rector has heard about enough times.

I'll try to ask ChatGPT 4.5 and Grok 3 to help build custom rules and will review their process, so you can see why the AST and your own creativity and determination beat GPT in the long run.


Our status: we assume GPTs will help us handle our work. We don't know much about AST yet and want to see, if it's worth learning.

Our task: We have a project with FOS Rest bundle 2 and need to upgrade to version 3, which allows a higher Symfony version.

The challenge: the routing in version 2 is magical and requires only a definition in the YAML file:

some_routes:
    type:         rest
    resource:     "@Controller/ProjectsController.php"
    prefix:       /sites
    defaults: { _format: json }

The official upgrade guide mentions this challenge very vaguely, as requires huge amount of manual work. Changing "rest" to "annotation" doesn't help here, because the controller doesn't have any annotations in the first place.


To kick off, let's ask for help GPTs. We want to create a Rector rule that we would be able to copy-paste to our project, run, and get the job done:

Create a Rector rule to migrate fosrest 2 routes to fosrest 3 routes on controller
Make use of @Route annotations, take it step by step
and do before/after code samples

Instead of posting conversation back and forth to this post, I'll share the full conversation:


What Problems have GPTs missed?

Both tools have failed to provide a working code. Also, due to innovation propagation lag they're leading us astray in syntax that has been outdated for 2 years now.

I'll comment on the most relevant fails where using GPT will give us more work and force us to research deeper back and forth. My thesis is that it would be faster to understand AST and write the rule from scratch ourselves, with understanding and full control. This further allows us to improve the Rector rule to catch edge cases used only in our project.


At first reply, both ChatGPT and Grok assumed I already had annotation routes in the controller:

/**
 * @Rest\Get("/users/{id}", name="get_user")
 */
public function getUser($id)
{
    return $this->json(['id' => $id]);
}

This is the happy path of simple annotation rename, so it makes sense GPT went for it. After correction that we use YAML files with definition, it got lost 1st time. So I've added a little bit more context:

I need a rule that works with route.php file routes like:

$routingConfigurator
    ->import(__DIR__ . '/../src/SomeBundle/Resources/config/routing.yml', 'rest')
    ->prefix('/api/some-prefix');

And convert this to @Route annotations above controller actions



Here is the routing.yml:

some_routes:
    type:         rest
    resource:     "@Controller/ProjectsController.php"
    prefix:       /sites
    defaults: { _format: json }

ChatGPT 4.5 hallucinated a couple of non-existing services. They look real but do not exist. Good luck finding those:


ChatGPT 4.5 started to create a rule for adding @Route annotations correctly, but then it got lost and started to parse the $routingConfigurator->import PHP route file. This part is actually the least important and is not used for adding routes.


ChatGPT 4.5 hallucinated printNodesToFilePath() method, which does not exist in Rector.


Grok 3 rigidly assumed all the controller actions are named in the same way, e.g. 'getAction'. That means getProduct would be skipped. Also, it missed the cget*() prefix, which should be also converted.


Both tools are lagging and using Rector 0.9 config syntax, without fluent API. Copy-pasting such syntax would probably lead to bugs while using Rector 2.0, setting your project even further behind.


They're also using the wrong namespace - the use Rector\Core\*; no longer exists. The use Rector\ should be used instead since Rector 1.0. Copy-pasting such code would lead to infamous class-not-found errors Stackoverflow is full of.


They both miss-understood the separation process:


Conclusion

In my opinion, the GPTs fail here for both developer groups.

Developers who don't know AST

If I was a developer who has little or no AST knowledge, I would get lost as GPT gives them only half-baked cake. Some parts will not work because code is outdated. If after 6 years we're unable to use PHP 7.0 strict type declarations in PHP projects, not sure how many years it will take GPTs to start using Rector 2.0 syntax.

This is the innovation propagation lag in practice. By the time Rector 3.0 is out, GPTs will still be using Rector 2.0 syntax, and so on. It forces us to stop the innovation and go for BC compatibility forever, creating an even more coupled legacy that we actually try to get rid of.

Medior/Seniors AST devs

More experienced AST developer sees obvious mistakes that GPTs make and tries to fix them. Yet after the first couple of feedback, GPTs already forgot about the controller routes and focused only on YAML configs. It would be definitely faster to write such a rule with the help of Copilot.


That's why it's worth learning AST because you'll be able to shape codebases reliable with your own hands and vast imagination.


Happy coding!