Why AST Fixes your Coding Standard Better than Tokens

In the last post Brief History of Tools Watching and Changing Your PHP Code we saw there are over dozen tools in PHP that can modify code. So there is no surprise coding standard tools are "upgrading" code from PHP 5.6 to PHP 7.2 without knowing types and that AST is moving false to !.

Should coding standard upgrade your code? Should AST make your code cleaner? Should AST take of coding standard changes? Which is born for it?

Tokens

PHP CS Fixer can upgrade to few features of new PHP using just token_get_all():

<p><a href="https://github.com/PHP-CS-Fixer/PHP-CS-Fixer/blob/03e13fb91c775a151dc57ae51e80ba3f2abe7da6/src/RuleSet.php#L209-L240"><code>RuleSet.php</code></a></p>

AST

Rector can solve rather low-level changes in code quality level:

-if (! $this->isTrue($condition) === false) {
+if ($this->isTrue($condition)) {
-count(func_get_args()) === 1);
+func_num_args() === 1

The "Code Quality" Level

It's the most favorite level in Rector. Why?

-foreach ($this->oldToNewFunctions as $oldFunction => $newFunction) {
-    if ($currentFunction === $oldFunction) {
-        return $newFunction;
-    }
-}
-
-return null;
+return $this->oldToNewFunctions[$currentFunction] ?? null;

Huge thanks to Gabriel Caruso, who brought this idea to Rector and helped me to shift my view to the one I'll show you below.


If there would be no AST, this all could be handled by token_get_all (like PHP_CodeSniffer and PHP CS Fixer/), but such implementation needs to be lot longer to achieve similar quality, since you have to check every previous and next tokens for any unexpected values.

"I really don't like programming. I built this tool to program less so that I could just reuse code."
Rasmus Lerdorf

Shifting the Scope

We're here at the moment:

That's very narrow and old-school, but the shift has already begun...

Tokens are Best at

AST is Best at

1 Example for Coding Standards in AST

Let's take this case of useless variable:

 function () {
-    $a = true;
-    return $a;
+    return true;
 };

My first thought was: "Why is it assigned, is there some magic behind this? I need to explore more." Well, there isn't - it's a trap. Both for the programmer and for PHP to interpret it.

So this change will not only make your code more readable but also faster. A nice side effect, right?

Let's briefly compare how tokens and AST approach this:

Tokens AST
PHP_CodeSniffer Rector
UselessVariableSniff SimplifyUselessVariableRector
329 lines 120 lines
2 helper services 1 helper service

Note that it would be very difficult to write both versions shorter and keep reliability high and I believe kukulich is very good at implementing Sniffs effectively. It's a matter of used technology, not implementation skill.

To sum up, the AST version takes only 36,47 % of code what token version.


Also, AST implementation also solved this case without no extra work:

 function test() {
     $a = 1;
     $b = 1;
-    $c = [
+    return [
         $b-- => $a++,
         --$b => ++$a,
     ];
-    return $c;
 }

Coding Standards on Steroids with AST

I still imagine how PHP would look like today if we had AST in 2012 when Fabien started PHP CS Fixer.





Do you learn from my contents or use open-source packages like Rector every day?
Consider supporting it on GitHub Sponsors. I'd really appreciate it!