Why AST Fixes your Coding Standard Better than Tokens

In the last post Brief History of Tools Watching and Changing Your PHP Code we saw there are over dozen tools in PHP that can modify code. So there is no surprise coding standard tools are "upgrading" code from PHP 5.6 to PHP 7.2 without knowing types and that AST is moving false to !.

Should coding standard upgrade your code? Should AST make your code cleaner? Should AST take of coding standard changes? Which is born for it?

Tokens

PHP CS Fixer can upgrade to few features of new PHP using just token_get_all():

<p><a href="https://github.com/FriendsOfPHP/PHP-CS-Fixer/blob/03e13fb91c775a151dc57ae51e80ba3f2abe7da6/src/RuleSet.php#L209-L240"><code>RuleSet.php</code></a></p>

AST

Rector can solve rather low-level changes in code quality level:

-if (! $this->isTrue($condition) === false) {
+if ($this->isTrue($condition)) {
-count(func_get_args()) === 1);
+func_num_args() === 1

The "Code Quality" Level

It's the most favorite level in Rector. Why?

  • it makes your code clear
  • it's easy to use on any PHP code regardless framework you're using - from pure PHP, over Drupal, Wordpress, Magento, to frameworks like Symfony, Nette, and Laravel
  • it helps you to use direct PHP functions instead of wrapping them into complex structures ↓
-foreach ($this->oldToNewFunctions as $oldFunction => $newFunction) {
-    if ($currentFunction === $oldFunction) {
-        return $newFunction;
-    }
-}
-
-return null;
+return $this->oldToNewFunctions[$currentFunction] ?? null;

Huge thanks to Gabriel Caruso, who brought this idea to Rector and helped me to shift my view to the one I'll show you below.


If there would be no AST, this all could be handled by token_get_all (like PHP_CodeSniffer and PHP CS Fixer/), but such implementation needs to be lot longer to achieve similar quality, since you have to check every previous and next tokens for any unexpected values.

"I really don't like programming. I built this tool to program less so that I could just reuse code."
Rasmus Lerdorf

Shifting the Scope

We're here at the moment:

  • tokens / coding standard === styling only
  • AST / static analysis === context aware only

That's very narrow and old-school, but the shift has already begun...

Tokens are Best at

  • spacing and exact positions

    -if ($condition )
    -{
    +if ($condition) {
    
  • sign changes

    -$items = array(1, 2, 3;);
    +$items = [1, 2, 3];
    
  • doc block changes

    /**
    -* @param    int|string
    +* @param int|string $id
     */
    

AST is Best at

  • logic and structure changes

    -if (! $this->isTrue($condition) === false) {
    +if ($this->isTrue($condition)) {
    
  • code cleanup

    -$value = $value;
    
  • context-aware names

    -$formBuilder->add('name', new TextType);
    +$formBuilder->add('name', TextType::class);
    

1 Example for Coding Standards in AST

Let's take this case of useless variable:

 function () {
-    $a = true;
-    return $a;
+    return true;
 };

My first thought was: "Why is it assigned, is there some magic behind this? I need to explore more." Well, there isn't - it's a trap. Both for the programmer and for PHP to interpret it.

So this change will not only make your code more readable but also faster. A nice side effect, right?

Let's briefly compare how tokens and AST approach this:

Tokens AST
PHP_CodeSniffer Rector
UselessVariableSniff SimplifyUselessVariableRector
329 lines 120 lines
2 helper services 1 helper service

Note that it would be very difficult to write both versions shorter and keep reliability high and I believe kukulich is very good at implementing Sniffs effectively. It's a matter of used technology, not implementation skill.

To sum up, the AST version takes only 36,47 % of code what token version.


Also, AST implementation also solved this case without no extra work:

 function test() {
     $a = 1;
     $b = 1;
-    $c = [
+    return [
         $b-- => $a++,
         --$b => ++$a,
     ];
-    return $c;
 }

Coding Standards on Steroids with AST

I still imagine how PHP would look like today if we had AST in 2012 when Fabien started PHP CS Fixer.


  • Would you be interested in such AST rules for coding standard?

  • What rules would you add if it would be easier to create them with AST?




Do you learn from my contents or use open-souce packages like Rector every day?
Consider supporting it on GitHub Sponsors. I'd really appreciate it!