Tag: PHP

  • Why @@ is the best attribute syntax for PHP

    Update 2020-09-02: The Shorter Attribute Syntax Change proposal was accepted, with #[Attr] voted as the final attribute syntax for PHP 8.0.

    Background

    Analogous to docblock annotations, attributes make it possible to apply metadata to a class, function, or property with a native syntax and reflection API.

    The Attributes v2 RFC (accepted on May 4, 2020) added support for native attributes in PHP 8 using the <<Attr>> syntax borrowed from Hacklang. In June, I authored the Shorter Attribute Syntax RFC to propose using @@Attr or #[Attr] to address five shortcomings of the original syntax (verbosity, poor readability of nested attributes, confusion with generics and shift operators, and dissimilarity to other languages). The result of this RFC was that the @@Attr syntax was accepted and implemented in PHP 8.0 beta 1.

    However, several people were unhappy with this outcome, and proposed a new RFC to vote again on the declined attribute syntaxes (as well as a couple new alternatives) in hopes of a different result. This RFC puts forward four main arguments for changing the syntax: grouped attribute support, consistency, forward compatibility, and “potential future benefits”. Are these really good arguments?

    Attribute grouping

    Grouped attribute support (e.g. <<Attr1, Attr2>>) was added to partially reduce the verbosity of the <<>> syntax. However, this has the downside of creating two different syntaxes for declaring attributes. Furthermore, grouped attributes result in unnecessary diff noise when adding or removing a second attribute on its own line:

    <<SomeAttr("argument")>>
    function foo() {}
    
    // changes to
    <<
        SomeAttribute("argument"),
        OtherAttr([1, 2, 3]),
    >>
    function foo() {}

    In contrast, with the @@Attr syntax, individual attributes can always be added or removed without having to change attributes on other lines:

    @@SomeAttribute("argument")
    @@OtherAttr([1, 2, 3]) // can be added/removed independently
    function foo() {}

    Finally, the @@Attr syntax without grouping is equally concise as the alternatives with grouping, so this is not a reason to prefer any other syntax over @@Attr.

    Consistency

    The RFC argues that an attribute end delimiter is necessary for consistency with other syntax constructs in PHP. However, attributes are not standalone declarations, but modifiers on the declaration that follows them, similar to type declarations and visibility/extendibility modifiers:

    // declaration modifiers do not have end delimiters like this:
    [final] class X {
        [public] function foo([int|float] $bar) {}
    }
    
    // the @@ syntax is consistent with other declaration modifiers:
    @@Attribute
    final class X {
        @@Jit
        public function foo(@@Deprecated int|float $bar) {}
    }

    The RFC responds to this by arguing that attributes are more complex than modifiers and type declarations since they have an optional argument list. However, this fails to recognize that an attribute’s argument list already has its own start/end delimiters (parentheses)! So adding another end delimiter if anything reduces consistency rather than improving it.

    The @@Attr syntax is consistent with existing declaration modifiers in PHP, as well as docblock annotations and other languages using the @Attr syntax.

    Forward compatibility

    The #[Attr] syntax could provide a temporary forward compatibility benefit to library authors, who would be able to reuse the same class both as a PHP 8 attribute and to store information from a docblock annotation when the library is used on PHP 7. However, this benefit will be irrelevant once most of the library users upgrade to PHP 8, and the library wants to take advantage of any other PHP 8 syntax.

    Even without partial syntax forward compatibility, a library can support both PHP 8 attributes and PHP 7 annotations with a small amount of extra work. A parent class can be used to store/handle annotation information, and a child class can be registered as an attribute for PHP 8 users.

    The downside of forward compatibility is that it can result in code that is valid in both PHP 7 and PHP 8, but runs very differently on both. For example:

    class X {
        // This comments out the first parameter entirely in
        // PHP 7, silently leading to different behavior.
        public function __construct(
            #[MyImmutable] public bool $x,
            private bool $flag = false,
        ) {}
    }
    $f1 = #[ExampleAttribute] function () {};
    
    $f2 = #[ExampleAttribute] fn() => 1;
    
    $object = new #[ExampleAttribute] class () {};
    foo();
    
    // On PHP 7 this is interpreted as
    $f1 = $f2 = $object = new foo();
    <?php
    // This example echoes the rest of the source code in
    // PHP 7 and echoes "Test" in PHP 8.
    #[DeprecationReason('reason: <https://some-website/reason?>')]
    function main() {}
    const APP_SECRET = 'app-secret';
    echo "Test";

    Is the temporary forward compatibility benefit (which realistically will only simplify code slightly for library authors) really worth the downside of a larger BC break and risk of valid code being interpreted very differently across PHP versions?

    Potential future benefits?

    Lastly, the RFC argues that an end delimiter could be helpful for enabling future syntaxes such as applying a function decorator:

    #[fn ($x) => $x * 4]
    function foo($x) {...}

    However, there are other potential ways to accomplish the same thing that would arguably be even more readable and flexible:

    // via a built-in attribute:
    @@Before(fn ($x) => $x * 4)
    function foo($x) {...}
    
    // via a syntax for checked attributes:
    @@!MyDecorator
    function foo($x) {...}

    Whether the attribute syntax has an end delimiter or not, it will be possible to extend functionality in the future. The @Attr syntax has been proven over many years through its use in docblock annotations and other languages, and if it was deficient in some way it almost certainly would have been discovered long ago.

    The case for @@

    The @@Attr syntax arguably strikes the best balance between conciseness, familiarity with docblock annotations, and a very small BC break. Unlike the #[Attr] and @[Attr] proposals, it does not break useful, functional syntax.

    The lack of a separate end delimiter is consistent with other declaration modifiers, and can help avoid confusion when both docblocks and attributes are being used. The syntaxes with an end delimiter make attributes appear as if they are standalone declarations which a docblock can be applied to, even though they are not.

    If you read this far, thank you! Let me know if this post changed your mind, or if there is some other argument that convinces you why @@ or another syntax is best.

  • PolyCast: a library for safe type conversion in PHP

    PolyCast: a library for safe type conversion in PHP

    On March 16, 2015, something amazing happened in the world of PHP. The long-awaited, hotly debated Scalar Type Declarations RFC was accepted for PHP 7! Finally, it will be possible to declare scalar types (int, float, bool, and string) for function parameters and return values:

    <?php
    
    function itemTotal(int $quantity, float $price): float
    {
        return $quantity * $price;
    }

    The need for safe type casts

    By default, scalar types are enforced weakly. So while passing a value such as “my string” to an int parameter would produce an error, values such as 10.9, “42.5”, true, and false would be accepted and cast to 10, 42, 1, and 0, respectively. This behavior lacks safety, since any of these values are likely to be errors, and casting them results in data loss.

    Enabling the optional strict mode will prevent values with an incorrect type from being passed, but this isn’t a complete solution. Whenever you are dealing with user input, whether from a posted form, url parameters, or an uploaded CSV, the data will arrive as a string. Before it can be passed to a function expecting an int or float, the data must be converted to the corresponding type.

    Simple, right?

    <?php
    declare(strict_types=1);
    
    $total = itemTotal((int)$_POST['quantity'], (float)$_POST['price']);

    Wrong. This is even less safe than the default type coercion! A user could pass a value such as “5 hundred” or “ten” and it would be cast to 5 or 0 without producing an error. This is especially concerning in scenarios where sensitive financial information is being handled.

    PHP filters?

    In the past I’ve tried to solve this problem by using PHP’s built-in FILTER_VALIDATE_INT and FILTER_VALIDATE_FLOAT validation filters. However, there are two problems with this approach. First is verbosity: validating just two inputs for our itemTotal function requires eight additional lines of code:

    <?php
    declare(strict_types=1);
    
    $quantity = filter_var($_POST['quantity'], FILTER_VALIDATE_INT);
    $price = filter_var($_POST['price'], FILTER_VALIDATE_FLOAT);
    
    if ($quantity === false) {
        throw new Exception("quantity must be an integer");
    } elseif ($price === false) {
        throw new Exception("price must be a number");
    }
    
    $total = itemTotal($quantity, $price);

    Secondly, and even more problematic, filter_var casts the value being checked to a string and trims whitespace, which results in various unsafe conversions being accepted.

    Introducing PolyCast

    In October of last year, Andrea Faulds proposed a Safe Casting Functions RFC to fill the need for safe type conversion. At the same time, I started developing a userland implementation called PolyCast. Although Andrea’s RFC was ultimately declined, I continued to move PolyCast forward, with a number of improvements based on community feedback.

    PolyCast comes with two sets of functions. The first (safe_int, safe_float, and safe_string) return true if a value can be cast to the corresponding type without data loss, and false if it cannot. The second (to_int, to_float, and to_string) will directly cast and return a value if it is safe, and otherwise throw a CastException.

    This makes safe type conversion nearly as simple as forced casts, without compromising safety:

    <?php
    declare(strict_types=1);
    
    use function theodorejb\polycast\{ to_int, to_float };
    
    $total = itemTotal(to_int($_POST['quantity']), to_float($_POST['price']));

    For more examples and details on which values are considered safe, check out the project on GitHub. PolyCast is tested on PHP 5.4+, and you can easily install it with composer require theodorejb/polycast.

  • Implementing a linked list in SQL

    Implementing a linked list in SQL

    Recently I was challenged with enabling users to drag and drop items in a list to sort them in any order, and persisting that order in a SQL database. One way to handle this would be to add an index column to the table, which could be updated when an item is reordered. The downside of this approach is that whenever an item is added or moved, the index of every item beneath it must be updated. This could become a performance bottleneck in very large lists.

    A more efficient approach is to use a linked list, where each item contains a reference to the previous item in the list, and the first item has a null reference (you could alternatively reference the next item, with the last item containing a null reference, but this requires the list to be sorted back-to-front, which I find less intuitive).

    Let’s start by creating a minimal table for items:

    CREATE TABLE ordered_items (
        item_id INT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
        item_name VARCHAR(100) NOT NULL,
        previous_item_id INT UNSIGNED NULL,
        FOREIGN KEY (previous_item_id) REFERENCES ordered_items(item_id)
    );

    Next, we’ll write two functions for adding items to the list: one to insert the item and the other to update any item referencing the same previous ID as the new item.

    <?php
    
    function addItem($name, $previousId)
    {
        $sql = "INSERT INTO ordered_items (item_name, previous_item_id)
                VALUES (?, ?)";
    
        // The query function is assumed to prepare and execute a SQL statement with
        // an array of bound parameters, and return the insert ID or selected row.
        $itemId = query($sql, [$name, $previousId]);
        setInsertedItemReference($itemId, $previousId);
    }
    
    /**
     * If another item in the list has the same previous ID as the
     * inserted item, change it to reference the inserted item.
     */
    function setInsertedItemReference($itemId, $itemPreviousId)
    {
        $params = [$itemId, $itemId];
    
        if ($itemPreviousId === null) {
            $condition = 'IS NULL';
        } else {
            $condition = '= ?';
            $params[] = $itemPreviousId;
        }
    
        $sql = "UPDATE ordered_items
                SET previous_item_id = ?
                WHERE item_id <> ?
                AND previous_item_id {$condition}";
    
        query($sql, $params);
    }

    To remove an item from the list, we will again need two functions: one to delete the item row and another to update any item referencing the removed item.

    <?php
    
    function deleteItem($itemId)
    {
        $previousId = selectItem($itemId)['previous_item_id'];
        closeMovedItemGap($itemId, $previousId);
        query("DELETE FROM ordered_items WHERE item_id = ?", [$itemId]);
    }
    
    function selectItem($itemId)
    {
        $sql = "SELECT * FROM ordered_items WHERE item_id = ?";
        return query($sql, [$itemId]);
    }
    
    /**
     * If any other item has a previous ID referencing the moved item,
     * change it to point to the moved item's original previous ID.
     */
    function closeMovedItemGap($itemId, $itemPreviousId)
    {
        $sql = "UPDATE ordered_items
                SET previous_item_id = ?
                WHERE previous_item_id = ?";
    
        query($sql, [$itemPreviousId, $itemId]);
    }

    Finally, we can add a function to update items (including their sort order):

    <?php
    
    function updateItem($id, $name, $previousId)
    {
        if ($id === $previousId) {
            throw new Exception('Items cannot reference themselves');
        }
    
        $originalItem = selectItem($id);
    
        if ($previousId !== $originalItem['previous_item_id']) {
            // the item was reordered
            closeMovedItemGap($id, $originalItem['previous_item_id']);
            setInsertedItemReference($id, $previousId);
        }
    
        $sql = "UPDATE ordered_items
                SET item_name = ?,
                previous_item_id = ?
                WHERE item_id = ?";
    
        query($sql, [$name, $previousId, $id]);
    }

    As can be seen, whether an item is added, removed, or reordered, at most three rows will need to be updated. This keeps the performance nearly constant, regardless of the size of the list. With the basic database implementation complete, in my next post I’ll share an approach to efficiently sort the linked list in client-side code.

  • Responsive Captcha: A small PHP library for preventing spam

    If you’re reading this, you probably already know what a CAPTCHA is. The most common form consists of an image with warped or obscured characters which must be entered into a text field. While these image-based CAPTCHAs tend to be effective at stopping spam, they are also poorly accessible, often slow, and require a third-party service or large font files. Surely there must be a better way.

    There is. Text-based CAPTCHAs use simple logic questions to weed out bots while remaining accessible to users with disabilities. I found numerous text CAPTCHA implementations floating around the Web, but I was disappointed that they all either relied on a third-party service or required setting up a database. So I decided to make my own.

    The result is Responsive Captcha, a PHP library which generates simple, random arithmetic and logic questions, and can be easily integrated into an existing form.

    Some example questions generated by Responsive Captcha include:

    • Which is smallest: eight, sixty-nine, or seven?
    • What is nine minus five?
    • What is the third letter in rainbow?
    • What is eight multiplied by one?

    For more examples and instructions for use, check out the project on GitHub: https://github.com/theodorejb/Responsive-Captcha.