Selenium testing PHP sites in Bitbucket Pipelines

It looks like mounting files isn’t possible in the bitbucket services without getting all meta and using custom Docker-in-Docker images, so I ended up ditching the nginx service. However I successfully managed to run the PHP built in webserver!

php -S 127.0.0.1:80 public/index.php &

The -S option starts the built in dev server, and public/index.php is my app’s entrypoint. The & pushes the process into the background. It only seems to work with the 127.0.0.1 IP, if you try a custom host it won’t work.

Here’s my full pipeline config:

image:
  name: php:8.1-fpm
  username: $DH_USER
  password: $DH_PASS

definitions:
  services:
    db:
      image: mariadb:10.2
      variables:
        MARIADB_DATABASE: MY_TEST_DB
        MARIADB_USER: root
        MARIADB_ROOT_PASSWORD: 'root'
    chrome:
      image: selenium/standalone-chrome
      ports:
        - 4444
        - 5900


pipelines:
  pull-requests:
    '**':
      - step:
          services:
            - db
            - chrome
          name: Installing Composer and running Unit Tests...
          script:
            - if [[ "${BITBUCKET_PR_DESTINATION_BRANCH}" != "develop" ]] && [[ "${BITBUCKET_PR_DESTINATION_BRANCH}" != "release/"* ]]; then printf "Skipping Tests for this target (${BITBUCKET_PR_DESTINATION_BRANCH})"; exit; fi
            - curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
            - composer install --no-scripts
            - vendor/bin/grumphp git:pre-commit
            - cp config/autoload/data.test.php.dist config/autoload/data.test.php
            - php bin/console orm:schema-tool:create
            - php bin/console fixtures:load
            - php -S 127.0.0.1:80 public/index.php &
            - vendor/bin/codecept run --env=bitbucket

  branches:
    develop:
      - step:
          services:
            - db
          name: Installing Composer and running Unit Tests...
          script:
            - curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
            - composer install --no-scripts
            - vendor/bin/grumphp git:pre-commit
            - cp config/autoload/data.test.php.dist config/autoload/data.test.php
            - php bin/console orm:schema-tool:update --force --complete
            - php bin/console fixtures:load
            - php -S 127.0.0.1:80 public/index.php &
            - vendor/bin/codecept run unit --env=bitbucket

Auto prepend a JIRA ticket number to your commit messages

To ensure commit messages contain the JIRA ticket, in the following format:

ABC-1234 : [your commit message here]

You can have this prepended automatically by creating `.git/hooks/prewpare-commit-msg`, with the following code:

#!/bin/sh
COMMIT_MSG_FILE=$1
branch=$(git branch --show-current)
issue=$(sed -e 's/.*\/\([A-Z]\{3\}-[0-9]\{4\}\)\/.*/\1/' <<< $branch)

if [[ $issue =~ [A-Z]{3}-[0-9]{4} ]]; then
originalText=$(cat "$COMMIT_MSG_FILE")
echo "$issue : " > "$COMMIT_MSG_FILE"
echo "$originalText" >> "$COMMIT_MSG_FILE"
fi

After creating the file, run `chmod +x .git/hooks/prepare-commit-msg` to make it executable.

You should also have a `commit-msg` hook to check with regex that the first line contains more than just the ticket
[A-Z]{3}-[0-9]{4}\s:\s(.*). Exit with a non zero status if the regex fails

In my case I am using GrumPHP to manage the commit-msg hook. GrumPHP has a JIRA ticket task, which we can set up in the yaml file like so:

  tasks:
    git_commit_message:
      matchers:
        Must contain JIRA issue number: '/[A-Z]{3}-[0-9]{4} : (.+)/'

Switching PHP versions via the terminal

I ran composer install today, and was rather surprised to find that the intl module wasn’t installed?! Then I remembered I had installed package updates etc, and it turned out I had BOTH PHP 7.3 AND PHP 7.4 installed!

All my sites were developed on 7.3, so I had to switch back.

update-alternatives --list php

Typing that command in will output something like this:

/usr/bin/php.default
/usr/bin/php7.3
/usr/bin/php7.4

Now you can see the available versions. To switch versions, I simply typed

sudo update-alternatives --set php /usr/bin/php7.3

Middleware all the things

Image result for middleware psr 15

The middleware pattern is really awesome, you have layers of midlleware classes that take a request and return a reponse, but pass to a handler first, eventually reaching the original callable your request routed to.

My framework Bone MVC makes use of PSR-7 requests, so I’ve been refactoring for v3.0.0 to use league/route and it’s middleware stack, and I’ve just realised a great use case, so I’m about to refactor and create a middleware!

Take a look at this controller action:

/**
* @param ServerRequestInterface $request
* @param array $args
* @return ResponseInterface
* @throws \Doctrine\ORM\EntityNotFoundException
*/
public function viewAction(ServerRequestInterface $request, array $args): ResponseInterface
{
$dragon = $this->service->getRepository()->find($args['id']);
$server = $request->getServerParams();
$hal = [
'_links' => [
'self' => [
'href' => $server['REQUEST_SCHEME'] . '://' . $server['SERVER_NAME'] . '/api/dragon/' . $args['id']
]
],
];
$payload = array_merge($hal, $dragon->toArray());

return new JsonResponse($payload);
}

As well as returning the entity in array format to the JSON response, I’m also creating HAL content negotiation. I was about to do the same for the index listing the collection of entities, when i thought I could simplify my controller by taking HAL stuff out of it!

The middleware signature looks like this:

namespace Psr\Http\Server;

use Psr\Http\Message\ResponseInterface;
use Psr\Http\Message\ServerRequestInterface;

/**
 * Participant in processing a server request and response.
 *
 * An HTTP middleware component participates in processing an HTTP message:
 * by acting on the request, generating the response, or forwarding the
 * request to a subsequent middleware and possibly acting on its response.
 */
interface MiddlewareInterface
{
    /**
     * Process an incoming server request.
     *
     * Processes an incoming server request in order to produce a response.
     * If unable to produce the response itself, it may delegate to the provided
     * request handler to do so.
     */
    public function process(ServerRequestInterface $request, RequestHandlerInterface $handler): ResponseInterface;
}

So first up I’ll create my Middleware class and make it do nothing but pass stuff along as it was received

<?php

namespace Bone\Http\Middleware;

use Psr\Http\Message\ResponseInterface;
use Psr\Http\Message\ServerRequestInterface;
use Psr\Http\Server\MiddlewareInterface;
use Psr\Http\Server\RequestHandlerInterface;

class HalEntity implements MiddlewareInterface
{
    /**
     * @param ServerRequestInterface $request
     * @param RequestHandlerInterface $handler
     * @return ResponseInterface
     */
    public function process(ServerRequestInterface $request, RequestHandlerInterface $handler): ResponseInterface
    {
        return $handler->handle($request);
    }
}

Then register the middleware on your stack. For myself using league/router, you can add middleware to all routes, groups of routes, or individual routes, which is what I’ll do here:

$route->map('GET', '/dragon/{id:number}', [DragonApiController::class, 'viewAction'])
->middleware(new HalEntity());

A quick check, and yes, the page is still loading as per normal. So now to get the refactoring done!

First we strip out the HAL link stuff and the array merge, and move it to a class impolementing the PSR middleware interface, to make it look like this:

<?php

namespace Bone\Http\Middleware;

use Psr\Http\Message\ResponseInterface;
use Psr\Http\Message\ServerRequestInterface;
use Psr\Http\Server\MiddlewareInterface;
use Psr\Http\Server\RequestHandlerInterface;

class HalEntity implements MiddlewareInterface
{
/**
* @param ServerRequestInterface $request
* @param RequestHandlerInterface $handler
* @return ResponseInterface
*/
public function process(ServerRequestInterface $request, RequestHandlerInterface $handler): ResponseInterface
{
$uri = $request->getUri();

$hal = [
'_links' => [
'self' => [
'href' => $uri->getScheme() . '://' . $uri->getHost() . $uri->getPath(),
]
],
];

$response = $handler->handle($request);

$data = \json_decode($response->getBody()->getContents(), true);
$data = \array_merge($hal, $data);

$body = $response->getBody();
$body->rewind();
$body->write(\json_encode($data));

return $response->withBody($body);
}
}

And your controller action will look like this 🙂 :

/**
 * @param ServerRequestInterface $request
 * @param array $args
 * @return ResponseInterface
 * @throws \Doctrine\ORM\EntityNotFoundException
 */
public function viewAction(ServerRequestInterface $request, array $args): ResponseInterface
{
    $dragon = $this->service->getRepository()->find($args['id']);

    return new JsonResponse($dragon->toArray() );
}

Bone MVC Framework v3 released

vX.Y.Z – Major.minor.patch – or, BC breaks, features, hotfixes! The best software would be 1.∞.x, as that would mean an absolute tonne of new features have been added, without breaking any backwards compatibility!

Having said all of that, I would now like to present my latest BC break, and announce the release of Bone MVC Framework v3.0 😀

https://bonemvc.delboysplace.co.uk/en_PI

What’s new? Quite a lot. Try it out! We (I) ripped out the old Bone Router and replaced it with league/route, a PSR-15 middleware router based on Fastroute! We replaced Pimple with delboy1978uk/barnacle, a PSR-11 wrapper for Pimple (Pimple has a PSR-11 wrapper but it doesn’t implement the interface correctly and so doesn’t work. Fabien Potencier has went in a huff with the PHP-Fig group so I didn’t bother sending him a Pull Request 😐). Both of these improvements have allowed for a far more flexible and modular approach to site building, as you will see if you try it out!

Ok then, let’s see it in action

The recommended approach is to use Docker, as this means you can get the identical dev enironment that I do, regardless of your operating system. If you don’t wish to take this approach, you can set a virtualhost and database yourself.

Docker VM dev environment setup

To get the Docker VM working, make sure you have Git, VirtualBox (or VMWare), and Docker installed. Once you do, open a terminal window. We need to run a one off command, telling Docker to create a VirtualBox VM which will be your new Docker Machine.

docker-machine create --driver virtualbox default

You now have a Docker Machine. So, the usual process when you would like to begin coding ais as follows:

docker-machine start

Now it’s started, run this for every tab you open in the terminal to set up environment vars in your terminal

eval $(docker-machine env)

Installing Bone MVC Framework

By default, we use a fake domain https://awesome.scot, so sudo nano /etc/hosts and add the following:

awesome.scot 192.168.99.100

Now if you cd into wherever you wish to store the project (~/Sites in my case), we can clone the framework skeleton project:

git clone https://github.com/delboy1978uk/bonemvc
cd bonemvc

Inside the skeleton files, you’ll see a docker-compose.yml, and a build folder. If your’e nosey, have a look and you’ll see how it works! The VM comes with Apache, PHP 7.3, XDebug and a stack of PHP modules, Composer, Mariadb, Mailhog, self signed SSL certificate, and most of the stuff you need. Lets start it up.

docker-compose up

This will take a few minutes the first time you do this, but thereafter should be a lot faster, even with different Bone MVC projects in different folders.

At this point, your terminal is now tailing the logs from each of your services, mail, php, apache, mariadb, and so we need another tab open as we want to log into to the Linux VM to run composer. Remember when you open a new tab, you need to run the eval command again to get the environment variables.

eval $(docker-machine env)
docker-compose exec php /bin/bash

At this point, you’ll notice a slight change in your shell’s look. We are now logged into the server! Lets install the dependencies and we are ready to rock!

eval $(docker-machine env)
docker-compose exec php /bin/bash
composer install

Once composer is finished doing it’s thing, we are ready to browse to the project! Open https://awesome.scot in your browser, and you should see the following!

Okay great, so how is it all structured?

As per most projects these days, the main entrypoint is public/index.php, and all public assets are installed in public/. Bone use the environment variable APPLICATION_ENV to differentiate between your dev setup and staging, production, etc.

The build folder is part of the Docker setup as mentioned earlier, so can be ignored.

The config folder allows for subfolders based on the APPLICATION_ENV, which will override the config files sitting directly under config/. You can add as many .php files into these folders that you like.

The data folder containsa variety of things. There is a cache storage folder (delboy/bone-cache coming soon), a logs folder (delboy1978uk/bone-log coming soon), a proxies folder (for doctrine entity proxies if using delboy1978uk/bone-doctrine), a translations folder with several languages pre-configured (use poedit to edit these files), and an uploads folder.

The migrations folder is use to store database migrations if using delboy1978uk/bone-doctrine.

The src/ folder contains your applications modules, and currently only contains one module, App.

Inside src/App, you will see a class called AppPackage.php, look inside!

All packages implement RegistrationInterface, and any packages containing controller routes will also implement RouterConfigInterface. addToContainer() allows you to create object factories so you can inject any dependencies. In the AppPackage, the Controller requires the view engine so it can render pages, so we create a controller factory and pull the view from the container (view is set up automatically by Bone):

    /**
     * @param Container $c
     */
    public function addToContainer(Container $c)
    {
        $c[IndexController::class] = $c->factory(function (Container $c) {
            $view = $c->get(PlatesEngine::class);
            return new IndexController($view);
        });
    }

The addRoutes() method tells the router which actions in which controller to call.

    /**
     * @param Container $c
     * @param Router $router
     * @return Router
     */
    public function addRoutes(Container $c, Router $router): Router
    {
        $router->map('GET', '/', [IndexController::class, 'indexAction']);
        $router->map('GET', '/learn', [IndexController::class, 'learnAction']);

        return $router;
    }

If you have Doctrine entities, return true in the hasEntityPath() method, and return the path to the Entity folder in the getEntityPath() method.

Controllers now just take in a PSR-7 request, and return a PSR-7 Response. However we can wrap the request and response in middleware! See league/route docs for more info.

For a bit of fun, I’ll show you a module with a Doctrine entity, and just how quickly you can add a new feature! In the VM CLI, type the following:

composer require delboy1978uk/bone-doctrine
composer require delboy1978uk/bone-gentest

You now have the doctrine and my test entity package in the vendor folder.

Open config/packages.php, and add the two packages.

<?php

use BoneMvc\Module\App\AppPackage;
use BoneMvc\Module\BoneMvcDoctrine\BoneMvcDoctrinePackage;
use Random\Developer\JediPackage;

return [
    'packages' => [
        AppPackage::class,
        BoneMvcDoctrinePackage::class,
        JediPackage::class,
    ],
    'viewFolder' => 'src/App/View'
];

Ok, so we have entities need migrated to the database. In the VM CLI, type the following:

migrant diff
migrant migrate

You’ll notice that migrant is just Doctrine migrations, but tweaked for usage with Bone MVC. Now you have migrated, head over to https://awesome.scot/jedi

You now have a demo module with an index list page, add edit and delete functionality, with form filtering and validation and rendering done via delboy1978uk/form. Have a look through the module code to see how to set everything up!

Please play around and have fun with it, and feel free to leave any comments!

Ready to rock PHP 7.3 Docker LAMP Stack

I have built a full LAMP stack that comes with the following:

  • Apache
  • PHP 7.3.3
  • Mariadb
  • MailHog
  • XDebug
  • HTTPS Virtualhost with holding page

Here’s how you get a full LAMP stack up and running in no time at all, which will be identical on any platform.

Firstly, install Docker and Virtualbox if you don’t have them. Then create a default base VM.

https://www.docker.com/community-edition

https://www.virtualbox.org/

docker-machine create --driver virtualbox default

OK. So, first thing you need to do is start your VM:

docker-machine start
docker-machine env
eval $(docker-machine env)

You should run eval $(docker-machine env) in any other terminal tabs you open too, this sets up environment vars for setting up docker. Take a note of that IP address, and edit your /etc/hosts file

192.168.99.100 awesome.scot

Ok, lets clone the LAMP stack:

git clone https://github.com/delboy1978uk/lamp
cd lamp

This is another one off, build the image.

docker-compose build

That’s it! Now start your docker image like this

docker-compose up

and you can finally open your browser to

https://awesome.scot

Install your composer package somewhere outside vendor

And I don’t mean change the vendor folder name, that’s easy. I mean getting one package in a custom location, whereas the rest still go into vendor.

This will only work with packages that require composer/installers, so if it isn’t your own package and they don’t require that in, then you can stop reading.

Still here? Awesome. In your vendor package, you need to add the installer

composer require composer/installers

Now in your composer.json, change (or add) the type. The package we just required in is actually to help various CMS’es and frameworks, so you must supply a valid type. Thankfully, it doesn’t matter which we choose, as we override the install path anyway.

"type": "ppi-module",

Commit that, and then go to your main project. In the composer.json, add the following:

"extra": {
"installer-paths": {
"/some/custom/path": ["your/package"]
}
}

Now when you run composer install, you’ll see everything but your package in the vendor folder, and your custom package in its custom location! 😀

Setting up a multi user XDebug proxy

I absolutely love XDebug, it’s a tool that no PHP developer should go without! I constantly find myself in contracts where the workplace aren’t using it, and I end up installing and configuring it for them. However, using XDebug alone will only allow one connection at a time. This is fine when we are are developing in an isolated environment, but in many offices an entire team of developers could be working on a server at once! The solution to this is to add an XDebug proxy!

I will assume for this tutorial that you already have XDebug working and can connect with one user.

The first step is to download the debugger proxy itself to the machine running PHP. This can be found here http://code.activestate.com/komodo/remotedebugging/

Now, before you click on the PHP download, DON’T. You actually want the Python one. Download the package for your OS. If you use Linux or Mac, you’ll need python installed too. Windows comes with a binary .exe so Python isn’t required.

Depending whether you are using PHP-FPM or not will determine which port XDebug is currently listening on. PHP-FPM itself runs on port 9000, so if you are using that, your XDebug port will probably be 9001. If you are using PHP as an apache module or whatever, XDebug will probably be listening on port 9000. For the purposes of this article I will assume XDebug is listening on port 9000.

To start the proxy on Windows:

.\pydbgpproxy.exe -d 127.0.0.1:9000 -i 100.200.30.40:9001

To start the proxy on Linux:

export PYTHONPATH=<_Komodo install directory_>/lib/support/dbgp/pythonlib;$PYTHONPATH
cd <_Komodo install directory_>/lib/support/dbgp/bin
python pydbgpproxy -d 127.0.0.1:9000 -i 100.200.30.40:9001

To start the proxy on OS X:

export PYTHONPATH=<_Komodo install directory_>/Contents/SharedSupport/dbgp/pythonlib;$PYTHONPATH
cd <_Komodo install directory_>/Contents/SharedSupport/dbgp/bin
python pydbgpproxy -d 127.0.0.1:9000 -i 100.200.30.40:9001

The options;

-d is the debugger itself, listening on port 9000 on the same machine

-i is the IDE listener port. The IP itself is actually the external IP of this same machine.

If it’s running, you should see something like this:

proxy> .\pydbgpproxy.exe -d 127.0.0.1:9000 -i 10.227.148.40:9001
INFO: dbgp.proxy: starting proxy listeners. appid: 8080
INFO: dbgp.proxy: dbgp listener on 127.0.0.1:9000
INFO: dbgp.proxy: IDE listener on 10.227.148.40:9001

Now back in PHPStorm (or some inferior product) Goto the DBGp Proxy settings in the tools menu, and select configure. Put your name in the IDE key box, the IP of the PHP server, and port 9001.

Again in the tool DBGp proxy, and click Register IDE. If successful, you should be able to see lines like this:

INFO: dbgp.proxy: Server:onConnect ('10.227.148.38', 60748) [proxyinit -p 9000 -k DEREK -m 1 ]
INFO: dbgp.proxy: Server:onConnect ('10.227.148.38', 60996) [proxyinit -p 9000 -k KPATIL -m 1 ]

And that’s it! Now two or more developers can simultaneously use XDebug at the same time. Have fun!

Fixing detatched Doctrine entity problems

I’m not sure WHY it happens, but sometimes Doctrine entities can become detatched from the entity manager (If anyone does, please comment below!).

If you’ve ever added an existing object to a new one and tried to persist, you may see this error:

[Doctrine\ORM\ORMInvalidArgumentException]
A new entity was found through the relationship ‘OAuth\Client#user’ that was not configured to cascade persist operations for entity: OAuth\OAuthUser@000
000007e2d73fd0000000054f34ea6. To solve this issue: Either explicitly call EntityManager#persist() on this unknown entity or configure cascade persist t
his association in the mapping for example @ManyToOne(..,cascade={“persist”}). If you cannot find out which entity causes the problem implement ‘OAuth\OA
uthUser#__toString()’ to get a clue.

In the case of my code above, the user entity had become detached somehow. Previously, I had managed to solve this problem by calling merge($entity) instead of persist($entity), but this is itself causes problems. If you have join column collections, these will no longer persist!

Therefore the answer is to re-attach the entity first!

public function create(Client $client)
{
$em = $this->getEntityManager();
$user = $client->getUser();
if ($em->getUnitOfWork()->getEntityState($user) !== UnitOfWork::STATE_MANAGED) {
$user = $em->merge($user);
$client->setUser($user);
}
$em->persist($client);
$em->flush();
return $client;
}

Binary Search Trees in PHP

Full credit to googleguy, who can be found on Freenode IRC channel ##php (sometimes).

Imagine you’re a phone company and you want to implement an automated system on your website for assigning phone numbers for your customers. There are two use cases. One, is that the customer can ask if a specific number is available. The other, is that the customer can ask for any random number that’s available. What data structure would you use to efficiently store and search these phone numbers? Keep in mind that you could potentially have billions of these phone numbers (1011 if we include international numbers).

If you use an Array, access is O(1) so it seems like a good idea at first, but that’s only in cases where you know the index of the value you want to access. Finding that index means searching the entire array, which is really O(n). If you use a Hash Table, the cost of search is O(1) so it seems like a better idea, but the cost of deletion and insertion can be O(n) in the worst case scenario (where you have 100% collision).

The best answer is to use a BST or Binary Search Tree, because they have average O(log n) for search, insertion, and deletion. That is assuming it’s a self-balancing tree implementation.


If you’re not at all familiar with Binary Search Trees or Binary Trees in general, let me give you a simple analogy to try and grasp this particular concept of search in computer science.

Let’s say a friend of yours is stuck on the road due to car trouble. Like a good friend you go to pick them up. You know they are stuck at a fork on this road and you arrive at a fork. At the fork there is a sign with a number on it. The number is 5 and your friend sees a sign with the number 16. You know that if you go left, you will only find numbers less than 5 and if you go right you will find numbers greater than 5. So you go right and arrive at another fork. This time you see a sign with the number 11. What do you do?

Knowing how to get to your friend, on this road, is no different than knowing how to search a (BST). You must always know two things, and always be able to make one of two simple choices. Is this number more than or less than the number I’m searching for? Do I go left, or do I go right at the fork?

Learning Opportunity

A BST is a data structure made up of nodes (the forks in our road analogy) which have left and right branches that, may or may not, each have an attached node (i.e there could be a dead end). A node attached to the left branch is less than the current node. A node attached to the right branch is greater than the current node. This always holds true no matter which node you’re currently looking at. The entirety of the structure makes up what looks like a tree, given all these nodes connected by branches. The fact that there can only be two branches on each node is what makes it binary. In the context of search these distinct properties are what make a BST very efficient.

The idea here is that you can always discard one-half of the tree with every iteration. This makes it possible to search for any value in the tree logarithmically, i.e. O(log n). Consequently, the cost of insertion or deletion to and from the tree are the same.

So say we have 9 random integers in a BST and we want to find out if the integer 16 is one of them. You always start at the root node of the tree and compare the given value to the search value. If the node’s value is greater than the search value we move to the node in the left branch. If the node’s value is less than the search value we move to the node in the right branch. Repeat this step until you find the value or reach the end (a null branch).

In PHP, we know that arrays are really just ordered hashmaps, and the cost of search in a hashmap is O(1), which makes it much faster than O(log n). However, BSTs have other properties that make them more desirable than a hashmap in many use cases. For example, finding the min or max value in a hashmap is O(n) whereas in a BST it is O(log n). In a real vector, or contiguous block, the values would need to be sorted before search can be O(1), but Quicksort requires O(n log n) making it very inefficient for insertion and deletion, because you have to resort each time you insert or delete.

Implementation in PHP

The best way to learn is to just roll up your sleeves and do it, right? So let’s implement a BST in PHP and learn something!

We already know that a binary search tree is made up of nodes. The properties that best describe these nodes is that they have a value, a left branch, and a right branch. Each branch can also contain a node. So let’s write up the node class first since it’s simply described by a set of properties and doesn’t have any real behavior (i.e. methods).

class Node
{
    public $value, $left, $right;

    public function __construct($value)
    {
        $this->value = $value;
    }
}

Simple enough. Now that we have a way to create a node, we’ll need the actual binary search tree implementation to be able to create the tree. So far we understand that a BST has a root node, so that will be one of the properties that describe this object.

The BST also has a couple of behaviors, that will make up the methods of this class. We’ll start by implementing the simplest of these behaviors, which is search.

class BST
{
    public $root;

    public function __construct($value = null)
    {
        if ($value !== null) {
            $this->root = new Node($value);
        }
    }

    public function search($value)
    {
        $node = $this->root;

        while($node) {
            if ($value > $node->value) {
                $node = $node->right;
            } elseif ($value < $node->value) {
                $node = $node->left;
            } else {
                break;
            }
        }

        return $node;
    }
}

As you can see search is quite simple. We merely compare the search value to the current node (starting at the root node). If it’s greater than the value of the current node, we go to the node in the right branch. If it’s less than the value of the current node we go to the node in the left branch. If they’re equal we return the current node. This repeats in the loop until we reach a branch that is null. This tells us that the value does not exist in the tree.

Insertion

Now let’s implement the insert functionality of the tree in order to make it really useful. This is very similar to our search behavior, with one difference. We’re looking for a null branch to insert the value. So ideally, our insert function should look fairly similar to our search function with the minor difference being that we will attach a new node to the null branch reached at the end of our loop.

public function insert($value)
{
    $node = $this->root;
    if (!$node) {
        return $this->root = new Node($value);
    }

    while($node) {
        if ($value > $node->value) {
            if ($node->right) {
                $node = $node->right;
            } else {
                $node = $node->right = new Node($value);
                break;
            }
        } elseif ($value < $node->value) {
            if ($node->left) {
                $node = $node->left;
            } else {
                $node = $node->left = new Node($value);
                break;
            }
        } else {
            break;
        }
    }
    return $node;
}

OK, so what’s going on here? Every time we go right or left we check to see if the branch is null before proceeding and based on that information we make a decision to either attach a new node to the branch and break from the loop, or move to the next node. So insertion happens just before the point where we would normally end up with null and break from the loop, but since we can’t attach a node to null, we attach just before the $node is assigned null, and then break.

Min

Before we can learn about deleting nodes from the tree we must first grasp an important concept of binary search trees, which is how to find their minimum and maximum values. Remember that every time you go left in a BST you are guaranteed to get a value smaller than its parent. So if we start at the root node, and keep going left until we arrive at a node with a null left branch, we should have reached the smallest value in the tree.

So our implementation of min in the BST class should look something like this.

public function min()
{
    if (!$this->root) {
        throw new Exception('Tree root is empty!');
    }

    $node = $this->root;
    return $node->min();
}

Notice that BST::min() will only call on root->min() so we must implement the actual functionality in our Node class. This makes sense since we will likely find it useful to find the minimum value from a specific branch in the tree and not just from the root.

So our implementation of min in the Node class should look something like this.

public function min()
{
    $node = $this;
    while ($node->left) {
        if (!$node->left) {
            break;
        }
        $node = $node->left;
    }
    return $node;
}

Notice that if you start at the root and keep going left in our example tree illustrated earlier, you will ultimately reach the smallest value in the tree, but consequently, if you were to start at any other node in the tree and keep going left until you arrive at a null left branch, you will have the smallest value in that branch.

Max

You can probably guess how the implementation for max will work in our BST class…

public function max()
{
    if (!$this->root) {
        throw new Exception('Tree root is empty!');
    }

    $node = $this->root;
    return $node->max();
}

In our Node class the same concept applies, but we’re going right instead of left.

public function max()
{
    $node = $this;
    while ($node->right) {
        if (!$node->right) {
            break;
        }
        $node = $node->right;
    }
    return $node;
}

Deletion

Great, now we have a way to insert nodes into the tree and a way to search the tree. Next, we need to have a way to delete nodes from the tree. Deletion is slightly trickier than search and insertion, but that’s OK, because we’re super ninja programmers that can code our way out of any problem, right?

Crickets…

There are three kinds of deletion in a binary search tree we should be aware of, because they’re each handled just a little differently.

  1. Deleting a node that has no children
  2. Deleting a node that has one child node
  3. Deleting a node that has two child nodes

The easiest is the first one with no nodes, obviously. You simply just delete the node by assigning null to its parent branch, which means you must keep track of the parent branch. So in our earlier BST, if we decided to delete the node with the value of 1, its parent node 2, would then have a left branch of null and the node would be removed from the tree.

Now, let’s say we want to delete the node with the value 2, which at this point only has one branch (containing the node with the value 4). It doesn’t actually matter if the child’s branch is right or left. As long as it only has one branch we simply take the node in that branch and attach it to the parent branch of the node we’re about to delete.

Finally, the last case is when we wish to delete a node that has both branches occupied. To do this we must first find the smallest node of its right branch. We can do this using the node’s min method that we implemented earlier.

Now we assign that min value to value of the node we’re trying to delete, which will give us duplicate nodes.

Next we apply a delete operation to the duplicate node.

Now, to make this concept a little easier to implement we can make one small modification to our Node class. By allowing each node to keep track of its parent node, we never need to pass along the parent by keeping track of it from the calling context. This will simplify the last and most complex case of deletion for us quite a bit. It will also make it easier to walk the tree later.

class Node
{
    public $parent, $value, $left, $right;

    public function __construct($value, Node $parent = null)
    {
        $this->value = $value;
        $this->parent = $parent;
    }
}

Now we just need to modify the insert method of our BST class to pass the parent into the Node’s constructor when doing insertion.

What this gives us is a doubly-linked list of nodes. Making it possible for a child node, in the tree, to know about its parent node. This is also a recursive relationship, because the child stores the parent property as the parent node object itself. So, be careful of (recursion there) recursion there.

public function insert($value)
{
    $node = $this->root;
    if (!$node) {
        return $this->root = new Node($value);
    }

    while($node) {
        if ($value > $node->value) {
            if ($node->right) {
                $node = $node->right;
            } else {
                $node = $node->right = new Node($value, $node);
                break;
            }
        } elseif ($value < $node->value) {
            if ($node->left) {
                $node = $node->left;
            } else {
                $node = $node->left = new Node($value, $node);
                break;
            }
        } else {
            break;
        }
    }
    return $node;
}

The root will have a null parent, and that’s fine. So nothing else should change in our implementation. We’re now ready to implement delete in our BST class.

public function delete($value)
{
    $node = $this->search($value);
    if ($node) {
        $node->delete();
    }
}

The delete method in our BST class is rather simple. It just searches for the node we’re trying to delete, and calls the delete method on that Node. The Node should be able to take care of the rest from here.

public function delete()
{
    if ($this->left && $this->right) {
        $min = $this->right->min();
        $this->value = $min->value;
        $min->delete();
    } elseif ($this->right) {
        if ($this->parent->left === $this) {
            $this->parent->left = $this->right;
            $this->right->parent = $this->parent->left;
        } elseif ($this->parent->right === $this) {
            $this->parent->right = $this->right;
            $this->right->parent = $this->parent->right;
        }
        $this->parent = null;
        $this->right   = null;
    } elseif ($this->left) {
        if ($this->parent->left === $this) {
            $this->parent->left = $this->left;
            $this->left->parent = $this->parent->left;
        } elseif ($this->parent->right === $this) {
            $this->parent->right = $this->left;
            $this->left->parent = $this->parent->right;
        }
        $this->parent = null;
        $this->left   = null;
    } else {
        if ($this->parent->right === $this) {
            $this->parent->right = null;
        } elseif ($this->parent->left === $this) {
            $this->parent->left = null;
        }
        $this->parent = null;
    }
}

Our Node::delete(), on the other hand, is not so simple.

The first if condition, in this method, is the number 3 scenario on our list, mentioned earlier. When the node we’re trying to delete has both a left and right branch, we must first find the min of its right branch and set its value to the value of that node. Then we call delete on that node. Notice, once we do that the min node cannot meet the first condition again. It will only fall into one of the other two scenarios—it either has one child or no children at all (because it will never have a left branch).

The animation here, highlights the most complex of deletion cases, where we need to perform a delete on a node that has both left and right branches occupied.

Let’s Take a Walk

Now, we’ll look at how you can walk through the binary search tree and some of the different ways you can do it.

One thing you should note about tree structures like the binary search tree is that they are not linear structures, like vectors, or linked lists. They can not be viewed sequentially. They are actually traversed as graphs, which means there is more than one way to traverse them. However, there must be a systematic way in which you can visit every node in the tree exactly once. Otherwise, the implementation is considered inefficient and potentially defective. Imagine you wanted to print all of the node values in the tree and your method allowed some nodes to be visited more than once. This wouldn’t be very helpful.

There are two conceptual ways in which you can traverse a binary tree. One way is iterative, meaning it works sequentially (one after the other), but requires a little more memory. The other way is to do it recursively, and since a binary tree is by definition is a corecursive structure, that would make sense. Doing it iteratively requires storing or deferring some nodes to memory before traversing them. This is usually accomplished by using a stack or queue data structure. I actually find it easier to implement a walk method on a tree recursively, but that’s just me. Then again, I’m the one writing this so, what do I care how you like to do it?!

Now whether the method of implementation is recursive or iterative, you still have to choose the order of traversal on the tree, which can be either Depth First or Breadth First. Breadth First order is also sometimes referred to as level order, because you visit every node on a particular level of the tree before moving down to the next level.

In Depth First order there are three different algorithms commonly used for traversal.

  1. Pre-order
  2. In-order
  3. Post-order

We’re not going to go through all of these approaches here. In this article, we’re just going to use Depth First In-order traversal, since I find it to make the most sense for demonstration purposes.

Depth First In-order Traversal

First, let’s plot out our chartered course so that I have some high-level view of where we want to go, before we try to get there.

The sequence looks kind of like a roller coaster ride, but this particular aproach will give us the nodes in order.

So the result is all values in ascending order as [1, 2, 4, 5, 7, 11, 16, 23, 34], which is what you’d expect if this was a sorted array, for example.

public function walk(Node $node = null)
{
    if (!$node) {
        $node = $this->root;
    }
    if (!$node) {
        return false;
    }
    if ($node->left) {
        yield from $this->walk($node->left);
    }
    yield $node;
    if ($node->right) {
        yield from $this->walk($node->right);
    }
}

I’ve used a generator implementation, which will only work in PHP 7, but that’s just because it makes the implementation so much simpler. You can accomplish the same thing in PHP 5 by replacing the yield calls with an array assignment, for example, and then using array_reduce to return the actual results back to the caller. You might need a wrapper function for this, but the concept is the same. It’s just so much less code using a generator!

If we ran this code…

// Instantiate a new tree with root node of 5
$bst = new BST(5);

// Insert left branch nodes
$bst->insert(2);
$bst->insert(1);
$bst->insert(4);

// Insert right branch nodes
$bst->insert(11);
$bst->insert(7);
$bst->insert(23);
$bst->insert(16);
$bst->insert(34);

// Walk the tree
$tree = $bst->walk();
foreach($tree as $node) {
    echo "{$node->value}\n";
}

We should get output like this…

1
2
4
5
7
11
16
23
34

Other Real World Applications

So, you’re probably thinking, what on earth would I need a binary tree for, and especially in a language like PHP? And you’re probably very right for thinking that, but don’t confuse what you’ll need it for with why you almost never need to implement a binary tree in practice. In PHP, particularly, the performance gains of binary search are dwarfed by the context-switching overhead of having to implement them as objects (it takes a lot longer to carry out a function call in PHP than it would to do simple pointer dereferencing in a language like C, for example).

It’s very important to know about and understand this data structure in concert with other data structures so that you’re aware of the performance (time and memory) characteristics of each in a given problem. In many cases, you’ll be using tools that implement a btree quite frequently, but will rarely be required to implement one yourself. For example, your Database Management System (DBMS) implements a type of btree commonly referred to as B+ tree. In 3D computer graphics, many games will often implement a kind of btree usually called a Binary Space Parition (BSP), which allows for recursively subdividing a space into convex sets by hyperplanes.

There are many areas where trees or graphs (nodes and edges) are generally useful in a wide array of programming problems. So having a good grasp of BST, is not a bad start!

Final Thoughts

As mentioned in the beginning of this article, only a self-balancing implementation of a BST, such as AVL-tree will accomplish average O(log n) in search, insertion, and deletion, because as you can see by our implementation if I were to simply insert the numbers [1,2,3,4,5] or [5,4,3,2,1] you would have O(n) since the tree would be entirely right or left heavy.

So, as always, I like to leave something up to the imagination. So I’ll leave it to you to play around with the implementation and see if you can figure out how to implement a balance method that will automatically balance the tree.