This post is about trying to limit SQL queries, by the use of a method called “caching”. I’m going to table about different methods of caching.

Consider this code using Eloquent models in Laravel:

    private function addRoleCheckboxes()
    {
        foreach(Role::all() as $role)
        {
            $role= new Checkbox($this->roleCheckboxName($role),$role->name);
            $this->addWidget($role);
        }
    }
    public function getRolesFromCheckboxes(): array {
        $a=[];
        foreach(Role::all() as $role)
        {
            $name=$this->roleCheckboxName($role);
            if($this->$name->get())
            {
                $a[]=$role->id;
            }
        }
        return $a;
    }

This is from a class that represents a HTML form which is managing permissions for a Roles. You can see here that one function is responsible for adding a Checkbox to the form for each role, and the next function goes returns a list of role id’s that have been checked.

Now my problem here is the Role::all() part. Given that the Role model is a part of the site’s permissions it’s likely going to be used many times on most pages. We don’t want to be hitting the database for each call on it. I guess that might be true of other data on a site as well. So more generally we could consider anything that might be anything of the form Model::all() or even Model::find($id).

If we had something like Role::find(1) called many times what that means some text is being generated that looks something like select * from roles where id=1 and then that is being sent to the postgresql server, which then parses and tokenizes the query, then does a look up of the data in memory or on disk to get the correct list of rows that match, in this case a single row, and then returns, usually via TCP/IP, the result. TCP/IP involved packetization which is another data transformation to allow the data to be sent over the network, but in most of my cases it’s not a real network it’s the same machine. Then PHP will parse that result set into a data structure, which Eloquent will then turn into an instance of the Role model.

Basically I’m saying finding a single record requires quite a bit of work, and that it might be good not to do that same work over and over again, certainly when we know the second response is not going to be different to the first one. Also, when we know the response is highly unlikely to be different, and even if it was, it wouldn’t matter.

In the words of the old joke, the only two hard problems of computer science are naming things, cache invalidation, and out by one errors.

Cache invalidation can be hard. If you cache a product listing, or search results of a product, then a change to the product price needs to be correct now, but a change in the product is no guarentee that the product shows in any particular cache of listings or search results, so you have to do work to figure that reverse relationship, and then invalidate the cache. Which is easy to get wrong and then you end up with horrible bugs.

Or you can cheat and just invalidate everything fairly quickly, all the time. Because, another common failure of caching is that when you have too many items in the cache it can end up being more work dealing with the cache that working things out from scratch. You may think that you difficult to generic relationship structure that generates the Navigation on your site is complicated and needs to be cached but then discover that the number of variations you would have to store (1 per user? 1 per role?) is higher than the number of items you’re deciding between.

So we now have three things to consider

  1. Repeated queries
  2. Cache invalidation
  3. Overall cache size.

Is it possible to code something that can manage all this automatically. Well, yes I guess it is possible, but the code that did this would be making an attempt to understand which of a number of caching methods is the best to use in each case. And that information we coders probably have as we build our site. Perhaps this can be an AI thing as AI improves.

The normal Laravel Cache mechanism is to have a key=>value type structure. You generate the HTML (or whatever) you want, give it a name, and store the data against the name. That then goes to some Laravel Cache driver. Like memcached, or redis, or just a file somewhere. So loading from the cache will require some parsing, but that might be OK if it’s just a string or something. If you do caching like this that caching will work over many requests. So you want to be careful not to cache things that contain one user’s data and then show it to another user. I’ve seen and heard about this kind of error loads of times.

So what I want is a cache only where repeated SQL queries in the same request are cached.

Well, each request is dealt with by a single process. Let’s just store the results of the SQL query in memory. It’s basically there already.

I could create something like this


class MyRoleProvider {
  private $roles=[];
  private $loaded=false;

  static public function all() : array {

    if(self::$loaded)
    {
      self::load();
    }
    return self::$roles;
  }
  static public function load() : void {
    self::$roles=Roles::all();
  }
}

Then instead of typing in Roles::all() I could now type MyRoleProvider::all(). On the first run it would do Roles::all() but subsequent calls would just return the result object. So that would solve my problem, and in some sense it would be “cached”.

I could add a function to the code above to cover the find() function and that would perhaps be clever enough to understand if we’ve already done all() then we might already have the data we need from the find(). Saving us another SQL query. This is simple code, we could pass the Roles as a parameter to the function, and then have this class work for all sorts of models. Excellent. Ah, but there is a cost, and that cost is in the use of memory for the records. This might not be a concern, but if there were a large number of roles, and we were doing it with other models then this could cause memory issues.

I have used this strategy combined with normal Laravel Cache stlyle caching to get 300+ query pages down to an average of <10 queries.

There was another caching thing I did once as well, which was to store the basket total and the basket items in the client’s local storage and return a template which only had those two values missing and used javascript to load them in. That meant I could cache the complete header and footer for an ecommerce site in two cache values (I was using PHP’s APC functions). I guess we could call that caching data in the browser.

Don’t forget that it can be worthwhile using the HTTP caching directives as well.

So, when it comes to caching your SQL queries understand the methods you have and their trade offs. Here’s the list of caching methods:

  1. Multi request accessible data cache
  2. Data cache in memory of process
  3. In browser data cache
  4. In browser page cache (including image cache)