Fix internal reference
[GitHub/WoltLab/woltlab.github.io.git] / docs / php_api_caches.md
1 # Caches
2
3 WoltLab Suite offers two distinct types of caches:
4
5 1. [Persistent caches](php_api_caches_persistent-caches.md) created by cache builders whose data can be stored using different cache sources.
6 2. [Runtime caches](php_api_caches_runtime-caches.md) store objects for the duration of a single request.
7
8 ## Understanding Caching
9
10 Every so often, plugins make use of cache builders or runtime caches to store
11 their data, even if there is absolutely no need for them to do so. Usually, this
12 involves a strong opinion about the total number of SQL queries on a page,
13 including but not limited to some magic treshold numbers, which should not be
14 exceeded for "performance reasons".
15
16 This misconception can easily lead into thinking that SQL queries should be
17 avoided or at least written to a cache, so that it doesn't need to be executed
18 so often. Unfortunately, this completely ignores the fact that both a single
19 query can take down your app (e. g. full table scan on millions of rows), but
20 10 queries using a primary key on a table with a few hundred rows will not slow
21 down your page.
22
23 There are some queries that should go into caches by design, but most of the
24 cache builders weren't initially there, but instead have been added because
25 they were required to reduce the load _significantly_. You need to understand
26 that caches always come at a cost, even a runtime cache does! In particular,
27 they will always consume memory that is not released over the duration of the
28 request lifecycle and potentially even leak memory by holding references to
29 objects and data structures that are no longer required.
30
31 Caching should always be a solution for a problem.
32
33 ### When to Use a Cache
34
35 It's difficult to provide a definite answer or checklist when to use a cache
36 and why it is required at this point, because the answer is: It depends. The
37 permission cache for user groups is a good example for a valid cache, where
38 we can achieve significant performance improvement compared to processing this
39 data on every request.
40
41 Its caches are build for each permutation of user group memberships that are
42 encountered for a page request. Building this data is an expensive process that
43 involves both inheritance and specific rules in regards to when a value for a
44 permission overrules another value. The added benefit of this cache is that one
45 cache usually serves a large number of users with the same group memberships and
46 by computing these permissions once, we can serve many different requests. Also,
47 the permissions are rather static values that change very rarely and thus we can
48 expect a very high cache lifetime before it gets rebuild.
49
50 ### When not to Use a Cache
51
52 I remember, a few years ago, there was a plugin that displayed a user's character
53 from an online video game. The character sheet not only included a list of basic
54 statistics, but also displayed the items that this character was wearing and or
55 holding at the time.
56
57 The data for these items were downloaded in bulk from the game's vendor servers
58 and stored in a persistent cache file that periodically gets renewed. There is
59 nothing wrong with the idea of caching the data on your own server rather than
60 requesting them everytime from the vendor's servers - not only because they
61 imposed a limit on the number of requests per hour.
62
63 Unfortunately, the character sheet had a sub-par performance and the users were
64 upset by the significant loading times compared to literally every other page
65 on the same server. The author of the plugin was working hard to resolve this
66 issue and was evaluating all kind of methods to improve the page performance,
67 including deep-diving into the realm of micro-optimizations to squeeze out every
68 last bit of performance that is possible.
69
70 The real problem was the cache file itself, it turns out that it was holding the
71 data for several thousand items with a total file size of about 13 megabytes.
72 It doesn't look that much at first glance, after all this isn't the '90s anymore,
73 but unserializing a 13 megabyte array is really slow and looking up items in such
74 a large array isn't exactly fast either.
75
76 The solution was rather simple, the data that was fetched from the vendor's API
77 was instead written into a separate database table. Next, the persistent cache
78 was removed and the character sheet would now request the item data for that
79 specific character straight from the database. Previously, the character sheet
80 took several seconds to load and after the change it was done in a fraction of
81 a second. Although quite extreme, this illustrates a situation where the cache
82 file was introduced in the design process, without evaluating if the cache -
83 at least how it was implemented - was really necessary.
84
85 Caching should always be a solution for a problem. Not the other way around.