debian-django-cachalot/docs/introduction.rst

146 lines
7.5 KiB
ReStructuredText
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

.. _Introduction:
Introduction
------------
Should you use it?
..................
Django-cachalot is the perfect speedup tool for most Django projects.
It will speedup a website of 100 000 visits per month without any problem.
In fact, **the more visitors you have, the faster the website becomes**.
Thats because every possible SQL query on the project ends up being cached.
Django-cachalot is especially efficient in the Django administration website
since its unfortunately badly optimised (use foreign keys in ``list_editable``
if you need to be convinced).
However, its not suited for projects where there is **a high number
of modifications per minute** on each table, like a social network with
more than a 50 messages per minute. Django-cachalot may still give a small
speedup in such cases, but it may also slow things a bit
(in the worst case scenario, a 20% slowdown,
according to :ref:`the benchmark <Benchmark>`).
If you have a website like that, optimising your SQL database and queries
is the number one thing you have to do.
There is also an obvious case where you dont need django-cachalot:
when the project is already fast enough (all pages load in less than 300 ms).
Like any other dependency, django-cachalot is a potential source of problems
(even though its currently bug free).
Dont use dependencies you can avoid, a “future you” may thank you for that.
Features
........
- **Saves in cache the results of any SQL query** generated by the Django ORM
that reads data. These saved results are then returned instead
of executing the same SQL query, which is faster.
- The first time a query is executed is about 10% slower, then the following
times are way faster (7× faster being the average).
- Automatically invalidates saved results,
so that **you never get stale results**.
- **Invalidates per table, not per object**: if you change an object,
all the queries done on other objects of the same model are also invalidated.
This is unfortunately technically impossible to make a reliable
per-object cache. Dont be fooled by packages pretending having
that per-object feature, they are unreliable and dangerous for your data.
- **Handles everything in the ORM**. You can use the most advanced features
from the ORM without a single issue, django-cachalot is extremely robust.
- An easy control thanks to :ref:`settings` and :ref:`a simple API <API>`.
But thats only required if you have a complex infrastructure. Most people
will never use settings or the API.
- A few bonus features like
:ref:`a signal triggered at each database change <Signal>`
(including bulk changes) and
:ref:`a template tag for a better template fragment caching <Template utils>`.
Comparison with similar tools
.............................
This comparison was done in December 2015. It compares django-cachalot
to the other popular automatic ORM caches at the moment:
`django-cache-machine <https://github.com/django-cache-machine/django-cache-machine>`_
& `django-cacheops <https://github.com/Suor/django-cacheops>`_.
Features
~~~~~~~~
===================================================== ========= ============= ==========
Feature cachalot cache-machine cacheops
===================================================== ========= ============= ==========
Easy to install ✔ ✘ quite
Cache agnostic ✔ ✔ ✘
Type of invalidation per table per object per query
CPU performance excellent excellent excellent
Memory performance excellent good excellent
Reliable ✔ ✘ ✘
Useful for > 50 modifications per minute ✘ ✔ ✔
Handles transactions ✔ ✘ ✘
Handles Django admin save ✔ ✘ ✘
Handles multi-table inheritance ✔ ✔ ✘
Handles ``QuerySet.count`` ✔ ✘ ✔
Handles ``QuerySet.aggregate``/``annotate`` ✔ ✔ ✘
Handles ``QuerySet.update`` ✔ ✘ ✘
Handles ``QuerySet.select_related`` ✔ ✔ ✘
Handles ``QuerySet.extra`` ✔ ✘ ✘
Handles ``QuerySet.values``/``values_list`` ✔ ✘ ✔
Handles ``QuerySet.dates``/``datetimes`` ✔ ✘ ✔
Handles subqueries ✔ ✔ ✘
Handles querysets generating a SQL ``HAVING`` keyword ✔ ✔ ✘
Handles ``cursor.execute`` ✔ ✘ ✘
Handles the Django command ``flush`` ✔ ✘ ✘
===================================================== ========= ============= ==========
Explanations
''''''''''''
“Handles [a feature]” means that the package correctly invalidates SQL queries
using that feature. So if a package doesnt handle a feature, you may get
stale query results when using this feature.
It does not mean that it caches a query with this feature, although
django-cachalot caches all queries except random queries
or those ran through ``cursor.execute``.
This comparison was done by running the test suite of cachalot against
cache-machine & cacheops. This test suite is indeed relevant for other
packages (such as cache-machine & cacheops) since most of it is written in
a cachalot-independent way.
Similarly, the performance comparison was done using our benchmark,
coupled with a memory measure.
To me, cache-machine & cacheops are not reliable because of these reasons:
- Neither cache-machine or cacheops handle transactions, which is critical.
**Transactions are used a lot in Django internals**: at least
in any Django admin save, many-to-many relations modification,
bulk creation or update, migrations, session save.
If an error occurs during one of these operations, good luck finding
if stale data is returned. The best you can do in this case is manually
clearing the cache.
- If you use a query thats not handled, you may get stale data. It ends up
ruining your database since it lets you save modifications to stale data,
therefore overwriting the latest version thats in the database.
And you always end up using queries that are not handled since there is no
list of unhandled queries in the documentation of each module.
- In the case of cache-machine, another issue is that it relies
on “flush lists”, which cant work reliably when implemented in a cache
like this (see `cache-machine#107 <https://github.com/django-cache-machine/django-cache-machine/issues/107>`_).
Number of lines of code
~~~~~~~~~~~~~~~~~~~~~~~
Django-cachalot tries to be as minimalist as possible, while handling most
use cases. Being minimalist is essential to create maintainable projects,
and having a large test suite is essential to get an excellent quality.
The statistics below speak for themselves…
============ ======== ============= ========
Project part cachalot cache-machine cacheops
============ ======== ============= ========
Application 743 843 1662
Tests 3023 659 1491
============ ======== ============= ========