Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlite cache #187

Closed
thechief389 opened this issue May 9, 2018 · 10 comments
Closed

sqlite cache #187

thechief389 opened this issue May 9, 2018 · 10 comments
Labels

Comments

@thechief389
Copy link

Is it possible to cache responses into a sqlite database?

@AndCycle
Copy link

AndCycle commented May 14, 2018

write your own, and propose a pull req,

I do have a sqlite cache that adapt CacheControl but it require some refactor to proper propose as pull req.

@AndCycle
Copy link

AndCycle commented May 14, 2018

here is a minimal base that should work

class SQLiteCache(BaseCache):
    def __init__(self, db_path):

        self.conn = sqlite3.connect(db_path)

        db_init_script = f'''
            CREATE TABLE IF NOT EXISTS cache (key TEXT PRIMARY KEY, value BLOB);
        '''


        # sqlite use auto-index to enforce PRIMARY KEY/UNIQUE constrain, 
        # so it's not necessary to create additional index

        self.conn.executescript(db_init_script)

    def _execute(self, sql, parameters):
        with self.conn:
            self.conn.execute(sql, parameters)

    def _execute_read(self, sql, parameters):
        with self.conn:
            for val in self.conn.execute(sql, parameters):
                yield val

    def _execute_fetch_single_column(self, sql, parameters = {}):
        for val in self._execute_read(sql, parameters):
            yield val[0]

    def _execute_fetch_single_element(self, sql, parameters = {}):
        for val in self._execute_fetch_single_column(sql, parameters):
            return val

    def keys(self):
        for val in self._execute_fetch_single_column('SELECT key FROM cache'):
            yield val

    def get(self, key):
        return self._execute_fetch_single_element('SELECT value FROM cache WHERE key = :key', {'key': key})

    def set(self, key, val):
        self._execute('INSERT OR REPLACE INTO cache(key, value) VALUES (:key, :val)', {'key': key, 'val': val})

    def delete(self, key):
        self._execute('DELETE FROM cache WHERE key = :key', {'key': key})

@thechief389
Copy link
Author

Thank you for showing me this.

@ionrock
Copy link
Contributor

ionrock commented May 15, 2018

Also, I won't be adding any other cache implementations to CacheConrol proper. I'd much rather you create your own package and submit a PR to the docs linking to it.

@AndCycle
Copy link

AndCycle commented May 15, 2018

I original write SQLite cache because there is too much problem with FileCache,
primarily due to it's file locking implement have tons of issue if used on win32,

I think provide a simple SQLite implement will be very sufficient for most people,
as I use SQLite implement for around 9 millions of files cached which occupy 1.3TB on my disk,
very reliable as sqlite3 is a built-in package.

the down side is there is no easy implement to take care outdated files and vacuum for sqlite3,
but that problem also present in current FileCache implement, which also require some manual work,
so I don't think things can get any worse in this

@thechief389
Copy link
Author

There's requests-cache, which supports SQLite caching, but doesn't have many features like CacheControl

@thechief389
Copy link
Author

The cache doesn't appear in the database when I used your base.

@AndCycle
Copy link

AndCycle commented May 17, 2018

@thechief389
comment edited,
as it's a strip down version of my original complex code with multi-consumer and have mem cache layer I did't verify it well,
but you get the idea.

@jaap3
Copy link
Contributor

jaap3 commented Jun 4, 2018

It might be worth doing some research into adapting https://github.com/grantjenks/python-diskcache to cachecontrol. That package provides a generic cache implementation which is a mix of SQLite and file based caching. It would also be an interesting solution for issue #185

@jaap3
Copy link
Contributor

jaap3 commented Jun 19, 2018

I did some testing, it seems that python-diskcache can be used as a drop in replacement for FileCache:

import requests

from cachecontrol import CacheControl
from diskcache import FanoutCache

class MyFanoutCache(FanoutCache):
    # Workaround until either grantjenks/python-diskcache#77 or #195 is fixed
    def __bool__(self): 
        return True
    __nonzero__ = __bool__

cache = MyFanoutCache('./tmp', size_limit=2 ** 30, eviction_policy='least-recently-used')
session = CacheControl(requests.Session(), cache=cache)

Then you could periodically call cache.cull() to remove expired items and get the size back down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants