Storage

Storage is a place where all the data is stored (not a surprise).

There are currently three built-in Storages classes, but since possible places where one could store REST API data are infinite, there is also a pretty straightforward method of creating additional Storages.

DictStorage

All the data is in a python dictionary. Data is stored in a dictionary. Useful for prototyping, testing and deploying one-time applications.

data = {}
storage = engine.DictStorage(data)

initial data doesn’t have to be empty:

data = {
    'cookie': {1: {'id': 1, 'type': 'muffin'},
               2: {'id': 2, 'type': 'shortbread', 'jar': 1}},
    'jar': {1: {'id': 1, 'cookies': [2]}}
}
storage = engine.DictStorage(data)

PickledDictStorage

The same as DictStorage, but dictionary is read from a pickle file and saved there after every change. This should not be considered in any way a “production” quality storage - it’s just an extension of DictStorage that allows restarts without loosing data.

file_with_pickled_dictionary = 'data.pickle'
storage = engine.PickledDictStorage(file_with_pickled_dictionary)

PGStorage

The only serious storage. Data is stored in a single schema in a PostgreSQL Database.

import psycopg2

connstr = ''
schema_name = 'cookies'

conn = psycopg2.connect(connstr)
storage = engine.PGStorage(conn, schema_name)

This works fine, but if:

  • your app uses more than one process
  • storage is initialized before fork

all processes will share the same connection, which is something you’d better avoid.

The solution is to use storage-returning function instead of storage object:

#   this function will be called once for each request
def storage_func():
    conn = psycopg2.connect(connstr)
    return engine.PGStorage(conn, schema_name)

#   nothing else changes, engine.setup is called with function as second arg
engine.setup(dm, storage_finc)

This way separate storage (and connection) will be used in every process. By the way, this is still not a perfect solution - creating connection for every call is slow, there should be one connection per process, but this is not a blarghish problem.

PGStorage can be modified/extended in many ways, few examples are in the cookbook:

Creating your own storage

Each storage should extend abstract class blargh.engine.storage.BaseStorage, implementing all of its required functions.

class blargh.engine.storage.BaseStorage

Abstract base class for all Storage classes.

While designing class, especially __init__, it’s good to remember that engine.setup() accepts either an initialized Storage, or function returning new Storage, and if function is provided, new Storage instance will be created for every request, so it’s recomented to either:

  • provide object, not function, if possible,
  • or make Storage initialization as simple as possible (e.g. avoid reading large files)
save(instance)
Parameters:instanceblargh.engine.Instance with already set .id().
Returns:None

Save single object to the database. If such object already exists, it should be replaced. This method is an inverse of .load().

load(name, id_)
Parameters:
  • name – resource name
  • id – resource id
Returns:

dict

Return dictionary with all saved data for object identified by (name, id_). This method is an inverse of .save().

delete(name, id_)
Parameters:
  • name – resource name
  • id – resource id
Returns:

None

Remove from storage object identified by (name, id_). Should raise exceptions.e404 if object does not exists.

selected_ids(name, data, sort, limit)
Parameters:
  • name – resource name
  • data – dict, with resource field names as keys
  • sort – list of field names instances should be sorted with, with possible ‘-’ prefixes indicating descending order, storage does not have to implement this
  • limit – non-negative integer indicating max number of instances that should returned, storage does not have to implement this (and should not if it does not implement sort)
Returns:

sorted list of ids

Return list of ids of all objects for which data is a subset of value returned by .load().

In other words:

for id_ in Storage.selected_ids('foo', {'bar': 'baz'}):
    assert Storage.load('foo', id_)['bar'] == 'baz'
next_id(name)
Parameters:name – resource name
Returns:next free id

There is no guarantee this id will be used, but no value should ever be returned more than once.

This method is required only for creation of POSTed resources (or created in any other way without supplying id), so Storage might just raise some exception if all resources should be created via PUT.

begin()

Start a transaction, that will be later .commit()ed or .rollback()ed.

Togehter with .commit() and .rollback() provides transctional interface. Might be left empty if this Storage does not support transactions.

commit()

Save all changes since last .begin().

Togehter with .begin() and .rollback() provides transctional interface. Might be left empty if this Storage does not support transactions.

rollback()

Discard all changes since last .begin().

Togehter with .begin() and .commit() provides transctional interface. Might be left empty if this Storage does not support transactions.

data()
Returns:all storage data in any convinient format.

Used only for debugging & testing.