Deadbeat developer documentation

Note

Please be aware, that this version of documentation is appropriate for:

  • version: 0.1
  • distribution: development (unstable)
  • Git revision: 13f0cc1722184600751120dd817f76589ca44380

Warning

Although production code is based on this library, it should still be considered as work in progress.

This is the documentation for developers of the Deadbeat library itself, or developers of cogs usable by Deadbeat. If you are the user of the library (= you are writing an application based on it), this document should not be that useful for you, unless you need some special Cog for your app, but might be interesting reading nonetheless.

Guidelines

Let PEP 20 be the guide for your mind.

Let PEP 8 be the guide for your hand.

Let you and PEP 257 and PEP 287 be the guide for others.

Use Sphinx format for argument docstrings.

Pull and merge often.

Use devel branch for small updates and bugfixes.

For bigger features fork devel, merge after accepting, delete branch.

master must not break unittests. devel should not.

New feature should be accompanied with unit tests.

Do not introduce new dependencies. Dependent code should go into its own submodule, so dependency can be runtime and enforced by application developer if necessary, but not by library.

Reuse existing (even soft) dependencies. There is no need to use three competing IP address libraries. However, do not prevent application developer to use different one in his app, should he need to.

Events management

There is one principal object, instantiated as a singleton for every application, of class deadbeat.movement.Escapement, which takes care of the event queue. Most of the methods can be called from cogs (for supply cogs it is a must) to bind themselves to particular events.

Methods usually bind a callable (usually some cog’s method) and its arguments with specific system event.

Immediate events

Use deadbeat.movement.Escapement.enqueue() and deadbeat.movement.Escapement.dequeue() to directly manipulate the event queue. Immediate entering of events can be used in cases, when cog needs to plan its own different method.

File descriptor events

Use deadbeat.movement.Escapement.fd_register() and deadbeat.movement.Escapement.fd_unregister() to bind the cog’s method to specific file descriptor’s select.epoll event. These are usually used to watch sockets for incoming/outgoing data.

Inotify events

deadbeat.movement.Escapement.inotify_add() and deadbeat.movement.Escapement.inotify_rm() can be used to watch file or directory entries through inotify(7) Linux mechanism. Inotify is useful for watching file contents change, files appear or disappear in a directory, etc.

Note that inotify(7) mechanism may not be available on specific systems due to system architecture or compabitility issues. So, do not make any assumptions and do not rely on inotify unconditionally. Use inotify methods only after checking that deadbeat.movement.Escapement.inotify is True. If it is not, gracefully fall back to other method (for example regular polling).

Signal events

Use deadbeat.movement.Escapement.signal_register() and deadbeat.movement.Escapement.signal_unregister() for binding specific cog’s method with *nix signal(7).

Broadcast events

Use deadbeat.movement.Train.notify() to send broadcast to all cogs bearing method name. From the other side - if your cog possesses a method of specific name, it may get called. Already broadcasted names are:

event_done(event_id)

Processing of the event-id event just finished. The cog may release any and all related resources still held.

event_rename(event_id)

The event_id event just got connected with the new name. Old id may be forgotten, however event data or its (possibly merged or split) parts continue on.

finish()

Called before application is about to shut down all the cogs. See also Shutting down.

done()

Called before application is about to quit. See also Shutting down.

Cogs

A cog (in Deadbeat terminology) is a callable, accepting the data travelling through the geartrain as sole argument, and returning iterable, containing

  • one instance modified data, or
  • several instances of new or possibly delayed older data, or
  • None, if it needs to swallow or delay the data.

Usual pattern is thus:

def Pinion(data):
    do_something_with(data)
    return (data,)

If callable needs to initialize and/or keep its state (as most nontrivial ones do), use callable object, derived from deadbeat.movement.Cog:

def Pinion(movement.Cog):

    def __init__(self, momentum):
        self.momentum = momentum

    def __call__(self, data):
        do_something_with(data)
        return (data,)

Winding up

Save all __init__ arguments to corresponding instance variables with the same name - deadbeat.movement.Cog.__str__() method will then produce nice rendering, which you’ll appreciate while debugging.

Cogs should provide suitable set of getters and setters as __init__ arguments, so caller can modify the means of how the information is extracted from the data and incorporated back in. See Getters/setters.

Cogs must ask for deadbeat.movement.Train instance as the first argument of __init__ if they need any Events management.

Supply cogs must schedule themselves for the first run as they need to (be it by means of time events or by binding themselves to some system event, for example on the socket they opened).

Shutting down

When application is going to shutdown, all cogs are notified by finish event.

The cog is required to flush all the caches, send down the train all the data/events still held, also release all the scheduled or watched events and strictly do not schedule or bind to any new events. The application is shutting down. You may use py:data:deadbeat.movement.DummyEscapement to replace real one to prevent any new event bindings (and to not complicate your code where these are acquired).

Cog’s finish is called in the order of the train, so as long as all cogs obey finishing rules, cogs will not recieve another events (except done).

After all cogs are shut, each cogs receives done event. Cog is required to release all the remaining resources. Event processing ends, application is going to quit.

Configuration

Cogs should provide recommended configuration insert, such is the case of deadbeat.daemon.daemon_base_config, deadbeat.ip.anonymise_base_config or others. See also Conf.

Ideally the insert should be usable by means of Python star convention, like:

log.configure(**cfg.log)
daemon.detach(**cfg.daemon)

However that may not make sense as every cog may not validate having its own configuration subsection, or it may make sense to gather configuration from more loosely fitting pieces, like:

watcher = fs.FileWatcherSupply(train.esc, filenames=cfg.input_files, tail=True)

Logging

Use logging framework. There is deadbeat.log helper module to simplify and unify logging (and its configuration) for the user of the library (and – hopefully – for the user of the resulting application), so you can safely assume, that logging is alive and kicking.

Wherever possible, prefix all log messages with event id (that requires you to ask for id_getter in the __init__):

event_id = self.id_getter(data)

try:
    os.unlink(long_name)
except Exception:
    logging.exception("%s: attempt to remove nonexistent file %s", event_id, long_name)
else:
    logging.debug("%s: removed %s", event_id, long_name)

Getters/setters

If possible, use the concept of itemgetter or itemsetter, so the cog doesn’t need to know underlying data structure. Cog should do only its work, calculation, communication, etc, but accessing and modifying of the data should be out of its realm, by means of provided callables.

For example, enrichment cog does not work on the whole data, but only IP address piece, and sets hostname piece. Which piece it is is defined by means of itemgetter for IP (which can be operator.itemgetter("ip") if data is a dictionary, or operator.itemgetter(2) if data is a list, or something completely different).

Complementary concept is itemsetter, the small function, which takes the data and the piece, puts it onto the right place in the data, and returns the data. The return is important, as data can be immutable structure, so itemsetter can return completely new object instead of modified one.

Supply cogs will also need to facilitate id_setter (again, accessible as the argument for the user of the library), which is the function generating and setting unique ID for the event.

Cogs, which want to be able to generate useful log messages, must also ask for complementary id_getter, to accompany messages with data ID.

Usual pattern here is:

def ModifierPinion(movement.Cog):

    def __init__(self, piece_getter=None, result_setter=None):
        self.piece_getter = piece_getter or operator.itemgetter("piece")    # use reasonable default
        self.result_setter = result_setter or movement.itemsetter("result")

    def __call__(self, data):
        piece = self.piece_getter(data)
        result = compute_something(piece)
        data = self.result_setter(data, result)
        return (data,)

See also ref:getset.