User’s Guide

Automatically engine selection

Automatically engine selection is achived by auto_engine(), which will be called in the function compfile.open(). Users rarely need to call auto_engine() directly.

auto_engine() will call an ordered list of engine determination function (EDF) to decide the appropriate engine type. The signature of a EDF must be edf_func(path: path-like), where path is the path to the archive. The function should return a callable object if it can be determined, or return None otherwise. The actually engine (i.e. the callable object returned by EDF can open a specific type of compressed file. Typical engines are bz2.open(), gzip.open() etc.

The EDF list already contains several EDFs. Users can extend the list by registering new EDFs:

arlib.register_auto_engine(func)

A priority value can also be specified:

arlib.register_auto_engine(func, priority)

The value of priority define the ordering of the registered EDFs. The smaller the priority value, the higher the priority values. EDFs with higher priority will be called before EDFs with lower priority values. The default priority value is 50.

A third bool type argument prepend can also be specified for register_auto_engine(). When prepend is true, the EDF will be put before (i.e. higher priority) other registered EDFs with the same priority value. Otherwise, it will be put after them.

register_auto_engine() can also be used as decorators

@compfile.register_auto_engine
def func(path):
    # function definition


@compfile.register_auto_engine(priority=50, prepend=False)
def func2(path):
    # function definition

Current implementation

Currently, the following engines are registered:

Extend the library

The architecture of the library is flexible enough to add more compressed file types. Adding a new compressed file type simply involves registering and EDF:

def open_abc(fpath, mode):
    # open the *.abc compressed file
    return uncompressed_file_file

@register_auto_engine
def edf_abc(fpath):
    if abc.endswith('.abc'):
        return open_abc
    else:
        return None