Python API

Remake classes

class remake.Remake(name: Optional[str] = None, config: Optional[dict] = None, special_paths: Optional[remake.special_paths.SpecialPaths] = None)

Core class. A remakefile is defined by creating an instance of Remake.

Acts as an entry point to running all tasks via python, and retrieving information about the state of any task. Contains a list of all tasks added by any TaskRule.

A remakefile must contain:

>>> demo = Remake()

This must be near the top of the file - after the imports but before any TaskRule is defined.

all_ancestors(tasks)

Find all ancestors of tasks

Parameters

tasks – tasks to start from

Returns

all ancestors

all_descendants(tasks)

Find all descendants of tasks

Parameters

tasks – tasks to start from

Returns

all descendants

ancestors(task)

All ancestors of a given task.

Parameters

task – task to start from

Returns

all ancestors

property completed_tasks
configure(print_reasons: bool, executor: str, display: str)

Allow Remake object to be configured after creation.

Parameters
  • print_reasons – print reason for running individual task

  • executor – name of which remake.executor to use

  • display – how to display task status after each task is run

current_remake = {}
descendants(task)

All descendants of a given task.

Parameters

task – task to start from

Returns

all descendants

display_task_dag()

Display all tasks as a Directed Acyclic Graph (DAG)

file_info(filenames)

File info for all given files.

Parameters

filenames – filenames to get info for

Returns

dict containing all info

finalize()

Finalize this Remake object

property finalized

Is this finalized (i.e. ready to be run)?

find_task(task_path_hash_key: Union[remake.task.Task, str])

Find a task from its path_hash_key.

Parameters

task_path_hash_key – key of task

Returns

found task

find_tasks(task_path_hash_keys)

Find all tasks given by their path hash keys

Parameters

task_path_hash_keys – list of path hash keys

Returns

all found tasks

list_files(filetype=None, exists=False, produced_by_rule=None, used_by_rule=None, produced_by_task=None, used_by_task=None)

List all files subject to criteria.

Parameters
  • filetype – one of input_only, output_only, input, output, inout

  • exists – whether file exists

  • produced_by_rule – whether file is produced by rule

  • used_by_rule – whether file is used by rule

  • produced_by_task – whether file is produced by task (path hash key)

  • used_by_task – whether file is used by task (path hash key)

Returns

all matching files

list_rules()

List all rules

list_tasks(tfilter=None, rule=None, requires_rerun=False, uses_file=None, produces_file=None, ancestor_of=None, descendant_of=None)

List all tasks subject to requirements.

Parameters
  • tfilter – dict of key/value pairs to filter tasks on

  • rule – rule that tasks belongs to

  • requires_rerun – whether tasks require rerun

  • uses_file – whether tasks use a given file

  • produces_file – whether tasks produce a given file

  • ancestor_of – whether tasks are an ancestor of this task (path hash key)

  • descendant_of – whether tasks are a descendant of this task (path hash key)

Returns

all matching tasks

property name
property pending_tasks
property remaining_tasks
remakes = {}
rerun_required()

Rerun status of this Remake object.

Returns

True if any tasks remain to be run

reset()

Reset the internal state

run_all(force=False)

Run all tasks.

Parameters

force – force rerun of each task

run_one()

Run the next pending task

run_random()

Run a random task (pot luck out of pending)

run_requested(requested, force=False, handle_dependencies=False)

Run requested tasks.

Parameters
  • requested

  • force – force rerun of each task

  • handle_dependencies – add all ancestor tasks to ensure given tasks can be run

short_status(mode='logger.info')

Log/print a short status line.

Parameters

mode – ‘logger.info’ or ‘print’

task_info(task_path_hash_keys)

Task info for all given tasks.

Parameters

task_path_hash_keys – task hash keys for tasks

Returns

dict containing all info

task_status(task: remake.task.Task)str

Get the status of a task.

Parameters

task – task to get status for

Returns

status

class remake.SpecialPaths(**paths)

Special paths to use for all input/output filenames.

When tasks use inputs or create outputs, they are referenced by their filesystem path. This class makes it easy to define special paths that are used internally to locate the actual file. For example, CWD is a special path for the current working directory. This can be used to make paths consistent across different machine. If machine A has a file at path /A/data/path, and machine B has a file at path /B/data/path, a special path called DATA could be set up, pointing to the right path on each machine. E.g. on machine A:

>>> special_paths = SpecialPaths(DATA='/A/data')
>>> special_paths.DATA
PosixPath('/A/data')

This must be passed into Remake to take effect:

>>> from remake import Remake
>>> demo = Remake(special_paths=special_paths)
class remake.TaskRule(task_ctrl, func, inputs, outputs, *, force=False, depends_on=())

Core class. Defines a set of tasks in a remakefile.

Each class must have class-level properties: rule_inputs, rule_outputs, and each must have a method: rule_run. Each output file must be unique within a remakefile. In the rule_run method, the inputs and outputs are available through e.g. the self.inputs property.

>>> demo = Remake()
>>> class TaskSet(TaskRule):
...     rule_inputs = {'in': 'infile'}
...     rule_outputs = {'out': 'outfile'}
...     def rule_run(self):
...         self.outputs['out'].write_text(self.inputs['in'].read_text())
>>> len(TaskSet.tasks)
1

Each class can also optionally define a var_matrix, and dependency functions/classes. var_matrix should be a dict with string keys, and a list of items for each key. There will be as many tasks created as the itertools.product between the lists for each key. The values will be substituted in to the inputs/outputs.

>>> def fn():
...     print('in fn')
>>> class TaskSet2(TaskRule):
...     rule_inputs = {'in': 'infile'}
...     rule_outputs = {'out_{i}{j}': 'outfile_{i}{j}'}
...     var_matrix = {'i': [1, 2], 'j': [3, 4]}
...     dependencies = [fn]
...     def rule_run(self):
...         fn()
...         self.outputs[f'out_{self.i}'].write_text(str(self.i) + self.inputs['in'].read_text())
>>> len(TaskSet2.tasks)
4

Note, all tasks created by these TaskRule are added to the Remake object:

>>> len(demo.tasks)
5

When the remakefile is run ($ remake run on the command line), all the tasks will be triggered according to their ordering. If any of the rule_run methods is changed, then those tasks will be rerun, and if their output is is different subsequent tasks will be rerun.

class remake.TaskQuerySet(iterable=None, task_ctrl=None)

List of classes with some extra capabilities.

Provides some simple methods for filtering, running and displaying the status of all its tasks.

exclude(**kwargs)

As filter, but exclude instead of include tasks.

Parameters

kwargs – key-value pairs to exclude

Returns

filtered tasks

filter(cast_to_str=False, **kwargs)

Filter tasks based on kwargs.

>>> class DummyTask:
...     pass
>>> tasks = []
>>> for i in range(10):
...     task = DummyTask()
...     setattr(task, 'i', i)
...     setattr(task, 'j', i % 3 == 0)
...     tasks.append(task)
>>> task_query_set = TaskQuerySet(tasks, None)
>>> len(task_query_set.filter(j=True))
4
Parameters
  • cast_to_str – Cast values to string first

  • kwargs – key-value pairs of task properties

Returns

filtered TaskQuerySet

first()

Get first task.

get(**kwargs)

Get one and only one task.

Parameters

kwargs – key-value pairs to get

Returns

task meeting criteria

in_rule(rule)

Filter all tasks to those in provided rule.

Parameters

rule – rule to filter on

Returns

filtered TaskQuerySet

last()

Get last task.

run(force=False)

Run all tasks.

Parameters

force – force run

status(reasons=False, task_diff=False)

Print status for all tasks.

Parameters
  • reasons – show reasons for why tasks have their statuses

  • task_diff – show a diff of the tasks’ rule_run methods

remake.loader

remake.loader.load_remake(filename: Union[str, pathlib.Path], finalize: bool = False)Remake

Load a remake instance from a file.

>>> ex1 = load_remake('examples/ex1.py')
>>> ex1.finalized
False
Parameters
  • filename – file that contains exactly one remake = Remake()

  • finalize – finalize the remake instance

Returns

instance of Remake

remake.util

remake.util.format_path(path: Union[pathlib.Path, str], **kwargs)pathlib.Path

Format a path based on **kwargs.

>>> format_path(Path('some/path/{dirname}/{filename}'), dirname='output', filename='out.txt')
PosixPath('some/path/output/out.txt')
Parameters
  • path – path with python format-style braces

  • kwargs – keyword args to substitute

Returns

formatted path

remake.util.load_module(local_filename: Union[str, pathlib.Path])

Use Python internals to load a Python module from a filename.

>>> load_module('examples/ex1.py').__name__
'ex1'
Parameters

local_filename – name of module to load

Returns

module

remake.util.sha1sum(path: pathlib.Path, buf_size: int = 65536)str

Calculate sha1 sum for a path.

>>> sha1sum(Path('examples/data/in.txt'))
'3620f0704e803d65098e5f2b836633b166e25474'
Parameters
  • path – file path to calculate sha1 sum for

  • buf_size – buffer size for reading file

Returns

sha1 sum of input path

remake.util.sysrun(cmd)

Run a system command, returns a CompletedProcess

>>> print(sysrun('echo "hello"').stdout)
hello

raises CalledProcessError if cmd is bad. to access output: sysrun(cmd).stdout

remake.util.tmp_to_actual_path(path: pathlib.Path)pathlib.Path

Convert a temporary remake path to an actual path.

When writing to an output path, remake uses a temporary path then copies to the actual path on completion. This function can be used to see the actual path from the temporary path.

>>> tmp_to_actual_path(Path('.remake.tmp.output.txt'))
PosixPath('output.txt')
Parameters

path – temporary remake path

Returns

actual path