Python API¶
Remake classes¶
- class remake.Remake(name: Optional[str] = None, config: Optional[dict] = None, special_paths: Optional[remake.special_paths.SpecialPaths] = None)¶
Core class. A remakefile is defined by creating an instance of Remake.
Acts as an entry point to running all tasks via python, and retrieving information about the state of any task. Contains a list of all tasks added by any TaskRule.
A remakefile must contain:
>>> demo = Remake()
This must be near the top of the file - after the imports but before any TaskRule is defined.
- all_ancestors(tasks)¶
Find all ancestors of tasks
- Parameters
tasks – tasks to start from
- Returns
all ancestors
- all_descendants(tasks)¶
Find all descendants of tasks
- Parameters
tasks – tasks to start from
- Returns
all descendants
- ancestors(task)¶
All ancestors of a given task.
- Parameters
task – task to start from
- Returns
all ancestors
- property completed_tasks¶
- configure(print_reasons: bool, executor: str, display: str)¶
Allow Remake object to be configured after creation.
- Parameters
print_reasons – print reason for running individual task
executor – name of which remake.executor to use
display – how to display task status after each task is run
- current_remake = {}¶
- descendants(task)¶
All descendants of a given task.
- Parameters
task – task to start from
- Returns
all descendants
- display_task_dag()¶
Display all tasks as a Directed Acyclic Graph (DAG)
- file_info(filenames)¶
File info for all given files.
- Parameters
filenames – filenames to get info for
- Returns
dict containing all info
- finalize()¶
Finalize this Remake object
- property finalized¶
Is this finalized (i.e. ready to be run)?
- find_task(task_path_hash_key: Union[remake.task.Task, str])¶
Find a task from its path_hash_key.
- Parameters
task_path_hash_key – key of task
- Returns
found task
- find_tasks(task_path_hash_keys)¶
Find all tasks given by their path hash keys
- Parameters
task_path_hash_keys – list of path hash keys
- Returns
all found tasks
- list_files(filetype=None, exists=False, produced_by_rule=None, used_by_rule=None, produced_by_task=None, used_by_task=None)¶
List all files subject to criteria.
- Parameters
filetype – one of input_only, output_only, input, output, inout
exists – whether file exists
produced_by_rule – whether file is produced by rule
used_by_rule – whether file is used by rule
produced_by_task – whether file is produced by task (path hash key)
used_by_task – whether file is used by task (path hash key)
- Returns
all matching files
- list_rules()¶
List all rules
- list_tasks(tfilter=None, rule=None, requires_rerun=False, uses_file=None, produces_file=None, ancestor_of=None, descendant_of=None)¶
List all tasks subject to requirements.
- Parameters
tfilter – dict of key/value pairs to filter tasks on
rule – rule that tasks belongs to
requires_rerun – whether tasks require rerun
uses_file – whether tasks use a given file
produces_file – whether tasks produce a given file
ancestor_of – whether tasks are an ancestor of this task (path hash key)
descendant_of – whether tasks are a descendant of this task (path hash key)
- Returns
all matching tasks
- property name¶
- property pending_tasks¶
- property remaining_tasks¶
- remakes = {}¶
- rerun_required()¶
Rerun status of this Remake object.
- Returns
True if any tasks remain to be run
- reset()¶
Reset the internal state
- run_all(force=False)¶
Run all tasks.
- Parameters
force – force rerun of each task
- run_one()¶
Run the next pending task
- run_random()¶
Run a random task (pot luck out of pending)
- run_requested(requested, force=False, handle_dependencies=False)¶
Run requested tasks.
- Parameters
requested –
force – force rerun of each task
handle_dependencies – add all ancestor tasks to ensure given tasks can be run
- short_status(mode='logger.info')¶
Log/print a short status line.
- Parameters
mode – ‘logger.info’ or ‘print’
- task_info(task_path_hash_keys)¶
Task info for all given tasks.
- Parameters
task_path_hash_keys – task hash keys for tasks
- Returns
dict containing all info
- task_status(task: remake.task.Task) → str¶
Get the status of a task.
- Parameters
task – task to get status for
- Returns
status
- class remake.SpecialPaths(**paths)¶
Special paths to use for all input/output filenames.
When tasks use inputs or create outputs, they are referenced by their filesystem path. This class makes it easy to define special paths that are used internally to locate the actual file. For example, CWD is a special path for the current working directory. This can be used to make paths consistent across different machine. If machine A has a file at path /A/data/path, and machine B has a file at path /B/data/path, a special path called DATA could be set up, pointing to the right path on each machine. E.g. on machine A:
>>> special_paths = SpecialPaths(DATA='/A/data') >>> special_paths.DATA PosixPath('/A/data')
This must be passed into Remake to take effect:
>>> from remake import Remake >>> demo = Remake(special_paths=special_paths)
- class remake.TaskRule(task_ctrl, func, inputs, outputs, *, force=False, depends_on=())¶
Core class. Defines a set of tasks in a remakefile.
Each class must have class-level properties: rule_inputs, rule_outputs, and each must have a method: rule_run. Each output file must be unique within a remakefile. In the rule_run method, the inputs and outputs are available through e.g. the self.inputs property.
>>> demo = Remake() >>> class TaskSet(TaskRule): ... rule_inputs = {'in': 'infile'} ... rule_outputs = {'out': 'outfile'} ... def rule_run(self): ... self.outputs['out'].write_text(self.inputs['in'].read_text()) >>> len(TaskSet.tasks) 1
Each class can also optionally define a var_matrix, and dependency functions/classes. var_matrix should be a dict with string keys, and a list of items for each key. There will be as many tasks created as the itertools.product between the lists for each key. The values will be substituted in to the inputs/outputs.
>>> def fn(): ... print('in fn') >>> class TaskSet2(TaskRule): ... rule_inputs = {'in': 'infile'} ... rule_outputs = {'out_{i}{j}': 'outfile_{i}{j}'} ... var_matrix = {'i': [1, 2], 'j': [3, 4]} ... dependencies = [fn] ... def rule_run(self): ... fn() ... self.outputs[f'out_{self.i}'].write_text(str(self.i) + self.inputs['in'].read_text()) >>> len(TaskSet2.tasks) 4
Note, all tasks created by these TaskRule are added to the Remake object:
>>> len(demo.tasks) 5
When the remakefile is run ($ remake run on the command line), all the tasks will be triggered according to their ordering. If any of the rule_run methods is changed, then those tasks will be rerun, and if their output is is different subsequent tasks will be rerun.
- class remake.TaskQuerySet(iterable=None, task_ctrl=None)¶
List of classes with some extra capabilities.
Provides some simple methods for filtering, running and displaying the status of all its tasks.
- exclude(**kwargs)¶
As filter, but exclude instead of include tasks.
- Parameters
kwargs – key-value pairs to exclude
- Returns
filtered tasks
- filter(cast_to_str=False, **kwargs)¶
Filter tasks based on kwargs.
>>> class DummyTask: ... pass >>> tasks = [] >>> for i in range(10): ... task = DummyTask() ... setattr(task, 'i', i) ... setattr(task, 'j', i % 3 == 0) ... tasks.append(task) >>> task_query_set = TaskQuerySet(tasks, None) >>> len(task_query_set.filter(j=True)) 4
- Parameters
cast_to_str – Cast values to string first
kwargs – key-value pairs of task properties
- Returns
filtered TaskQuerySet
- first()¶
Get first task.
- get(**kwargs)¶
Get one and only one task.
- Parameters
kwargs – key-value pairs to get
- Returns
task meeting criteria
- in_rule(rule)¶
Filter all tasks to those in provided rule.
- Parameters
rule – rule to filter on
- Returns
filtered TaskQuerySet
- last()¶
Get last task.
- run(force=False)¶
Run all tasks.
- Parameters
force – force run
- status(reasons=False, task_diff=False)¶
Print status for all tasks.
- Parameters
reasons – show reasons for why tasks have their statuses
task_diff – show a diff of the tasks’ rule_run methods
remake.loader¶
- remake.loader.load_remake(filename: Union[str, pathlib.Path], finalize: bool = False) → Remake¶
Load a remake instance from a file.
>>> ex1 = load_remake('examples/ex1.py') >>> ex1.finalized False
- Parameters
filename – file that contains exactly one remake = Remake()
finalize – finalize the remake instance
- Returns
instance of Remake
remake.util¶
- remake.util.format_path(path: Union[pathlib.Path, str], **kwargs) → pathlib.Path¶
Format a path based on **kwargs.
>>> format_path(Path('some/path/{dirname}/{filename}'), dirname='output', filename='out.txt') PosixPath('some/path/output/out.txt')
- Parameters
path – path with python format-style braces
kwargs – keyword args to substitute
- Returns
formatted path
- remake.util.load_module(local_filename: Union[str, pathlib.Path])¶
Use Python internals to load a Python module from a filename.
>>> load_module('examples/ex1.py').__name__ 'ex1'
- Parameters
local_filename – name of module to load
- Returns
module
- remake.util.sha1sum(path: pathlib.Path, buf_size: int = 65536) → str¶
Calculate sha1 sum for a path.
>>> sha1sum(Path('examples/data/in.txt')) '3620f0704e803d65098e5f2b836633b166e25474'
- Parameters
path – file path to calculate sha1 sum for
buf_size – buffer size for reading file
- Returns
sha1 sum of input path
- remake.util.sysrun(cmd)¶
Run a system command, returns a CompletedProcess
>>> print(sysrun('echo "hello"').stdout) hello
raises CalledProcessError if cmd is bad. to access output: sysrun(cmd).stdout
- remake.util.tmp_to_actual_path(path: pathlib.Path) → pathlib.Path¶
Convert a temporary remake path to an actual path.
When writing to an output path, remake uses a temporary path then copies to the actual path on completion. This function can be used to see the actual path from the temporary path.
>>> tmp_to_actual_path(Path('.remake.tmp.output.txt')) PosixPath('output.txt')
- Parameters
path – temporary remake path
- Returns
actual path