Python API¶
Remake classes¶
- class remake.Remake(name: Optional[str] = None, config: Optional[dict] = None, special_paths: Optional[remake.special_paths.SpecialPaths] = None)¶
- Core class. A remakefile is defined by creating an instance of Remake. - Acts as an entry point to running all tasks via python, and retrieving information about the state of any task. Contains a list of all tasks added by any TaskRule. - A remakefile must contain: - >>> demo = Remake() - This must be near the top of the file - after the imports but before any TaskRule is defined. - all_ancestors(tasks)¶
- Find all ancestors of tasks - Parameters
- tasks – tasks to start from 
- Returns
- all ancestors 
 
 - all_descendants(tasks)¶
- Find all descendants of tasks - Parameters
- tasks – tasks to start from 
- Returns
- all descendants 
 
 - ancestors(task)¶
- All ancestors of a given task. - Parameters
- task – task to start from 
- Returns
- all ancestors 
 
 - property completed_tasks¶
 - configure(print_reasons: bool, executor: str, display: str)¶
- Allow Remake object to be configured after creation. - Parameters
- print_reasons – print reason for running individual task 
- executor – name of which remake.executor to use 
- display – how to display task status after each task is run 
 
 
 - current_remake = {}¶
 - descendants(task)¶
- All descendants of a given task. - Parameters
- task – task to start from 
- Returns
- all descendants 
 
 - display_task_dag()¶
- Display all tasks as a Directed Acyclic Graph (DAG) 
 - file_info(filenames)¶
- File info for all given files. - Parameters
- filenames – filenames to get info for 
- Returns
- dict containing all info 
 
 - finalize()¶
- Finalize this Remake object 
 - property finalized¶
- Is this finalized (i.e. ready to be run)? 
 - find_task(task_path_hash_key: Union[remake.task.Task, str])¶
- Find a task from its path_hash_key. - Parameters
- task_path_hash_key – key of task 
- Returns
- found task 
 
 - find_tasks(task_path_hash_keys)¶
- Find all tasks given by their path hash keys - Parameters
- task_path_hash_keys – list of path hash keys 
- Returns
- all found tasks 
 
 - list_files(filetype=None, exists=False, produced_by_rule=None, used_by_rule=None, produced_by_task=None, used_by_task=None)¶
- List all files subject to criteria. - Parameters
- filetype – one of input_only, output_only, input, output, inout 
- exists – whether file exists 
- produced_by_rule – whether file is produced by rule 
- used_by_rule – whether file is used by rule 
- produced_by_task – whether file is produced by task (path hash key) 
- used_by_task – whether file is used by task (path hash key) 
 
- Returns
- all matching files 
 
 - list_rules()¶
- List all rules 
 - list_tasks(tfilter=None, rule=None, requires_rerun=False, uses_file=None, produces_file=None, ancestor_of=None, descendant_of=None)¶
- List all tasks subject to requirements. - Parameters
- tfilter – dict of key/value pairs to filter tasks on 
- rule – rule that tasks belongs to 
- requires_rerun – whether tasks require rerun 
- uses_file – whether tasks use a given file 
- produces_file – whether tasks produce a given file 
- ancestor_of – whether tasks are an ancestor of this task (path hash key) 
- descendant_of – whether tasks are a descendant of this task (path hash key) 
 
- Returns
- all matching tasks 
 
 - property name¶
 - property pending_tasks¶
 - property remaining_tasks¶
 - remakes = {}¶
 - rerun_required()¶
- Rerun status of this Remake object. - Returns
- True if any tasks remain to be run 
 
 - reset()¶
- Reset the internal state 
 - run_all(force=False)¶
- Run all tasks. - Parameters
- force – force rerun of each task 
 
 - run_one()¶
- Run the next pending task 
 - run_random()¶
- Run a random task (pot luck out of pending) 
 - run_requested(requested, force=False, handle_dependencies=False)¶
- Run requested tasks. - Parameters
- requested – 
- force – force rerun of each task 
- handle_dependencies – add all ancestor tasks to ensure given tasks can be run 
 
 
 - short_status(mode='logger.info')¶
- Log/print a short status line. - Parameters
- mode – ‘logger.info’ or ‘print’ 
 
 - task_info(task_path_hash_keys)¶
- Task info for all given tasks. - Parameters
- task_path_hash_keys – task hash keys for tasks 
- Returns
- dict containing all info 
 
 - task_status(task: remake.task.Task) → str¶
- Get the status of a task. - Parameters
- task – task to get status for 
- Returns
- status 
 
 
- class remake.SpecialPaths(**paths)¶
- Special paths to use for all input/output filenames. - When tasks use inputs or create outputs, they are referenced by their filesystem path. This class makes it easy to define special paths that are used internally to locate the actual file. For example, CWD is a special path for the current working directory. This can be used to make paths consistent across different machine. If machine A has a file at path /A/data/path, and machine B has a file at path /B/data/path, a special path called DATA could be set up, pointing to the right path on each machine. E.g. on machine A: - >>> special_paths = SpecialPaths(DATA='/A/data') >>> special_paths.DATA PosixPath('/A/data') - This must be passed into Remake to take effect: - >>> from remake import Remake >>> demo = Remake(special_paths=special_paths) 
- class remake.TaskRule(task_ctrl, func, inputs, outputs, *, force=False, depends_on=())¶
- Core class. Defines a set of tasks in a remakefile. - Each class must have class-level properties: rule_inputs, rule_outputs, and each must have a method: rule_run. Each output file must be unique within a remakefile. In the rule_run method, the inputs and outputs are available through e.g. the self.inputs property. - >>> demo = Remake() >>> class TaskSet(TaskRule): ... rule_inputs = {'in': 'infile'} ... rule_outputs = {'out': 'outfile'} ... def rule_run(self): ... self.outputs['out'].write_text(self.inputs['in'].read_text()) >>> len(TaskSet.tasks) 1 - Each class can also optionally define a var_matrix, and dependency functions/classes. var_matrix should be a dict with string keys, and a list of items for each key. There will be as many tasks created as the itertools.product between the lists for each key. The values will be substituted in to the inputs/outputs. - >>> def fn(): ... print('in fn') >>> class TaskSet2(TaskRule): ... rule_inputs = {'in': 'infile'} ... rule_outputs = {'out_{i}{j}': 'outfile_{i}{j}'} ... var_matrix = {'i': [1, 2], 'j': [3, 4]} ... dependencies = [fn] ... def rule_run(self): ... fn() ... self.outputs[f'out_{self.i}'].write_text(str(self.i) + self.inputs['in'].read_text()) >>> len(TaskSet2.tasks) 4 - Note, all tasks created by these TaskRule are added to the Remake object: - >>> len(demo.tasks) 5 - When the remakefile is run ($ remake run on the command line), all the tasks will be triggered according to their ordering. If any of the rule_run methods is changed, then those tasks will be rerun, and if their output is is different subsequent tasks will be rerun. 
- class remake.TaskQuerySet(iterable=None, task_ctrl=None)¶
- List of classes with some extra capabilities. - Provides some simple methods for filtering, running and displaying the status of all its tasks. - exclude(**kwargs)¶
- As filter, but exclude instead of include tasks. - Parameters
- kwargs – key-value pairs to exclude 
- Returns
- filtered tasks 
 
 - filter(cast_to_str=False, **kwargs)¶
- Filter tasks based on kwargs. - >>> class DummyTask: ... pass >>> tasks = [] >>> for i in range(10): ... task = DummyTask() ... setattr(task, 'i', i) ... setattr(task, 'j', i % 3 == 0) ... tasks.append(task) >>> task_query_set = TaskQuerySet(tasks, None) >>> len(task_query_set.filter(j=True)) 4 - Parameters
- cast_to_str – Cast values to string first 
- kwargs – key-value pairs of task properties 
 
- Returns
- filtered TaskQuerySet 
 
 - first()¶
- Get first task. 
 - get(**kwargs)¶
- Get one and only one task. - Parameters
- kwargs – key-value pairs to get 
- Returns
- task meeting criteria 
 
 - in_rule(rule)¶
- Filter all tasks to those in provided rule. - Parameters
- rule – rule to filter on 
- Returns
- filtered TaskQuerySet 
 
 - last()¶
- Get last task. 
 - run(force=False)¶
- Run all tasks. - Parameters
- force – force run 
 
 - status(reasons=False, task_diff=False)¶
- Print status for all tasks. - Parameters
- reasons – show reasons for why tasks have their statuses 
- task_diff – show a diff of the tasks’ rule_run methods 
 
 
 
remake.loader¶
- remake.loader.load_remake(filename: Union[str, pathlib.Path], finalize: bool = False) → Remake¶
- Load a remake instance from a file. - >>> ex1 = load_remake('examples/ex1.py') >>> ex1.finalized False - Parameters
- filename – file that contains exactly one remake = Remake() 
- finalize – finalize the remake instance 
 
- Returns
- instance of Remake 
 
remake.util¶
- remake.util.format_path(path: Union[pathlib.Path, str], **kwargs) → pathlib.Path¶
- Format a path based on **kwargs. - >>> format_path(Path('some/path/{dirname}/{filename}'), dirname='output', filename='out.txt') PosixPath('some/path/output/out.txt') - Parameters
- path – path with python format-style braces 
- kwargs – keyword args to substitute 
 
- Returns
- formatted path 
 
- remake.util.load_module(local_filename: Union[str, pathlib.Path])¶
- Use Python internals to load a Python module from a filename. - >>> load_module('examples/ex1.py').__name__ 'ex1' - Parameters
- local_filename – name of module to load 
- Returns
- module 
 
- remake.util.sha1sum(path: pathlib.Path, buf_size: int = 65536) → str¶
- Calculate sha1 sum for a path. - >>> sha1sum(Path('examples/data/in.txt')) '3620f0704e803d65098e5f2b836633b166e25474' - Parameters
- path – file path to calculate sha1 sum for 
- buf_size – buffer size for reading file 
 
- Returns
- sha1 sum of input path 
 
- remake.util.sysrun(cmd)¶
- Run a system command, returns a CompletedProcess - >>> print(sysrun('echo "hello"').stdout) hello - raises CalledProcessError if cmd is bad. to access output: sysrun(cmd).stdout 
- remake.util.tmp_to_actual_path(path: pathlib.Path) → pathlib.Path¶
- Convert a temporary remake path to an actual path. - When writing to an output path, remake uses a temporary path then copies to the actual path on completion. This function can be used to see the actual path from the temporary path. - >>> tmp_to_actual_path(Path('.remake.tmp.output.txt')) PosixPath('output.txt') - Parameters
- path – temporary remake path 
- Returns
- actual path