Skip to content

actantial

Runner

Runner

`actantial.runner.run_extract(data, backend, output_dir, template, templates_dir=Path(file).parent / 'templates', actor_labels=None, object_labels=None, resume_timestamp=None, template_columns=None)`

Run the actantial extraction pipeline over a DataFrame of texts.

Iterates over each row, renders the prompt template, calls the backend, parses the JSON output, and writes per-text result files to disk. Supports resuming an interrupted run by skipping texts that already have a saved result. Logs all activity to a timestamped log file under output_dir/logs/.

Results are saved under output_dir/actantial_models/{model_name}/{template}/{timestamp}/, with one JSON file per text ID and one file containing the full raw backend response.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	DataFrame with at least `id` and `text` columns.	required
`backend`	`LLMBackend`	An initialised `LLMBackend` instance.	required
`output_dir`	`Path`	Root directory for saving results and logs.	required
`template`	`str`	Name of the prompt template to use. Must exist in `templates_dir/{backend.model_name}/` or fall back to `templates_dir/default/`.	required
`templates_dir`	`Path`	Root directory containing per-model template subdirectories and the shared `default/` directory. Defaults to the built-in `templates/` folder.	`parent / 'templates'`
`actor_labels`	`Optional[list]`	List of actor labels for closed-set annotation. Only used if the template supports `actor_labels`.	`None`
`object_labels`	`Optional[list]`	List of object labels for closed-set annotation. Only used if the template supports `object_labels`.	`None`
`resume_timestamp`	`Optional[str]`	Timestamp of a previous run to resume, in `YYYYMMDD_HHMMSS` format. Texts already processed in that run are skipped. The model and template must match the original run.	`None`
`template_columns`	`Optional[list[str]]`	Column names from `data` to pass as additional template variables. Each name maps directly to a Jinja2 variable of the same name (e.g. `"parent_post"` → `{{ parent_post }}`). Columns must be string dtype; cast with `data[col].astype(str)` before calling if needed.	`None`