Skip to content

Runner

actantial.runner.run_extract(data, backend, output_dir, template, templates_dir=Path(__file__).parent / 'templates', actor_labels=None, object_labels=None, resume_timestamp=None, template_columns=None)

Run the actantial extraction pipeline over a DataFrame of texts.

Iterates over each row, renders the prompt template, calls the backend, parses the JSON output, and writes per-text result files to disk. Supports resuming an interrupted run by skipping texts that already have a saved result. Logs all activity to a timestamped log file under output_dir/logs/.

Results are saved under output_dir/actantial_models/{model_name}/{template}/{timestamp}/, with one JSON file per text ID and one file containing the full raw backend response.

Parameters:

Name Type Description Default
data DataFrame

DataFrame with at least id and text columns.

required
backend LLMBackend

An initialised LLMBackend instance.

required
output_dir Path

Root directory for saving results and logs.

required
template str

Name of the prompt template to use. Must exist in templates_dir/{backend.model_name}/ or fall back to templates_dir/default/.

required
templates_dir Path

Root directory containing per-model template subdirectories and the shared default/ directory. Defaults to the built-in templates/ folder.

parent / 'templates'
actor_labels Optional[list]

List of actor labels for closed-set annotation. Only used if the template supports actor_labels.

None
object_labels Optional[list]

List of object labels for closed-set annotation. Only used if the template supports object_labels.

None
resume_timestamp Optional[str]

Timestamp of a previous run to resume, in YYYYMMDD_HHMMSS format. Texts already processed in that run are skipped. The model and template must match the original run.

None
template_columns Optional[list[str]]

Column names from data to pass as additional template variables. Each name maps directly to a Jinja2 variable of the same name (e.g. "parent_post"{{ parent_post }}). Columns must be string dtype; cast with data[col].astype(str) before calling if needed.

None