Skip to content

Input / Output

actantial.io.load_annotations(data, label_folder, actor_labels_path=None, object_labels_path=None, verbose=True, **kwargs)

Load actant annotations from a run output folder into a DataFrame.

Matches each row to an annotation file in label_folder by its id value, then extracts actant roles from each file. Rows without a matching file receive None for all actant columns.

When label paths are provided, actor values not in the allowed set are replaced with None. This is useful when using closed annotation, where the LLM may assign labels outside the predefined label set.

Parameters:

Name Type Description Default
data DataFrame

DataFrame with at least an id column.

required
label_folder str

Path to the folder containing per-text JSON annotation files, as produced by run_extract.

required
actor_labels_path Optional[str]

Path to a YAML file with allowed actor labels. If provided, values for non-Object actants not in the list are replaced with None.

None
object_labels_path Optional[str]

Path to a YAML file with allowed object labels. If provided, values for the Object actant not in the list are replaced with None.

None
verbose bool

If True, print warnings about missing annotation files and a per-actant summary of dropped unknown actors.

True

Returns:

Type Description
DataFrame

A copy of the input DataFrame with actant columns added.

actantial.io.load_actors(data, file_path_column='file_name', actant_columns=ACTANTS, select_actor='first', actor_labels_path=None, object_labels_path=None, verbose=True, missing_actant_token='[UNK]')

Read per-text JSON annotation files and add actant columns to the DataFrame.

Each file is expected to map actant role names to actor values. When multiple actors are listed for a role, select_actor controls whether to keep only the first or join them all.

When label paths are provided, actor values not in the allowed set are replaced with None.

Parameters:

Name Type Description Default
data DataFrame

DataFrame containing a column with file paths to JSON annotation files.

required
file_path_column str

Name of the column containing the file paths.

'file_name'
actant_columns Optional[list[str]]

List of actants to extract from the JSON files. Defaults to the global ACTANTS list.

ACTANTS
select_actor Literal['first', 'combine']

Strategy for handling multiple actors per actant role. "first" keeps only the first actor; "combine" joins all actors into a comma-separated string.

'first'
actor_labels_path Optional[str]

Path to a YAML file with allowed actor labels. If provided, actor values for non-Object actants not in the list are replaced with None.

None
object_labels_path Optional[str]

Path to a YAML file with allowed object labels. If provided, actor values for the Object actant not in the list are replaced with None.

None
verbose bool

If True, print a per-actant summary of dropped unknown actors.

True
missing_actant_token Optional[str]

Token used in the data to denote a missing or unknown actant. Occurrences are replaced with None. Set to None to disable. Defaults to "[UNK]".

'[UNK]'

Returns:

Type Description
DataFrame

A copy of the input DataFrame with one column added per actant role.