Skip to main content

ALDashboard.docx_wrangling

get_docx_run_text

def get_docx_run_text(document: Union[docx.document.Document, str],
paragraph_number: int, run_number: int) -> str

Get run text by unified paragraph index across body/tables/headers/footers.

get_docx_run_items

def get_docx_run_items(
document: Union[docx.document.Document, str]) -> List[List[Any]]

Return [paragraph_index, run_index, run_text] across body/tables/headers/footers.

update_docx

def update_docx(
document: Union[docx.document.Document, str],
modified_runs: List[Tuple[int, int, str,
int]]) -> docx.document.Document

Update the document with modified runs.

Arguments

  • document - the docx.Document object, or the path to the DOCX file
  • modified_runs - a tuple of paragraph number, run number, the modified text, and a number from -1 to 1 indicating whether a new paragraph should be inserted before or after the current paragraph.

Returns

The modified document.

get_labeled_docx_runs

def get_labeled_docx_runs(
docx_path: str,
custom_people_names: Optional[List[Tuple[str, str]]] = None,
openai_client: Optional[Any] = None,
openai_api: Optional[str] = None,
openai_base_url: Optional[str] = None,
model: str = "gpt-5-nano",
custom_prompt: Optional[str] = None,
additional_instructions: Optional[str] = None,
max_output_tokens: Optional[int] = None
) -> List[Tuple[int, int, str, int]]

Scan the DOCX and return a list of modified text with Jinja2 variable names inserted.

Arguments

  • docx_path - path to the DOCX file
  • custom_people_names - optional list of custom (name, description) pairs, e.g. [("clients", "the person benefiting from the form")]
  • openai_api - optional API key override. If omitted, ALToolbox default resolution is used.

Returns

A list of tuples, each containing a paragraph number, run number, and the modified text of the run.

modify_docx_with_openai_guesses

def modify_docx_with_openai_guesses(docx_path: str) -> docx.document.Document

Uses OpenAI to guess the variable names for a document and then modifies the document with the guesses.

Arguments

  • docx_path str - Path to the DOCX file to modify.

Returns

  • docx.Document - The modified document, ready to be saved to the same or a new path