Skip to main content

ALToolbox.llms

chat_completion

def chat_completion(
system_message: Optional[str] = None,
user_message: Optional[str] = None,
openai_client: Optional[OpenAI] = None,
openai_api: Optional[str] = None,
temperature: float = 0.5,
json_mode=False,
model: str = "gpt-4o",
messages: Optional[List[Dict[str, str]]] = None,
skip_moderation: bool = True,
openai_base_url: Optional[str] = None,
max_output_tokens: Optional[int] = None,
max_input_tokens: Optional[int] = None,
reasoning_effort: Optional[Literal["minimal", "low", "medium",
"high"]] = None
) -> Union[List[Any], Dict[str, Any], str]

A light wrapper on the OpenAI chat endpoint.

Includes support for token limits, minimal error handling, and moderation.

Arguments

  • system_message str - The role the chat engine should play
  • user_message str - The message (data) from the user
  • openai_client Optional[OpenAI] - An OpenAI client object, optional. If omitted, will fall back to creating a new OpenAI client with the API key provided as an environment variable
  • openai_api Optional[str] - the API key for an OpenAI client, optional. If provided, a new OpenAI client will be created.
  • temperature float - The temperature to use for the GPT API
  • json_mode bool - Whether to use JSON mode for the GPT API. Requires the word json in the system message, but will add if you omit it.
  • model str - The model to use for the GPT API
  • messages Optional[List[Dict[str, str]]] - A list of messages to send to the chat engine. If provided, system_message and user_message will be ignored.
  • skip_moderation bool - Whether to skip the OpenAI moderation step, which may save seconds but risks banning your account. Only enable when you have full control over the inputs.
  • openai_base_url Optional[str] - The base URL for the OpenAI API. Defaults to value provided in the configuration or "https://api.openai.com/v1/".
  • max_output_tokens Optional[int] - The maximum number of tokens to return from the API. Defaults to 16380.
  • max_input_tokens Optional[int] - The maximum number of tokens to send to the API. Defaults to 128000. reasoning_effort (Optional[Literal["minimal", "low", "medium", "high"]]) = None: The reasoning effort to use for thinking models. Defaults to value provided in the configuration or "low".

Returns

A string with the response from the API endpoint or JSON data if json_mode is True

extract_fields_from_text

def extract_fields_from_text(
text: str,
field_list: Dict[str, str],
openai_client: Optional[OpenAI] = None,
openai_api: Optional[str] = None,
temperature: float = 0,
model="gpt-5-nano",
reasoning_effort: Optional[Literal["minimal", "low", "medium",
"high"]] = "low"
) -> Dict[str, Any]

Extracts fields from text.

Arguments

  • text str - The text to extract fields from
  • field_list Dict[str, str] - A list of fields to extract, with the key being the field name and the value being a description of the field
  • openai_client Optional[OpenAI] - An OpenAI client object. Defaults to None.
  • openai_api Optional[str] - An OpenAI API key. Defaults to None.
  • temperature float - The temperature to use for the OpenAI API. Defaults to 0.
  • model str - The model to use for the OpenAI API. Defaults to "gpt-5-nano".
  • reasoning_effort Optional[Literal["minimal", "low", "medium", "high"]] - The reasoning effort to use for the LLM. Defaults to "low".

Returns

  • dict - A dictionary of fields extracted from the text

extract_fields_from_file

def extract_fields_from_file(
the_file: Union[DAFile, DAFileList],
field_list: Dict[str, str],
openai_client: Optional[OpenAI] = None,
openai_api: Optional[str] = None,
model: str = "gpt-5-nano",
reasoning_effort: Optional[Literal["minimal", "low", "medium",
"high"]] = "low",
llm_hint: Optional[str] = "",
process_pdfs_with_ai: bool = True,
ocr_images_and_pdfs: bool = False,
ocr_use_google: Optional[bool] = False) -> Dict[str, Any]

Extracts data (in the form of a list of expected fields) from a file using an LLM.

When the file is a PDF, relies on the OpenAI vision API to interpret the document. Note that this may increase cost, but will also improve accuracy.

If it is another file type that is convertible by Markitdown, it uses Markitdown to convert the file to text first.

Can be combined with define_fields_from_dict to populate Docassemble fields.

You can provide a hint to the LLM if it would help with data extraction. For example: "the document ID is usually found near the top right of the first page."

You should normally call this function in the background as it may take some time to run, especially when ocr_images_and_pdfs is True.

Arguments

  • the_file Union[DAFile, DAFileList] - The file to extract fields from
  • field_list Dict[str, str] - A list of fields to extract, with the key being the field name and the value being a description of the field
  • openai_client Optional[OpenAI] - An OpenAI client object. Defaults to None.
  • openai_api Optional[str] - An OpenAI API key. Defaults to None.
  • model str - The model to use for the OpenAI API. Defaults to "gpt-5-nano".
  • reasoning_effort Optional[Literal["minimal", "low", "medium", "high"]] - The reasoning effort to use for the LLM. Defaults to "low".
  • llm_hint Optional[str] - an optional hint to improve processing the text layer with the LLM.
  • process_pdfs_with_ai bool - Whether to process PDFs with the OpenAI API (True) or convert to text first (False). Defaults to True.
  • ocr_images_and_pdfs bool - Whether to perform OCR on PDFs before processing with the OpenAI API. Defaults to False. May be useful if the PDF has a text layer that is incomplete.
  • ocr_use_google Optional[bool] - whether to use Google Vision API instead of local OCR. Only applies if ocr_images_and_pdfs is True

Returns

  • dict - A dictionary of fields extracted from the file

match_goals_from_text

def match_goals_from_text(question: str,
user_response: str,
goals: Dict[str, str],
openai_client: Optional[OpenAI] = None,
openai_api: Optional[str] = None,
temperature: float = 0,
model="gpt-4o-mini") -> Dict[str, Any]

Reads a user's message and determines whether it meets a set of goals, with the help of an LLM.

Arguments

  • question str - The question that was asked to the user
  • user_response str - The user's response to the question
  • goals Dict[str,str] - A list of goals to extract, with the key being the goal name and the value being a description of the goal
  • openai_client Optional[OpenAI] - An OpenAI client object. Defaults to None.
  • openai_api Optional[str] - An OpenAI API key. Defaults to None.
  • temperature float - The temperature to use for the OpenAI API. Defaults to 0.
  • model str - The model to use for the OpenAI API. Defaults to "gpt-4o-mini".

Returns

A dictionary of fields extracted from the text

classify_text

def classify_text(text: str,
choices: Dict[str, str],
default_response: str = "null",
openai_client: Optional[OpenAI] = None,
openai_api: Optional[str] = None,
temperature: float = 0,
model="gpt-4o-mini") -> str

Given a text, classify it into one of the provided choices with the assistance of a large language model.

Arguments

  • text str - The text to classify
  • choices Dict[str,str] - A list of choices to classify the text into, with the key being the choice name and the value being a description of the choice
  • default_response str - The default response to return if the text cannot be classified. Defaults to "null".
  • openai_client Optional[OpenAI] - An OpenAI client object, optional. If omitted, will fall back to creating a new OpenAI client with the API key provided as an environment variable
  • openai_api Optional[str] - the API key for an OpenAI client, optional. If provided, a new OpenAI client will be created.
  • temperature float - The temperature to use for GPT. Defaults to 0.
  • model str - The model to use for the GPT API

Returns

The classification of the text.

synthesize_user_responses

def synthesize_user_responses(messages: List[Dict[str, str]],
custom_instructions: Optional[str] = "",
openai_client: Optional[OpenAI] = None,
openai_api: Optional[str] = None,
temperature: float = 0,
model: str = "gpt-4o-mini") -> str

Given a first draft and a series of follow-up questions and answers, use an LLM to synthesize the user's responses into a single, coherent reply.

Arguments

  • messages List[Dict[str, str]] - A list of questions from the LLM and responses from the user
  • custom_instructions str - Custom instructions for the LLM to follow in constructing the synthesized response
  • openai_client Optional[OpenAI] - An OpenAI client object, optional. If omitted, will fall back to creating a new OpenAI client with the API key provided as an environment variable
  • openai_api Optional[str] - the API key for an OpenAI client, optional. If provided, a new OpenAI client will be created.
  • temperature float - The temperature to use for GPT. Defaults to 0.
  • model str - The model to use for the GPT API

Returns

A synthesized response from the user.

define_fields_from_dict

def define_fields_from_dict(field_dict: Dict[str, Any],
fields_to_ignore: Optional[List] = None) -> None

Assign values from a dictionary to corresponding Docassemble interview fields.

Docassemble and built-in keywords are never defined by this function. If fields_to_ignore is provided, those fields will also be ignored.

Arguments

  • field_dict Dict[str, Any] - A dictionary of fields to define, with the key being the field name and the value presumably taken from the output of extract_fields_from_text.
  • fields_to_ignore Optional[List] - A list of fields to ignore. Defaults to None. Should be used to ensure safety when defining fields from untrusted sources. E.g., ["user_is_logged_in"]

Goal Objects

class Goal(DAObject)

A class to represent a goal.

Attributes

  • name str - The name of the goal
  • description str - A description of the goal
  • satisfied bool - Whether the goal is satisfied

response_satisfies_me_or_follow_up

def response_satisfies_me_or_follow_up(
messages: List[Dict[str, str]],
openai_client: Optional[OpenAI] = None,
model="gpt-4o-mini",
system_message: Optional[str] = None,
llm_assumed_role: Optional[str] = "teacher",
user_assumed_role: Optional[str] = "student") -> str

Returns the text of the next question to ask the user or the string "satisfied" if the user's response satisfies the goal.

Arguments

  • messages List[Dict[str, str]] - The messages to check
  • openai_client Optional[OpenAI] - An OpenAI client object. Defaults to None.
  • model str - The model to use for the OpenAI API. Defaults to "gpt-4o-mini".
  • system_message Optional[str] - The system message to use for the OpenAI API. Defaults to None.
  • llm_assumed_role Optional[str] - The role for the LLM to assume. Defaults to "teacher".
  • user_assumed_role Optional[str] - The role for the user to assume. Defaults to "student".

Returns

The text of the next question to ask the user or the string "satisfied"

get_next_question

def get_next_question(thread_so_far: List[Dict[str, str]],
openai_client: Optional[OpenAI] = None,
model="gpt-4o-mini") -> str

Returns the text of the next question to ask the user.

Arguments

  • thread_so_far List[Dict[str, str]] - The thread of the conversation so far
  • openai_client Optional[OpenAI] - An OpenAI client object. Defaults to None.
  • model str - The model to use for the OpenAI API. Defaults to "gpt-4o-mini".

Returns

The text of the next question to ask the user.

GoalDict Objects

class GoalDict(DADict)

A class to represent a DADict of Goals.

satisfied

def satisfied() -> bool

Returns True if all goals are satisfied, False otherwise.

Returns

True if all goals are satisfied, False otherwise.

GoalQuestion Objects

class GoalQuestion(DAObject)

A class to represent a question about a goal.

Attributes

  • goal Goal - The goal the question is about
  • question str - The question to ask the user
  • response str - The user's response to the question

complete

@property
def complete()

Returns True if the goal, question, and response attributes are present.

GoalSatisfactionList Objects

class GoalSatisfactionList(DAList)

A class to help ask the user questions until all goals are satisfied.

Uses an LLM to prompt the user with follow-up questions if the initial response isn't complete. By default, the number of follow-up questions is limited to 10.

This can consume a lot of tokens, as each follow-up has a chance to send the whole conversation thread to the LLM.

By default, this will use the OpenAI API key defined in the global configuration under this path:

You can specify the path to an alternative configuration by setting the openai_configuration_path attribute.

This object does NOT accept the key as a direct parameter, as that will be leaked in the user's answers.

open ai:
key: sk-...

Attributes

  • goals List[Goal] - The goals in the list, provided as a dictionary
  • goal_list GoalList - The list of Goals
  • question_limit int - The maximum number of follow-up questions to ask the user
  • question_per_goal_limit int - The maximum number of follow-up questions to ask the user per goal
  • initial_draft str - The initial draft of the user's response
  • initial_question str - The original question posed in the interview

mark_satisfied_goals

def mark_satisfied_goals() -> None

Marks goals as satisfied if the user's response satisfies the goal. This should be used as soon as the user gives their initial reply.

keep_going

def keep_going() -> bool

Returns True if there is at least one unsatisfied goal and if the number of follow-up questions asked is less than the question limit, False otherwise.

Returns

True if there is at least one unsatisfied goal and if the number of follow-up questions asked is less than the question limit, False otherwise.

need_more_questions

def need_more_questions() -> bool

Returns True if there is at least one unsatisfied goal, False otherwise.

Also has the side effect of checking the user's most recent response to see if it satisfies the goal and updating the next question to be asked.

Returns

True if there is at least one unsatisfied goal, False otherwise.

satisfied

def satisfied() -> bool

Returns True if all goals are satisfied, False otherwise.

Returns

True if all goals are satisfied, False otherwise.

get_next_goal_and_question

def get_next_goal_and_question() -> tuple

Returns the next unsatisfied goal, along with a follow-up question to ask the user, if relevant.

Returns

A tuple of (Goal, str) where the first item is the next unsatisfied goal and the second item is the next question to ask the user, if relevant. If the user's response to the last question satisfied the goal, returns (None, None).

synthesize_draft_response

def synthesize_draft_response() -> str

Returns a draft response that synthesizes the user's responses to the questions.

Returns

A draft response that synthesizes the user's responses to the questions.

provide_feedback

def provide_feedback(
feedback_prompt: str = "") -> Union[List[Any], Dict[str, Any], str]

Returns feedback to the user based on the goals they satisfied.

Arguments

  • feedback_prompt str - The prompt to use for the feedback. Defaults to "".

Returns

Feedback to the user based on the goals they satisfied.

GoalOrientedQuestion Objects

class GoalOrientedQuestion(DAObject)

A class to represent a question in a goal-oriented questionnaire.

Attributes

  • question str or dict - The question to ask the user (text or field structure)
  • response str or dict - The user's response to the question (text or field values)

complete

@property
def complete()

Returns True if the question and response attributes are present.

response_as_text

def response_as_text() -> str

Returns the response in a readable text format for the LLM.

Combines both structured responses from response_dict and the open-ended response. Uses original labels from the question for better context. Handles checkboxes specially by using .true_values() to show only checked items.

Returns

A formatted string representation of all responses.

build_field_list

def build_field_list() -> List[Dict[str, Any]]

Build a field list from this question object for use in docassemble fields.

Returns

A list of field dictionaries suitable for use in a fields: code: block

GoalOrientedQuestionList Objects

class GoalOrientedQuestionList(DAList)

A class to help ask the user follow-up questions until their response satisfies a single rubric.

Unlike GoalSatisfactionList which tracks multiple individual goals, this class focuses on a single rubric that describes what constitutes a complete response. The AI will continue asking follow-up questions until the response satisfies the rubric or the question limit is reached.

This can consume a lot of tokens, as each follow-up has a chance to send the whole conversation thread to the LLM.

By default, this will use the OpenAI API key defined in the global configuration under this path:

You can specify the path to an alternative configuration by setting the openai_configuration_path attribute.

This object does NOT accept the key as a direct parameter, as that will be leaked in the user's answers.

open ai:
key: sk-...

Attributes

  • rubric str - The rubric that describes what constitutes a complete response
  • question_limit int - The maximum number of follow-up questions to ask the user. Defaults to 6.
  • initial_draft str, optional - The initial draft of the user's response (for open-ended initial questions)
  • initial_draft_dict DADict, optional - Dictionary of structured responses (for structured initial questions)
  • initial_draft_response str, optional - Open-ended response for structured initial questions
  • initial_question str - The original question posed in the interview
  • use_structured_initial_question bool - If True, generate structured fields for initial question. Defaults to False.
  • model str - The model to use for the OpenAI API. Defaults to "gpt-5-nano".
  • llm_assumed_role str - The role for the LLM to assume. Defaults to "legal aid intake worker".
  • user_assumed_role str - The role for the user to assume. Defaults to "applicant for legal help".
  • skip_moderation bool - If True, skips moderation checks when generating structured fields. Defaults to True.
  • reasoning_effort Optional[Literal["minimal", "low", "medium", "high"]] - The level of reasoning effort to use when generating responses. Defaults to "low"; use "minimal" for increased speed.

generate_initial_question_fields

def generate_initial_question_fields() -> Dict[str, Any]

Generate structured fields for the initial question using the LLM.

This allows the initial question to use structured fields (radio, checkboxes, etc.) instead of requiring an open-ended narrative response.

Returns

A dict with the structure for the initial question fields.

build_initial_field_list

def build_initial_field_list() -> List[Dict[str, Any]]

Build field list for the initial question when using structured format.

Returns

A list of field dictionaries suitable for use in a fields: code: block

initial_response_as_text

def initial_response_as_text() -> str

Returns the initial response in text format, handling both string and dict formats.

Handles checkboxes specially by using .true_values() to show only checked items.

Returns

A formatted string representation of the initial response.

keep_going

def keep_going() -> bool

Returns True if the response is not yet complete and the question limit hasn't been reached.

Returns

True if more questions can be asked, False otherwise.

need_more_questions

def need_more_questions() -> bool

Returns True if the user needs to answer more questions, False otherwise.

Also has the side effect of checking the user's most recent response to see if it satisfies the rubric and updating the next question to be asked.

Returns

True if more questions are needed, False otherwise.

satisfied

def satisfied() -> bool

Returns True if the rubric is satisfied, False otherwise.

Returns

True if the rubric is satisfied, False otherwise.

get_next_question

def get_next_question() -> Optional[Union[str, Dict[str, Any]]]

Returns the text or field structure of the next question to ask the user.

Returns

The text/fields of the next question, or None if no more questions are needed.

synthesize_draft_response

def synthesize_draft_response() -> str

Returns a draft response that synthesizes the user's responses to the questions.

Returns

A draft response that synthesizes the user's responses to the questions.

provide_feedback

def provide_feedback(
feedback_prompt: str = "") -> Union[List[Any], Dict[str, Any], str]

Returns feedback to the user based on how well they satisfied the rubric.

Arguments

  • feedback_prompt str - The prompt to use for the feedback. Defaults to "".

Returns

Feedback to the user based on how well they satisfied the rubric.

IntakeQuestion Objects

class IntakeQuestion(DAObject)

A class to represent a question in an LLM-assisted intake questionnaire.

Attributes

  • question str - The question to ask the user
  • response str - The user's response to the question

complete

@property
def complete()

Returns True if the question and response attributes are present.

IntakeQuestionList Objects

class IntakeQuestionList(DAList)

Class to help create an LLM-assisted intake questionnaire.

The LLM will be provided a free-form set of in/out criteria (like that provided to a phone intake worker), an initial draft question from the user, and then guide the user through a series of follow-up questions to gather only enough information to determine if the user meets the criteria.

In/out criteria are often pretty short, so we do not make or support embeddings at the moment.

Attributes

  • criteria Dict[str, str] - A dictionary of criteria to match, indexed by problem type
  • problem_type_descriptions Dict[str, str] - A dictionary of descriptions of the problem types
  • problem_type str - The type of problem to match. E.g., a unit/department inside the law firm
  • initial_problem_description str - The initial description of the problem from the user
  • initial_question str - The original question posed in the interview
  • question_limit int - The maximum number of follow-up questions to ask the user. Defaults to 10.
  • model str - The model to use for the GPT API. Defaults to gpt-4.1.
  • max_output_tokens int - The maximum number of tokens to return from the API. Defaults to 4096
  • llm_role str - The role the LLM should play. Allows you to customize the script the LLM uses to guide the user. We have provided a default script that should work for most intake questionnaires.
  • llm_user_qualifies_prompt str - The prompt to use to determine if the user qualifies. We have provided a default prompt.
  • out_of_questions bool - Whether the user has run out of questions to answer
  • qualifies bool - Whether the user qualifies based on the criteria

need_more_questions

def need_more_questions() -> bool

Returns True if the user needs to answer more questions, False otherwise.

Also has the side effect of checking the user's most recent response to see if it satisfies the criteria and updating both the next question to be asked and the current qualification status.

Returns

True if the user needs to answer more questions, False otherwise.