interlab.queries.query_for_json
1import json 2import re 3from typing import Any, TypeVar 4 5import pydantic 6from fastapi.encoders import jsonable_encoder 7 8from treetrace import FormatStr, TracingNode 9 10from .json_examples import generate_json_example 11from .json_parsing import find_and_parse_json_block 12from .json_schema import get_json_schema, get_pydantic_model 13from .query_failure import ParsingFailure 14from .query_model import query_model 15 16_FORMAT_PROMPT = """\ 17# Format instructions:\n 18{deliberation}Write the answer as a single valid JSON value, conforming to the following JSON schema:\n 19```json 20{schema} 21```\n 22The answer should contain exactly one valid JSON code block delimited by "```json" and "```". 23""" 24 25 26_FORMAT_PROMPT_DELIBERATE = """\ 271. Deliberate about the problem and write your thoughts as free-form text containing no JSON. 282. """ 29 30 31_FORMAT_PROMPT_EXAMPLE = """\ 32Here is an example JSON instance of the given schema.\n 33```json 34{example} 35```\n""" 36 37 38_FORMAT_VAR = "FORMAT_PROMPT" 39 40TOut = TypeVar("TOut") 41 42 43def query_for_json( 44 model: Any, 45 T: type, 46 prompt: str, 47 with_example: bool | TOut | str = False, 48 with_cot: bool = False, 49 max_repeats: int = 5, 50 model_for_examples: Any = None, 51) -> TOut: 52 """ 53 Prompt `model` to produce a JSON representation of type `T`, and return it parsed and validated. 54 55 * `model` can be a langchain normal or chat model, a interlab model, or just any callable object. 56 * `T` needs to be a dataclass, a pydantic BaseModel, or a pydantic dataclass. 57 While defining the classes, use field names and desciprions that will help the LLM fill in the data as you 58 expect it. Recursive classes are not suported. 59 After parsing, the models will also be validated. 60 * `prompt` is any string query to the model. If `prompt` contains "{FORMAT_PROMPT}", it will be replaced with format 61 instructions, the JSON schema and the optional JSON example. Otherwise this information will be appended 62 at the end of the prompt (this seems to work well). 63 * `with_example=True` will generate an example JSON instance of the type and its schema, and the example will 64 be added to the prompt. Examples can help smaller LLMs or with more complex tasks but it is for now unclear 65 how much they help larger models, and there is some chance they influence the answer. 66 The example is generated by an LLM (default: gpt-3.5-turbo) so they are trying to be semantically menaningful 67 instances of type T relative to field names and descriptions. 68 In-memory and on-disk caching of the examples for schemas is TODO. 69 You can also provide your own example by passing a JSON string or JSON-serializable object in `with_example`. 70 Note that a provided example is not validated (TODO: validate it). 71 * `with_cot=True` adds a minimal prompt for writing chain-of-thought reasoning before writing 72 out the JSON response. This may improve response quality (via CoT deliberation) but has some risks: 73 the models may include JSON in their deliberation (confusing the parser) or run out of token limit via 74 lenghty deliberation. 75 * `max_repeats` limits how many times will the model be queried before raising an exception - 76 all models have some chance to fail to follow the instructions, and this gives them several chances. 77 Repetition is triggered valid JSON is not found in the output or if it fails to validate 78 against the schema or any validators in the dataclasses. 79 Note there is no repetition on LLM model failure (the model is expected to take care of network faiures etc.). 80 * `model_for_examples` can specify an model to use to generate the example JSON. By default, 81 `gpt-3.5-turbo` is used. 82 83 Returns a valid instance of `T` or raises `ParsingFailure` if all retries failed to find valid JSON. 84 85 *Notes:* 86 87 - Tracing: `query_for_json` logs one TraceNode for its call, and uses `query_model` which 88 also logs TraceNodes for the LLM calls themselves by default. 89 90 - Uses pydatinc under the hood to construction of JSON schemas, flexible conversion of types to schema, 91 validation etc. 92 93 - The prompts ask the LLMs to wrap the JSON in markdown-style codeblocks for additional robustness 94 (e.g. against wild `{` or `}` somewhere in surrounding text, which is hard to avoid reliably.), 95 and falls back to looking for the outermost `{}`-pair. 96 This may still fail e.g. when talking about JSON in your task, or having the JSON answer 97 contain "```" as substrings. While current version seems sufficient, there are TODOs for improvement. 98 99 - The schema presented to LLM is reference-free; all `$ref`s from the JSON schema are resolved. 100 """ 101 if isinstance(prompt, str): 102 fmt_count = len(re.findall(f'{"{"}{_FORMAT_VAR}{"}"}', prompt)) 103 if fmt_count > 1: 104 raise ValueError( 105 f'Multiple instances of {"{"}{_FORMAT_VAR}{"}"} found in prompt' 106 ) 107 if fmt_count == 0: 108 prompt = ( 109 FormatStr() + prompt + FormatStr("\n\n{" + _FORMAT_VAR + "#77777726}") 110 ) 111 elif isinstance(prompt, FormatStr): 112 if _FORMAT_VAR not in prompt.free_params(): 113 prompt += FormatStr("\n\n{" + _FORMAT_VAR + "#77777726}") 114 else: 115 raise TypeError("query_for_json only accepts str or FormatStr as `prompt`") 116 117 deliberation = _FORMAT_PROMPT_DELIBERATE if with_cot else "" 118 119 pdT = get_pydantic_model(T) 120 schema = get_json_schema(pdT) 121 format_prompt = _FORMAT_PROMPT.format(schema=schema, deliberation=deliberation) 122 123 if with_example is True: 124 with_example = generate_json_example(schema, model=model_for_examples) 125 if with_example and not isinstance(with_example, str): 126 with_example = json.dumps(jsonable_encoder(with_example)) 127 if with_example: 128 format_prompt += _FORMAT_PROMPT_EXAMPLE.format(example=with_example) 129 130 if isinstance(prompt, str): 131 prompt_with_fmt = prompt.replace(f'{"{"}{_FORMAT_VAR}{"}"}', format_prompt) 132 else: 133 prompt_with_fmt = prompt.format(**{_FORMAT_VAR: format_prompt}) 134 135 with TracingNode( 136 f"query for JSON of type {T}", 137 kind="query", 138 inputs=dict( 139 prompt=prompt, 140 with_example=with_example, 141 with_cot=with_cot, 142 max_repeats=max_repeats, 143 T=str(T), 144 ), 145 ) as c: 146 for i in range(max_repeats): 147 res = query_model(model, prompt_with_fmt) 148 assert isinstance(res, str) 149 try: 150 d = find_and_parse_json_block(res) 151 # TODO: Is the following conversion/validation working for nested fields as well? 152 # Convert to pydantic type for permissive conversion and validation 153 d = pdT(**d) 154 # Convert back to match expected type (nested types are ok) 155 d = T(**d.dict()) 156 assert isinstance(d, T) 157 c.set_result(d) 158 return d 159 except (ValueError, pydantic.ValidationError) as e: 160 if i < max_repeats - 1: 161 continue 162 # Errors on last turn get logged into tracing and propagated 163 raise ParsingFailure( 164 f"model repeatedly returned a response without a valid JSON instance of {T.__class__.__name__}" 165 ) from e
44def query_for_json( 45 model: Any, 46 T: type, 47 prompt: str, 48 with_example: bool | TOut | str = False, 49 with_cot: bool = False, 50 max_repeats: int = 5, 51 model_for_examples: Any = None, 52) -> TOut: 53 """ 54 Prompt `model` to produce a JSON representation of type `T`, and return it parsed and validated. 55 56 * `model` can be a langchain normal or chat model, a interlab model, or just any callable object. 57 * `T` needs to be a dataclass, a pydantic BaseModel, or a pydantic dataclass. 58 While defining the classes, use field names and desciprions that will help the LLM fill in the data as you 59 expect it. Recursive classes are not suported. 60 After parsing, the models will also be validated. 61 * `prompt` is any string query to the model. If `prompt` contains "{FORMAT_PROMPT}", it will be replaced with format 62 instructions, the JSON schema and the optional JSON example. Otherwise this information will be appended 63 at the end of the prompt (this seems to work well). 64 * `with_example=True` will generate an example JSON instance of the type and its schema, and the example will 65 be added to the prompt. Examples can help smaller LLMs or with more complex tasks but it is for now unclear 66 how much they help larger models, and there is some chance they influence the answer. 67 The example is generated by an LLM (default: gpt-3.5-turbo) so they are trying to be semantically menaningful 68 instances of type T relative to field names and descriptions. 69 In-memory and on-disk caching of the examples for schemas is TODO. 70 You can also provide your own example by passing a JSON string or JSON-serializable object in `with_example`. 71 Note that a provided example is not validated (TODO: validate it). 72 * `with_cot=True` adds a minimal prompt for writing chain-of-thought reasoning before writing 73 out the JSON response. This may improve response quality (via CoT deliberation) but has some risks: 74 the models may include JSON in their deliberation (confusing the parser) or run out of token limit via 75 lenghty deliberation. 76 * `max_repeats` limits how many times will the model be queried before raising an exception - 77 all models have some chance to fail to follow the instructions, and this gives them several chances. 78 Repetition is triggered valid JSON is not found in the output or if it fails to validate 79 against the schema or any validators in the dataclasses. 80 Note there is no repetition on LLM model failure (the model is expected to take care of network faiures etc.). 81 * `model_for_examples` can specify an model to use to generate the example JSON. By default, 82 `gpt-3.5-turbo` is used. 83 84 Returns a valid instance of `T` or raises `ParsingFailure` if all retries failed to find valid JSON. 85 86 *Notes:* 87 88 - Tracing: `query_for_json` logs one TraceNode for its call, and uses `query_model` which 89 also logs TraceNodes for the LLM calls themselves by default. 90 91 - Uses pydatinc under the hood to construction of JSON schemas, flexible conversion of types to schema, 92 validation etc. 93 94 - The prompts ask the LLMs to wrap the JSON in markdown-style codeblocks for additional robustness 95 (e.g. against wild `{` or `}` somewhere in surrounding text, which is hard to avoid reliably.), 96 and falls back to looking for the outermost `{}`-pair. 97 This may still fail e.g. when talking about JSON in your task, or having the JSON answer 98 contain "```" as substrings. While current version seems sufficient, there are TODOs for improvement. 99 100 - The schema presented to LLM is reference-free; all `$ref`s from the JSON schema are resolved. 101 """ 102 if isinstance(prompt, str): 103 fmt_count = len(re.findall(f'{"{"}{_FORMAT_VAR}{"}"}', prompt)) 104 if fmt_count > 1: 105 raise ValueError( 106 f'Multiple instances of {"{"}{_FORMAT_VAR}{"}"} found in prompt' 107 ) 108 if fmt_count == 0: 109 prompt = ( 110 FormatStr() + prompt + FormatStr("\n\n{" + _FORMAT_VAR + "#77777726}") 111 ) 112 elif isinstance(prompt, FormatStr): 113 if _FORMAT_VAR not in prompt.free_params(): 114 prompt += FormatStr("\n\n{" + _FORMAT_VAR + "#77777726}") 115 else: 116 raise TypeError("query_for_json only accepts str or FormatStr as `prompt`") 117 118 deliberation = _FORMAT_PROMPT_DELIBERATE if with_cot else "" 119 120 pdT = get_pydantic_model(T) 121 schema = get_json_schema(pdT) 122 format_prompt = _FORMAT_PROMPT.format(schema=schema, deliberation=deliberation) 123 124 if with_example is True: 125 with_example = generate_json_example(schema, model=model_for_examples) 126 if with_example and not isinstance(with_example, str): 127 with_example = json.dumps(jsonable_encoder(with_example)) 128 if with_example: 129 format_prompt += _FORMAT_PROMPT_EXAMPLE.format(example=with_example) 130 131 if isinstance(prompt, str): 132 prompt_with_fmt = prompt.replace(f'{"{"}{_FORMAT_VAR}{"}"}', format_prompt) 133 else: 134 prompt_with_fmt = prompt.format(**{_FORMAT_VAR: format_prompt}) 135 136 with TracingNode( 137 f"query for JSON of type {T}", 138 kind="query", 139 inputs=dict( 140 prompt=prompt, 141 with_example=with_example, 142 with_cot=with_cot, 143 max_repeats=max_repeats, 144 T=str(T), 145 ), 146 ) as c: 147 for i in range(max_repeats): 148 res = query_model(model, prompt_with_fmt) 149 assert isinstance(res, str) 150 try: 151 d = find_and_parse_json_block(res) 152 # TODO: Is the following conversion/validation working for nested fields as well? 153 # Convert to pydantic type for permissive conversion and validation 154 d = pdT(**d) 155 # Convert back to match expected type (nested types are ok) 156 d = T(**d.dict()) 157 assert isinstance(d, T) 158 c.set_result(d) 159 return d 160 except (ValueError, pydantic.ValidationError) as e: 161 if i < max_repeats - 1: 162 continue 163 # Errors on last turn get logged into tracing and propagated 164 raise ParsingFailure( 165 f"model repeatedly returned a response without a valid JSON instance of {T.__class__.__name__}" 166 ) from e
Prompt model
to produce a JSON representation of type T
, and return it parsed and validated.
model
can be a langchain normal or chat model, a interlab model, or just any callable object.T
needs to be a dataclass, a pydantic BaseModel, or a pydantic dataclass. While defining the classes, use field names and desciprions that will help the LLM fill in the data as you expect it. Recursive classes are not suported. After parsing, the models will also be validated.prompt
is any string query to the model. Ifprompt
contains "{FORMAT_PROMPT}", it will be replaced with format instructions, the JSON schema and the optional JSON example. Otherwise this information will be appended at the end of the prompt (this seems to work well).with_example=True
will generate an example JSON instance of the type and its schema, and the example will be added to the prompt. Examples can help smaller LLMs or with more complex tasks but it is for now unclear how much they help larger models, and there is some chance they influence the answer. The example is generated by an LLM (default: gpt-3.5-turbo) so they are trying to be semantically menaningful instances of type T relative to field names and descriptions. In-memory and on-disk caching of the examples for schemas is TODO. You can also provide your own example by passing a JSON string or JSON-serializable object inwith_example
. Note that a provided example is not validated (TODO: validate it).with_cot=True
adds a minimal prompt for writing chain-of-thought reasoning before writing out the JSON response. This may improve response quality (via CoT deliberation) but has some risks: the models may include JSON in their deliberation (confusing the parser) or run out of token limit via lenghty deliberation.max_repeats
limits how many times will the model be queried before raising an exception - all models have some chance to fail to follow the instructions, and this gives them several chances. Repetition is triggered valid JSON is not found in the output or if it fails to validate against the schema or any validators in the dataclasses. Note there is no repetition on LLM model failure (the model is expected to take care of network faiures etc.).model_for_examples
can specify an model to use to generate the example JSON. By default,gpt-3.5-turbo
is used.
Returns a valid instance of T
or raises ParsingFailure
if all retries failed to find valid JSON.
Notes:
Tracing:
query_for_json
logs one TraceNode for its call, and usesquery_model
which also logs TraceNodes for the LLM calls themselves by default.Uses pydatinc under the hood to construction of JSON schemas, flexible conversion of types to schema, validation etc.
The prompts ask the LLMs to wrap the JSON in markdown-style codeblocks for additional robustness (e.g. against wild
{
or}
somewhere in surrounding text, which is hard to avoid reliably.), and falls back to looking for the outermost{}
-pair. This may still fail e.g. when talking about JSON in your task, or having the JSON answer contain "```" as substrings. While current version seems sufficient, there are TODOs for improvement.The schema presented to LLM is reference-free; all
$ref
s from the JSON schema are resolved.