支持此方法的模型
Provider | Tool calling | Structured output | JSON mode | Local | Multimodal | Package |
---|---|---|---|---|---|---|
ChatAnthropic | ✅ | ✅ | ❌ | ❌ | ✅ | langchain-anthropic |
ChatMistralAI | ✅ | ✅ | ❌ | ❌ | ❌ | langchain-mistralai |
ChatFireworks | ✅ | ✅ | ✅ | ❌ | ❌ | langchain-fireworks |
AzureChatOpenAI | ✅ | ✅ | ✅ | ❌ | ✅ | langchain-openai |
ChatOpenAI | ✅ | ✅ | ✅ | ❌ | ✅ | langchain-openai |
ChatTogether | ✅ | ✅ | ✅ | ❌ | ❌ | langchain-together |
ChatVertexAI | ✅ | ✅ | ❌ | ❌ | ✅ | langchain-google-vertexai |
ChatGoogleGenerativeAI | ✅ | ✅ | ❌ | ❌ | ✅ | langchain-google-genai |
ChatGroq | ✅ | ✅ | ✅ | ❌ | ❌ | langchain-groq |
ChatCohere | ✅ | ✅ | ❌ | ❌ | ❌ | langchain-cohere |
ChatBedrock | ✅ | ✅ | ❌ | ❌ | ❌ | langchain-aws |
ChatHuggingFace | ✅ | ✅ | ❌ | ✅ | ❌ | langchain-huggingface |
ChatNVIDIA | ✅ | ✅ | ❌ | ✅ | ❌ | langchain-nvidia-ai-endpoints |
ChatOllama | ✅ | ✅ | ✅ | ✅ | ❌ | langchain-ollama |
ChatLlamaCpp | ✅ | ✅ | ❌ | ✅ | ❌ | langchain-community |
ChatAI21 | ✅ | ✅ | ❌ | ❌ | ❌ | langchain-ai21 |
ChatUpstage | ✅ | ✅ | ❌ | ❌ | ❌ | langchain-upstage |
ChatDatabricks | ✅ | ✅ | ❌ | ❌ | ❌ | langchain-databricks |
这是获取结构化输出最简单和最可靠的方法,这个方法针对提供结构化输出的原生API的模型实现,例如工具,函数调用或json模式,并在底层利用这些功能。
此方法接受一个模式作为输入,该模式指定所需输出属性的名称,类型和描述,改方法返回一个类似模型的可运行对象,处理输出字符串或消息外,她输出与给定模式对应的对象,
模式可以指定为 TypedDict 类、JSON Schema 或 Pydantic 类。如果使用 TypedDict 或 JSON Schema,则可运行对象将返回一个字典;如果使用 Pydantic 类,则将返回一个 Pydantic 对象。
如果我们希望模型返回一个 Pydantic 对象,我们只需传入所需的 Pydantic 类。使用 Pydantic 的主要优点是模型生成的输出将会被验证。如果缺少任何必需字段或字段类型错误,Pydantic 将引发错误。
from typing import Optional
from pydantic import BaseModel, Field
# Pydantic
class Joke(BaseModel):
"""Joke to tell user."""
setup: str = Field(description="The setup of the joke")
punchline: str = Field(description="The punchline to the joke")
rating: Optional[int] = Field(
default=None, description="How funny the joke is, from 1 to 10"
)
structured_llm = llm.with_structured_output(Joke)
structured_llm.invoke("Tell me a joke about cats")
输出结果:
Joke(setup='Why was the cat sitting on the computer?', punchline='Because it wanted to keep an eye on the mouse!', rating=7)
如果您不想使用 Pydantic,明确不想对参数进行验证,或者希望能够流式处理模型输出,您可以使用 TypedDict 类定义您的模式。我们可以选择性地使用 LangChain 支持的特殊 Annotated
语法,允许您指定字段的默认值和描述。请注意,如果模型没有生成默认值,则默认值不会自动填充,它仅用于定义传递给模型的模式。
Requirements
- 核心:
langchain-core>=0.2.26
- 类型扩展: 强烈建议从
typing_extensions
导入Annotated
和TypedDict
,而不是从typing
导入,以确保在不同 Python 版本之间的一致行为。
from typing_extensions import Annotated, TypedDict
# TypedDict
class Joke(TypedDict):
"""Joke to tell user."""
setup: Annotated[str, ..., "The setup of the joke"]
# Alternatively, we could have specified setup as:
# setup: str # no default, no description
# setup: Annotated[str, ...] # no default, no description
# setup: Annotated[str, "foo"] # default, no description
punchline: Annotated[str, ..., "The punchline of the joke"]
rating: Annotated[Optional[int], None, "How funny the joke is, from 1 to 10"]
structured_llm = llm.with_structured_output(Joke)
structured_llm.invoke("Tell me a joke about cats")
{'setup': 'Why was the cat sitting on the computer?',
'punchline': 'Because it wanted to keep an eye on the mouse!',
'rating': 7}
同样,我们可以传入一个 JSON Schema 字典。这不需要任何导入或类,并且清楚地说明了每个参数的文档,代价是稍微冗长一些。
json_schema = {
"title": "joke",
"description": "Joke to tell user.",
"type": "object",
"properties": {
"setup": {
"type": "string",
"description": "The setup of the joke",
},
"punchline": {
"type": "string",
"description": "The punchline to the joke",
},
"rating": {
"type": "integer",
"description": "How funny the joke is, from 1 to 10",
"default": None,
},
},
"required": ["setup", "punchline"],
}
structured_llm = llm.with_structured_output(json_schema)
structured_llm.invoke("Tell me a joke about cats")
{'setup': 'Why was the cat sitting on the computer?',
'punchline': 'Because it wanted to keep an eye on the mouse!',
'rating': 7}
from typing import Union
# Pydantic
class Joke(BaseModel):
"""Joke to tell user."""
setup: str = Field(description="The setup of the joke")
punchline: str = Field(description="The punchline to the joke")
rating: Optional[int] = Field(
default=None, description="How funny the joke is, from 1 to 10"
)
class ConversationalResponse(BaseModel):
"""Respond in a conversational manner. Be kind and helpful."""
response: str = Field(description="A conversational response to the user's query")
class FinalResponse(BaseModel):
final_output: Union[Joke, ConversationalResponse]
structured_llm = llm.with_structured_output(FinalResponse)
structured_llm.invoke("Tell me a joke about cats")
FinalResponse(final_output=Joke(setup='Why was the cat sitting on the computer?', punchline='Because it wanted to keep an eye on the mouse!', rating=7))
structured_llm = llm.with_structured_output(Joke)
for chunk in structured_llm.stream("Tell me a joke about cats"):
print(chunk)
{}
{'setup': ''}
{'setup': 'Why'}
{'setup': 'Why was'}
{'setup': 'Why was the'}
{'setup': 'Why was the cat'}
{'setup': 'Why was the cat sitting'}
{'setup': 'Why was the cat sitting on'}
{'setup': 'Why was the cat sitting on the'}
{'setup': 'Why was the cat sitting on the computer'}
{'setup': 'Why was the cat sitting on the computer?'}
from langchain_core.prompts import ChatPromptTemplate
system = """You are a hilarious comedian. Your specialty is knock-knock jokes. \
Return a joke which has the setup (the response to "Who's there?") and the final punchline (the response to " who?").
Here are some examples of jokes:
example_user: Tell me a joke about planes
example_assistant: {{"setup": "Why don't planes ever get tired?", "punchline": "Because they have rest wings!", "rating": 2}}
example_user: Tell me another joke about planes
example_assistant: {{"setup": "Cargo", "punchline": "Cargo 'vroom vroom', but planes go 'zoom zoom'!", "rating": 10}}
example_user: Now about caterpillars
example_assistant: {{"setup": "Caterpillar", "punchline": "Caterpillar really slow, but watch me turn into a butterfly and steal the show!", "rating": 5}}"""
prompt = ChatPromptTemplate.from_messages([("system", system), ("human", "{input}")])
few_shot_structured_llm = prompt | structured_llm
few_shot_structured_llm.invoke("what's something funny about woodpeckers")
{'setup': 'Woodpecker',
'punchline': "Woodpecker who? Woodpecker who can't find a tree is just a bird with a headache!",
'rating': 7}
当结构化输出的底层方法是工具调用时,我们可以将示例作为显式工具调用传入。您可以查看您使用的模型是否在其API参考中使用工具调用。
from langchain_core.messages import AIMessage, HumanMessage, ToolMessage
examples = [
HumanMessage("Tell me a joke about planes", name="example_user"),
AIMessage(
"",
name="example_assistant",
tool_calls=[
{
"name": "joke",
"args": {
"setup": "Why don't planes ever get tired?",
"punchline": "Because they have rest wings!",
"rating": 2,
},
"id": "1",
}
],
),
# Most tool-calling models expect a ToolMessage(s) to follow an AIMessage with tool calls.
ToolMessage("", tool_call_id="1"),
# Some models also expect an AIMessage to follow any ToolMessages,
# so you may need to add an AIMessage here.
HumanMessage("Tell me another joke about planes", name="example_user"),
AIMessage(
"",
name="example_assistant",
tool_calls=[
{
"name": "joke",
"args": {
"setup": "Cargo",
"punchline": "Cargo 'vroom vroom', but planes go 'zoom zoom'!",
"rating": 10,
},
"id": "2",
}
],
),
ToolMessage("", tool_call_id="2"),
HumanMessage("Now about caterpillars", name="example_user"),
AIMessage(
"",
tool_calls=[
{
"name": "joke",
"args": {
"setup": "Caterpillar",
"punchline": "Caterpillar really slow, but watch me turn into a butterfly and steal the show!",
"rating": 5,
},
"id": "3",
}
],
),
ToolMessage("", tool_call_id="3"),
]
system = """You are a hilarious comedian. Your specialty is knock-knock jokes. \
Return a joke which has the setup (the response to "Who's there?") \
and the final punchline (the response to " who?")."""
prompt = ChatPromptTemplate.from_messages(
[("system", system), ("placeholder", "{examples}"), ("human", "{input}")]
)
few_shot_structured_llm = prompt | structured_llm
few_shot_structured_llm.invoke({"input": "crocodiles", "examples": examples})
{'setup': 'Crocodile',
'punchline': 'Crocodile be seeing you later, alligator!',
'rating': 7}
structured_llm = llm.with_structured_output(None, method="json_mode")
structured_llm.invoke(
"Tell me a joke about cats, respond in JSON with `setup` and `punchline` keys"
)
{'setup': 'Why was the cat sitting on the computer?',
'punchline': 'Because it wanted to keep an eye on the mouse!'}
PydanticOutputParser
from typing import List
from langchain_core.output_parsers import PydanticOutputParser
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
class Person(BaseModel):
"""Information about a person."""
name: str = Field(..., description="The name of the person")
height_in_meters: float = Field(
..., description="The height of the person expressed in meters."
)
class People(BaseModel):
"""Identifying information about all people in a text."""
people: List[Person]
# Set up a parser
parser = PydanticOutputParser(pydantic_object=People)
# Prompt
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"Answer the user query. Wrap the output in `json` tags\n{format_instructions}",
),
("human", "{query}"),
]
).partial(format_instructions=parser.get_format_instructions())
让我们看看发送到模型的信息:
query = "Anna is 23 years old and she is 6 feet tall"
print(prompt.invoke(query).to_string())
System: Answer the user query. Wrap the output in `json` tags
The output should be formatted as a JSON instance that conforms to the JSON schema below.
As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.
Here is the output schema:
\`\`\`
{"description": "Identifying information about all people in a text.", "properties": {"people": {"title": "People", "type": "array", "items": {"$ref": "#/definitions/Person"}}}, "required": ["people"], "definitions": {"Person": {"title": "Person", "description": "Information about a person.", "type": "object", "properties": {"name": {"title": "Name", "description": "The name of the person", "type": "string"}, "height_in_meters": {"title": "Height In Meters", "description": "The height of the person expressed in meters.", "type": "number"}}, "required": ["name", "height_in_meters"]}}}
\`\`\`
Human: Anna is 23 years old and she is 6 feet tall
import json
import re
from typing import List
from langchain_core.messages import AIMessage
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
class Person(BaseModel):
"""Information about a person."""
name: str = Field(..., description="The name of the person")
height_in_meters: float = Field(
..., description="The height of the person expressed in meters."
)
class People(BaseModel):
"""Identifying information about all people in a text."""
people: List[Person]
# Prompt
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"Answer the user query. Output your answer as JSON that "
"matches the given schema: \`\`\`json\n{schema}\n\`\`\`. "
"Make sure to wrap the answer in \`\`\`json and \`\`\` tags",
),
("human", "{query}"),
]
).partial(schema=People.schema())
# Custom parser
def extract_json(message: AIMessage) -> List[dict]:
"""Extracts JSON content from a string where JSON is embedded between \`\`\`json and \`\`\` tags.
Parameters:
text (str): The text containing the JSON content.
Returns:
list: A list of extracted JSON strings.
"""
text = message.content
# Define the regular expression pattern to match JSON blocks
pattern = r"\`\`\`json(.*?)\`\`\`"
# Find all non-overlapping matches of the pattern in the string
matches = re.findall(pattern, text, re.DOTALL)
# Return the list of matched JSON strings, stripping any leading or trailing whitespace
try:
return [json.loads(match.strip()) for match in matches]
except Exception:
raise ValueError(f"Failed to parse: {message}")
这是发送给模型的提示:
query = "Anna is 23 years old and she is 6 feet tall"
print(prompt.format_prompt(query=query).to_string())
System: Answer the user query. Output your answer as JSON that matches the given schema: \`\`\`json
{'title': 'People', 'description': 'Identifying information about all people in a text.', 'type': 'object', 'properties': {'people': {'title': 'People', 'type': 'array', 'items': {'$ref': '#/definitions/Person'}}}, 'required': ['people'], 'definitions': {'Person': {'title': 'Person', 'description': 'Information about a person.', 'type': 'object', 'properties': {'name': {'title': 'Name', 'description': 'The name of the person', 'type': 'string'}, 'height_in_meters': {'title': 'Height In Meters', 'description': 'The height of the person expressed in meters.', 'type': 'number'}}, 'required': ['name', 'height_in_meters']}}}
\`\`\`. Make sure to wrap the answer in \`\`\`json and \`\`\` tags
Human: Anna is 23 years old and she is 6 feet tall
当我们调用它时,它的样子是这样的:
chain = prompt | llm | extract_json
chain.invoke({"query": query})
[{'people': [{'name': 'Anna', 'height_in_meters': 1.8288}]}]