带 RAG 的对话式搜索
对话式搜索允许您用自然语言提问,并通过追问来细化答案。因此,对话成为您与大型语言模型 (LLM) 之间的交流。为了实现这一点,模型需要记住整个对话的上下文,而不是单独回答每个问题。
对话式搜索通过以下组件实现
- 对话历史:允许 LLM 记住当前对话的上下文并理解后续问题。
- 检索增强生成 (RAG):允许 LLM 使用专有或最新信息补充其静态知识库。
对话历史
对话历史包含一个简单的类似 CRUD 的 API,由两个资源组成:内存(memories)和消息(messages)。当前对话的所有消息都存储在一个对话内存中。一条消息代表一个问答对:一个人类输入的问题和一个 AI 答案。消息不能独立存在;它们必须被添加到内存中。
RAG
RAG 从索引和历史记录中检索数据,并将所有信息作为上下文发送给 LLM。LLM 随后使用动态检索到的数据补充其静态知识库。在 OpenSearch 中,RAG 通过包含检索增强生成处理器的搜索管道实现。该处理器拦截 OpenSearch 查询结果,从对话内存中检索对话中的先前消息,并向 LLM 发送提示。在处理器从 LLM 接收到响应后,它将响应保存到对话内存中,并返回原始的 OpenSearch 查询结果和 LLM 响应。
截至 OpenSearch 2.11,RAG 技术仅在 OpenAI 模型和 Amazon Bedrock 上的 Anthropic Claude 模型中进行了测试。
当安全插件启用时,所有内存都以 private
安全模式存在。只有创建内存的用户才能与该内存交互。任何用户都不能查看其他用户的内存。
先决条件
要开始使用对话式搜索,请启用对话内存和 RAG 管道功能
PUT /_cluster/settings
{
"persistent": {
"plugins.ml_commons.memory_feature_enabled": true,
"plugins.ml_commons.rag_pipeline_feature_enabled": true
}
}
配置对话式搜索
有两种方法可以配置对话式搜索
自动化工作流
OpenSearch 提供了一个工作流模板,可自动为 LLM 创建连接器、注册和部署 LLM,并配置搜索管道。在创建工作流时,您必须提供配置 LLM 的 API 密钥。查看对话式搜索工作流模板的默认设置,以确定是否需要更新任何参数。例如,如果模型端点与默认值 (https://api.cohere.ai/v1/chat
) 不同,请在 create_connector.actions.url
参数中指定模型的端点。要创建默认的对话式搜索工作流,请发送以下请求
POST /_plugins/_flow_framework/workflow?use_case=conversational_search_with_llm_deploy&provision=true
{
"create_connector.credential.key": "<YOUR_API_KEY>"
}
OpenSearch 会为创建的工作流返回一个工作流 ID:
{
"workflow_id" : "U_nMXJUBq_4FYQzMOS4B"
}
要检查工作流状态,请发送以下请求:
GET /_plugins/_flow_framework/workflow/U_nMXJUBq_4FYQzMOS4B/_status
工作流完成后,state
将变为 COMPLETED
。该工作流创建以下组件
- 模型连接器:连接到指定模型。
- 已注册并部署的模型:该模型已准备好进行推理。
- 搜索管道:配置用于处理对话式查询。
您现在可以继续步骤 4、5 和 6,将 RAG 数据摄取到索引中,创建对话内存,并使用管道进行 RAG。
手动设置
要手动配置对话式搜索,请按照以下步骤操作
步骤 1:为模型创建连接器
RAG 需要一个 LLM 才能运行。要连接到 LLM,请创建一个连接器。以下请求为 OpenAI GPT 3.5 模型创建了一个连接器
POST /_plugins/_ml/connectors/_create
{
"name": "OpenAI Chat Connector",
"description": "The connector to public OpenAI model service for GPT 3.5",
"version": 2,
"protocol": "http",
"parameters": {
"endpoint": "api.openai.com",
"model": "gpt-3.5-turbo",
"temperature": 0
},
"credential": {
"openAI_key": "<YOUR_OPENAI_KEY>"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://${parameters.endpoint}/v1/chat/completions",
"headers": {
"Authorization": "Bearer ${credential.openAI_key}"
},
"request_body": """{ "model": "${parameters.model}", "messages": ${parameters.messages}, "temperature": ${parameters.temperature} }"""
}
]
}
OpenSearch 返回连接器的连接器 ID
{
"connector_id": "u3DEbI0BfUsSoeNTti-1"
}
有关连接到其他服务和模型的示例请求,请参阅连接器蓝图。
步骤 2:注册并部署模型
注册您在上一步中为其创建连接器的 LLM。要将模型注册到 OpenSearch,请提供上一步中返回的 connector_id
POST /_plugins/_ml/models/_register
{
"name": "openAI-gpt-3.5-turbo",
"function_name": "remote",
"description": "test model",
"connector_id": "u3DEbI0BfUsSoeNTti-1"
}
OpenSearch 返回注册任务的任务 ID 和已注册模型的模型 ID
{
"task_id": "gXDIbI0BfUsSoeNT_jAb",
"status": "CREATED",
"model_id": "gnDIbI0BfUsSoeNT_jAw"
}
要验证注册是否完成,请调用 Tasks API
GET /_plugins/_ml/tasks/gXDIbI0BfUsSoeNT_jAb
响应中的 state
变为 COMPLETED
{
"model_id": "gnDIbI0BfUsSoeNT_jAw",
"task_type": "REGISTER_MODEL",
"function_name": "REMOTE",
"state": "COMPLETED",
"worker_node": [
"kYv-Z5-mQ4uCUy_cRC6LXA"
],
"create_time": 1706927128091,
"last_update_time": 1706927128125,
"is_async": false
}
要部署模型,请向 Deploy API 提供 model_id
POST /_plugins/_ml/models/gnDIbI0BfUsSoeNT_jAw/_deploy
OpenSearch 确认模型已部署
{
"task_id": "cnDObI0BfUsSoeNTDzGd",
"task_type": "DEPLOY_MODEL",
"status": "COMPLETED"
}
步骤 3:创建搜索管道
接下来,使用 retrieval_augmented_generation
处理器创建一个搜索管道
PUT /_search/pipeline/rag_pipeline
{
"response_processors": [
{
"retrieval_augmented_generation": {
"tag": "openai_pipeline_demo",
"description": "Demo pipeline Using OpenAI Connector",
"model_id": "gnDIbI0BfUsSoeNT_jAw",
"context_field_list": ["text"],
"system_prompt": "You are a helpful assistant",
"user_instructions": "Generate a concise and informative answer in less than 100 words for the given question"
}
}
]
}
有关处理器字段的信息,请参阅检索增强生成处理器。
步骤 4:将 RAG 数据摄取到索引中
RAG 使用一些补充数据增强 LLM 的知识。
首先,创建一个用于存储这些数据的索引,并将默认搜索管道设置为上一步中创建的管道
PUT /my_rag_test_data
{
"settings": {
"index.search.default_pipeline" : "rag_pipeline"
},
"mappings": {
"properties": {
"text": {
"type": "text"
}
}
}
}
接下来,将补充数据摄取到索引中
POST _bulk
{"index": {"_index": "my_rag_test_data", "_id": "1"}}
{"text": "Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s."}
{"index": {"_index": "my_rag_test_data", "_id": "2"}}
{"text": "Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."}
RAG 管道
RAG 是一种从索引中检索文档,通过序列到序列模型(例如 LLM)进行处理,然后用上下文中动态检索到的数据补充静态 LLM 信息的技术。
截至 OpenSearch 2.12,RAG 技术仅在 OpenAI 模型、Amazon Bedrock 上的 Anthropic Claude 模型和 Cohere Command 模型中进行了测试。
配置 Cohere Command 模型以启用 RAG 需要使用后处理函数来转换模型输出。有关更多信息,请参阅Cohere RAG 教程。
步骤 5:创建对话内存
您需要创建一个对话内存,用于存储对话中的所有消息。为了使内存易于识别,请在可选的 name
字段中为内存提供一个名称,如以下示例所示。由于 name
参数不可更新,这是您命名对话的唯一机会。
POST /_plugins/_ml/memory/
{
"name": "Conversation about NYC population"
}
OpenSearch 返回新创建内存的内存 ID
{
"memory_id": "znCqcI0BfUsSoeNTntd7"
}
您将使用 memory_id
向内存添加消息。
步骤 6:使用管道进行 RAG
要使用 RAG 管道,请向 OpenSearch 发送查询,并在 ext.generative_qa_parameters
对象中提供附加参数。
generative_qa_parameters
对象支持以下参数。
参数 | 必需 | 描述 |
---|---|---|
llm_question | 是 | LLM 必须回答的问题。 |
llm_model | 否 | 在您希望使用不同模型(例如 GPT 4 而非 GPT 3.5)的情况下,此参数会覆盖连接中设置的原始模型。如果在管道创建期间未设置默认模型,则此选项为必填项。 |
memory_id | 否 | 如果您提供 memory_id ,管道将检索指定内存中最近的 10 条消息,并将它们添加到 LLM 提示中。如果您未指定 memory_id ,则不会将先前上下文添加到 LLM 提示中。 |
context_size | 否 | 发送到 LLM 的搜索结果数量。这通常是为了满足令牌大小限制而需要的,该限制可能因模型而异。或者,您可以使用 Search API 中的 size 参数来控制发送到 LLM 的搜索结果数量。 |
message_size | 否 | 发送到 LLM 的消息数量。与搜索结果数量类似,这会影响 LLM 接收的总令牌数量。如果未设置,管道将使用默认消息大小 10 。 |
timeout | 否 | 管道等待使用连接器的远程模型响应的秒数。默认值为 30 。 |
如果您的 LLM 包含设置的令牌限制,请在 OpenSearch 查询中设置 size
字段,以限制搜索响应中使用的文档数量。否则,RAG 管道会将搜索结果中的每个文档发送到 LLM。
如果您向 LLM 提问有关当前情况的问题,它无法提供答案,因为它是在几年前的数据上训练的。但是,如果您添加当前信息作为上下文,LLM 就能够生成响应。例如,您可以询问 LLM 2023 年纽约市都会区的人口。您将构建一个包含 OpenSearch 匹配查询和 LLM 查询的查询。提供 memory_id
,以便消息存储在适当的内存对象中
GET /my_rag_test_data/_search
{
"query": {
"match": {
"text": "What's the population of NYC metro area in 2023"
}
},
"ext": {
"generative_qa_parameters": {
"llm_model": "gpt-3.5-turbo",
"llm_question": "What's the population of NYC metro area in 2023",
"memory_id": "znCqcI0BfUsSoeNTntd7",
"context_size": 5,
"message_size": 5,
"timeout": 15
}
}
}
由于上下文中包含了一份关于纽约市人口的文件,LLM 能够正确回答问题(尽管它包含了“预计”一词,因为它是在前几年的数据上训练的)。响应包含补充 RAG 数据中的匹配文档和 LLM 响应
响应
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 5.781642,
"hits": [
{
"_index": "my_rag_test_data",
"_id": "2",
"_score": 5.781642,
"_source": {
"text": """Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."""
}
},
{
"_index": "my_rag_test_data",
"_id": "1",
"_score": 0.9782871,
"_source": {
"text": "Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s."
}
}
]
},
"ext": {
"retrieval_augmented_generation": {
"answer": "The population of the New York City metro area in 2023 is projected to be 18,937,000.",
"message_id": "x3CecI0BfUsSoeNT9tV9"
}
}
}
现在,您将作为同一对话的一部分向 LLM 提出一个后续问题。再次,在请求中提供 memory_id
GET /my_rag_test_data/_search
{
"query": {
"match": {
"text": "What was it in 2022"
}
},
"ext": {
"generative_qa_parameters": {
"llm_model": "gpt-3.5-turbo",
"llm_question": "What was it in 2022",
"memory_id": "znCqcI0BfUsSoeNTntd7",
"context_size": 5,
"message_size": 5,
"timeout": 15
}
}
}
LLM 正确识别了对话主题并返回了相关响应
{
...
"ext": {
"retrieval_augmented_generation": {
"answer": "The population of the New York City metro area in 2022 was 18,867,000.",
"message_id": "p3CvcI0BfUsSoeNTj9iH"
}
}
}
要验证两条消息都已添加到内存中,请向 Get Messages API 提供 memory_ID
GET /_plugins/_ml/memory/znCqcI0BfUsSoeNTntd7/messages
响应包含两条消息
响应
{
"messages": [
{
"memory_id": "znCqcI0BfUsSoeNTntd7",
"message_id": "x3CecI0BfUsSoeNT9tV9",
"create_time": "2024-02-03T20:33:50.754708446Z",
"input": "What's the population of NYC metro area in 2023",
"prompt_template": """[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Generate a concise and informative answer in less than 100 words for the given question"}]""",
"response": "The population of the New York City metro area in 2023 is projected to be 18,937,000.",
"origin": "retrieval_augmented_generation",
"additional_info": {
"metadata": """["Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019.","Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s."]"""
}
},
{
"memory_id": "znCqcI0BfUsSoeNTntd7",
"message_id": "p3CvcI0BfUsSoeNTj9iH",
"create_time": "2024-02-03T20:36:10.24453505Z",
"input": "What was it in 2022",
"prompt_template": """[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Generate a concise and informative answer in less than 100 words for the given question"}]""",
"response": "The population of the New York City metro area in 2022 was 18,867,000.",
"origin": "retrieval_augmented_generation",
"additional_info": {
"metadata": """["Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019.","Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s."]"""
}
}
]
}
后续步骤
- 浏览我们的教程,了解如何构建 AI 搜索应用程序。