检索增强生成处理器

2.12 版本引入

retrieval_augmented_generation 处理器是一种搜索结果处理器，您可以在对话式搜索中用于检索增强生成 (RAG)。该处理器会拦截查询结果，从对话内存中检索对话的先前消息，并将提示发送给大型语言模型 (LLM)。处理器从 LLM 接收到响应后，会将其保存到对话内存中，并返回原始的 OpenSearch 查询结果和 LLM 响应。

从 OpenSearch 2.12 版本开始，retrieval_augmented_generation 处理器仅支持 OpenAI 和 Amazon Bedrock 模型。

请求正文字段

下表列出了所有可用的请求字段。

字段	数据类型	描述
`model_id`	字符串	管道中使用的模型 ID。必填。
`context_field_list`	数组	文档源中包含的字段列表，管道将其用作 RAG 的上下文。必填。有关更多信息，请参阅上下文字段列表。
`system_prompt`	字符串	发送给 LLM 的系统提示，用于调整其行为，例如响应语气。可以是角色描述或一组指令。可选。
`user_instructions`	字符串	发送给 LLM 的人工生成指令，用于指导其生成结果。
`tag`	字符串	处理器的标识符。可选。
`description`	字符串	处理器的描述。可选。

上下文字段列表

context_field_list 是文档源中包含的字段列表，管道将其用作 RAG 的上下文。例如，假设您的 OpenSearch 索引包含一个文档集合，每个文档都包含一个title 和 text

{
  "_index": "qa_demo",
  "_id": "SimKcIoBOVKVCYpk1IL-",
  "_source": {
    "title": "Abraham Lincoln 2",
    "text": "Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s.[6]\n"
  }
}

您可以通过在处理器中设置"context_field_list": ["text"]来指定仅将text内容发送给 LLM。

示例

以下示例演示了如何将搜索管道与 retrieval_augmented_generation 处理器结合使用。

创建搜索管道

以下请求创建一个包含用于 OpenAI 模型的 retrieval_augmented_generation 处理器的搜索管道

PUT /_search/pipeline/rag_pipeline
{
  "response_processors": [
    {
      "retrieval_augmented_generation": {
        "tag": "openai_pipeline_demo",
        "description": "Demo pipeline Using OpenAI Connector",
        "model_id": "gnDIbI0BfUsSoeNT_jAw",
        "context_field_list": ["text"],
        "system_prompt": "You are a helpful assistant",
        "user_instructions": "Generate a concise and informative answer in less than 100 words for the given question"
      }
    }
  ]
}

使用搜索管道

将 OpenSearch 查询与一个存储 LLM 生成式问答参数的 ext 对象结合使用

GET /my_rag_test_data/_search?search_pipeline=rag_pipeline
{
  "query": {
    "match": {
      "text": "Abraham Lincoln"
    }
  },
  "ext": {
    "generative_qa_parameters": {
      "llm_model": "gpt-3.5-turbo",
      "llm_question": "Was Abraham Lincoln a good politician",
      "memory_id": "iXC4bI0BfUsSoeNTjS30",
      "context_size": 5,
      "message_size": 5,
      "timeout": 15
    }
  }
}

有关设置对话式搜索的更多信息，请参阅使用 RAG 进行对话式搜索。

请求正文字段
- 上下文字段列表
示例
- 创建搜索管道
- 使用搜索管道

此页面有帮助吗？

✔ 是 ✖ 否

告诉我们原因

剩余 350 字符

有问题？在 OpenSearch 论坛上提问。

想做贡献？编辑此页面或创建问题。

检索增强生成处理器

请求正文字段

上下文字段列表

示例

创建搜索管道

使用搜索管道

OpenSearch 链接

参与其中

资源

联系我们