ML 推理搜索请求处理器

2.16 版本引入

的 ml_inference 搜索请求处理器用于调用注册的机器学习 (ML) 模型，以便使用模型输出重写查询。

先决条件
在使用 ml_inference 搜索请求处理器之前，您必须在 OpenSearch 集群上托管一个本地 ML 模型，或者通过 ML Commons 插件连接一个外部托管模型到您的 OpenSearch 集群。有关本地模型的更多信息，请参阅在 OpenSearch 中使用 ML 模型。有关外部托管模型的更多信息，请参阅连接到外部托管模型。

语法

以下是 ml-inference 搜索请求处理器的语法

{
  "ml_inference": {
    "model_id": "<model_id>",
    "function_name": "<function_name>",
    "full_response_path": "<full_response_path>",
    "query_template": "<query_template>",
    "model_config": {
      "<model_config_field>": "<config_value>"
    },
    "model_input": "<model_input>",
    "input_map": [
      {
        "<model_input_field>": "<query_input_field>"
      }
    ],
    "output_map": [
      {
        "<query_output_field>": "<model_output_field>"
      }
    ]
  }
}

配置参数

下表列出了 ml-inference 搜索请求处理器所需和可选的参数。

参数	数据类型	必需/可选	描述
`model_id`	字符串	必需	处理器使用的 ML 模型的 ID。
`query_template`	字符串	可选	一个查询字符串模板，用于构建包含 `new_document_field` 的新查询。通常用于将搜索查询重写为新的查询类型时。
`function_name`	字符串	外部托管模型可选本地模型必填	处理器中配置的 ML 模型的功能名称。对于本地模型，有效值是 `sparse_encoding`、`sparse_tokenize`、`text_embedding` 和 `text_similarity`。对于外部托管模型，有效值是 `remote`。默认值为 `remote`。
`model_config`	对象	可选	ML 模型的自定义配置选项。对于外部托管模型，如果设置此配置，它将覆盖默认连接器参数。对于本地模型，您可以将 `model_config` 添加到 `model_input` 以覆盖注册期间设置的模型配置。有关更多信息，请参阅`model_config` 对象。
`model_input`	字符串	外部托管模型可选本地模型必填	定义模型预期输入字段格式的模板。每种本地模型类型可能使用不同的输入集。对于外部托管模型，默认值为 `"{ \"parameters\": ${ml_inference.parameters} }`。
`input_map`	数组	必需	一个数组，指定如何将查询字符串字段映射到模型输入字段。数组的每个元素都是 `"<model_input_field>": "<query_input_field>"` 格式的映射，对应于文档字段的一次模型调用。如果未为外部托管模型指定输入映射，则所有文档字段将直接作为输入传递给模型。`input_map` 的大小表示模型被调用的次数（Predict API 请求的数量）。
`<model_input_field>`	字符串	必需	模型输入字段名称。
`<query_input_field>`	字符串	必需	用作模型输入的查询字段的名称或 JSON 路径。
`output_map`	数组	必需	一个数组，指定如何将模型输出字段映射到查询字符串中的新字段。数组的每个元素都是 `"<query_output_field>": "<model_output_field>"` 格式的映射。
`<query_output_field>`	字符串	必需	存储模型输出（由 `model_output` 指定）的查询字段名称。
`<model_output_field>`	字符串	必需	要存储在 `query_output_field` 中的模型输出字段的名称或 JSON 路径。
`full_response_path`	布尔型	可选	如果 `model_output_field` 包含字段的完整 JSON 路径而不是字段名称，则将此参数设置为 `true`。然后将完全解析模型输出以获取字段值。本地模型的默认值为 `true`，外部托管模型的默认值为 `false`。
`ignore_missing`	布尔型	可选	如果为 `true` 且在 `input_map` 或 `output_map` 中定义的任何输入字段缺失，则忽略此处理器。否则，缺失字段将导致失败。默认值为 `false`。
`ignore_failure`	布尔型	可选	指定即使处理器遇到错误是否继续执行。如果为 `true`，则忽略此处理器并继续搜索。如果为 `false`，则任何失败都会导致搜索被取消。默认值为 `false`。
`max_prediction_tasks`	整数	可选	查询搜索期间可以运行的最大并发模型调用数。默认值为 `10`。
`description`	字符串	可选	处理器的简要描述。
`tag`	字符串	可选	处理器的标识符标签。有助于调试以区分相同类型的处理器。

input_map 和 output_map 映射支持标准 JSON path 表示法来指定复杂数据结构。

使用处理器

按照以下步骤在管道中使用处理器。在创建处理器时，您必须提供模型 ID、input_map 和 output_map。在使用处理器测试管道之前，请确保模型已成功部署。您可以使用获取模型 API 来检查模型状态。

对于本地模型，您必须提供一个 model_input 字段，用于指定模型输入格式。将 model_config 中的任何输入字段添加到 model_input。

对于外部托管模型，model_input 字段是可选的，其默认值为 "{ \"parameters\": ${ml_inference.parameters} }。

设置

创建一个名为 my_index 的索引并索引两个文档

POST /my_index/_doc/1
{
  "passage_text": "I am excited",
  "passage_language": "en",
  "label": "POSITIVE",
  "passage_embedding": [
    2.3886719,
    0.032714844,
    -0.22229004
    ...]
}

POST /my_index/_doc/2
{
  "passage_text": "I am sad",
  "passage_language": "en",
  "label": "NEGATIVE",
  "passage_embedding": [
    1.7773438,
    0.4309082,
    1.8857422,
    0.95996094,
    ...]
}

当您在没有搜索管道的情况下对创建的索引运行术语查询时，查询会搜索包含查询中指定的精确术语的文档。以下查询不会返回任何结果，因为查询文本与索引中的任何文档都不匹配

GET /my_index/_search
{
  "query": {
    "term": {
      "passage_text": {
        "value": "happy moments",
        "boost": 1
      }
    }
  }
}

通过使用模型，搜索管道可以根据模型推理动态重写术语值，以增强或更改搜索结果。这意味着模型从搜索查询中获取初始输入，对其进行处理，然后更新查询术语以反映模型推理，从而可能提高搜索结果的相关性。

示例：外部托管模型

以下示例配置了使用外部托管模型的 ml_inference 处理器。

步骤 1：创建管道

此示例演示了如何为外部托管的情感分析模型创建搜索管道，该模型会重写术语查询值。该模型需要一个 inputs 字段，并在 label 字段中生成结果。由于未指定 function_name，它默认为 remote，表示这是一个外部托管的模型。

术语查询值会根据模型的输出进行重写。搜索请求中的 ml_inference 处理器需要一个 input_map 来检索模型输入的查询字段值，以及一个 output_map 来将模型输出分配给查询字符串。

在此示例中，以下术语查询使用了 ml_inference 搜索请求处理器

 {
  "query": {
    "term": {
      "label": {
        "value": "happy moments",
        "boost": 1
      }
    }
  }
}

以下请求创建一个搜索管道，该管道重写了前面的术语查询

PUT /_search/pipeline/ml_inference_pipeline
{
  "description": "Generate passage_embedding for searched documents",
  "request_processors": [
    {
      "ml_inference": {
        "model_id": "<your model id>",
        "input_map": [
          {
            "inputs": "query.term.label.value"
          }
        ],
        "output_map": [
          {
            "query.term.label.value": "label"
          }
        ]
      }
    }
  ]
}

向外部托管的模型发出 Predict API 请求时，所有必要的字段和参数通常都包含在 parameters 对象中

POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_predict
{
  "parameters": {
    "inputs": [
      {
        ...
      }
    ]
  }
}

因此，要使用外部托管的情感分析模型，请按照以下格式发送 Predict API 请求

POST /_plugins/_ml/models/cywgD5EB6KAJXDLxyDp1/_predict
{
  "parameters": {
    "inputs": "happy moments"
  }
}

模型处理输入并根据输入文本的情感生成预测。在此示例中，情感是积极的

{
  "inference_results": [
    {
      "output": [
        {
          "name": "response",
          "dataAsMap": {
            "label": "POSITIVE",
            "score": "0.948"
          }
        }
      ],
      "status_code": 200
    }
  ]
}

为外部托管的模型指定 input_map 时，您可以直接引用 inputs 字段，而不是提供其点路径 parameters.inputs

"input_map": [  
  {
    "inputs": "query.term.label.value"
  }
]

步骤 2：运行管道

创建搜索管道后，您可以使用该搜索管道运行相同的术语查询

GET /my_index/_search?search_pipeline=my_pipeline_request_review
{
  "query": {
    "term": {
      "label": {
        "value": "happy moments",
        "boost": 1
      }
    }
  }
}

查询术语值会根据模型的输出进行重写。模型确定查询术语的情感是积极的，因此重写后的查询显示如下

{
  "query": {
    "term": {
      "label": {
        "value": "POSITIVE",
        "boost": 1
      }
    }
  }
}

响应包含其 label 字段值为 POSITIVE 的文档

{
  "took": 288,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.00009405752,
    "hits": [
      {
        "_index": "my_index",
        "_id": "3",
        "_score": 0.00009405752,
        "_source": {
          "passage_text": "I am excited",
          "passage_language": "en",
          "label": "POSITIVE"
        }
      }
    ]
  }
}

示例：本地模型

以下示例展示了如何使用本地模型配置 ml_inference 处理器，将术语查询重写为 k-NN 查询。

步骤 1：创建管道

以下示例展示了如何为 huggingface/sentence-transformers/all-distilroberta-v1 本地模型创建搜索管道。该模型是托管在您的 OpenSearch 集群中的预训练句子转换器模型。

如果您使用 Predict API 调用模型，则请求显示如下

POST /_plugins/_ml/_predict/text_embedding/cleMb4kBJ1eYAeTMFFg4
{
  "text_docs": [
    "today is sunny"
  ],
  "return_number": true,
  "target_response": [
    "sentence_embedding"
  ]
}

使用此模式，按如下方式指定 model_input：

 "model_input": "{ \"text_docs\": ${input_map.text_docs}, \"return_number\": ${model_config.return_number}, \"target_response\": ${model_config.target_response} }"

在 input_map 中，将 query.term.passage_embedding.value 查询字段映射到模型所需的 text_docs 字段

"input_map": [
  {
    "text_docs": "query.term.passage_embedding.value"
  } 
]

由于您将要转换为嵌入的字段指定为 JSON 路径，因此需要将 full_response_path 设置为 true。然后解析完整的 JSON 文档以获取输入字段

"full_response_path": true

将使用 query.term.passage_embedding.value 字段中的文本生成嵌入

{
  "text_docs": "happy passage"
}

Predict API 请求返回以下响应：

{
  "inference_results": [
    {
      "output": [
        {
          "name": "sentence_embedding",
          "data_type": "FLOAT32",
          "shape": [
            768
          ],
          "data": [
            0.25517133,
            -0.28009856,
            0.48519906,
            ...
          ]
        }
      ]
    }
  ]
}

模型在 $.inference_results.*.output.*.data 字段中生成嵌入。output_map 将此字段映射到查询模板中的查询字段

"output_map": [
  {
    "modelPredictionOutcome": "$.inference_results.*.output.*.data"
  }
]

要使用本地模型配置 ml_inference 搜索请求处理器，请明确指定 function_name。在此示例中，function_name 是 text_embedding。有关有效的 function_name 值的信息，请参阅配置参数。

以下是 ml_inference 处理器与本地模型的最终配置

PUT /_search/pipeline/ml_inference_pipeline_local
{
  "description": "searchs reviews and generates embeddings",
  "request_processors": [
    {
      "ml_inference": {
        "function_name": "text_embedding",
        "full_response_path": true,
        "model_id": "<your model id>",
        "model_config": {
          "return_number": true,
          "target_response": [
            "sentence_embedding"
          ]
        },
        "model_input": "{ \"text_docs\": ${input_map.text_docs}, \"return_number\": ${model_config.return_number}, \"target_response\": ${model_config.target_response} }",
        "query_template": """{
        "size": 2,
        "query": {
          "knn": {
            "passage_embedding": {
              "vector": ${modelPredictionOutcome},
              "k": 5
              }
            }
           }
          }""",
        "input_map": [
          {
            "text_docs": "query.term.passage_embedding.value"
          }
        ],
        "output_map": [
          {
            "modelPredictionOutcome": "$.inference_results.*.output.*.data"
          }
        ],
        "ignore_missing": true,
        "ignore_failure": true
      }
    }
  ]
}

步骤 2：运行管道

运行以下查询，并在请求中提供管道名称

GET /my_index/_search?search_pipeline=ml_inference_pipeline_local
{
"query": {
  "term": {
    "passage_embedding": {
      "value": "happy passage"
      }
    }
  }
}

响应确认处理器运行了 k-NN 查询，返回了得分更高的文档 1

{
  "took": 288,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 0.00009405752,
    "hits": [
      {
        "_index": "my_index",
        "_id": "1",
        "_score": 0.00009405752,
        "_source": {
          "passage_text": "I am excited",
          "passage_language": "en",
          "label": "POSITIVE",
          "passage_embedding": [
            2.3886719,
            0.032714844,
            -0.22229004
            ...]
        }
      },
      {
        "_index": "my_index",
        "_id": "2",
        "_score": 0.00001405052,
        "_source": {
          "passage_text": "I am sad",
          "passage_language": "en",
          "label": "NEGATIVE",
          "passage_embedding": [
            1.7773438,
            0.4309082,
            1.8857422,
            0.95996094,
            ...
          ]
        }
      }
    ]
  }
}

语法
配置参数
使用处理器

此页面有帮助吗？

✔ 是 ✖ 否

告诉我们原因

剩余 350 字符

有问题？在 OpenSearch 论坛上提问。

想贡献力量？编辑此页面或创建问题。

ML 推理搜索请求处理器

语法

配置参数

使用处理器

设置

示例：外部托管模型

示例：本地模型

OpenSearch 链接

参与其中

资源

联系我们