Link Search Menu Expand Document Documentation Menu

混合搜索解释

2.19 版本引入

您可以提供 explain 参数,以了解混合查询中分数是如何计算、归一化和组合的。启用后,它会提供每个搜索结果评分过程的详细信息。这包括揭示所使用的分数归一化技术、不同分数如何组合以及单个子查询分数的计算。这种全面的洞察力使您更容易理解和优化混合查询结果。有关 explain 的更多信息,请参阅 Explain API

就资源和时间而言,explain 是一种开销较大的操作。对于生产集群,我们建议仅在故障排除时谨慎使用。

在运行完整的混合查询时,您可以在 URL 中使用以下语法提供 explain 参数

GET <index>/_search?search_pipeline=<search_pipeline>&explain=true
POST <index>/_search?search_pipeline=<search_pipeline>&explain=true

要使用 explain 参数,您必须在搜索管道中配置 hybrid_score_explanation 响应处理器。有关更多信息,请参阅 混合分数解释处理器

您也可以将 explain 与单个文档 ID 一起使用

GET <index>/_explain/<id>
POST <index>/_explain/<id>

在这种情况下,结果将只包含低级评分信息,例如,针对 termmatch 等基于文本的查询的 Okapi BM25 分数。有关响应示例,请参阅 Explain API 响应示例

要查看所有结果的 explain 输出,请在 URL 或请求正文中将参数设置为 true

POST my-nlp-index/_search?search_pipeline=my_pipeline&explain=true
{
  "_source": {
    "exclude": [
      "passage_embedding"
    ]
  },
  "query": {
    "hybrid": {
      "queries": [
        {
          "match": {
            "text": {
              "query": "horse"
            }
          }
        },
        {
          "neural": {
            "passage_embedding": {
              "query_text": "wild west",
              "model_id": "aVeif4oB5Vm0Tdw8zYO2",
              "k": 5
            }
          }
        }
      ]
    }
  }
}

响应包含评分信息

响应
{
    "took": 54,
    "timed_out": false,
    "_shards": {
        "total": 2,
        "successful": 2,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 5,
            "relation": "eq"
        },
        "max_score": 0.9251075,
        "hits": [
            {
                "_shard": "[my-nlp-index][0]",
                "_node": "IsuzeVYdSqKUfy0qfqil2w",
                "_index": "my-nlp-index",
                "_id": "5",
                "_score": 0.9251075,
                "_source": {
                    "text": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse .",
                    "id": "2691147709.jpg"
                },
                "_explanation": {
                    "value": 0.9251075,
                    "description": "arithmetic_mean combination of:",
                    "details": [
                        {
                            "value": 1.0,
                            "description": "min_max normalization of:",
                            "details": [
                                {
                                    "value": 1.2336599,
                                    "description": "weight(text:horse in 0) [PerFieldSimilarity], result of:",
                                    "details": [
                                        {
                                            "value": 1.2336599,
                                            "description": "score(freq=1.0), computed as boost * idf * tf from:",
                                            "details": [
                                                {
                                                    "value": 2.2,
                                                    "description": "boost",
                                                    "details": []
                                                },
                                                {
                                                    "value": 1.2039728,
                                                    "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                                                    "details": [
                                                        {
                                                            "value": 1,
                                                            "description": "n, number of documents containing term",
                                                            "details": []
                                                        },
                                                        {
                                                            "value": 4,
                                                            "description": "N, total number of documents with field",
                                                            "details": []
                                                        }
                                                    ]
                                                },
                                                {
                                                    "value": 0.46575344,
                                                    "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                                                    "details": [
                                                        {
                                                            "value": 1.0,
                                                            "description": "freq, occurrences of term within document",
                                                            "details": []
                                                        },
                                                        {
                                                            "value": 1.2,
                                                            "description": "k1, term saturation parameter",
                                                            "details": []
                                                        },
                                                        {
                                                            "value": 0.75,
                                                            "description": "b, length normalization parameter",
                                                            "details": []
                                                        },
                                                        {
                                                            "value": 16.0,
                                                            "description": "dl, length of field",
                                                            "details": []
                                                        },
                                                        {
                                                            "value": 17.0,
                                                            "description": "avgdl, average length of field",
                                                            "details": []
                                                        }
                                                    ]
                                                }
                                            ]
                                        }
                                    ]
                                }
                            ]
                        },
                        {
                            "value": 0.8503647,
                            "description": "min_max normalization of:",
                            "details": [
                                {
                                    "value": 0.015177966,
                                    "description": "within top 5",
                                    "details": []
                                }
                            ]
                        }
                    ]
...

响应正文字段

字段 描述
解释 explanation 对象有三个属性:valuedescriptiondetailsvalue 属性显示计算结果,description 解释执行了哪种类型的计算,而 details 显示任何执行的子计算。对于分数归一化,description 属性中的信息包括用于归一化或组合的技术以及相应的分数。

后续步骤

剩余 350 字符

有问题?

想贡献?