混合搜索解释
2.19 版本引入
您可以提供 explain
参数,以了解混合查询中分数是如何计算、归一化和组合的。启用后,它会提供每个搜索结果评分过程的详细信息。这包括揭示所使用的分数归一化技术、不同分数如何组合以及单个子查询分数的计算。这种全面的洞察力使您更容易理解和优化混合查询结果。有关 explain
的更多信息,请参阅 Explain API。
就资源和时间而言,explain
是一种开销较大的操作。对于生产集群,我们建议仅在故障排除时谨慎使用。
在运行完整的混合查询时,您可以在 URL 中使用以下语法提供 explain
参数
GET <index>/_search?search_pipeline=<search_pipeline>&explain=true
POST <index>/_search?search_pipeline=<search_pipeline>&explain=true
要使用 explain
参数,您必须在搜索管道中配置 hybrid_score_explanation
响应处理器。有关更多信息,请参阅 混合分数解释处理器。
您也可以将 explain
与单个文档 ID 一起使用
GET <index>/_explain/<id>
POST <index>/_explain/<id>
在这种情况下,结果将只包含低级评分信息,例如,针对 term
或 match
等基于文本的查询的 Okapi BM25 分数。有关响应示例,请参阅 Explain API 响应示例。
要查看所有结果的 explain
输出,请在 URL 或请求正文中将参数设置为 true
POST my-nlp-index/_search?search_pipeline=my_pipeline&explain=true
{
"_source": {
"exclude": [
"passage_embedding"
]
},
"query": {
"hybrid": {
"queries": [
{
"match": {
"text": {
"query": "horse"
}
}
},
{
"neural": {
"passage_embedding": {
"query_text": "wild west",
"model_id": "aVeif4oB5Vm0Tdw8zYO2",
"k": 5
}
}
}
]
}
}
}
响应包含评分信息
响应
{
"took": 54,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 5,
"relation": "eq"
},
"max_score": 0.9251075,
"hits": [
{
"_shard": "[my-nlp-index][0]",
"_node": "IsuzeVYdSqKUfy0qfqil2w",
"_index": "my-nlp-index",
"_id": "5",
"_score": 0.9251075,
"_source": {
"text": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse .",
"id": "2691147709.jpg"
},
"_explanation": {
"value": 0.9251075,
"description": "arithmetic_mean combination of:",
"details": [
{
"value": 1.0,
"description": "min_max normalization of:",
"details": [
{
"value": 1.2336599,
"description": "weight(text:horse in 0) [PerFieldSimilarity], result of:",
"details": [
{
"value": 1.2336599,
"description": "score(freq=1.0), computed as boost * idf * tf from:",
"details": [
{
"value": 2.2,
"description": "boost",
"details": []
},
{
"value": 1.2039728,
"description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
"details": [
{
"value": 1,
"description": "n, number of documents containing term",
"details": []
},
{
"value": 4,
"description": "N, total number of documents with field",
"details": []
}
]
},
{
"value": 0.46575344,
"description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
"details": [
{
"value": 1.0,
"description": "freq, occurrences of term within document",
"details": []
},
{
"value": 1.2,
"description": "k1, term saturation parameter",
"details": []
},
{
"value": 0.75,
"description": "b, length normalization parameter",
"details": []
},
{
"value": 16.0,
"description": "dl, length of field",
"details": []
},
{
"value": 17.0,
"description": "avgdl, average length of field",
"details": []
}
]
}
]
}
]
}
]
},
{
"value": 0.8503647,
"description": "min_max normalization of:",
"details": [
{
"value": 0.015177966,
"description": "within top 5",
"details": []
}
]
}
]
...
响应正文字段
字段 | 描述 |
---|---|
解释 | explanation 对象有三个属性:value 、description 和 details 。value 属性显示计算结果,description 解释执行了哪种类型的计算,而 details 显示任何执行的子计算。对于分数归一化,description 属性中的信息包括用于归一化或组合的技术以及相应的分数。 |
后续步骤
- 要了解如何将
explain
与内部命中结合使用,请参阅 在混合查询中使用内部命中。