按字段重新排序搜索结果
2.18 版引入
您可以使用 by_field
重新排序类型来根据文档字段对搜索结果进行重新排序。如果模型已运行并为您的文档生成了数值分数,或者之前应用了搜索响应处理器,并且您希望根据聚合字段对文档进行不同的重新排序,那么按字段重新排序搜索结果会很有用。
为了实现重新排序,您需要配置一个在搜索时运行的搜索管道。搜索管道会拦截搜索结果并对其应用rerank
处理器。rerank
处理器会评估搜索结果并根据从文档字段获得的新分数对其进行排序。
运行带重新排序的搜索
要运行带重新排序的搜索,请按照以下步骤操作
步骤 1:配置搜索管道
配置一个包含 rerank
处理器的搜索管道,并指定 by_field
重新排序类型。该管道将根据 reviews.stars
字段(通过完整的点路径指定)进行排序,并返回所有文档的原始查询分数及其新分数
PUT /_search/pipeline/rerank_byfield_pipeline
{
"response_processors": [
{
"rerank": {
"by_field": {
"target_field": "reviews.stars",
"keep_previous_score" : true
}
}
}
]
}
有关请求字段的更多信息,请参阅请求字段。
步骤 2:创建用于摄取的索引
为了使用您管道中定义的 rerank
处理器,请创建一个 OpenSearch 索引,并将上一步中创建的管道添加为默认管道
PUT /book-index
{
"settings": {
"index.search.default_pipeline" : "rerank_byfield_pipeline"
},
"mappings": {
"properties": {
"title": {
"type": "text"
},
"author": {
"type": "text"
},
"genre": {
"type": "keyword"
},
"reviews": {
"properties": {
"stars": {
"type": "float"
}
}
},
"description": {
"type": "text"
}
}
}
}
步骤 3:将文档摄取到索引中
要将文档摄取到上一步创建的索引中,请发送以下批量请求
POST /_bulk
{ "index": { "_index": "book-index", "_id": "1" } }
{ "title": "The Lost City", "author": "Jane Doe", "genre": "Adventure Fiction", "reviews": { "stars": 4.2 }, "description": "An exhilarating journey through a hidden civilization in the Amazon rainforest." }
{ "index": { "_index": "book-index", "_id": "2" } }
{ "title": "Whispers of the Past", "author": "John Smith", "genre": "Historical Mystery", "reviews": { "stars": 4.7 }, "description": "A gripping tale set in Victorian England, unraveling a century-old mystery." }
{ "index": { "_index": "book-index", "_id": "3" } }
{ "title": "Starlit Dreams", "author": "Emily Clark", "genre": "Science Fiction", "reviews": { "stars": 4.5 }, "description": "In a future where dreams can be shared, one girl discovers her imaginations power." }
{ "index": { "_index": "book-index", "_id": "4" } }
{ "title": "The Enchanted Garden", "author": "Alice Green", "genre": "Fantasy", "reviews": { "stars": 4.8 }, "description": "A magical garden holds the key to a young girls destiny and friendship." }
步骤 4:使用重新排序进行搜索
例如,在您的索引上运行一个 match_all
查询
POST /book-index/_search
{
"query": {
"match_all": {}
}
}
响应包含根据 reviews.stars
字段按降序排序的文档。每个文档都在 previous_score
字段中包含原始查询分数
{
"took": 33,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 4,
"relation": "eq"
},
"max_score": 4.8,
"hits": [
{
"_index": "book-index",
"_id": "4",
"_score": 4.8,
"_source": {
"reviews": {
"stars": 4.8
},
"author": "Alice Green",
"genre": "Fantasy",
"description": "A magical garden holds the key to a young girls destiny and friendship.",
"previous_score": 1,
"title": "The Enchanted Garden"
}
},
{
"_index": "book-index",
"_id": "2",
"_score": 4.7,
"_source": {
"reviews": {
"stars": 4.7
},
"author": "John Smith",
"genre": "Historical Mystery",
"description": "A gripping tale set in Victorian England, unraveling a century-old mystery.",
"previous_score": 1,
"title": "Whispers of the Past"
}
},
{
"_index": "book-index",
"_id": "3",
"_score": 4.5,
"_source": {
"reviews": {
"stars": 4.5
},
"author": "Emily Clark",
"genre": "Science Fiction",
"description": "In a future where dreams can be shared, one girl discovers her imaginations power.",
"previous_score": 1,
"title": "Starlit Dreams"
}
},
{
"_index": "book-index",
"_id": "1",
"_score": 4.2,
"_source": {
"reviews": {
"stars": 4.2
},
"author": "Jane Doe",
"genre": "Adventure Fiction",
"description": "An exhilarating journey through a hidden civilization in the Amazon rainforest.",
"previous_score": 1,
"title": "The Lost City"
}
}
]
},
"profile": {
"shards": []
}
}
后续步骤
- 了解更多关于
rerank
处理器的信息。 - 请参阅使用外部托管的交叉编码器模型按字段重排的综合示例。