排名字段类型
1.0 版引入
下表列出了 OpenSearch 支持的所有排名字段类型。
字段数据类型 | 描述 |
---|---|
rank_feature | 提升或降低文档的相关性分数。 |
rank_features | 提升或降低文档的相关性分数。当特征列表稀疏时使用。 |
rank feature 和 rank features 字段只能通过 rank feature 查询进行查询。它们不支持聚合或排序。
排名特征
rank feature 字段类型使用正的 浮点 值来提升或降低文档在 rank_feature
查询中的相关性分数。默认情况下,此值会提升相关性分数。要降低相关性分数,请将可选的 positive_score_impact
参数设置为 false。
示例
创建包含 rank feature 字段的映射
PUT chessplayers
{
"mappings": {
"properties": {
"name" : {
"type" : "text"
},
"rating": {
"type": "rank_feature"
},
"age": {
"type": "rank_feature",
"positive_score_impact": false
}
}
}
}
索引三个文档,其中一个 rank_feature 字段用于提升分数(rating
),另一个 rank_feature 字段用于降低分数(age
)
PUT testindex1/_doc/1
{
"name" : "John Doe",
"rating" : 2554,
"age" : 75
}
PUT testindex1/_doc/2
{
"name" : "Kwaku Mensah",
"rating" : 2067,
"age": 10
}
PUT testindex1/_doc/3
{
"name" : "Nikki Wolf",
"rating" : 1864,
"age" : 22
}
rank feature 查询
使用 rank feature 查询,您可以根据评分、年龄或评分和年龄对玩家进行排名。如果按评分排名,评分较高的玩家将具有更高的相关性分数。如果按年龄排名,较年轻的玩家将具有更高的相关性分数。
使用 rank feature 查询根据年龄和评分搜索玩家
GET chessplayers/_search
{
"query": {
"bool": {
"should": [
{
"rank_feature": {
"field": "rating"
}
},
{
"rank_feature": {
"field": "age"
}
}
]
}
}
}
当同时根据年龄和评分进行排名时,更年轻和排名更高的玩家得分更高
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.2093145,
"hits" : [
{
"_index" : "chessplayers",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.2093145,
"_source" : {
"name" : "Kwaku Mensah",
"rating" : 1967,
"age" : 10
}
},
{
"_index" : "chessplayers",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0150313,
"_source" : {
"name" : "Nikki Wolf",
"rating" : 1864,
"age" : 22
}
},
{
"_index" : "chessplayers",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.8098284,
"_source" : {
"name" : "John Doe",
"rating" : 2554,
"age" : 75
}
}
]
}
}
rank features
rank features 字段类型与 rank feature 字段类型类似,但更适用于稀疏的特征列表。rank features 字段可以索引数值特征向量,这些向量随后用于提升或降低文档在 rank_feature
查询中的相关性分数。
示例
创建包含 rank features 字段的映射
PUT testindex1
{
"mappings": {
"properties": {
"correlations": {
"type": "rank_features"
}
}
}
}
要索引包含 rank features 字段的文档,请使用带有字符串键和正浮点值的哈希映射
PUT testindex1/_doc/1
{
"correlations": {
"young kids" : 1,
"older kids" : 15,
"teens" : 25.9
}
}
PUT testindex1/_doc/2
{
"correlations": {
"teens": 10,
"adults": 95.7
}
}
使用 rank feature 查询文档
GET testindex1/_search
{
"query": {
"rank_feature": {
"field": "correlations.teens"
}
}
}
响应按相关性分数排名
{
"took" : 123,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.6258503,
"hits" : [
{
"_index" : "testindex1",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.6258503,
"_source" : {
"correlations" : {
"young kids" : 1,
"older kids" : 15,
"teens" : 25.9
}
}
},
{
"_index" : "testindex1",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.39263803,
"_source" : {
"correlations" : {
"teens" : 10,
"adults" : 95.7
}
}
}
]
}
}
rank feature 和 rank features 字段使用前九位有效位进行精度表示,导致大约 0.4% 的相对误差。值以 2−8 = 0.00390625 的相对精度存储。