Link Search Menu Expand Document Documentation Menu

排名字段类型

1.0 版引入

下表列出了 OpenSearch 支持的所有排名字段类型。

字段数据类型 描述
rank_feature 提升或降低文档的相关性分数。
rank_features 提升或降低文档的相关性分数。当特征列表稀疏时使用。

rank feature 和 rank features 字段只能通过 rank feature 查询进行查询。它们不支持聚合或排序。

排名特征

rank feature 字段类型使用正的 浮点 值来提升或降低文档在 rank_feature 查询中的相关性分数。默认情况下,此值会提升相关性分数。要降低相关性分数,请将可选的 positive_score_impact 参数设置为 false。

示例

创建包含 rank feature 字段的映射

PUT chessplayers
{
  "mappings": {
    "properties": {
      "name" : {
        "type" : "text"
      },
      "rating": {
        "type": "rank_feature" 
      },
      "age": {
        "type": "rank_feature",
        "positive_score_impact": false 
      }
    }
  }
}

索引三个文档,其中一个 rank_feature 字段用于提升分数(rating),另一个 rank_feature 字段用于降低分数(age

PUT testindex1/_doc/1
{
  "name" : "John Doe",
  "rating" : 2554,
  "age" : 75
}

PUT testindex1/_doc/2
{
  "name" : "Kwaku Mensah",
  "rating" : 2067,
  "age": 10
}

PUT testindex1/_doc/3
{
  "name" : "Nikki Wolf",
  "rating" : 1864,
  "age" : 22
}

rank feature 查询

使用 rank feature 查询,您可以根据评分、年龄或评分和年龄对玩家进行排名。如果按评分排名,评分较高的玩家将具有更高的相关性分数。如果按年龄排名,较年轻的玩家将具有更高的相关性分数。

使用 rank feature 查询根据年龄和评分搜索玩家

GET chessplayers/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "rank_feature": {
            "field": "rating"
          }
        },
        {
          "rank_feature": {
            "field": "age"
          }
        }
      ]
    }
  }
}

当同时根据年龄和评分进行排名时,更年轻和排名更高的玩家得分更高

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.2093145,
    "hits" : [
      {
        "_index" : "chessplayers",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.2093145,
        "_source" : {
          "name" : "Kwaku Mensah",
          "rating" : 1967,
          "age" : 10
        }
      },
      {
        "_index" : "chessplayers",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0150313,
        "_source" : {
          "name" : "Nikki Wolf",
          "rating" : 1864,
          "age" : 22
        }
      },
      {
        "_index" : "chessplayers",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.8098284,
        "_source" : {
          "name" : "John Doe",
          "rating" : 2554,
          "age" : 75
        }
      }
    ]
  }
}

rank features

rank features 字段类型与 rank feature 字段类型类似,但更适用于稀疏的特征列表。rank features 字段可以索引数值特征向量,这些向量随后用于提升或降低文档在 rank_feature 查询中的相关性分数。

示例

创建包含 rank features 字段的映射

PUT testindex1
{
  "mappings": {
    "properties": {
      "correlations": {
        "type": "rank_features" 
      }
    }
  }
}

要索引包含 rank features 字段的文档,请使用带有字符串键和正浮点值的哈希映射

PUT testindex1/_doc/1
{
  "correlations": { 
    "young kids" : 1,
    "older kids" : 15,
    "teens" : 25.9
  }
}

PUT testindex1/_doc/2
{
  "correlations": {
    "teens": 10,
    "adults": 95.7
  }
}

使用 rank feature 查询文档

GET testindex1/_search
{
  "query": {
    "rank_feature": {
      "field": "correlations.teens"
    }
  }
}

响应按相关性分数排名

{
  "took" : 123,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.6258503,
    "hits" : [
      {
        "_index" : "testindex1",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.6258503,
        "_source" : {
          "correlations" : {
            "young kids" : 1,
            "older kids" : 15,
            "teens" : 25.9
          }
        }
      },
      {
        "_index" : "testindex1",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.39263803,
        "_source" : {
          "correlations" : {
            "teens" : 10,
            "adults" : 95.7
          }
        }
      }
    ]
  }
}

rank feature 和 rank features 字段使用前九位有效位进行精度表示,导致大约 0.4% 的相对误差。值以 2−8 = 0.00390625 的相对精度存储。

剩余 350 字符

有问题?

想做贡献?