Link Search Menu Expand Document Documentation Menu

经典令牌过滤器

经典令牌过滤器的主要功能是与经典分词器协同工作。它通过应用以下常见转换来处理令牌,这些转换有助于文本分析和搜索

  • 删除所有格结尾,例如’s。例如,John’s 变为 John
  • 删除首字母缩略词中的句点。例如,D.A.R.P.A. 变为 DARPA

示例

以下示例请求创建一个名为 custom_classic_filter 的新索引,并使用 classic 过滤器配置分析器

PUT /custom_classic_filter
{
  "settings": {
    "analysis": {
      "analyzer": {
        "custom_classic": {
          "type": "custom",
          "tokenizer": "classic",
          "filter": ["classic"]
        }
      }
    }
  }
}

生成的词元

使用以下请求检查使用该分析器生成的词元

POST /custom_classic_filter/_analyze
{
  "analyzer": "custom_classic",
  "text": "John's co-operate was excellent."
}

响应包含生成的词元

{
  "tokens": [
    {
      "token": "John",
      "start_offset": 0,
      "end_offset": 6,
      "type": "<APOSTROPHE>",
      "position": 0
    },
    {
      "token": "co",
      "start_offset": 7,
      "end_offset": 9,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "operate",
      "start_offset": 10,
      "end_offset": 17,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "was",
      "start_offset": 18,
      "end_offset": 21,
      "type": "<ALPHANUM>",
      "position": 3
    },
    {
      "token": "excellent",
      "start_offset": 22,
      "end_offset": 31,
      "type": "<ALPHANUM>",
      "position": 4
    }
  ]
}
剩余 350 字符

有问题?

想做贡献?