停止分析器

stop 分析器会移除预定义的停用词列表。此分析器由一个 lowercase 分词器和一个 stop token 过滤器组成。

参数

您可以使用以下参数配置 stop 分析器。

参数	必需/可选	数据类型	描述
`停用词`	可选	字符串或字符串列表	一个字符串，指定预定义的停用词列表（例如 `_english_`），或一个数组，指定自定义停用词列表。默认值为 `_english_`。
`停用词路径`	可选	字符串	包含停用词列表的文件的路径（绝对路径或相对于配置目录的相对路径）。

示例

使用以下命令创建名为 my_stop_index 的索引，并使用 stop 分析器

PUT /my_stop_index
{
  "mappings": {
    "properties": {
      "my_field": {
        "type": "text",
        "analyzer": "stop"
      }
    }
  }
}

配置自定义分析器

使用以下命令配置一个等同于 stop 分析器的自定义分析器索引

PUT /my_custom_stop_analyzer_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_stop_analyzer": {
          "tokenizer": "lowercase",
          "filter": [
            "stop"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "my_field": {
        "type": "text",
        "analyzer": "my_custom_stop_analyzer"
      }
    }
  }
}

生成的词元

使用以下请求检查使用该分析器生成的词元

POST /my_custom_stop_analyzer_index/_analyze
{
  "analyzer": "my_custom_stop_analyzer",
  "text": "The large turtle is green and brown"
}

响应包含生成的词元

{
  "tokens": [
    {
      "token": "large",
      "start_offset": 4,
      "end_offset": 9,
      "type": "word",
      "position": 1
    },
    {
      "token": "turtle",
      "start_offset": 10,
      "end_offset": 16,
      "type": "word",
      "position": 2
    },
    {
      "token": "green",
      "start_offset": 20,
      "end_offset": 25,
      "type": "word",
      "position": 4
    },
    {
      "token": "brown",
      "start_offset": 30,
      "end_offset": 35,
      "type": "word",
      "position": 6
    }
  ]
}

指定停用词

以下示例请求指定了一个自定义停用词列表

PUT /my_new_custom_stop_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_stop_analyzer": {
          "type": "stop",                     
          "stopwords": ["is", "and", "was"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "description": {
        "type": "text",
        "analyzer": "my_custom_stop_analyzer" 
      }
    }
  }
}

以下示例请求指定了包含停用词的文件路径

PUT /my_new_custom_stop_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_stop_analyzer": {
          "type": "stop",                     
          "stopwords_path": "stopwords.txt"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "description": {
        "type": "text",
        "analyzer": "my_custom_stop_analyzer" 
      }
    }
  }
}

在此示例中，文件位于 config 目录中。您也可以指定文件的完整路径。

参数
示例
配置自定义分析器
生成的词元

此页面有帮助吗？

✔ 是 ✖ 否

告诉我们原因

剩余 350 字符

有问题？在 OpenSearch 论坛上提问。

想贡献力量？编辑此页面或创建问题。

停止分析器

参数

示例

配置自定义分析器

生成的词元

指定停用词

OpenSearch 链接

参与其中

资源

联系我们