地理瓦片网格聚合
地理瓦片网格聚合将文档分组到网格单元中,以便进行地理分析。每个网格单元对应一个地图瓦片,并使用{zoom}/{x}/{y}
格式标识。您可以使用地理瓦片网格聚合在地理点或地理形状字段上聚合文档。一个显著的区别是,地理点只存在于一个桶中,而地理形状在其相交的所有地理瓦片网格单元中都会被计数。
精度
precision
参数控制确定网格单元大小的粒度级别。精度越低,网格单元越大。
以下示例说明了低精度和高精度的聚合请求。
首先,创建一个索引并将 location
字段映射为 geo_point
PUT national_parks
{
"mappings": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
将以下文档索引到示例索引中
PUT national_parks/_doc/1
{
"name": "Yellowstone National Park",
"location": "44.42, -110.59"
}
PUT national_parks/_doc/2
{
"name": "Yosemite National Park",
"location": "37.87, -119.53"
}
PUT national_parks/_doc/3
{
"name": "Death Valley National Park",
"location": "36.53, -116.93"
}
您可以以多种格式索引地理点。有关所有支持格式的列表,请参阅地理点文档。
低精度请求
运行一个低精度请求,将所有三个文档归入一个桶中
GET national_parks/_search
{
"aggregations": {
"grouped": {
"geotile_grid": {
"field": "location",
"precision": 1
}
}
}
}
您可以对地理瓦片网格聚合查询使用 GET
或 POST
HTTP 方法。
响应将所有文档归为一组,因为它们足够接近,可以归入一个网格单元中
响应
{
"took": 51,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "national_parks",
"_id": "1",
"_score": 1,
"_source": {
"name": "Yellowstone National Park",
"location": "44.42, -110.59"
}
},
{
"_index": "national_parks",
"_id": "2",
"_score": 1,
"_source": {
"name": "Yosemite National Park",
"location": "37.87, -119.53"
}
},
{
"_index": "national_parks",
"_id": "3",
"_score": 1,
"_source": {
"name": "Death Valley National Park",
"location": "36.53, -116.93"
}
}
]
},
"aggregations": {
"grouped": {
"buckets": [
{
"key": "1/0/0",
"doc_count": 3
}
]
}
}
}
高精度请求
现在运行一个高精度请求
GET national_parks/_search
{
"aggregations": {
"grouped": {
"geotile_grid": {
"field": "location",
"precision": 6
}
}
}
}
由于粒度更高,所有三个文档被分别归入不同的桶中
响应
{
"took": 15,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "national_parks",
"_id": "1",
"_score": 1,
"_source": {
"name": "Yellowstone National Park",
"location": "44.42, -110.59"
}
},
{
"_index": "national_parks",
"_id": "2",
"_score": 1,
"_source": {
"name": "Yosemite National Park",
"location": "37.87, -119.53"
}
},
{
"_index": "national_parks",
"_id": "3",
"_score": 1,
"_source": {
"name": "Death Valley National Park",
"location": "36.53, -116.93"
}
}
]
},
"aggregations": {
"grouped": {
"buckets": [
{
"key": "6/12/23",
"doc_count": 1
},
{
"key": "6/11/25",
"doc_count": 1
},
{
"key": "6/10/24",
"doc_count": 1
}
]
}
}
}
您还可以通过在 bounds
参数中提供边界包络的坐标来限制地理区域。bounds
和 geo_bounding_box
坐标都可以使用任何地理点格式指定。以下查询为 bounds
参数使用熟知文本 (WKT) “POINT(longitude
latitude
)” 格式
GET national_parks/_search
{
"size": 0,
"aggregations": {
"grouped": {
"geotile_grid": {
"field": "location",
"precision": 6,
"bounds": {
"top_left": "POINT (-120 38)",
"bottom_right": "POINT (-116 36)"
}
}
}
}
}
响应只包含在指定边界内的两个结果
响应
{
"took": 48,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "national_parks",
"_id": "1",
"_score": 1,
"_source": {
"name": "Yellowstone National Park",
"location": "44.42, -110.59"
}
},
{
"_index": "national_parks",
"_id": "2",
"_score": 1,
"_source": {
"name": "Yosemite National Park",
"location": "37.87, -119.53"
}
},
{
"_index": "national_parks",
"_id": "3",
"_score": 1,
"_source": {
"name": "Death Valley National Park",
"location": "36.53, -116.93"
}
}
]
},
"aggregations": {
"grouped": {
"buckets": [
{
"key": "6/11/25",
"doc_count": 1
},
{
"key": "6/10/24",
"doc_count": 1
}
]
}
}
}
bounds
参数可以与 geo_bounding_box
过滤器一起使用或单独使用;这两个参数是独立的,并且可以彼此具有任何空间关系。
聚合地理形状
要在地理形状字段上运行聚合,首先创建一个索引并将 location
字段映射为 geo_shape
PUT national_parks
{
"mappings": {
"properties": {
"location": {
"type": "geo_shape"
}
}
}
}
接下来,将一些文档索引到 national_parks
索引中
PUT national_parks/_doc/1
{
"name": "Yellowstone National Park",
"location":
{"type": "envelope","coordinates": [ [-111.15, 45.12], [-109.83, 44.12] ]}
}
PUT national_parks/_doc/2
{
"name": "Yosemite National Park",
"location":
{"type": "envelope","coordinates": [ [-120.23, 38.16], [-119.05, 37.45] ]}
}
PUT national_parks/_doc/3
{
"name": "Death Valley National Park",
"location":
{"type": "envelope","coordinates": [ [-117.34, 37.01], [-116.38, 36.25] ]}
}
您可以按如下方式在 location
字段上运行聚合
GET national_parks/_search
{
"aggregations": {
"grouped": {
"geotile_grid": {
"field": "location",
"precision": 6
}
}
}
}
聚合地理形状时,一个地理形状可以被多个桶计数,因为它与多个网格单元重叠
响应
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "national_parks",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "Yellowstone National Park",
"location" : {
"type" : "envelope",
"coordinates" : [
[
-111.15,
45.12
],
[
-109.83,
44.12
]
]
}
}
},
{
"_index" : "national_parks",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "Yosemite National Park",
"location" : {
"type" : "envelope",
"coordinates" : [
[
-120.23,
38.16
],
[
-119.05,
37.45
]
]
}
}
},
{
"_index" : "national_parks",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "Death Valley National Park",
"location" : {
"type" : "envelope",
"coordinates" : [
[
-117.34,
37.01
],
[
-116.38,
36.25
]
]
}
}
}
]
},
"aggregations" : {
"grouped" : {
"buckets" : [
{
"key" : "6/12/23",
"doc_count" : 1
},
{
"key" : "6/12/22",
"doc_count" : 1
},
{
"key" : "6/11/25",
"doc_count" : 1
},
{
"key" : "6/11/24",
"doc_count" : 1
},
{
"key" : "6/10/24",
"doc_count" : 1
}
]
}
}
}
目前,OpenSearch 通过 API 支持地理形状聚合,但在 OpenSearch Dashboards 可视化中不支持。如果您希望看到地理形状聚合在可视化中实现,请为相关的 GitHub issue 点赞。
支持的参数
地理瓦片网格聚合请求支持以下参数。
参数 | 数据类型 | 描述 |
---|---|---|
field(字段) | 字符串 | 包含地理点的字段。此字段必须映射为 geo_point 字段。如果字段包含数组,则所有数组值都将被聚合。必填。 |
precision(精度) | 整数 | 用于确定结果分桶的网格单元的粒度级别。单元格不能超过所需精度指定的尺寸(对角线)。有效值范围为 [0, 29]。可选。默认值为 7。 |
bounds(边界) | 对象 | 用于过滤地理点的边界框。边界框由左上角和右下角顶点定义。顶点以以下格式之一指定为地理点 - 包含纬度和经度的对象 - 以 [ longitude (经度), latitude (纬度)] 格式的数组- 以“ latitude (纬度),longitude (经度)”格式的字符串- Geohash - WKT 有关格式示例,请参阅地理点格式。可选。 |
size(大小) | 整数 | 返回的最大桶数。当桶数超过size 时,OpenSearch 返回包含更多文档的桶。可选。默认值为 10,000。 |
shard_size(分片大小) | 整数 | 每个分片返回的最大桶数。可选。默认值为 max (10, size · 分片数量),这提供了更优先桶的更准确计数。 |