复杂查询
除了简单的 SFW(SELECT-FROM-WHERE)查询外,SQL 插件还支持子查询、连接、联合和减法等复杂查询。这些查询在一个以上的 OpenSearch 索引上操作。要了解这些查询在后台如何执行,请使用 explain 操作。
连接
OpenSearch SQL 支持内连接、交叉连接和左外连接。
约束
连接有许多约束
- 您只能连接两个索引。
- 您必须为索引使用别名(例如,
people p)。 - 在 ON 子句中,您只能使用 AND 条件。
-
在 WHERE 语句中,不要组合包含多个索引的树。例如,以下语句有效:
WHERE (a.type1 > 3 OR a.type1 < 0) AND (b.type2 > 4 OR b.type2 < -1)以下语句无效:
WHERE (a.type1 > 3 OR b.type2 < 0) AND (a.type1 > 4 OR b.type2 < -1) - 您不能对结果使用 GROUP BY 或 ORDER BY。
- 不支持带 OFFSET 的 LIMIT(例如
LIMIT 25 OFFSET 25)。
描述
JOIN 子句使用每个索引共有的值组合一个或多个索引的列。
语法
规则 tableSource

规则 joinPart

示例 1:内连接
内连接根据您的连接谓词组合两个索引的列,创建新的结果集。它迭代这两个索引并比较每个文档以找到满足连接谓词的文档。您可以选择在 JOIN 子句之前加上 INNER 关键字。
连接谓词由 ON 子句指定。
SQL 查询
SELECT
a.account_number, a.firstname, a.lastname,
e.id, e.name
FROM accounts a
JOIN employees_nested e
ON a.account_number = e.id
解释
explain 输出很复杂,因为 JOIN 子句与在单独的查询规划器框架中执行的两个 OpenSearch DSL 查询相关联。您可以通过检查 Physical Plan 和 Logical Plan 对象来解释它。
{
"Physical Plan" : {
"Project [ columns=[a.account_number, a.firstname, a.lastname, e.name, e.id] ]" : {
"Top [ count=200 ]" : {
"BlockHashJoin[ conditions=( a.account_number = e.id ), type=JOIN, blockSize=[FixedBlockSize with size=10000] ]" : {
"Scroll [ employees_nested as e, pageSize=10000 ]" : {
"request" : {
"size" : 200,
"from" : 0,
"_source" : {
"excludes" : [ ],
"includes" : [
"id",
"name"
]
}
}
},
"Scroll [ accounts as a, pageSize=10000 ]" : {
"request" : {
"size" : 200,
"from" : 0,
"_source" : {
"excludes" : [ ],
"includes" : [
"account_number",
"firstname",
"lastname"
]
}
}
},
"useTermsFilterOptimization" : false
}
}
}
},
"description" : "Hash Join algorithm builds hash table based on result of first query, and then probes hash table to find matched rows for each row returned by second query",
"Logical Plan" : {
"Project [ columns=[a.account_number, a.firstname, a.lastname, e.name, e.id] ]" : {
"Top [ count=200 ]" : {
"Join [ conditions=( a.account_number = e.id ) type=JOIN ]" : {
"Group" : [
{
"Project [ columns=[a.account_number, a.firstname, a.lastname] ]" : {
"TableScan" : {
"tableAlias" : "a",
"tableName" : "accounts"
}
}
},
{
"Project [ columns=[e.name, e.id] ]" : {
"TableScan" : {
"tableAlias" : "e",
"tableName" : "employees_nested"
}
}
}
]
}
}
}
}
}
结果集
| a.account_number | a.firstname | a.lastname | e.id | e.name |
|---|---|---|---|---|
| 6 | Hattie | Bond | 6 | Jane Smith |
示例 2:交叉连接
交叉连接,也称为笛卡尔连接,将第一个索引中的每个文档与第二个索引中的每个文档组合。结果集是两个索引文档的笛卡尔积。此操作类似于没有指定连接条件的 ON 子句的内连接。
对大型或中型索引执行交叉连接存在风险。它可能会触发断路器,终止查询以避免内存不足。
SQL 查询
SELECT
a.account_number, a.firstname, a.lastname,
e.id, e.name
FROM accounts a
JOIN employees_nested e
结果集
| a.account_number | a.firstname | a.lastname | e.id | e.name |
|---|---|---|---|---|
| 1 | Amber | Duke | 3 | Bob Smith |
| 1 | Amber | Duke | 4 | Susan Smith |
| 1 | Amber | Duke | 6 | Jane Smith |
| 6 | Hattie | Bond | 3 | Bob Smith |
| 6 | Hattie | Bond | 4 | Susan Smith |
| 6 | Hattie | Bond | 6 | Jane Smith |
| 13 | Nanette | Bates | 3 | Bob Smith |
| 13 | Nanette | Bates | 4 | Susan Smith |
| 13 | Nanette | Bates | 6 | Jane Smith |
| 18 | Dale | Adams | 3 | Bob Smith |
| 18 | Dale | Adams | 4 | Susan Smith |
| 18 | Dale | Adams | 6 | Jane Smith |
示例 3:左外连接
如果第一个索引不满足连接谓词,则使用左外连接保留其行。OUTER 关键字是可选的。
SQL 查询
SELECT
a.account_number, a.firstname, a.lastname,
e.id, e.name
FROM accounts a
LEFT JOIN employees_nested e
ON a.account_number = e.id
结果集
| a.account_number | a.firstname | a.lastname | e.id | e.name |
|---|---|---|---|---|
| 1 | Amber | Duke | null | null |
| 6 | Hattie | Bond | 6 | Jane Smith |
| 13 | Nanette | Bates | null | null |
| 18 | Dale | Adams | null | null |
子查询
子查询是用于另一个语句内并用括号括起来的完整 SELECT 语句。从解释输出中,您可以看到某些子查询实际上被转换为等效的连接查询以执行。
示例 1:表子查询
SQL 查询
SELECT a1.firstname, a1.lastname, a1.balance
FROM accounts a1
WHERE a1.account_number IN (
SELECT a2.account_number
FROM accounts a2
WHERE a2.balance > 10000
)
解释
{
"Physical Plan" : {
"Project [ columns=[a1.balance, a1.firstname, a1.lastname] ]" : {
"Top [ count=200 ]" : {
"BlockHashJoin[ conditions=( a1.account_number = a2.account_number ), type=JOIN, blockSize=[FixedBlockSize with size=10000] ]" : {
"Scroll [ accounts as a2, pageSize=10000 ]" : {
"request" : {
"size" : 200,
"query" : {
"bool" : {
"filter" : [
{
"bool" : {
"adjust_pure_negative" : true,
"must" : [
{
"bool" : {
"adjust_pure_negative" : true,
"must" : [
{
"bool" : {
"adjust_pure_negative" : true,
"must_not" : [
{
"bool" : {
"adjust_pure_negative" : true,
"must_not" : [
{
"exists" : {
"field" : "account_number",
"boost" : 1
}
}
],
"boost" : 1
}
}
],
"boost" : 1
}
},
{
"range" : {
"balance" : {
"include_lower" : false,
"include_upper" : true,
"from" : 10000,
"boost" : 1,
"to" : null
}
}
}
],
"boost" : 1
}
}
],
"boost" : 1
}
}
],
"adjust_pure_negative" : true,
"boost" : 1
}
},
"from" : 0
}
},
"Scroll [ accounts as a1, pageSize=10000 ]" : {
"request" : {
"size" : 200,
"from" : 0,
"_source" : {
"excludes" : [ ],
"includes" : [
"firstname",
"lastname",
"balance",
"account_number"
]
}
}
},
"useTermsFilterOptimization" : false
}
}
}
},
"description" : "Hash Join algorithm builds hash table based on result of first query, and then probes hash table to find matched rows for each row returned by second query",
"Logical Plan" : {
"Project [ columns=[a1.balance, a1.firstname, a1.lastname] ]" : {
"Top [ count=200 ]" : {
"Join [ conditions=( a1.account_number = a2.account_number ) type=JOIN ]" : {
"Group" : [
{
"Project [ columns=[a1.balance, a1.firstname, a1.lastname, a1.account_number] ]" : {
"TableScan" : {
"tableAlias" : "a1",
"tableName" : "accounts"
}
}
},
{
"Project [ columns=[a2.account_number] ]" : {
"Filter [ conditions=[AND ( AND account_number ISN null, AND balance GT 10000 ) ] ]" : {
"TableScan" : {
"tableAlias" : "a2",
"tableName" : "accounts"
}
}
}
}
]
}
}
}
}
}
结果集
| a1.firstname | a1.lastname | a1.balance |
|---|---|---|
| Amber | Duke | 39225 |
| Nanette | Bates | 32838 |
示例 2:From 子查询
SQL 查询
SELECT a.f, a.l, a.a
FROM (
SELECT firstname AS f, lastname AS l, age AS a
FROM accounts
WHERE age > 30
) AS a
解释
{
"from" : 0,
"size" : 200,
"query" : {
"bool" : {
"filter" : [
{
"bool" : {
"must" : [
{
"range" : {
"age" : {
"from" : 30,
"to" : null,
"include_lower" : false,
"include_upper" : true,
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"_source" : {
"includes" : [
"firstname",
"lastname",
"age"
],
"excludes" : [ ]
}
}
结果集
| f | l | a |
|---|---|---|
| Amber | Duke | 32 |
| Dale | Adams | 33 |
| Hattie | Bond | 36 |