OpenSearch高级Python客户端(opensearch-dsl-py
)将在版本2.1.0之后弃用。我们建议切换到Python客户端(opensearch-py
),它现在包含了opensearch-dsl-py
的功能。
高级 Python 客户端
OpenSearch高级Python客户端(opensearch-dsl-py
)为常见的OpenSearch实体(如文档)提供了包装类,因此您可以将它们作为Python对象进行操作。此外,高级客户端简化了查询编写,并为常见的OpenSearch操作提供了方便的Python方法。高级Python客户端支持创建和索引文档、带或不带过滤器的搜索,以及使用查询更新文档。
本入门指南演示了如何连接到OpenSearch、索引文档和运行查询。有关客户端源代码,请参阅opensearch-dsl-py仓库。
设置
要将客户端添加到您的项目,请使用 pip 安装它
pip install opensearch-dsl
安装客户端后,您可以像导入其他模块一样导入它
from opensearchpy import OpenSearch
from opensearch_dsl import Search
连接到 OpenSearch
要连接到默认的OpenSearch主机,如果您正在使用安全插件,请创建一个启用SSL的客户端对象。您可以将默认凭据用于测试目的
host = 'localhost'
port = 9200
auth = ('admin', 'admin') # For testing only. Don't store credentials in code.
ca_certs_path = '/full/path/to/root-ca.pem' # Provide a CA bundle if you use intermediate CAs with your root CA.
# Create the client with SSL/TLS enabled, but hostname verification disabled.
client = OpenSearch(
hosts = [{'host': host, 'port': port}],
http_compress = True, # enables gzip compression for request bodies
http_auth = auth,
use_ssl = True,
verify_certs = True,
ssl_assert_hostname = False,
ssl_show_warn = False,
ca_certs = ca_certs_path
)
如果您有自己的客户端证书,请在client_cert_path
和client_key_path
参数中指定它们
host = 'localhost'
port = 9200
auth = ('admin', 'admin') # For testing only. Don't store credentials in code.
ca_certs_path = '/full/path/to/root-ca.pem' # Provide a CA bundle if you use intermediate CAs with your root CA.
# Optional client certificates if you don't want to use HTTP basic authentication.
client_cert_path = '/full/path/to/client.pem'
client_key_path = '/full/path/to/client-key.pem'
# Create the client with SSL/TLS enabled, but hostname verification disabled.
client = OpenSearch(
hosts = [{'host': host, 'port': port}],
http_compress = True, # enables gzip compression for request bodies
http_auth = auth,
client_cert = client_cert_path,
client_key = client_key_path,
use_ssl = True,
verify_certs = True,
ssl_assert_hostname = False,
ssl_show_warn = False,
ca_certs = ca_certs_path
)
如果您没有使用安全插件,请创建一个禁用SSL的客户端对象
host = 'localhost'
port = 9200
# Create the client with SSL/TLS and hostname verification disabled.
client = OpenSearch(
hosts = [{'host': host, 'port': port}],
http_compress = True, # enables gzip compression for request bodies
use_ssl = False,
verify_certs = False,
ssl_assert_hostname = False,
ssl_show_warn = False
)
创建索引
要创建OpenSearch索引,请使用client.indices.create()
方法。您可以使用以下代码构建一个包含自定义设置的JSON对象
index_name = 'my-dsl-index'
index_body = {
'settings': {
'index': {
'number_of_shards': 4
}
}
}
response = client.indices.create(index_name, body=index_body)
索引文档
您可以通过扩展Document
类来创建一个类,以表示您将要索引到OpenSearch中的文档
class Movie(Document):
title = Text(fields={'raw': Keyword()})
director = Text()
year = Text()
class Index:
name = index_name
def save(self, ** kwargs):
return super(Movie, self).save(** kwargs)
要索引文档,请创建新类的一个对象并调用其save()
方法
# Set up the opensearch-py version of the document
Movie.init(using=client)
doc = Movie(meta={'id': 1}, title='Moneyball', director='Bennett Miller', year='2011')
response = doc.save(using=client)
执行批量操作
您可以使用客户端的bulk()
方法同时执行多个操作。这些操作可以是相同类型或不同类型的。请注意,操作必须由\n
分隔,并且整个字符串必须是单行
movies = '{ "index" : { "_index" : "my-dsl-index", "_id" : "2" } } \n { "title" : "Interstellar", "director" : "Christopher Nolan", "year" : "2014"} \n { "create" : { "_index" : "my-dsl-index", "_id" : "3" } } \n { "title" : "Star Trek Beyond", "director" : "Justin Lin", "year" : "2015"} \n { "update" : {"_id" : "3", "_index" : "my-dsl-index" } } \n { "doc" : {"year" : "2016"} }'
client.bulk(movies)
搜索文档
您可以使用Search
类来构建查询。以下代码创建了一个带有过滤器的布尔查询
s = Search(using=client, index=index_name) \
.filter("term", year="2011") \
.query("match", title="Moneyball")
response = s.execute()
以上查询等同于OpenSearch领域特定语言(DSL)中的以下查询
GET my-dsl-index/_search
{
"query": {
"bool": {
"must": {
"match": {
"title": "Moneyball"
}
},
"filter": {
"term" : {
"year": 2011
}
}
}
}
}
删除文档
您可以使用client.delete()
方法删除文档
response = client.delete(
index = 'my-dsl-index',
id = '1'
)
删除索引
您可以使用client.indices.delete()
方法删除索引
response = client.indices.delete(
index = 'my-dsl-index'
)
示例程序
以下示例程序创建一个客户端,添加一个具有非默认设置的索引,插入一个文档,执行批量操作,搜索该文档,删除该文档,然后删除该索引
from opensearchpy import OpenSearch
from opensearch_dsl import Search, Document, Text, Keyword
host = 'localhost'
port = 9200
auth = ('admin', 'admin') # For testing only. Don't store credentials in code.
ca_certs_path = 'root-ca.pem'
# Create the client with SSL/TLS enabled, but hostname verification disabled.
client = OpenSearch(
hosts=[{'host': host, 'port': port}],
http_compress=True, # enables gzip compression for request bodies
# http_auth=auth,
use_ssl=False,
verify_certs=False,
ssl_assert_hostname=False,
ssl_show_warn=False,
# ca_certs=ca_certs_path
)
index_name = 'my-dsl-index'
index_body = {
'settings': {
'index': {
'number_of_shards': 4
}
}
}
response = client.indices.create(index_name, index_body)
print('\nCreating index:')
print(response)
# Create the structure of the document
class Movie(Document):
title = Text(fields={'raw': Keyword()})
director = Text()
year = Text()
class Index:
name = index_name
def save(self, ** kwargs):
return super(Movie, self).save(** kwargs)
# Set up the opensearch-py version of the document
Movie.init(using=client)
doc = Movie(meta={'id': 1}, title='Moneyball', director='Bennett Miller', year='2011')
response = doc.save(using=client)
print('\nAdding document:')
print(response)
# Perform bulk operations
movies = '{ "index" : { "_index" : "my-dsl-index", "_id" : "2" } } \n { "title" : "Interstellar", "director" : "Christopher Nolan", "year" : "2014"} \n { "create" : { "_index" : "my-dsl-index", "_id" : "3" } } \n { "title" : "Star Trek Beyond", "director" : "Justin Lin", "year" : "2015"} \n { "update" : {"_id" : "3", "_index" : "my-dsl-index" } } \n { "doc" : {"year" : "2016"} }'
client.bulk(movies)
# Search for the document.
s = Search(using=client, index=index_name) \
.filter('term', year='2011') \
.query('match', title='Moneyball')
response = s.execute()
print('\nSearch results:')
for hit in response:
print(hit.meta.score, hit.title)
# Delete the document.
print('\nDeleting document:')
print(response)
# Delete the index.
response = client.indices.delete(
index = index_name
)
print('\nDeleting index:')
print(response)