部署运维大数据平台搭建大数据

Elasticsearch学习教程系列(2)-命令学习(二)批处

2019-03-19  本文已影响4人  抹布先生M

目前本系列文章有:
Elasticsearch学习教程系列(0)-入门与安装
Elasticsearch学习教程系列(1)-命令学习(一) 集群健康、索引、文档操作
Elasticsearch学习教程系列(2)-命令学习(二)批处理、数据操作、搜索

上一篇文章中,我们介绍了Elasticsearch集群运行情况、索引、文档的CRUD操作了,下面让我们来愉快地学习一些新的命令吧。

批处理

Elasticsearch除了能够索引,更新和删除单个文档之外,还提供了使用_bulkAPI批量执行上述任何操作的功能。此功能非常重要,因为它提供了一种非常有效的机制,可以尽可能快地执行多个操作,并尽可能少地进行网络往返。
作为一个简单示例,以下调用在一个批量操作中索引两个文档(ID 1 - John Doe和ID 2 - Jane Doe):

[builder @master~] $ curl - X POST "localhost:9200/customer/_doc/_bulk?pretty" - H 'Content-Type: application/json' - d '
{
    "index":
    {
        "_id": "1"
    }
}
{
    "name": "John Doe"
}
{
    "index":
    {
        "_id": "2"
    }
}
{
    "name": "Jane Doe"
}
'
{
    "took": 19,
    "errors": false,
    "items": [
    {
        "index":
        {
            "_index": "customer",
            "_type": "_doc",
            "_id": "1",
            "_version": 6,
            "result": "updated",
            "_shards":
            {
                "total": 2,
                "successful": 1,
                "failed": 0
            },
            "_seq_no": 5,
            "_primary_term": 2,
            "status": 200
        }
    },
    {
        "index":
        {
            "_index": "customer",
            "_type": "_doc",
            "_id": "2",
            "_version": 1,
            "result": "created",
            "_shards":
            {
                "total": 2,
                "successful": 1,
                "failed": 0
            },
            "_seq_no": 0,
            "_primary_term": 2,
            "status": 201
        }
    }]
}

下面示例更新第一个文档( ID为1), 然后在一个批量操作中删除第二个文档( ID为2):

[builder @master~] $ curl - X POST "localhost:9200/customer/_doc/_bulk?pretty" - H 'Content-Type: application/json' - d '
{
    "index":
    {
        "_id": "1"
    }
}
{
    "name": "John Doe"
}
{
    "index":
    {
        "_id": "2"
    }
}
{
    "name": "Jane Doe"
}
'
{
    "took": 15,
    "errors": false,
    "items": [
    {
        "index":
        {
            "_index": "customer",
            "_type": "_doc",
            "_id": "1",
            "_version": 7,
            "result": "updated",
            "_shards":
            {
                "total": 2,
                "successful": 1,
                "failed": 0
            },
            "_seq_no": 6,
            "_primary_term": 2,
            "status": 200
        }
    },
    {
        "index":
        {
            "_index": "customer",
            "_type": "_doc",
            "_id": "2",
            "_version": 2,
            "result": "updated",
            "_shards":
            {
                "total": 2,
                "successful": 1,
                "failed": 0
            },
            "_seq_no": 1,
            "_primary_term": 2,
            "status": 200
        }
    }]
}

请注意,对于删除操作,之后没有相应的源文档,因为删除只需要删除文档的ID。
Bulk API不会因其中一个操作失败而失败。如果单个操作因任何原因失败,它将继续处理其后的其余操作。批量API返回时,它将为每个操作提供一个状态(按照发送的顺序),以便您可以检查特定操作是否失败。

数据操作

导入数据

下面我们在某个文件夹中保存着1个json文件,内容如下:

[builder@master ~/Developer/esTempData]$ cat accounts.json
{"index":{"_id":"1"}}
{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}

{"index":{"_id":"6"}}
{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
...
# 我们的json文件中有1000条数据,上面只展示2条数据

导入我们的json文件数据到Elasticsearch,需要在json文件的当前路径下执行:

[builder@master ~/Developer/esTempData]$ curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_doc/_bulk?pretty&refresh" --data-binary "@accounts.json"
...
## 查看索引数据,可以看到第2行的数据有1000条记录了
[builder@master ~]$  curl -X GET "http://localhost:9200/_cat/indices?v"
health status index     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .kibana_1 WShElb71RVigvhHRsU5vIA   1   0          3            0     11.9kb         11.9kb
yellow open   bank      LLrKZKoNT-ifFpiv-dBc9w   5   1       1000            0    474.6kb        474.6kb
yellow open   customer  jrkOIUCjTLec7_5OcwldYw   5   1          3            0     10.9kb         10.9kb

搜索API

搜索有两种基本方式:一种是通过发送搜索参数REST请求URI和其他通过发送他们REST请求JSON主体。请求JSON体方法更具表现力,并以更可读的JSON格式定义搜索。我们将尝试一个请求URI方法的示例,但是对于本教程的其余部分,我们将专门使用请求体方法。

可以从_search端点访问用于搜索的REST API 。此示例返回bank索引中的所有文档:

[builder@master ~]$ curl -X GET 'localhost:9200/bank/_search?q=*&sort=account_number:asc&pretty'

q=*参数指示Elasticsearch匹配索引中的所有文档。该sort=account_number:asc参数指示使用account_number每个文档的字段以升序对结果进行排序。该pretty参数再次告诉Elasticsearch返回漂亮的JSON结果。

响应(部分显示):

{
  "took" : 63,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1000,
    "max_score" : null,
    "hits" : [ {
      "_index" : "bank",
      "_type" : "_doc",
      "_id" : "0",
      "sort": [0],
      "_score" : null,
      "_source" : {"account_number":0,"balance":16623,"firstname":"Bradshaw","lastname":"Mckenzie","age":29,"gender":"F","address":"244 Columbus Place","employer":"Euron","email":"bradshawmckenzie@euron.com","city":"Hobucken","state":"CO"}
    }, {
      "_index" : "bank",
      "_type" : "_doc",
      "_id" : "1",
      "sort": [1],
      "_score" : null,
      "_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
    }, ...
    ]
  }
}

关于response响应字段含义:

下面使用JSON请求体的方法完成上述相同的搜索:

$ curl -X GET 'localhost:9200/bank/_search' -d '{
  "query": { "match_all": {} },
  "sort": [
    { "account_number": "asc" }
  ]
}' -H "Content-Type:application/json"

# response 报文:
{"took":14,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1000,"max_score":null,"hits":[{"_index":"bank","_type":"_doc","_id":"0","_score":null,"_source":{"account_number":0,"balance":16623,"firstname":"Bradshaw","lastname":"Mckenzie","age":29,"gender":"F","address":"244 Columbus Place","employer":"Euron","email":"bradshawmckenzie@euron.com","city":"Hobucken","state":"CO"},"sort":[0]}...
-End- 扫描关注一下,可以互相交流学习哟
上一篇下一篇

猜你喜欢

热点阅读