Elasticsearch6.2.1安装分词器(三)

2018-05-26  本文已影响355人  wangfs

上篇文章实践了elasticsearch搜索集群的环境部署,接下来再完善下搜索功能,分词器。
分词器源码地址:https://github.com/medcl/elasticsearch-analysis-ik

3.1 安装分词器
[root@elastic-redis-03 elasticsearch]# ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.1/elasticsearch-analysis-ik-6.2.1.zip
-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.1/elasticsearch-analysis-ik-6.2.1.zip
[=================================================] 100%   
-> Installed analysis-ik
[root@elastic-redis-03 elasticsearch]# ls
bin  config  data  lib  LICENSE.txt  logs  modules  NOTICE.txt  plugins  README.textile
[root@elastic-redis-03 elasticsearch]# ls -l plugins/
total 4
drwxr-xr-x 2 root root 4096 Jun 14 13:04 analysis-ik #这个目录就是执行安装操作后产生的
[root@elastic-redis-03 elasticsearch]# ls -l plugins/analysis-ik/
total 1420
-rw-r--r-- 1 root root 263965 Jun 14 13:04 commons-codec-1.9.jar
-rw-r--r-- 1 root root  61829 Jun 14 13:04 commons-logging-1.2.jar
-rw-r--r-- 1 root root  51097 Jun 14 13:04 elasticsearch-analysis-ik-6.2.1.jar
-rw-r--r-- 1 root root 736658 Jun 14 13:04 httpclient-4.5.2.jar
-rw-r--r-- 1 root root 326724 Jun 14 13:04 httpcore-4.4.4.jar
-rw-r--r-- 1 root root   1805 Jun 14 13:04 plugin-descriptor.properties
[root@elastic-redis-03 elasticsearch]# 

[root@elastic-redis-03 elasticsearch]# ls -l config/
total 68
drwxr-x--- 2 elastic elastic  4096 Jun 14 13:04 analysis-ik  #这个目录就是执行安装操作后产生的
-rw-rw---- 1 elastic elastic  1868 Jun 14 14:52 elasticsearch.yml
-rw-rw---- 1 elastic elastic  2767 Jun 14 12:39 jvm.options
-rw-rw---- 1 elastic elastic  5091 Feb  8 03:30 log4j2.properties
-rw------- 1 elastic elastic 41824 Jun 14 14:50 nohup.out
[root@elastic-redis-03 elasticsearch]# 

3.32 然后另外两个节点执行同样的安装操作
3.33 查看插件是否安装成功
[root@elastic-redis-03 analysis-ik]# curl -get 172.31.15.172:9200/_cat/plugins
node-1 analysis-ik 6.2.1
node-2 analysis-ik 6.2.1
node-3 analysis-ik 6.2.1
3.4 重启elasticsearch(三台服务器都要重启),趁着重启的机会,我们可以观察下集群的状态。怎么观察?最简单的方式就是上篇文章说的通过head插件实现的web页面查看。

下图是目前为止的集群状态,五角星代表主节点,圆圈代表数据节点。


3.png

入正题,继续分词器。

3.5 创建一个索引
[root@pro-3-b ~]# curl -XPUT 'http://172.31.15.172:9200/custome?pretty' 
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "custome"
}
[root@pro-3-b ~]# curl -XPOST -H "Content-Type: application/json" '172.31.15.172:9200/customer/external?pretty' -d '
{
    "name": "安徽省长江流域"
}'

[root@pro-3-b ~]# curl -XPOST -H "Content-Type: application/json" '172.31.15.172:9200/customer/external?pretty' -d '
{
    "name": "省长是中华人民共和国的部级官员"
}'
[root@pro-3-b ~]# curl -XPOST -H "Content-Type: application/json" '172.31.15.172:9200/customer/external?pretty' -d '
> {
>   "name": "河南省郑州市"
> }'
{
  "_index" : "customer",
  "_type" : "external",
  "_id" : "jB86m2MB0bX0KdnnB-Dc",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 1,
  "_primary_term" : 1
}
3.6 搜索关键字测试
#搜索"省长"关键字
[root@pro-3-b ~]# curl -XGET '172.31.15.172:9200/_search' -H "Content-Type: application/json"  -d '
{
"query": {
"match": {
"name": "省长"
}
}
}'
{"took":4,"timed_out":false,"_shards":{"total":15,"successful":15,"skipped":0,"failed":0},"hits":{"total":3,"max_score":0.74487394,"hits":[{"_index":"customer","_type":"external","_id":"ix-tmmMB0bX0KdnnhuCs","_score":0.74487394,"_source":
{
    "name": "省长是中华人民共和国的部级官员"
}},{"_index":"customer","_type":"external","_id":"ih-pmmMB0bX0Kdnn0OAK","_score":0.5753642,"_source":
{
    "name": "安徽省长江流域"
}},{"_index":"customer","_type":"external","_id":"jB86m2MB0bX0KdnnB-Dc","_score":0.22108285,"_source":
{
  "name": "河南省郑州市"
}}]}}[root@pro-3-b ~]# 

#搜索"中国"关键字

[root@pro-3-b ~]# curl -XGET '172.31.15.172:9200/_search' -H "Content-Type: application/json"  -d '
{
"query": {
"match": {
"name": "中国"
}
}
}'
{"took":7,"timed_out":false,"_shards":{"total":15,"successful":15,"skipped":0,"failed":0},"hits":{"total":1,"max_score":1.179499,"hits":[{"_index":"customer","_type":"external","_id":"ix-tmmMB0bX0KdnnhuCs","_score":1.179499,"_source":
{
    "name": "省长是中华人民共和国的部级官员"
}}]}}[root@pro-3-b ~]# 

可以看到分词成功。也可以通过head来测试,如下图:


7.png

注意:
ik_max_word: 会将文本做最细粒度的拆分,比如会将“俄罗斯世界杯即将开幕”拆分为“俄罗斯,罗斯,斯世,世界杯,世界,杯,即将,开幕”,会穷尽各种可能的组合。
ik_smart: 会做最粗粒度的拆分,比如会将“俄罗斯世界杯即将开幕”拆分为“俄罗斯,世界杯,即将,开幕”。

参考:
https://blog.csdn.net/chengyuqiang/article/details/78991570
https://www.cnblogs.com/xing901022/p/5469338.html
http://www.hemingliang.site/1087.html

https://blog.csdn.net/vitaair/article/details/79912266

上一篇下一篇

猜你喜欢

热点阅读