(二)Elasticsearch 分词
2022-02-17 本文已影响0人
刀鱼要到岛上掉
analyzer
-
默认standard
- 内置的分析器有whitespace 、 simple和english
GET /_analyze { "analyzer": "standard", "text":"司马南手撕联想集团倒拔垂杨柳" } #2去除非字母,转小写 POST /_analyze { "analyzer": "simple", "text":"2 running Quick Brown-foxes" } #3按空格切分 { "analyzer": "whitespace", "text":"2 running Quick Brown-foxes" } #4 比simple多去除is the 等修饰词 POST /_analyze { "analyzer": "stop", "text":"2 running Quick Brown-foxes in the summer" } #5不分词 POST /_analyze { "analyzer": "keyword", "text":"2 running Quick Brown-foxes in the summer" } #针对不同的语言 GET /_analyze { "analyzer": "english", "text":"2 running Quick Brown-foxes in the summer" }
-
ik分词器
GET /_analyze { "analyzer": "ik_max_word", "text":"司马南手撕联想集团杨柳" }
- ik_max_word:会将文本做最细粒度的拆分
- ik_smart:会做最粗粒度的拆分;已被分出的词语将不会再次被其它词语占有