Elasticsearch Painless 脚本语言入门
脚本语言入门
1. 准备数据
curl -X PUT "localhost:9200/hockey/player/_bulk?refresh" -H 'Content-Type: application/json' -d'
{"index":{"_id":1}}
{"first":"johnny","last":"gaudreau","goals":[9,27,1],"assists":[17,46,0],"gp":[26,82,1],"born":"1993/08/13"}
{"index":{"_id":2}}
{"first":"sean","last":"monohan","goals":[7,54,26],"assists":[11,26,13],"gp":[26,82,82],"born":"1994/10/12"}
{"index":{"_id":3}}
{"first":"jiri","last":"hudler","goals":[5,34,36],"assists":[11,62,42],"gp":[24,80,79],"born":"1984/01/04"}
{"index":{"_id":4}}
{"first":"micheal","last":"frolik","goals":[4,6,15],"assists":[8,23,15],"gp":[26,82,82],"born":"1988/02/17"}
{"index":{"_id":5}}
{"first":"sam","last":"bennett","goals":[5,0,0],"assists":[8,1,0],"gp":[26,1,0],"born":"1996/06/20"}
{"index":{"_id":6}}
{"first":"dennis","last":"wideman","goals":[0,26,15],"assists":[11,30,24],"gp":[26,81,82],"born":"1983/03/20"}
{"index":{"_id":7}}
{"first":"david","last":"jones","goals":[7,19,5],"assists":[3,17,4],"gp":[26,45,34],"born":"1984/08/10"}
{"index":{"_id":8}}
{"first":"tj","last":"brodie","goals":[2,14,7],"assists":[8,42,30],"gp":[26,82,82],"born":"1990/06/07"}
{"index":{"_id":39}}
{"first":"mark","last":"giordano","goals":[6,30,15],"assists":[3,30,24],"gp":[26,60,63],"born":"1983/10/03"}
{"index":{"_id":10}}
{"first":"mikael","last":"backlund","goals":[3,15,13],"assists":[6,24,18],"gp":[26,82,82],"born":"1989/03/17"}
{"index":{"_id":11}}
{"first":"joe","last":"colborne","goals":[3,18,13],"assists":[6,20,24],"gp":[26,67,82],"born":"1990/01/30"}
'
2. 案例
可以从名为doc的Map访问文档值。
-
获取所有id下的goals的总和
GET hockey/_search { "query": { "match_all": {} }, "script_fields": { "total_goals": { "script": { "lang": "painless", "source": """ int total = 0; for (int i = 0; i < doc['goals'].length; ++i) { total += doc['goals'][i]; } return total; """ } } } }
kibana中执行结果如下:
1.png
-
获取字段first和last的拼接的结果
GET hockey/_search { "query": { "match_all": {} }, "sort": { "_script": { "type": "string", "order": "asc", "script": { "lang": "painless", "source": "doc['first.keyword'].value + ' ' + doc['last.keyword'].value" } } } }
说明1:如果doc['field'].value 中的field不存在。那么返回的结果如下:
- 0
if a
field has a numeric datatype (long, double etc.) -
false
is afield
has a boolean datatype - epoch date if a
field
has a date datatype -
null
if afield
has a string datatype -
null
if afield
has a geo datatype -
""
if afield
has a binary datatype
说明2:从es7.0开始,如果doc['field'].value 中的field不存在,将抛出异常。但是这里有个前提是你必须设置jvm.option - Des.scripting.exception_for_missing_value=true 。可以通过doc['field'].size() == 0 来判断是都存在field
- 0
-
更新字段的值
先查看一下内容:
GET hockey/_search { "stored_fields": [ "_id", "_source" ], "query": { "term": { "_id": 1 } } }
部分内容如下:
{ "_index" : "hockey", "_type" : "player", "_id" : "1", "_score" : 1.0, "_source" : { "first" : "johnny", "last" : "gaudreau", "goals" : [9,27,1], "assists" : [17,46,0], "gp" : [26,82,1], "born" : "1993/08/13" }
更新脚本如下:
POST hockey/player/1/_update { "script": { "lang": "painless", "source": "ctx._source.last = params.last", "params": { "last": "hockey" } } }
执行结果如下:
{ "_index" : "hockey", "_type" : "player", "_id" : "1", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 3, "_primary_term" : 3 }
再次查看内容即可,结果不在显示。
更新多个字段:
POST hockey/player/1/_update { "script": { "lang": "painless", "source": """ ctx._source.last = params.last; ctx._source.nick = params.nick """, "params": { "last": "gaudreau", "nick": "hockey" } } }
会新增一个字段:nick,结果如下:
{ "_index" : "hockey", "_type" : "player", "_id" : "1", "_score" : 1.0, "_source" : { "first" : "johnny", "last" : "gaudreau", //删掉了部分字段 "born" : "1993/08/13", "nick" : "hockey" }
-
日期的操作
日期字段显示为ReadableDateTime,支持getYear,getDayOfWeek等方法。 使用getMillis获取epoch以来的毫秒数。 要在脚本中使用这些,请省略get前缀并继续小写方法名称的其余部分。 例如,以下每个hockey 的出生年份返回:
GET hockey/_search { "script_fields": { "birth_year": { "script": { "source": "doc.born.value.year" } } } }
返回部分结果如下:
"fields" : { "birth_year" : [ 1996 ] }
-
正则表达式
默认情况下禁用正则表达式,因为它们可以避免针对长时间运行和内存耗尽的脚本的无痛保护。 更糟糕的是,即使看起来无害的正则表达式也可能具有惊人的性能和堆栈深度行为。 它们仍然是一个非常强大的工具,但是在默认情况下启用它太可怕了。 要自己启用它们,请在elasticsearch.yml中设置
script.painless.regex.enabled:true。
语法结构如下:
- / pattern /:模式文字创建模式。 这是在创建模式的唯一方法。
- =〜:find运算符返回一个布尔值,如果文本的子序列匹配,则返回true,否则返回false。
- ==〜:匹配运算符返回一个布尔值,如果文本匹配则返回true,否则返回false。
查找last中有b的情况:
POST hockey/player/_update_by_query { "script": { "lang": "painless", "source": """ if (ctx._source.last =~ /b/) { ctx._source.last += "matched"; } else { ctx.op = "noop"; } """ } }
查看辅音开始元音结束的names
POST hockey/player/_update_by_query { "script": { "lang": "painless", "source": """ if (ctx._source.last ==~ /[^aeiou].*[aeiou]/) { ctx._source.last += "matched"; } else { ctx.op = "noop"; } """ } }
使用
Pattern.matcher
直接获取Matcher
实例 并且删除所有的last names中的元音POST hockey/player/_update_by_query { "script": { "lang": "painless", "source": "ctx._source.last = /[aeiou]/.matcher(ctx._source.last).replaceAll('')" } }
last names中的所有元音字母大写:
POST hockey/player/_update_by_query { "script": { "lang": "painless", "source": """ ctx._source.last = ctx._source.last.replaceAll(/[aeiou]/, m -> m.group().toUpperCase(Locale.ROOT)) """ } }