ES搜索篇-折叠搜索(Collapse Search)
2022-06-14 本文已影响0人
走过分叉路
1、基础知识
- 折叠使用的关键字必须是单个值的keyword类型或者numeric类型并且存放于doc_values数据结构中,如果折叠字段是数组类型的则不支持
- 折叠不影响搜索结果中的total计数,如果想要统计折叠结果中的唯一组的个数,可以使用聚合
测试数据
[
{
"name": "xiaomi phone",
"desc": "shouji zhong de zhandouji",
"price": 3999,
"tags": [
"xingjiabi",
"fashao",
"buka"
]
},
{
"name": "xiaomi phone",
"desc": "exercitation commodo cillum",
"price": 23,
"tags": [
"sunt"
]
},
{
"name": "几水即统",
"desc": "labore commodo ullamco",
"price": 46,
"tags": [
"qui",
"proident",
"ut Duis sint cillum"
]
},
{
"name": "天常表效",
"desc": "fugiat culpa dolor",
"price": 46,
"tags": [
"non cupidatat aute magna occaecat",
"veniam reprehenderit",
"nulla commodo quis laborum",
"fugiat ex minim nulla"
]
},
{
"name": "上容等位得志",
"desc": "consequat do laboris magna anim",
"price": 46,
"tags": [
"amet ex ut sed aute",
"dolor veniam",
"consequat ut aute sunt fugiat",
"ullamco ipsum sed",
"tempor enim veniam eu consectetur"
]
}
]
2、搜索案例
2.1折叠搜索
搜索语句
- 此处query查询条件为的是匹配全部文档;
- 按照price字段进行折叠;
- 相同价格的以字段价格排序;
- from参数表示跳过几个折叠后的结果,如果折叠后有3个结果且from=1,则会从第二个折叠文档开始显示
{
"query": {
"exists": {
"field": "name"
}
},
"collapse": {
"field": "price"
},
"sort": [
{
"price": {
"order": "desc"
}
}
],
"from": 0
}
搜索结果
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 5, // 这里的值是折叠前的文档个数
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "product",
"_id": "1",
"_score": null,
"_source": {
"name": "xiaomi phone",
"desc": "shouji zhong de zhandouji",
"price": 3999,
"tags": [
"xingjiabi",
"fashao",
"buka"
]
},
"fields": {
"price": [
3999
]
},
"sort": [
3999
]
},
{
"_index": "product",
"_id": "3",
"_score": null,
"_source": {
"name": "几水即统",
"desc": "labore commodo ullamco",
"price": 46,
"tags": [
"qui",
"proident",
"ut Duis sint cillum"
]
},
"fields": {
"price": [
46
]
},
"sort": [
46
]
},
{
"_index": "product",
"_id": "2",
"_score": null,
"_source": {
"name": "xiaomi phone",
"desc": "exercitation commodo cillum",
"price": 23,
"tags": [
"sunt"
]
},
"fields": {
"price": [
23
]
},
"sort": [
23
]
}
]
}
}
2.2折叠扩展
- 关键参数为inner_hits,如果一个折叠key折叠了100个搜索结果,我们可以只取前面N个,N由size参数定义
- max_concurrent_group_searches参数:由于折叠扩展的inner_hits需要发送额外的请求,所以限制这个并发请求的个数是有必要的,默认值由es的data节点数和查询线程池大小决定。
- inner_hits也可以是一个数组,这样可以按照不同维度取回自己需要的结果,例如
"inner_hits": [
{
"name": "largest_responses",
"size": 3,
"sort": [
{
"http.response.bytes": {
"order": "desc"
}
}
]
},
{
"name": "most_recent",
"size": 3,
"sort": [
{
"@timestamp": {
"order": "desc"
}
}
]
}
]
- 折叠扩展的实现原理:
每个折叠组的每个inner_hits都是通过发送额外的查询请求完成的,如果有太多这样的请求,那么响应速度会显著下降。 - inner_hits内部支持2次折叠,例如:
{
"query": {
"match": {
"message": "GET /search"
}
},
"collapse": {
"field": "geo.country_name",
"inner_hits": {
"name": "by_location",
"collapse": { "field": "user.id" },
"size": 3
}
}
}
搜索条件
{
"query": {
"exists": {
"field": "name"
}
},
"collapse": {
"field": "price",
"inner_hits":{
"name":"top_2_price",
"size":2,
"sort":[
{
"price":"desc"
}
]
},
"max_concurrent_group_searches":1
},
"sort": [
{
"price": {
"order": "desc"
}
}
],
"from": 0
}
响应
可以看到,价格为46的文档返回了2个
{
"took": 22,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 5,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "product",
"_id": "1",
"_score": null,
"_source": {
"name": "xiaomi phone",
"desc": "shouji zhong de zhandouji",
"price": 3999,
"tags": [
"xingjiabi",
"fashao",
"buka"
]
},
"fields": {
"price": [
3999
]
},
"sort": [
3999
],
"inner_hits": {
"top_2_price": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "product",
"_id": "1",
"_score": null,
"_source": {
"name": "xiaomi phone",
"desc": "shouji zhong de zhandouji",
"price": 3999,
"tags": [
"xingjiabi",
"fashao",
"buka"
]
},
"sort": [
3999
]
}
]
}
}
}
},
{
"_index": "product",
"_id": "3",
"_score": null,
"_source": {
"name": "几水即统",
"desc": "labore commodo ullamco",
"price": 46,
"tags": [
"qui",
"proident",
"ut Duis sint cillum"
]
},
"fields": {
"price": [
46
]
},
"sort": [
46
],
"inner_hits": {
"top_2_price": {
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "product",
"_id": "3",
"_score": null,
"_source": {
"name": "几水即统",
"desc": "labore commodo ullamco",
"price": 46,
"tags": [
"qui",
"proident",
"ut Duis sint cillum"
]
},
"sort": [
46
]
},
{
"_index": "product",
"_id": "4",
"_score": null,
"_source": {
"name": "天常表效",
"desc": "fugiat culpa dolor",
"price": 46,
"tags": [
"non cupidatat aute magna occaecat",
"veniam reprehenderit",
"nulla commodo quis laborum",
"fugiat ex minim nulla"
]
},
"sort": [
46
]
}
]
}
}
}
},
{
"_index": "product",
"_id": "2",
"_score": null,
"_source": {
"name": "xiaomi phone",
"desc": "exercitation commodo cillum",
"price": 23,
"tags": [
"sunt"
]
},
"fields": {
"price": [
23
]
},
"sort": [
23
],
"inner_hits": {
"top_2_price": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "product",
"_id": "2",
"_score": null,
"_source": {
"name": "xiaomi phone",
"desc": "exercitation commodo cillum",
"price": 23,
"tags": [
"sunt"
]
},
"sort": [
23
]
}
]
}
}
}
}
]
}
}