mongo回顾(九:聚合查询三)
mongo的聚合查询操作符是比较多的,平常也无需全部用法都记住,有个大体概念后,使用时翻阅下官方文档也是一个不错的方法,今天继续介绍mongo比较常用的聚合操作
常见步骤中的运算符

这几个运算符见名知意就不做具体介绍了
接下来分别介绍几个常用聚合操作;
$addFields
$addFields将新字段追加到现有文档中。您可以$addFields在聚合操作中包括一个或多个阶段,我一般用于操作文档中的数组并返回结果,当然在project中用sum也能达到这个效果
{
_id: 1,
student: "Maya",
homework: [ 10, 5, 10 ],
quiz: [ 10, 8 ],
extraCredit: 0
}
{
_id: 2,
student: "Ryan",
homework: [ 5, 6, 5 ],
quiz: [ 8, 8 ],
extraCredit: 8
}
db.scores.aggregate( [
{
$addFields: {
totalHomework: { $sum: "$homework" } ,
totalQuiz: { $sum: "$quiz" }
}
},
{
$addFields: { totalScore:
{ $add: [ "$totalHomework", "$totalQuiz", "$extraCredit" ] } }
}
] )
//result
{
"_id" : 1,
"student" : "Maya",
"homework" : [ 10, 5, 10 ],
"quiz" : [ 10, 8 ],
"extraCredit" : 0,
"totalHomework" : 25,
"totalQuiz" : 18,
"totalScore" : 43
}
{
"_id" : 2,
"student" : "Ryan",
"homework" : [ 5, 6, 5 ],
"quiz" : [ 8, 8 ],
"extraCredit" : 8,
"totalHomework" : 16,
"totalQuiz" : 16,
"totalScore" : 40
}
$facet
facet将多个聚合操作的结果结合起来,成为最终的输出文档
$group
同sql的group,对字段分组之后进行处理
group的内存限制为100 MB,超过会报错,或者运行磁盘操作allowDiskUse:true,不建议
{
$group:
{
_id: <expression>, // Group By Expression
<field1>: { <accumulator1> : <expression1> },//可选
...
}
}
名称 | 描述 |
---|---|
$addToSet |
返回每个组的唯一表达式值的数组。数组元素的顺序未定义。 |
$avg |
返回数值的平均值。忽略非数字值。 |
$first |
从每个组的第一个文档返回一个值。仅当文档按定义的顺序定义顺序。 |
$last |
从每个组的最后一个文档返回一个值。仅当文档按定义的顺序定义顺序。 |
$max |
返回每个组的最高表达式值。 |
$mergeObjects |
返回通过组合每个组的输入文档而创建的文档。 |
$min |
返回每个组的最低表达式值。 |
$push |
返回每个组的表达式值数组。 |
$stdDevPop |
返回输入值的总体标准偏差。 |
$stdDevSamp |
返回输入值的样本标准偏差。 |
$sum |
返回数值的总和。忽略非数字值。 |
如果id为null的group,不建议使用,mongo不适合大范围聚合
db.sales.aggregate([
{
$group : {
_id : null,
totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
averageQuantity: { $avg: "$quantity" },
count: { $sum: 1 }
}
}
])
//等效于以下SQL语句:
SELECT Sum(price * quantity) AS totalSaleAmount,
Avg(quantity) AS averageQuantity,
Count(*) AS Count
FROM sales
group push的使用
db.books.insertMany([
{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 },
{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
])
db.books.aggregate([
// First Stage
{
$group : { _id : "$author", books: { $push: "$$ROOT" } }
},
// Second Stage
{
$addFields:
{
totalCopies : { $sum: "$books.copies" }
}
}
])
根据author分组后把根数据都放进book数组中,再往数组中增加totalCopies字段,结果为
{ "_id" : "Homer",
"books" :
[
{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
]
},
{ "_id" : "Dante",
"books" :
[
{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
]
}
$out
$out
如果当前数据库中尚不存在一个新集合,则该操作将创建一个新集合。在聚合完成之前,该集合不可见。如果聚合失败,则MongoDB不会创建集合
out只能用于聚合操作的最后一个管道,4.2版本后可以用功能更完善的merge来替代
{ $out: { db: "<output-db>", coll: "<output-collection>" } }
若输出的集合不存在则新建集合,若存在,更新集合内容,并不会影响到索引
集合books包含以下文档:
{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }
{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }
{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }
{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
以下聚合操作将books 集合中的数据按作者分组,然后将结果写入到authors集合中。
db.books.aggregate( [
{ $group : { _id : "$author", books: { $push: "$title" } } },
{ $out : "authors" }
] )
操作后,authors集合包含以下文档:
{ "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] }
{ "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }
$sortByCount
$sortByCount等效于$group+ $sort
即为分组后排序
{ $group: { _id: <expression>, count: { $sum: 1 } } },
{ $sort: { count: -1 } }
{ "_id" : 1, "title" : "The Pillars of Society", "artist" : "Grosz", "year" : 1926, "tags" : [ "painting", "satire", "Expressionism", "caricature" ] }
{ "_id" : 2, "title" : "Melancholy III", "artist" : "Munch", "year" : 1902, "tags" : [ "woodcut", "Expressionism" ] }
{ "_id" : 3, "title" : "Dancer", "artist" : "Miro", "year" : 1925, "tags" : [ "oil", "Surrealism", "painting" ] }
{ "_id" : 4, "title" : "The Great Wave off Kanagawa", "artist" : "Hokusai", "tags" : [ "woodblock", "ukiyo-e" ] }
{ "_id" : 5, "title" : "The Persistence of Memory", "artist" : "Dali", "year" : 1931, "tags" : [ "Surrealism", "painting", "oil" ] }
{ "_id" : 6, "title" : "Composition VII", "artist" : "Kandinsky", "year" : 1913, "tags" : [ "oil", "painting", "abstract" ] }
{ "_id" : 7, "title" : "The Scream", "artist" : "Munch", "year" : 1893, "tags" : [ "Expressionism", "painting", "oil" ] }
{ "_id" : 8, "title" : "Blue Flower", "artist" : "O'Keefe", "year" : 1918, "tags" : [ "abstract", "painting" ] }
db.exhibits.aggregate( [ { $unwind: "$tags" }, { $sortByCount: "$tags" } ] )
该操作返回以下文档,按计数降序排列:
{ "_id" : "oil", "count" : 4 }
{ "_id" : "Expressionism", "count" : 3 }
{ "_id" : "Surrealism", "count" : 2 }
{ "_id" : "abstract", "count" : 2 }
{ "_id" : "woodblock", "count" : 1 }
{ "_id" : "woodcut", "count" : 1 }
{ "_id" : "ukiyo-e", "count" : 1 }
{ "_id" : "satire", "count" : 1 }
{ "_id" : "caricature", "count" : 1 }
$unwind
用于展开数组
{
$unwind:
{
path: <field path>,
includeArrayIndex: <string>,
preserveNullAndEmptyArrays: <boolean>
}
}
includeArrayIndex 可选值。一个新字段的名称,用于保存元素的数组索引。
preserveNullAndEmptyArrays 可选值。
- 如果为
true
,则如果path
为null,丢失或为空数组,则$unwind
输出文档。 - 如果为
false
,如果path
为null,缺少或为空数组,$unwind
则不会输出文档。
默认值为false
。
db.inventory2.insertMany([
{ "_id" : 1, "item" : "ABC", price: NumberDecimal("80"), "sizes": [ "S", "M", "L"] },
{ "_id" : 2, "item" : "EFG", price: NumberDecimal("120"), "sizes" : [ ] },
{ "_id" : 3, "item" : "IJK", price: NumberDecimal("160"), "sizes": "M" },
{ "_id" : 4, "item" : "LMN" , price: NumberDecimal("10") },
{ "_id" : 5, "item" : "XYZ", price: NumberDecimal("5.75"), "sizes" : null }
])
db.inventory2.aggregate( [ { $unwind: "$sizes" } ] )
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "L" }
{ "_id" : 3, "item" : "IJK", "price" : NumberDecimal("160"), "sizes" : "M" }
若是preserveNullAndEmptyArrays 为true时
db.inventory2.aggregate( [
{ $unwind: { path: "$sizes", preserveNullAndEmptyArrays: true } }
] )
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "L" }
{ "_id" : 2, "item" : "EFG", "price" : NumberDecimal("120") }
{ "_id" : 3, "item" : "IJK", "price" : NumberDecimal("160"), "sizes" : "M" }
{ "_id" : 4, "item" : "LMN", "price" : NumberDecimal("10") }
{ "_id" : 5, "item" : "XYZ", "price" : NumberDecimal("5.75"), "sizes" : null }
总结:mongo的聚合操作对文档的操作还是很全面的,也比较灵活,对于处理数据量不大的情况,大胆使用
下面一篇准备聊聊mongo是内存数据库吗?