mongo回顾（九：聚合查询三）

2021-04-15 本文已影响0人 supremecsp

mongo的聚合查询操作符是比较多的，平常也无需全部用法都记住，有个大体概念后，使用时翻阅下官方文档也是一个不错的方法，今天继续介绍mongo比较常用的聚合操作

常见步骤中的运算符

运算符.png
这几个运算符见名知意就不做具体介绍了
接下来分别介绍几个常用聚合操作；
$addFields

$addFields将新字段追加到现有文档中。您可以$addFields在聚合操作中包括一个或多个阶段,我一般用于操作文档中的数组并返回结果,当然在project中用sum也能达到这个效果

{
  _id: 1,
  student: "Maya",
  homework: [ 10, 5, 10 ],
  quiz: [ 10, 8 ],
  extraCredit: 0
}
{
  _id: 2,
  student: "Ryan",
  homework: [ 5, 6, 5 ],
  quiz: [ 8, 8 ],
  extraCredit: 8
}

db.scores.aggregate( [
   {
     $addFields: {
       totalHomework: { $sum: "$homework" } ,
       totalQuiz: { $sum: "$quiz" }
     }
   },
   {
     $addFields: { totalScore:
       { $add: [ "$totalHomework", "$totalQuiz", "$extraCredit" ] } }
   }
] )
//result
{
  "_id" : 1,
  "student" : "Maya",
  "homework" : [ 10, 5, 10 ],
  "quiz" : [ 10, 8 ],
  "extraCredit" : 0,
  "totalHomework" : 25,
  "totalQuiz" : 18,
  "totalScore" : 43
}
{
  "_id" : 2,
  "student" : "Ryan",
  "homework" : [ 5, 6, 5 ],
  "quiz" : [ 8, 8 ],
  "extraCredit" : 8,
  "totalHomework" : 16,
  "totalQuiz" : 16,
  "totalScore" : 40
}

$facet

facet将多个聚合操作的结果结合起来，成为最终的输出文档

$group

同sql的group，对字段分组之后进行处理
group的内存限制为100 MB，超过会报错，或者运行磁盘操作allowDiskUse:true,不建议

{
  $group:
    {
      _id: <expression>, // Group By Expression
      <field1>: { <accumulator1> : <expression1> },//可选
      ...
    }
 }

名称	描述
`$addToSet`	返回每个组的唯一表达式值的数组。数组元素的顺序未定义。
`$avg`	返回数值的平均值。忽略非数字值。
`$first`	从每个组的第一个文档返回一个值。仅当文档按定义的顺序定义顺序。
`$last`	从每个组的最后一个文档返回一个值。仅当文档按定义的顺序定义顺序。
`$max`	返回每个组的最高表达式值。
`$mergeObjects`	返回通过组合每个组的输入文档而创建的文档。
`$min`	返回每个组的最低表达式值。
`$push`	返回每个组的表达式值数组。
`$stdDevPop`	返回输入值的总体标准偏差。
`$stdDevSamp`	返回输入值的样本标准偏差。
`$sum`	返回数值的总和。忽略非数字值。

如果id为null的group,不建议使用，mongo不适合大范围聚合

db.sales.aggregate([
  {
    $group : {
       _id : null,
       totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
       averageQuantity: { $avg: "$quantity" },
       count: { $sum: 1 }
    }
  }
 ])
//等效于以下SQL语句：
SELECT Sum(price * quantity) AS totalSaleAmount,
       Avg(quantity)         AS averageQuantity,
       Count(*)              AS Count
FROM   sales

group push的使用

db.books.insertMany([
  { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
  { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
  { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 },
  { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
  { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
])

db.books.aggregate([
   // First Stage
   {
     $group : { _id : "$author", books: { $push: "$$ROOT" } }
   },
   // Second Stage
   {
     $addFields:
       {
         totalCopies : { $sum: "$books.copies" }
       }
   }
 ])

根据author分组后把根数据都放进book数组中，再往数组中增加totalCopies字段，结果为

{ "_id" : "Homer",
  "books" :
    [
       { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
       { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
    ]
 },
 { "_id" : "Dante",
   "books" :
     [
       { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
       { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
       { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
     ]
 }

$out

$out如果当前数据库中尚不存在一个新集合，则该操作将创建一个新集合。在聚合完成之前，该集合不可见。如果聚合失败，则MongoDB不会创建集合
out只能用于聚合操作的最后一个管道，4.2版本后可以用功能更完善的merge来替代
{ $out: { db: "<output-db>", coll: "<output-collection>" } }
若输出的集合不存在则新建集合，若存在，更新集合内容，并不会影响到索引

集合books包含以下文档：

{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }
{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }
{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }
{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }

以下聚合操作将books 集合中的数据按作者分组，然后将结果写入到authors集合中。

db.books.aggregate( [
                      { $group : { _id : "$author", books: { $push: "$title" } } },
                      { $out : "authors" }
                  ] )

操作后，authors集合包含以下文档：

{ "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] }
{ "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }

$sortByCount

$sortByCount等效于$group+ $sort
即为分组后排序

{ $group: { _id: <expression>, count: { $sum: 1 } } },
{ $sort: { count: -1 } }

{ "_id" : 1, "title" : "The Pillars of Society", "artist" : "Grosz", "year" : 1926, "tags" : [ "painting", "satire", "Expressionism", "caricature" ] }
{ "_id" : 2, "title" : "Melancholy III", "artist" : "Munch", "year" : 1902, "tags" : [ "woodcut", "Expressionism" ] }
{ "_id" : 3, "title" : "Dancer", "artist" : "Miro", "year" : 1925, "tags" : [ "oil", "Surrealism", "painting" ] }
{ "_id" : 4, "title" : "The Great Wave off Kanagawa", "artist" : "Hokusai", "tags" : [ "woodblock", "ukiyo-e" ] }
{ "_id" : 5, "title" : "The Persistence of Memory", "artist" : "Dali", "year" : 1931, "tags" : [ "Surrealism", "painting", "oil" ] }
{ "_id" : 6, "title" : "Composition VII", "artist" : "Kandinsky", "year" : 1913, "tags" : [ "oil", "painting", "abstract" ] }
{ "_id" : 7, "title" : "The Scream", "artist" : "Munch", "year" : 1893, "tags" : [ "Expressionism", "painting", "oil" ] }
{ "_id" : 8, "title" : "Blue Flower", "artist" : "O'Keefe", "year" : 1918, "tags" : [ "abstract", "painting" ] }

db.exhibits.aggregate( [ { $unwind: "$tags" },  { $sortByCount: "$tags" } ] )

该操作返回以下文档，按计数降序排列：
{ "_id" : "oil", "count" : 4 }
{ "_id" : "Expressionism", "count" : 3 }
{ "_id" : "Surrealism", "count" : 2 }
{ "_id" : "abstract", "count" : 2 }
{ "_id" : "woodblock", "count" : 1 }
{ "_id" : "woodcut", "count" : 1 }
{ "_id" : "ukiyo-e", "count" : 1 }
{ "_id" : "satire", "count" : 1 }
{ "_id" : "caricature", "count" : 1 }

$unwind

用于展开数组

{
  $unwind:
    {
      path: <field path>,
      includeArrayIndex: <string>,
      preserveNullAndEmptyArrays: <boolean>
    }
}

includeArrayIndex 可选值。一个新字段的名称，用于保存元素的数组索引。

preserveNullAndEmptyArrays 可选值。

如果为true，则如果path为null，丢失或为空数组，则$unwind输出文档。
如果为false，如果path为null，缺少或为空数组，$unwind则不会输出文档。
默认值为false。

db.inventory2.insertMany([
  { "_id" : 1, "item" : "ABC", price: NumberDecimal("80"), "sizes": [ "S", "M", "L"] },
  { "_id" : 2, "item" : "EFG", price: NumberDecimal("120"), "sizes" : [ ] },
  { "_id" : 3, "item" : "IJK", price: NumberDecimal("160"), "sizes": "M" },
  { "_id" : 4, "item" : "LMN" , price: NumberDecimal("10") },
  { "_id" : 5, "item" : "XYZ", price: NumberDecimal("5.75"), "sizes" : null }
])

db.inventory2.aggregate( [ { $unwind: "$sizes" } ] )

{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "L" }
{ "_id" : 3, "item" : "IJK", "price" : NumberDecimal("160"), "sizes" : "M" }

若是preserveNullAndEmptyArrays 为true时

db.inventory2.aggregate( [
   { $unwind: { path: "$sizes", preserveNullAndEmptyArrays: true } }
] )

{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "L" }
{ "_id" : 2, "item" : "EFG", "price" : NumberDecimal("120") }
{ "_id" : 3, "item" : "IJK", "price" : NumberDecimal("160"), "sizes" : "M" }
{ "_id" : 4, "item" : "LMN", "price" : NumberDecimal("10") }
{ "_id" : 5, "item" : "XYZ", "price" : NumberDecimal("5.75"), "sizes" : null }

总结：mongo的聚合操作对文档的操作还是很全面的，也比较灵活，对于处理数据量不大的情况，大胆使用
下面一篇准备聊聊mongo是内存数据库吗？

mongo回顾（九：聚合查询三）

猜你喜欢

热点阅读