FOREACH的嵌套用法以及Cypher的笛卡尔乘积效应

2019-03-17 本文已影响39人 redcohen

Cypher里面的FOREACH

foreach这种列表式遍历访问语法几乎在所有语言中都支持，Cypher也不例外。

FOREACH的语法介绍如下：

The FOREACH clause is used to update data within a list, whether components of a path, or result of aggregation.

代码类似：

FOREACH ( e in some LIST | Do any updating commands)

Cypher的FOREACH也可以多层嵌套

FOREACH右侧的UPDATE命令可以是 CREATE, CREATE UNIQUE, MERGE, DELETE, and FOREACH。官网原文如下解释:

Within the FOREACH parentheses, you can do any of the updating commands — CREATE, CREATE UNIQUE, MERGE, DELETE, and FOREACH.

大家注意包括FOREACH自己，因此Cypher的FOREACH 也是可以支持嵌套的(nested)，也就是多层FOREACH一起用。

类似：

FOREACH (e in LIST_1  |  
     FOREACH ( f in LIST_2 | 
                     updating cmd1
                     updating cmd2) )

FOREACH 变量的上下文局部性

有一点提醒，就是在FOREACH括号里面的变量上下文和括号外隔离。因此，在外面再想访问在内部创建的node变量，需要再次MATCH找回。

官方文档也做了说明：

The variable context within the FOREACH parenthesis is separate from the one outside it. This means that if you CREATE a node variable within a FOREACH, you will not be able to use it outside of the foreach statement, unless you match to find it.

多层FOREACH 嵌套使用时注意笛卡尔乘机效应

但是大家要注意一个Cypher中MATCH的笛卡尔乘积效应。

比如，你要针对所有超市Market，所有水果Fruit，以及所有客户Customer进行关联。假设三类都各有10种类型，那么三层循环应该是10×10×10=1000次。

最'直接'的代码实现：

   MATCH (m:Market), (f:Fruit), (c:Customer)
   WITH  Collect(m) as mList, Collect(f) as fList,Collect(c) as cList
   FOREACH (m in mList | 
        FOREACH (f in fList |  
            FOREACH (c in cList |  
                   DO STH )

但是，结果会让人崩溃，mList, fList, cList的size都是1000。因此，这个三层循环需要执行共1000^3次。

原因是第一行MATCH 回来的是笛卡尔乘积。

解决的办法也很简单，就是在第二行的Collect操作里面加DISTINCT限定。

   MATCH (m:Market), (f:Fruit), (c:Customer)
   WITH  Collect(DISTINCT m) as mList, Collect(DISTINCT f) as fList,Collect(DISTINCT c) as cList
   FOREACH (m in mList | 
        FOREACH (f in fList |  
            FOREACH (c in cList |  
                   DO STH )

这样，只会执行1000次。

FOREACH的嵌套用法以及Cypher的笛卡尔乘积效应

Cypher里面的FOREACH

Cypher的FOREACH也可以多层嵌套

FOREACH 变量的上下文局部性

多层FOREACH 嵌套使用时注意笛卡尔乘机效应

猜你喜欢

热点阅读