[Theory] Streams "on" CPS

2020-06-24 本文已影响0人何幻

以前写的一篇文章

1. 处理状态的几种方式

In information technology and computer science, a program is described as stateful if it is designed to remember preceding events or user interactions; the remembered information is called the state of the system.

维基百科中是这样解释状态（state）的，系统状态指的是程序记住的那些信息。
虽然这样理解起来比较抽象，但状态却是和我们的日常开发息息相关的。

1.1 命令式

for (let i = 0; i < 10; i++) {
    console.log(i);  // 0 1 2 3 4 5 6 7 8 9
}

其中变量 i，就保存了程序的状态，在执行的过程中 i 的值是会变的。
更重要的是，如果程序在执行过程中停止了，我们就很难直接从停止的位置继续执行了。
以上程序会依次输出 0~9 共十个数字。

现在我们想要让程序输出的数字翻倍，该怎样编写呢？

for (let i = 0; i < 10; i++) {
    const r = i * 2;
    console.log(r);  // 0 2 4 6 8 10 12 14 16 18
}

我们必须手动处理每一个 i 值。

这种做法经常会给调试带来麻烦，
因为如果我们要想调试 i 等于 5 时的程序功能，就必须先把前面 i 等于 0~4 也执行一遍。

1.2 流式

为了克服这种麻烦事，人们想到了流（stream），

流可能缓和状态模拟中的复杂性。—— 《SICP P220》

In computer science, a stream is a sequence of data elements made available over time.

下面我们直接看例子，因为将流（stream）类比为“河流”多多少少也是不准确的。

const stream = cont => {
    for (let i = 0; i < 10; i++) {
        cont(i);
    }
};

stream(r => console.log(r));  // 0 1 2 3 4 5 6 7 8 9

值得注意的是，流（stream）只是一种处理数据的方法，所以其表现形式并不是唯一的，
本例中，我们将流（stream）表示成了一个函数。

另外一种常见的表现形式是使用 generator，

const makeStream = function* () {
    for (let i = 0; i < 10; i++) {
        yield i;
    }
};

const stream = makeStream();

for (let r of stream) {
    console.log(r);  // 0 1 2 3 4 5 6 7 8 9
}

其中，makeStream 并不是流（stream），它只是一个流（stream）的生成器，
makeStream 会返回一个 iterator，用来表示流（stream）。

1.3 面向对象

面向对象也可以用来处理状态，
它将系统看成是一些互相影响的组成部分，每个组成部分可以有自己的状态，
系统的演变，通过这些组成部分之间相互传递消息来完成。

这并不是本文的重点，因此先略过。

2. 流（stream）的变换

上一节我们介绍了流（stream），以及流（stream）的不同表现形式，
这一节我们来看看，流（stream）式程序是怎么处理状态问题的。

以下我们选取了流（stream）的函数表示，假如有这样的一个流（stream），

const stream = cont => {
    for (let i = 0; i < 10; i++) {
        cont(i);
    }
};

stream(r => console.log(r));  // 0 1 2 3 4 5 6 7 8 9

我们如何让程序输出的数字翻倍呢？
一个直接的办法就是和命令式程序一样，在 stream 实现中，把每一个 i 翻倍。

const stream = cont => {
    for (let i = 0; i < 10; i++) {
        const r = i * 2;
        cont(r);
    }
};

stream(r => console.log(r));  // 0 2 4 6 8 10 12 14 16 18

除此之外，我们还可以采取另外的办法，
那就是把整个流（stream）作为一个整体进行操作，把当前流（stream）变成一个新的流（stream）。

const stream = cont => {
    for (let i = 0; i < 10; i++) {
        cont(i);
    }
};

const map = stream => cont => stream(i => cont(i * 2));
const anotherStream = map(stream);

anotherStream(r => console.log(r));  // 0 2 4 6 8 10 12 14 16 18

以上代码我们实现了一个 map 函数，它把一个 stream，变成了另外一个 anotherStream。

image.png

这个 map 的内部自动对流（stream）中的每一个元素进行了操作，
但从外面来看，map 的操作对象并不是流（stream）中的元素，而是流（stream）本身。

可以，第一次看到 map 的内部实现，总会有点莫名其妙的感觉。
不明白正常人是怎样写出这么难懂的程序来的？

const map = stream => cont => stream(i => cont(i * 2));

为了理解这么写的原因，
我们还要回过头来重新看待一下函数的调用和返回行为。

3. return vs CPS

在使用函数的时候，我们最常用的写法是，通过函数的入参向函数传值，然后通过函数的返回值得到结果。

const f = function (x) {
    return x + 1;
};

const r = f(1);
console.log(r);  // 2

当在一个函数中调用另一个函数时，这种用法将更为明显，

const f = function (x) {
    return g(x + 1);
};

const g = function (y) {
    return y * 2
};

const r = f(1);
console.log(r);  // 4

然而，对于像JavaScript这样支持first-class function的编程语言来说，就又多了一种从函数中返回值的方法。
那就是使用回调函数来返回，

const f = function (x, cont) {
    g(x + 1, v => {
        cont(v);
    });
};

const g = function (y, cont) {
    cont(y * 2);
};

f(1, r => {
    console.log(r);  // 4
});

我们给每个函数增加了一个 cont 参数（是一个函数），以便于当函数想要返回的时候去回调它。
像这种充当了函数 return 作用的 cont 参数，可以称之为该函数的 Continuation。

我们再看 f 是怎样调用 g 的，

g(x + 1, v => {
      cont(v);
});

它传递了一个函数 v => { ... }; 过去，作为 g 的 Continuation，
这种编码风格称为，Continuation Passing Style（简称 CPS）。

4. fact && fib

现在我们练习一下，将常见的两个递归函数 fact（阶乘函数）和 fib（斐波那契数列）写成CPS。
为此可以先写出它们的 return 版本，

const fact = n => n == 1 ? 1 : n * fact(n - 1);

const r = fact(10);
console.log(r);  // 3628800

const fib = n => n == 1 || n == 2 ? 1 : fib(n - 1) + fib(n - 2);

const r = fib(10);
console.log(r);  // 55

fact 和 fib 的 CPS 版本如下，其中 fib 的 CPS 版更有难度，
感兴趣的朋友，可以先不看后面的答案，手动写一下。

const fact = (n, cont) => n == 1 ? cont(1) : fact(n - 1, r => cont(n * r));
fact(10, r => {
    console.log(r);  // 3628800
});

const fib = (n, cont) => n == 1 || n == 2 ? cont(1) : fib(n - 1, r1 => fib(n - 2, r2 => cont(r1 + r2)));
fib(10, r => {
    console.log(r);  // 55
});

5. 反转一个流

流的反转是一个有趣的问题，我们可以用多种方式来实现，
例如，我们可以先把流（stream）中的元素放到一个数组中，然后再反转这个数组。

const stream = cont => {
    for (let i = 0; i < 10; i++) {
        cont(i);
    }
};

const reverse = stream => cont => {
    let memo = [];
    stream(r => memo.push(r));
    memo.reverse().forEach(r => cont(r));
};

const reverseStream = reverse(stream);
reverseStream(r => {
    console.log(r);
});

另外一个极其巧妙的办法是，把流（stream）中的元素倒序放到一个惰性函数中，
最后再调用一下这个函数，触发 cont。

const stream = cont => {
    for (let i = 0; i < 10; i++) {
        cont(i);
    }
};

const reverse = stream => cont => {
    let memo = () => 0;

    // 将 memo 保存到闭包中，避免取值时只能取到最终的那个 memo
    stream(r => (oldMemo => {
        memo = () => oldMemo(cont(r));
    })(memo));
  
    memo();
};

const reverseStream = reverse(stream);
reverseStream(r => {
    console.log(r);
});

这个思路来源于《Real World Haskell》第4章，Folding from the right，
用 foldr 实现 foldl 的技巧。

myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl f z xs = foldr step id xs z
    where step x g a = g (f a x)

翻译成 JavaScript，我们就可以用 reduce （实际上是 reduceLeft ）实现 reduceRight 了，

const reduceRight = (items, fn, init) => items.reduce(
    (memo, val) => x => memo(fn(x, val)),
    x => x
)(init);

const items = [1, 2, 3];

const r1 = items.reduce((memo, val) => memo + val, '');
console.log(r1);  // 123

const r2 = reduceRight(items, (memo, val) => memo + val, '');
console.log(r2);  // 321

其中，items.reduce((memo, val) => x => memo(fn(x, val)), x => x) 返回了一个函数。
reduceRight 是这个函数调用 init 的结果。

我们可以仔细分析一下 memo 的变化规律，

memo0: x => x
memo0(init): init

memo1: x => memo0(fn(x, val1)) 
memo1(init): memo0(fn(init, val1)) = fn(init, val1)

memo2: x => memo1(fn(x, val2))
memo2(init): memo1(fn(init, val2)) = fn(fn(init, val2), val1)

memo3: x => memo2(fn(x, val3))
memo3(init): memo2(fn(init, val3)) = fn(fn(fn(init, val3), val2), val1)

...

巧了，最终的 memo(init) 正好是我们要 reduceRight 的结果。

Any sufficiently advanced technology is indistinguishable from magic.

参考

Wikipedia: State (computer science)
Wikipedia: Stream (computing)
Wikipedia: Continuation
Wikipedia: Continuation-passing style
SICP