屠龙术

在 class 文件中生成所有的字节码指令(一)

2021-08-29  本文已影响0人  jyjz2008

参考资料

  1. 2.11. Instruction Set Summary
  2. Chapter 6. The Java Virtual Machine Instruction Set
  3. Chapter 7. Opcode Mnemonics by Opcode
  4. 深入理解JVM字节码2.3.1 加载和存储指令小节
  5. 深入理解Java虚拟机(第3版) 的附录C

正文

Java Virtual Machine 中用到的所有字节码指令可以在 Chapter 6. The Java Virtual Machine Instruction Set 中找到。字节码指令一共有两百多个。

我们可以尝试写一些 java 代码,让这两百多个字节码指令出现在编译后的 class 文件中。如果想让这两百个多个字节码指令出现在同一个 class 文件中,那么对应的 java 文件可能会比较臃肿。我们换个思路。可以写若干个 java 文件,每编译一个 java 文件,我们就在对应的 class 文件中用到一部分字节码指令。这样也算是用到了所有的字节码指令。
例如如下的 java 文件编译后,应该会用到 aconst_null 指令。

public class Temp {
  public Object f() {
    return null;
  }
}

本文先处理一部分 加载和存储指令(Load and Store Instructions)

The Java® Virtual Machine Specification 中对 加载和存储指令 的描述

The Java® Virtual Machine Specification 2.11.2 小节 如下

2.11.2. Load and Store Instructions

The load and store instructions transfer values between the local variables (§2.6.1) and the operand stack (§2.6.2) of a Java Virtual Machine frame (§2.6):

  • Load a local variable onto the operand stack: iload, iload_<n>, lload, lload_<n>, fload, fload_<n>, dload, dload_<n>, aload, aload_<n>.

  • Store a value from the operand stack into a local variable: istore, istore_<n>, lstore, lstore_<n>, fstore, fstore_<n>, dstore, dstore_<n>, astore, astore_<n>.

  • Load a constant on to the operand stack: bipush, sipush, ldc, ldc_w, ldc2_w, aconst_null, iconst_m1, iconst_<i>, lconst_<l>, fconst_<f>, dconst_<d>.

  • Gain access to more local variables using a wider index, or to a larger immediate operand: wide.

Instructions that access fields of objects and elements of arrays (§2.11.5) also transfer data to and from the operand stack.

Instruction mnemonics shown above with trailing letters between angle brackets (for instance, iload_<n>) denote families of instructions (with members iload_0, iload_1, iload_2, and iload_3 in the case of iload_<n>). Such families of instructions are specializations of an additional generic instruction (iload) that takes one operand. For the specialized instructions, the operand is implicit and does not need to be stored or fetched. The semantics are otherwise the same (iload_0 means the same thing as iload with the operand 0). The letter between the angle brackets specifies the type of the implicit operand for that family of instructions: for <n>, a nonnegative integer; for <i>, an int; for <l>, a long; for <f>, a float; and for <d>, a double. Forms for type int are used in many cases to perform operations on values of type byte, char, and short (§2.11.1).

This notation for instruction families is used throughout this specification.

个人理解

刚才的引文中提到 加载和存储指令 分为四类,其中第一类指令的作用是 Load a local variable onto the operand stack, 也就是将 局部变量表 中的变量加载到 操作数栈 上。涉及的指令包括
. iload
. iload_<n> (n 的可能取值是 0, 1, 2, 3)
. lload
. lload_<n> (n 的可能取值是 0, 1, 2, 3)
. fload
. fload_<n> (n 的可能取值是 0, 1, 2, 3)
. dload
. dload_<n> (n 的可能取值是 0, 1, 2, 3)
. aload
. aload_<n> (n 的可能取值是 0, 1, 2, 3)

下面是实战环节

实战

生成 iloadiload_<n> 指令

iloadiload_<n>int 类型有关。 对 iload_<n> 而言,n 的可能取值是 0, 1, 2, 3,所以它其实包含了如下4个指令

  1. iload_0 = 26 (0x1a)
  2. iload_1 = 27 (0x1b)
  3. iload_2 = 28 (0x1c)
  4. iload_3 = 29 (0x1d)

Chapter 6. The Java Virtual Machine Instruction Set 中对 iload_<n> 有如下描述

The <n> must be an index into the local variable array of the current frame (§2.6). The local variable at <n> must contain an int. The value of the local variable at <n> is pushed onto the operand stack.

iload 指令有一个 index 参数
Chapter 6. The Java Virtual Machine Instruction Set 中对 iload 有如下描述

The index is an unsigned byte that must be an index into the local variable array of the current frame (§2.6). The local variable at index must contain an int. The value of the local variable at index is pushed onto the operand stack.

我来写一个程序,看看编译后生成的 class 文件中是否会出现 iloadiload_<n> 指令。
java代码如下

public class Load {
  public static int f(int p0, int p1, int p2, int p3, int p4) {
    return p0 + p1 + p2 + p3 + p4;
  }
}

执行如下的命令对它进行编译,并查看 class 文件中的内容

javac Load.java
javap -cp . -c Load

结果如下

Compiled from "Load.java"
public class Load {
  public Load();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static int f(int, int, int, int, int);
    Code:
       0: iload_0
       1: iload_1
       2: iadd
       3: iload_2
       4: iadd
       5: iload_3
       6: iadd
       7: iload         4
       9: iadd
      10: ireturn
}

可以看到在 f()Code 属性中,iload 以及4iload_<n> 指令都出现了。

生成 lloadlload_<n> 指令

lloadlload_<n>long 类型有关。
这里 n 的可能取值是 0, 1, 2, 3lload_<n> 其实包含了如下4个指令

  1. lload_0 = 30 (0x1e)
  2. lload_1 = 31 (0x1f)
  3. lload_2 = 32 (0x20)
  4. lload_3 = 33 (0x21)

lloadlload_<n> 的这些指令也可以用类似的方式来生成(这里直接贴一下代码/命令/结果,重复的话就不多说了)。

public class Load {
  public static long f0(long p0, long p1, long p2) {
    return p0 + p1 + p2;
  }
  public static long f1(int a, long p0, long p1) {
    return p0 + p1;
  }
}
javac Load.java
javap -cp . -c Load
Compiled from "Load.java"
public class Load {
  public Load();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static long f0(long, long, long);
    Code:
       0: lload_0
       1: lload_2
       2: ladd
       3: lload         4
       5: ladd
       6: lreturn

  public static long f1(int, long, long);
    Code:
       0: lload_1
       1: lload_3
       2: ladd
       3: lreturn
}

生成 floadfload_<n> 指令

floadfload_<n>float 类型有关。
这里 n 的可能取值是 0, 1, 2, 3fload<n> 其实包含了如下4个指令

  1. fload_0 = 34 (0x22)
  2. fload_1 = 35 (0x23)
  3. fload_2 = 36 (0x24)
  4. fload_3 = 37 (0x25)

floadfload_<n> 的这些指令可以参考 iload/iload_<n> 的方式来生成。

public class Load {
  public static float f(float p0, float p1, float p2, float p3, float p4) {
    return p0 + p1 + p2 + p3 + p4;
  }
}

执行如下命令可以编译并查看字节码文件中的内容

javac Load.java 
javap -cp . -c Load

我这里看到的内容如下

Compiled from "Load.java"
public class Load {
  public Load();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static float f(float, float, float, float, float);
    Code:
       0: fload_0
       1: fload_1
       2: fadd
       3: fload_2
       4: fadd
       5: fload_3
       6: fadd
       7: fload         4
       9: fadd
      10: freturn
}

可以看到在 f()Code 属性中,fload 以及4fload_<n> 指令都出现了。

dload 和 dload_<n>

dloaddload_<n>double 类型有关。
这里 n 的可能取值是 0, 1, 2, 3dload_<n> 其实包含了如下4个指令

  1. dload_0 = 38 (0x26)
  2. dload_1 = 39 (0x27)
  3. dload_2 = 40 (0x28)
  4. dload_3 = 41 (0x29)

dloaddload_<n> 的这些指令也可以用类似的方式来生成。

public class Load {
  public static double f0(double p0, double p1, double p2) {
    return p0 + p1 + p2;
  }
  public static double f1(int a, double p0, double p1) {
    return p0 + p1;
  }
}
javac Load.java
javap -cp . -c Load
Compiled from "Load.java"
public class Load {
  public Load();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static double f0(double, double, double);
    Code:
       0: dload_0
       1: dload_2
       2: dadd
       3: dload         4
       5: dadd
       6: dreturn

  public static double f1(int, double, double);
    Code:
       0: dload_1
       1: dload_3
       2: dadd
       3: dreturn
}

可以看到在 f()Code 属性中,dload 以及4dload_<n> 指令都出现了。

aload 和 aload_<n>

aloadaload_<n>引用 类型有关。
这里 n 的可能取值是 0, 1, 2, 3aload_<n> 其实包含了如下4个指令

  1. aload_0 = 42 (0x2a)
  2. aload_1 = 43 (0x2b)
  3. aload_2 = 44 (0x2c)
  4. aload_3 = 45 (0x2d)
public class Load {
  public static void f0(String s0, String s1, String s2, String s3, String s4) {
    Object o = s0;
    o = s1;
    o = s2;
    o = s3;
    o = s4;
  }
}
javac Load.java 
javap -cp . -c Load

我这里看到的内容如下

Compiled from "Load.java"
public class Load {
  public Load();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static void f0(java.lang.String, java.lang.String, java.lang.String, java.lang.String, java.lang.String);
    Code:
       0: aload_0
       1: astore        5
       3: aload_1
       4: astore        5
       6: aload_2
       7: astore        5
       9: aload_3
      10: astore        5
      12: aload         4
      14: astore        5
      16: return
}

可以看到在 f()Code 属性中,aload 以及4aload_<n> 指令都出现了。

Chapter 7. Opcode Mnemonics by Opcode 中将字节码指令进行了分类。

Loads.png
上图绿色框中的所有指令,本文都涉及到了,本文完。
上一篇下一篇

猜你喜欢

热点阅读