16 October 2019
Note: This post is part of a series on D8 and R8, Android’s new dexer and optimizer, respectively. For an intro to D8 read “Android’s Java 8 support”. For an intro to R8 read “R8 Optimization: Staticization”.
The previous post on R8 covered enum ordinals which then allowed branch elimination to apply to a switch statement. In that post, the full bytecode for switch
on an enum was omitted because there’s actually more to the optimization.
Let’s start with a simple enum and a switch over its contents in two separate source files (this will be important later).
enum Greeting {
FORMAL, INFORMAL
}
class Main {
static String greetingType(Greeting greeting) {
switch (greeting) {
case FORMAL: return "formal";
case INFORMAL: return "informal";
default: throw new AssertionError();
}
}
public static void main(String... args) {
System.out.println(greetingType(Greeting.INFORMAL));
}
}
If we compile and run these files the output is as expected.
$ javac Greeting.java Main.java
$ java -cp . Main
informal
The bytecode in the previous post showed that the compiler produces a call to ordinal()
which is then used in the switch. But if that was all that the compiler did, re-ordering the constants of Greeting
would break the output of Main
.
enum Greeting {
- FORMAL, INFORMAL
+ INFORMAL, FORMAL
}
After changing the constant order, we can recompile only Greeting.java
and yet the application still produces the correct output.
$ javac Greeting.java
$ java -cp . Main
informal
If the bytecode was only relying on the value of ordinal()
, this code would have produced “formal”.
To understand how this works we can look at the Java bytecode of greetingType
.
$ javap -c Main.class
class Main {
static java.lang.String greetingType(Greeting);
Code:
0: getstatic #2 // Field Main$1.$SwitchMap$Greeting:[I
3: aload_0
4: invokevirtual #3 // Method Greeting.ordinal:()I
7: iaload
8: lookupswitch {
1: 36
2: 39
default: 42
}
36: ldc #4 // String formal
38: areturn
39: ldc #5 // String informal
41: areturn
42: new #6 // class java/lang/AssertionError
45: dup
46: invokespecial #7 // Method java/lang/AssertionError."<init>":()V
49: athrow
}
Let’s break the contents down. The first bytecode of this method has a lot of information to unpack:
0: getstatic #2 // Field Main$1.$SwitchMap$Greeting:[I
This looks up a static field on the class Main$1
with the name $SwitchMap$Greeting
and the type int[]
. We obviously did not write this class or field, so it must have been generated automatically by javac
.
The next two bytecodes perform the call to ordinal()
on the method argument.
3: aload_0
4: invokevirtual #3 // Method Greeting.ordinal:()I
Java bytecode is stack-based, so the int[]
result of getstatic
and the int
value of ordinal()
both remain on the stack. (If you don’t understand how a stack-based machine works, you can watch this presentation for an introduction.) The next instruction uses that int[]
and int
as its operands.
7: iaload
This “integer array load” instruction looks up a value in the int[]
at the index returned by ordinal()
. The rest of the bytecodes of the method are a “normal” switch statement which uses the value from the array as its input.
It’s pretty clear that this $SwitchMap$Greeting
array is the mechanism which allows our code to continue to work despite the ordinals changing their value. So how does it work?
When compiled, each case
of the switch
is assigned one-based index. The default
branch is assigned zero.
switch (greeting) {
case FORMAL: ... // <-- index 1
case INFORMAL: ... // <-- index 2
default: ... // <-- index 0
}
The $SwitchMap$Greeting
array is populated at runtime in the static initializer of Main$1
. The empty int[]
is created first and assigned to the $SwitchMap$Greeting
field.
0: invokestatic #1 // Method Greeting.values:()[LGreeting;
3: arraylength
4: newarray int
6: putstatic #2 // Field $SwitchMap$Greeting:[I
The length of this array is the same as the number of constants (which might not match the number of case
blocks). This is important since ordinals are used as an index into this array.
The next bytecodes are repeated for each constant used in the switch statement.
9: getstatic #2 // Field $SwitchMap$Greeting:[I
12: getstatic #3 // Field Greeting.FORMAL:LGreeting;
15: invokevirtual #4 // Method Greeting.ordinal:()I
18: iconst_1
19: iastore
The ordinal of FORMAL
, the first case
subject, is used as the offset in the array where its corresponding switch index value of 1 is stored. The same is done for the ordinal of INFORMAL
and the value 2. This int[]
effectively creates a map from the ordinals which may change to a fixed set of integer values which will not.
By using this map, the switch statement can remain stable even when we re-arrange the constants of Greeting
.
The switch map indirection created by javac
is useful when the enum may be recompiled separately from the callers. Android applications are packaged as a single unit, so the indirection is nothing but wasted binary size and runtime overhead.
Running D8 on the class files from above shows that the indirection is maintained.
$ java -jar $R8_HOME/build/libs/d8.jar \
--lib $ANDROID_HOME/platforms/android-29/android.jar \
--release \
--output . \
*.class
$ $ANDROID_HOME/build-tools/29.0.2/dexdump -d classes.dex
⋮
[00040c] Main.greetingType:(LGreeting;)Ljava/lang/String;
0000: sget-object v0, LMain$1;.$SwitchMap$Greeting:[I
0002: invoke-virtual {v1}, LGreeting;.ordinal:()I
0005: move-result v1
0006: aget v1, v0, v1
0008: packed-switch v1, 00000024
⋮
R8, however, performs whole-program analysis and optimization. There’s no point for it to retain this indirection since the enum cannot change independently of the switch.
[00040c] Main.greetingType:(LGreeting;)Ljava/lang/String;
-0000: sget-object v0, LMain$1;.$SwitchMap$Greeting:[I
0000: invoke-virtual {v1}, LGreeting;.ordinal:()I
0003: move-result v1
-0006: aget v1, v0, v1
0004: packed-switch v1, 00000024
The branches of the switch are rewritten to account for the fact that the input now uses the zero-based ordinal directly instead of the one-based values from the switch map. With the Main$1
class and its array being no longer referenced, it is eliminated like normal dead code.
Only with this indirection removed can the enum ordinal optimization from the previous post result in eliminating the switch. Otherwise, the ordinal value would flow into the int[]
as an index which is not safe to eliminate in the general case.
An enum used in a Kotlin when
will also produce a similar indirection for the same reasons.
val Greeting.type get() = when (this) {
Greeting.FORMAL -> "formal"
Greeting.INFORMAL -> "informal"
}
When compiled, the Java bytecode shows a similar mechanism but with different names.
$ javap -c MainKt
public final class MainKt {
public static final java.lang.String getType(Greeting);
Code:
0: aload_0
1: getstatic #21 // Field MainKt$WhenMappings.$EnumSwitchMapping$0:[I
4: swap
5: invokevirtual #27 // Method Greeting.ordinal:()I
8: iaload
9: tableswitch {
1: 36
2: 41
default: 46
}
⋮
The generated class is suffixed with $WhenMappings
instead of an arbitrary integer and the array is named $EnumSwitchMapping$0
.
R8 initially did not detect Kotlin mappings because of these slightly different names. Version 1.6 of R8 (included in AGP 3.6) will correctly detect and eliminate them.
Switch map elimination is a nice win for binary size and runtime performance. More importantly, by removing an indirection between the input to a switch and its branching logic, other optimizations like turning calls to ordinal()
into a constant can result in branch elimination.
More R8 optimization posts coming soon. Stay tuned!
— Jake Wharton