Devirtualization Analyzer

Extends the Java 1.4 frontend

This is an example of how a large JastAdd project can be extended in an easy and declarative way. The Java 1.4 Frontend is extended with various kinds of static analysis with the goal of finding candidates for devirtualization, i.e., the replacement of virtual method calls with faster static calls. The analysis includes building a call graph, cross-referencing, and reachability. It is illustrated how different computations can be expressed in succint small declarative modules, making use of reference attributes, rewrites, and also circular attributes.

There are several different techniques for devirtualizing method calls, most based on that the complete program and its class hierarchy are known. For example, if we know that a class has no subclasses, calls to its methods can be replaced by static calls. Even more calls can be devirtualized if we look also at classes with subclasses, but find cases where the called method is not overridden in any of the subclasses. Better results still can be achieved by looking also which classes are in fact instantiated in the program, taking the call graph reachability into account.

Below, we show how several of these analyses can be implemented in JastAdd. Note, however, that the implementation is experimental and for illustration purposes only. You can run the tool, but don't trust the figures that come out since we have not yet taken all Java's language features into account. For example, the implementation currently takes classes into account but ignores interfaces.

This project builds on the Java 1.4 Frontend. To run it, you need to download both this project and the Java1.4Frontend, and place them in sibling directories. The modules defined in this project will make use of the abstract syntax and attributes defined in the Java1.4Frontend, and also refine the abstract syntax and add new attributes.

Devirtualization analysis modules

Classes with zero subclasses

A simple devirtualization condition is that a class does not have any subclasses. More precisely, if we have a virtual method call exp.m(), the call to m() is devirtualizable if the formal type of exp is a class that has no subclasses.

This condition is implemented by the attribute ZSdevirt() in the module ZSdevirt.jrag. The implementation uses the type() attribute defined in the Java1.4Frontend, and the subclasses() attribute described below in the Helper modules section.

In order to see how often this condition is fulfilled for real programs, we have implemented an attribute ZSdevirtCountInSourceFiles() for the Program (root) node. It is implemented in ZSdevirtCount.jrag by a simple traversal through those parts of the AST corresponding to source code compilation units (i.e., we do not consider compilation units read in from class files). We also compute the total number of virtual calls, see TotalVirtualCount.jrag. The counts are tested in the testDevirtualization() testcases in quicktests/TestAnalysis.java and largetests/TestJavac.java.

Called methods that are not overridden

A better condition for finding devirtualizable calls is that the called method is not overridden. This allows devirtualization also in some cases when the formal receiver type does have subclasses. More precisely, suppose we have a virtual method call exp.m(), and that the formal type of exp is F. Then m() is devirtualizable if there is no method overriding m() in any class below F.

This condition is implemented by the attribute NOMdevirt() in the module NOMdevirt.jrag. The implementation uses the type(), decl() and instanceOf(TypeDecl) attributes defined in the Java1.4Frontend. The instanceOf(TypeDecl) attribute compares types according to the subtyping relation. In addition, an attribute overriders() is used which is described below in the Helper modules section. The overriders() attribute contains the collection of method declarations that override a given method declaration.

A count is implemented for the NOMdevirt() attribute as well, see NOMdevirtCount.jrag. The testDevirtualization() testcase that was mentioned above tests this count as well.

Taking reachability into account

Even better devirtualization can be done by taking reachability into account, as done by the well-known RTA algorithm. If there is an overriding method, we can ignore it if it is inside a class that is never instantiated. To know which classes are instantiated, we could find the NEW expressions inside methods that are reachable from the main method. As a first step in this direction, we have defined an attribute reachable() for methods in the module Reachable.jrag. This attribute is simple to define using a circular attribute: a method is reachable if it is the main method. In addition, it is reachable if it is called by another method that is also reachable. The definition makes use of an attribute callers() defined in the Helper modules below.

A count is implemented for the reachable() attribute as well, see ReachableCount.jrag. The testDevirtualization() testcase that was mentioned above tests this count as well.

We have not yet completed the implementation of devirtualization analysis based on reachability. You may also notice that the definition of reachability is not complete. For example, we need also take initializers and constructors into account to find calls. And the roots of the reachability analysis should include also run() methods. These additions are straight-forward and will be added in future versions of this example.

Helper modules for class analysis

The Java1.4Frontend includes many attributes useful for the devirtualization computations. But we also need some additional attributes and refinements of the AST, including, e.g., the call graph. These attributes would be useful for many other analyses than devirtualization, so when the implementation is complete it will be useful to move these modules into a separate class analysis component that can be reused together with the Java1.4Frontend.

Call graph

The call graph is implemented by an attribute calls() for method declarations. The calls() is a set of references to all methods potentially called in the method body, taking overriding into account, but not control flow. For example, suppose the method contains a call exp.m(), where the formal type of exp is F. Then F's method m() will be included in calls(), as well as overriding methods m() in subclasses to F. The calls() attribute is implemented in the module CallGraph.jrag.

The CallGraph module uses the attribute overriders() which is a cross-referencing attribute described below. The total number of edges in the call graph are counted in the module CallGraphCount.jrag. This attribute is used by test cases.

The implementation of the call graph is not yet complete. For example, constructor calls need to be taken into account.

Specializing static and virtual method calls

In the Java1.4FrontEnd, there is no syntactic difference between static and virtual method calls. But in the devirtualization it is useful to separate these two cases since it allows computations concerning virtual method calls to not have to take into account that the call might be static.

To implement this separation, we add new subclasses to the abstract syntax in the module staticvirtual.ast. The MethodDot class is specialized into StaticMethodDot and VirtualMethodDot, and similarly for MethodAccess. (A MethodDot models a qualified method call, like "a.b()", whereas a MethodAccess models an unqualified call, like "b()").

To actually perform the separation, we have rewrite rules that replace MethodDot and MethodAccess with the new more specialized classes. This is shown in module SpecializeStaticVirtualCalls.jrag. These rewrites are specializations, i.e., they replace a node of a given class with a node of a subclass, and the rewriting conditions make use of a specialization idiom, stating that the node must be of exactly class MethodDot (or MethodAccess) in order for the rewrite rule to apply. This prevents the rule from being applied again, after the node has already been rewritten to the subclass.

Cross references

Several of the computations make use of cross-referencing attributes. For example, the Java1.4Frontend provides attributes for accessing the superclass from a subclass, but in the ZS devirtualization analysis, the reverse relation is needed: the set of subclasses of each superclass. Similarly, the Java1.4Frontend provides an attribute overrides() for method declarations, but the reverse relation is needed in the NOM devirtualization analysis. A third example is the call graph, where the reverse relation, callers() is needed for the reachability analysis.

A cross-referencing attribute such as the subclasses() attribute is a collection attribute, i.e., it is a collection of references to objects matching a certain condition. To evaluate such an attribute, the entire AST has to be traversed since the matching objects can (in general) be located anywhere in the tree. This is thus a fairly expensive operation. In the case of the subclasses() attribute, each class in the AST will have this attribute, so to evaluate all of them would mean traversing the entire AST once for each class declaration. To improve the performance, we have chosen an alternative implementation where all the subclasses() attributes are evaluated in a single traversal of the AST. This is done by implementing the attribute as a method with hand-coded demand evaluation inside: the first access to any of the subclasses() attributes will trigger the evaluation of all of them. Notice that the API to the computation is the same as for the synthesized attribute solution. A client of the subclasses() attribute does not have to explicitly schedule its evaluation, it can simply use the attribute as if it is already computed.

The same idiom is used for the implementation of all three cross-referencing attributes, as seen in the modules SubclassCrossRefs.jrag, OverridesCrossRefs.jrag, and CallerCrossRefs.jrag. In a future version of JastAdd, we plan to add explicit support for such jointly evaluated collection attributes so that their specification is more concise, and that the code that you see in these modules is generated automatically.

There is once again a counting module as well, aCallerCrossRefsCount.jrag

Additional contents of this directory

DevirtualizationAnalyzer.zip for downloading the example to your computer. To run the example you also need to download the Java 1.4 Frontend, and unpack it in a sibling directory.
tests
- quicktests/TestAnalysis.java: test cases small examples, testing the computations described above.
- largetests/TestJavac.java: test cases for a large example, analyzing the source code for the javac compiler (bundled with the example so you can run it). To run this test case you probably need to increase the size of the Java heap. Use, e.g., java -Xms256M -Xmx256M
testframework: package containing classes useful for testing
- TestCaseParse.java: Parses a set of source files into an AST.
- TestAll.java: Used for finding all testcases when running tests from a terminal window.

build.xml: Ant build file to be used from a terminal window and from Eclipse. Makes use of tools and source files in the Java 1.4 Frontend component.

Use the targets build (default), test, and clean, when working from a terminal window. For example, type

ant	generate all java files and compile them
ant test	run all test cases
ant clean	remove all generated files and all the class files for both generated and handwritten java files

Use the targets gen and cleanGen when working from eclipse (see Running JastAdd Under Eclipse):
ant gen
generate all java files
ant cleanGen
remove all generated files and their class files.

ant gen	generate all java files
ant cleanGen	remove all generated files and their class files.