Thursday, October 6, 2011

Manipulating Java Class Files with ASM 4 - Part One : Hello World!

What is ASM: ASM is an open source java library for manipulating java byte code. So it has the same purpose as Apache BCEL. As this article assumes that the reader has some knowledge of java class file format, it is advisable to read about it in here. So how is it different from BCEL? Well firstly it allows for an event driven way to manipulate byte code eliminating the need to load the whole class in the memory just to make a small addition. Secondly, it does not have a separate class for every single instruction. Instead, it handles opcodes directly and only has constants representing each. This reduces the size of the library to a great extent. So, ASM is simply lighter and smarter. However, ASM also has a mechanism to deal with class files by loading the whole class into the memory just in case the operation is too complex to be handled through event based processing.

The current stable version of ASM is 3.3. However version 4.0 RC2 is out. So, I am going to discuss that version here.

Event based processing vs in memory processing: ASM supports both event driven and in memory processing. Event driven processing is lightweight, but is a little limiting. On the other hand, in memory processing is more flexible and easy to use, but it is more heavy weight. The in memory processor internally uses the event driven processor, just like the DOM XML parser internally uses the SAX XML parser.

The following program demonstrates listening to class processing events.



The above program does not do anything except printing what event is reached when. At this time, we are visiting the class java.lang.String. Note that in a java class, there is an order in which things appear. For example, methods always appear after the field. The events also occur in the same order. For more information, go through the class file format discussed in here.

The code itself is very straight forward, and does not need a lot of explanation. We first extend the class ClassVisitor and override the necessary methods. Then we set up a ClassReader and call its accept method passing the object of the ClassVisitor as a parameter.

New in version 4: Upto version 3.3, ClassVisitor was an interface, so we had to implement all the methods. To relieve the developer from this mess, there was ClassAdapter which had no op implementations of all methods, which the developer could extend. In version 4, ClassVisitor has become a class and ClassAdapter is no longer there.


Notice that some visitor methods return objects of other methods. The methods of those Visitor objects are called when the particular kind of object is traversed. The following example shows how to use a MethodVisitor to step through the method's code.



The following is the output in this case.


<init>()V
177

main([Ljava/lang/String;)V
89
89
3
177


Note that the instructions are direct opcodes.

Generating Class: Before discussing the details of generating a class, I will talk about a real cool thing that is available in ASM. Its called ASMifier. Its a class that prints the code requred to generate a given class. The following example shows how to use ASMifier. The class has got a main method and hence can be used from the command prompt.

java -cp asm-all-4.0_RC2.jar org.objectweb.asm.util.ASMifier java.lang.String>>StringDump.java


And the following is the output.



The code shows how to create most of the class artifacts. I would now explain parts of the code. However, I will use a simpler class for demonstration and also like to add appropriate comments. The reason why I discussed the above is that in case you get stuck about how to create a particular artifact, you can simply compile a class having it and then generate code using ASMifier to see how it is created. I will use the following code as an example. The class ClassCreationDemoMaker creates a class that would have been the result of compiling the class ClassCreationDemo.



The fundamental principle of creating a class is to use an object of ClassWriter, which is a ClassVisitor, and then call different visit methods in it in proper order. In this case we have only created one field and three methods. It is mandatory to create the constructor. The code is very simple to understand. Note that after calling the visitMethod method in the ClassWriter, we have used the MethodWriter returned from it to write the code in the method. Here also note that we have manually allocated the maximum stack size for each method. We can alternatively use ClassWriter.COMPUTE_MAXS flag in the ClassWriter constructor to avoid doing this.

Writing a class is always an in-memory process. This is because the constants used in the class always needs to be in the constant pool, which is near the start of the class and it stores all the field and method names and every other things.


Modifying classes: Honestly, this is where ASM shines, because of its sheer ease of use. In this section, we are going to create the class ClassModifierDemo which modifies the class ClassModificationDemo. The job of the modifier is to insert a code in each method invocation (including constructors) that prints its name.





This example is a combination of reading and creating a class. In this case we simply wrap the ClassWriter with our own ClassVisitor. It simply delegates all calls directly to the ClassWriter except the visitMethod, which in turn is written to wrap the MethodVisitor returned by the ClassWriter with our own. MethodVisitor also directly delegates all calls to the super class except visitCode, which it uses to insert the custom code.

After running the program, a new version of ClassModificationDemo will be created in the out directory with proper package structure. Now we can run the ClassModificationDemo class from out directory and see the result. It will print which method is invoked when.

13 comments:

  1. Very good walk trough. Just what I was looking for.

    ReplyDelete
  2. Great Post. It introduced me to the ASM library which is really fascinating.
    I have been trying my hands on ASM library ever since and have created a very simple agent that simply logs the entry and exit to a method.

    I have a small question? Do you know a way of ignoring classes in certain packages while instrumenting the code using ASM? For example, I want to visit classes in the package org.demo.* and not org.junit.* . Do you know whats the best way to do it?

    ReplyDelete
  3. That's actually very easy, just put an if condition on the class name in your transformer.

    ReplyDelete
  4. Can I configure the whole package name to be ignored somehow instead of ignoring every class in the package?

    ReplyDelete
  5. The class name is a string, do a pattern matching. Remember this code is called once for every class load, so this will not affect the performance of warm systems. There is no way to tell the JVM to ignore those classes, you need to do it in the transformer.

    ReplyDelete
  6. Great. Thanks Debashish. Things are becoming more clear now to me.

    ReplyDelete
  7. Hi, It would be possible to change or add a parent (extends X) for a class ?

    ReplyDelete
  8. Yes, implement the visit method of your ClassVisitor. While calling cv.visit from within the visit method, change the superclass name. Here cv is the ClassWriter passed to you in its constructor. You cannot add a parent class, because there is only one parent class. You can only change it. You can however add interfaces, in the same visit method, also change the list of interfaces if you like.

    Check the visit method in http://asm.ow2.org/asm40/javadoc/user/org/objectweb/asm/ClassVisitor.html

    ReplyDelete
  9. Great post, for beginners. Now i have some idea how to proceed further. Thanks!! :-)

    ReplyDelete
  10. Hi , very nice post and helpful too. but when I run DemoClassInstructionViewer class as decribed in the post to get details about methods inside MethodVisitor, I always get :

    xception in thread "main" java.lang.IllegalArgumentException
    at org.objectweb.asm.MethodVisitor.(Unknown Source)
    at org.objectweb.asm.MethodVisitor.(Unknown Source)
    at com.geekyarticles.asm.DemoClassInstructionViewer.(DemoClassInstructionViewer.java:18)
    at com.geekyarticles.asm.DemoClassInstructionViewer.main(DemoClassInstructionViewer.java:35)

    Please help..

    Thankyou very much

    ~dina

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. Thanks for pointing it out.

    The code was written using ASM 4, the current API needs some changes. I will update the code in the blog very soon.

    You need to specify the API as Opcodes.ASM4 and InstructionAdapter must use api as a parameter like the following.

    InstructionAdapter instMv = new InstructionAdapter(api, oriMv) {

    @Override
    public void visitInsn(int opcode) {
    System.out.println(opcode);
    super.visitInsn(opcode);
    }

    };

    ReplyDelete
  13. Even If I tried running this code with ASM 4, it got error. How can I sovle this?

    the error is Exception in thread "main" java.lang.IllegalArgumentException
    at org.objectweb.asm.ClassReader.(Unknown Source)
    at org.objectweb.asm.ClassReader.(Unknown Source)
    at org.objectweb.asm.ClassReader.(Unknown Source)
    at com.geekyarticles.asm.ASMHelloWorld.main(ASMHelloWorld.java:114)

    ReplyDelete