Monday, November 14, 2011

OpenJDK Java8 Lambda Syntax

This document provides some working examples of the future OpenJDK 8 lambda syntax.   There was a lot of discussion whether the language changes would get included in the Java7 or Java8 so it looks like we will see the changes in Java8 slated for release in 2013.

The lambda conversion uses a target SAM type or single abstract method.

You can see the conversion in this example:

    interface F {
        void f();
    }


        final F func = () -> System.out.println("Test");
        System.out.println("Function Object : " + func);
        func.f();
...
The variable type on the left is an interface of type 'F' with a single method 'f'.

It seems that the Java lambda implementation provides a slightly more expressive syntax for code blocks that you would normally see in an anonymous inner class.  These sections of  code are synonymous.

        for (int i = 0; i < 10; i++) {
            final int ii = i;
            final int res6 = ((Fx)( (final int y) -> y + ii )).f(10);          
            System.out.println("[6] Output from recent lambda call index="+i+" : " + res6);
        }
        ...

        for (int i = 0; i < 10; i++) {
            final int ii = i;
            // @line70, anon inner class call
            final Fx fxx = new Fx() {
              public int f(int y) { return y + ii; }
            };
            final int res6 = fxx.f(10);          
            System.out.println("[7] Output from anonymous inner call index="+i+" : " + res6);
        }


Resources:


The Lambda team provided a binary build of Javac and Java for Windows and Linux.

[1] http://jdk8.java.net/lambda/


$ ./java.bat
openjdk version "1.8.0-ea"
OpenJDK Runtime Environment (build 1.8.0-ea-b1314)
OpenJDK Client VM (build 23.0-b04, mixed mode)

https://github.com/berlinbrown/BuildingOpenJDKCookbook

BuildingJDK8/ExampleLambdaProject/LambdaJava8Test.java


http://hg.openjdk.java.net/lambda/lambda/

http://mail.openjdk.java.net/pipermail/lambda-dev/attachments/20100122/3764c21a/attachment.txt

----

The Thoracic system may be compared to a great freight system, with each of its tributaries[Pg 218]—from the main trunk arteries down to the tiniest blood vessels—starting from the heart and carrying its cargo of blood to every part of the body by means of the power furnished by the lungs.



Monday, November 7, 2011

Implementing a Parser and Simple Compiler for the Java Virtual Machine

1. Overview


  This document covers the implementation of a simple recursive-descent parser for an infix adder language with a lexer, parser, and compiler.  The language is implemented with Java and compiles to Java Virtual Machine (JVM) bytecode.  The first implementation of the compiler is different from other implementations because it is fully implemented in one class without third party parser generator libraries.  The implementation of the language only uses the Java standard libraries.  The code for the parser does not use a parser generator so it only contains pure Java standard libraries and no references to third-party libraries.

I wanted to discuss this particular example it helps for you to understand the nature of programming languages and their implementation.  Don't be scared about the parsing input token routines in them.  They help as a tool to the developer.  This a basic implementation but at least I provided a JVM compiler.  Actually code generator from the AST to JVM byte code.


Figure 1 : Mini adder language input (6 tokens are visible)
  The language program in the snippet above consists of a SYMBOL name token, an assignment token, NUMBER value the PLUS token and another NUMBER token.  The result of the add operation is stored in  global memory.  


2. Lexical Analysis


  The lexical analysis consists of reading the character tokens from the input stream and then converting those   ASCII characters into tokens that are used by our language.
Figure 2: Source code for lexer method, convert input ASCII characters into our language token types

  If you look at the code snippet above, the lexer method mostly consists of a large case statement that checks for a particular character and then creates a wrapper token object for that character. Once the token object is created, we add start to add the token node to the abstract syntax tree.

3. Lexer and Simple Parser with ANTLR


The parser example code uses only the Java standard libraries.   Most developers that write domain specific languages or other programming languages may use a parser generator.  ANTLR is a parser generator system written in Java that can output parser code in Java programming language or Python.  You provide a grammar and ANTLR generates the code that you can use for the lexer and parser.  In our case, are going to create a simple grammar and then two classes are created, the lexer and parser.

When then feed the input tokens to our ANTLR code generated code.





4. Parser Implementation and Abstract Syntax Tree (AST)


  The parser implementation converts and analyzes the tokens and coverts the tokens to an abstract syntax tree.  Once we build the AST, then we can traverse the tree and perform operations at each node.

Figure 3.1 : Lookahead parsing (LL(2)) and look for match errors

  We used look ahead two parsing.   Match method.

Figure 3.2 : Output from Abstract Syntax Tree
Figure 3:  Building the AST



4.1 Mini Language Functionality: Adding, Global Memory, Printing

Figure 4:  Visitor implementation, visit the AST expression nodes.  Perform 'add' operation

5. Java Bytecode


  The Java bytecode consists of  the magic number CAFEBABE.  After that comes the constant pool data.  After the constant pool data comes the data buffer which consists of the code block.


5.1 Java Tools : Javap

Figure 5 : Output from Javap Disassemble/Decompile on a Java Class

5.2 The Constant Pool


  The constant pool is a collection of constants that are referenced by ID in the data buffer or code section.


5.3 The Data Buffer and Code Section Attributes

The data buffer code section contains the code.

5.4 Analyzing the OpenJDK Java Source


Figure 6 : OpenJDK Java Source ClassWriter.java, writes Java bytecode to file/stream




6. Mini Compiler Implementation (writing to JVM bytecode)


  When thinking about the Java programming language, think of it as a wrapper for generating Java bytecode.  You don't necessarily need a mainstream programming language.  You do need to format the bytecode input into a class file and feed that into the Java virtual machine.


Figure 7 : Mini adder compiler, adding constants to constant pool. Matches Javap output (see comments)




7. Conclusion


This document covers the implementation of a compiler, a very basic compiler for the JVM.  We looked at the lexer, the parser, the AST, JVM bytecode and compiler.
A. Resources
B. Source Code
https://github.com/berlinbrown/ImperativeStaticCompiledLangOneFile - Github Source, see CompiledLangOneFile.java)

Also see:
http://math-services.appspot.com/lang2.jsp

C. Building the OpenJDK Java Compiler


D. Tools


http://www.antlr.org/


E. Disclaimer


For this document, I included snapshot images of the source instead of the actual source because I plan on converting this document to other formats.  I wanted to keep some of the syntax highlighting from my IDE.

-- Berlin Brown on 9/22/2011