How to use Bayou

Bayou is a system for automatic generation of API idioms in Java, powered by method called Neural Sketch Learning. Here we walk through the main features of Bayou.

You can think of Bayou as a system for assisting a programmer who is using Java APIs. A programmer interacts with Bayou by writing  a draft program — a Java method with “holes”, or missing blocks of code — and supplying some clues in a query about how these holes should be filled. Bayou’s job is then to fill these holes with code in a way that will, hopefully, match the programmer’s intent. More precisely, Bayou infers a probability distribution over the set of candidate completions of a given draft, then returns the top-k programs from this distribution.

The current version of Bayou allows only one hole per draft, although this is not a fundamental restriction. Each term in a query can be:

  • an API method call that the generated program should use, or
  • an API data type that the user wants the generated code to use.

Now let us examine a few queries in more detail.

Query 1: API calls

A single API call

Suppose you want Bayou to generate a Java method that reads from a given file. You know that this method should call the API method “readLine”. You can express this knowledge as a query using a draft that looks as below:

1
2
3
4
5
6
7
8
import java.io.File;
public class Test {
    void read(File file) {
        {
            /// call:readLine
        }
    }  
}

Queries are specified to Bayou using a special /// notation. It is important to note that queries should be specified in an otherwise empty block – note the { } around the line containing ///. This is how Bayou understands that there is a “hole” at that program location.

In this query, “call:readLine” suggests Bayou that the program it generates should preferably make a call to an API method named “readLine”. It is not necessary to specify the class name or full signature of the method; simply the name is sufficient.

You might notice that once you start typing in “call:readLine”, a drop-down box with suggestions will appear below the cursor. You can simply select an element from this list to auto-complete it instead of typing it in full. In fact, it is required that all queries be elements that occur in this list – otherwise an error would be shown when issuing the query to Bayou.

Bayou will replace the empty block in which the query is given, possibly adding relevant imports, and leaving the rest of the code unchanged. Here is a sample program that Bayou is highly likely to return on this input in the top-5 results.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import java.io.BufferedReader;
import java.io.File;
import java.io.IOException;
import java.io.FileNotFoundException;
import java.io.FileReader;
public class TestIO {
  void read(File file) {
    {
      FileReader fr1;
      BufferedReader br1;
      String s1;
      try {
        fr1 = new FileReader(file);
        br1 = new BufferedReader(fr1);
        s1 = br1.readLine();
      } catch (FileNotFoundException _e) {
      } catch (IOException _e) {
      }
      return;
    }
  }
}

This program initializes a FileReader and a BufferedReader, and then calls “readLine” on the BufferedReader. The reason Bayou generates this program is that in real-world code, to read a line from a file, one tends to use BufferedReader as it is reads from the file in a buffered manner. Thus, when a user offers “readLine” as a query towards the program they want, it is quite likely that the above kind of program is what they have in mind.

We will refer to this program later in this document, so let us give it a name: ReadLineFromFile. Naturally, because of the uncertain nature of queries, Bayou will also return some other programs that do not meet your intent. For example, you will often also get a program that uses the file to create a stream, then proceeds to read from the stream. As we explain later, you can supply additional evidence to lower the probability of these results.

Multiple API calls

What if you knew more than one method that the synthesized code ought to call? For instance, notice that in the above program generated by Bayou, it has left the reader open for further reads from the file, if needed. Suppose you are sure that it doesn’t need to be left open, and can be closed after reading, by calling the “close” API method. In this case, you could add this additional piece of knowledge to the query:

1
2
3
4
5
6
7
8
import java.io.File;
public class Test {
    void read(File file) {
        {
            /// call:readLine call:close
        }
    } 
}

Invoking Bayou with this would get you the program below in the top-5 results. The program is basically the same as ReadLineFromFile but also closes the file after reading.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import java.io.File;
import java.io.FileNotFoundException;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.FileReader;
public class TestIO {
  void read(File file) {
    {
      BufferedReader br1;
      FileReader fr1;
      String s1;
      try {
        fr1 = new FileReader(file);
        br1 = new BufferedReader(fr1);
        s1 = br1.readLine();
        br1.close();
      } catch (FileNotFoundException _e) {
      } catch (IOException _e) {
      }
      return;
    }
  }
}

 

Query 2: API Data Types

You can also specify queries using API data types. Suppose you know that the code that you want uses the type “FileReader” and “BufferedReader”. You can express this intent in a draft like the following.

1
2
3
4
5
6
7
8
import java.io.File;
public class TestIO {
    void read(File file) {
        {
            /// type:FileReader type:BufferedReader
        }
    }
}

The program ReadLineFromFile is likely to be among the top-5 programs returned on this query, although some other programs that use FileReader and BufferedReader in other ways will also likely be returned.

 

Mixed Queries

A fundamental feature of Bayou is that the query interface it offers is fundamentally multimodal: you are allowed to freely mix different kinds of terms in the query.

Concretely, take again our running scenario. Suppose you want to assert to Bayou that the generated code should likely use the method “readLine” as well as the type “FileReader”. You can do this using a draft such as the following.

1
2
3
4
5
6
7
8
import java.io.File;
public class TestIO {
    void read(File file) {
        {
            /// call:readLine type:FileReader
        }
    }
}

The “mixed query” supplied here significantly raises the probability of getting the program ReadLineFromFile, and effectively rules out programs that, for example, read lines from streams rather than files. The variation in the results now arises from different ways of reading from a file (e.g., calling a single “readLine” vs calling it in a loop), and different ways of constructing a reader for the file (some programs use a LineNumberReader created from the FileReader), etc.

Queries and surrounding code

Implicitly, Bayou gathers the types of the formal parameters of the draft method in its query (of the “type:” kind). In addition, you are also allowed to mix queries, declared within the /// block, with regular code outside the block that does variable declaration and initialization. In this case, Bayou will work with those variables when generating programs in the same way it works with the formal parameters.

The current version of Bayou, however, does not work with any other arbitrary code in the draft method. Although this is not a technical limitation, allowing arbitrary code might give the user the impression that Bayou would interpret and work with that code, which it currently does not.

To declare and initialize variables for Bayou to work with, simply provide those statements within the draft method, but outside the (empty) query block. Here is a legitimate draft of this sort:

1
2
3
4
5
6
7
8
9
import java.io.File;
public class TestIO {
    void read() {
        File file = new File("abc.txt");
        {
            /// call:readLine call:close
        }
    }
}

In the returned programs, Bayou would replace the query block with code.

Return values

Bayou generates return statements for the draft method based on its return type. It will try its best to match and return a variable of the right type in the generated program. If no such variable is found, Bayou will return some default value (null for objects, 0 for integers, etc.).

For instance, in our running example, if you would like Bayou to return the line that is read from the file, you can provide the return type “String” as follows:

1
2
3
4
5
6
7
8
import java.io.File;
public class TestIO {
    String read(File file) {
        {
            /// call:readLine
        }
    }
}

With this draft program, Bayou would likely generate a program similar to ReadLineFromFile, but one that returns the read line:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import java.io.IOException;
import java.io.File;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.FileNotFoundException;
public class TestIO {
  String read(File file) {
    {
      BufferedReader br1;
      FileReader fr1;
      String s1;
      try {
        fr1 = new FileReader(file);
        br1 = new BufferedReader(fr1);
        s1 = br1.readLine();
      } catch (FileNotFoundException _e) {
      } catch (IOException _e) {
      }
      return s1;
    }
  }
}

Note that Bayou does not currently pass the return type into the query implicitly.