Manchester Home

A Java GUI for Genetic Programming



















Java GUI for Genetic Programming




Dr. David Wedge, has produced a Java genetic programming library 'geneticWedge' that can be easily be called from a command line Java program.

To provide wider access to the software within the research group, I have developed a Graphical User Interface (GUI) front end for the library.

At the moment, the GUI software is best regarded as in first draft alpha development phase, however it should be usable.

I'm particularly keen to get feedback, bug reports naturally, but I would also welcome suggestions for improvement and new features.

As yet, there's no documentation, help or instructions - hopefully anyone familiar with genetic programming will be able to work out the basic operation of the GUI. I will be adding documentation / help in due course.

BTW, I'd certainly like to thank David for coding the gp library in the first place - I think he had the harder task!

Update: some minor fixes and improvements; and updated to newer versions of cdk & jfreechart.





Prerequisites




1) Java standard edition 1.6
To run Java programs, you will need Java run time environment installed. If you plan to develop or write Java programs, you'll need the jdk. I've used a little java 1.6 specific code, so users of java 1.5 will need to upgrade.

2) Chemistry Development Kit 1.2.2 I used the CDK to allow the rendering of SMILES strings as 2D molecular graphics. Parsing and displaying the graphics is quite slow - so this feature is not really usable for very large tables (the current version renders / parses a lot faster than earlier versions). Current smiles rendering is only implemented for row labels. The latest version of the CDK can be found here.

Note: Using the CDK, I believe there's potential for adding GP node / functions which implement chemical operations - let me know if you have any ideas along these lines.

3) JFreeChart 1.0.13 & JCommon 1.0.16; these Java libraries were used to provide simple 2D charting e.g. of the fitness history. The latest versions are here and here.

I've provided links to the versions of these Java libraries I used during development of the GUI just in case the latest versions introduce compatibility problems.




Downloads




1) The GP library. This library is a slightly modified version compared to David's, as extra classes were added, and a few modified to enable the GP code to run interactively. I'll link to his library once he incorporates the changes into his code.

2) The GP GUI.

3) CDK 1.2.2

4) JFreeChart 1.0.13

5) JCommon 1.0.16

I will release the GP GUI source code once the software moves out of the alpha development stage. However, if you should be keen to get your hands on it sooner, please request via email.




Running the Software




This really just a quick and dirty explanation of setting up and running the JavaGPGUI, I'll certainly expand on this with the proper documentation in due course. If anyone needs help installing using the software, just give me an (email) shout.

The data to be modelled should be in the form of comma separated values (.csv) text file.

Assuming all the jar files have been download to the same directory, the following command will start the GUI:

On Linux (pre-fix the class path with a colon & use the '/' character):

java -cp :./* JavaGP.JavaGPGUI 

(You might also need to ensure that the files are set to executable with chmod +x *.jar).

On Windows (pre-fix the class path with a semi-colon & use the '\' character):

java -cp ;.\* JavaGP.JavaGPGUI

Once you're sure this works, you can create a program short-cut / link specifying the same command.

To open your data file use:
File -> Import Data -> CSV File

You will need to use the check boxes at the right hand side of the open file dialogue to specify whether there are column headers and / or row labels.
For row labels, you can also specify that they are SMILES strings - though as noted above, this feature is only really usable for small tables.

Once you've loaded your data, it will appear as a table under the Data tab.

Row labels will appear in the left-hand pane, and the data in the right-hand pane (with scroll bars). It is assumed that the last column is the target 'y-data', and the remainder are input 'x-data'. If not, you can right-click the desired target and select 'promote to target'.

The Functions / Data tab presents the selection of functions to be included; the data pre-processing required (e.g. normalisation); and the  training set / validation set selection method.

The GP Setting tab allows one to set the mutation rates, population sizes and the maximum number of generations etc.. For a typical run, you can can just run with the default parameters.

Select Run and the GP will start running with the Fitness History, the Best Individual (so far) and a plot of the Best Individual Fit to Target appearing in the Data / Results tab.

At the end of the run, you can select File -> Save -> Save Results to save the results to a csv file.

As the software is at the moment, it's not possible to save the data table or the 'project' settings, however, this is on the expected features list for later implementation.






To Do



These are my current ideas for new features, in no particular order:
  • Add a tab for 'Test Data': import test data from csv; move data from input data table to test data table; select an 'individual' from the fitness history and evaluate this against the test data. Include the ability to save the results.
  • Have a setting for the GP to evalutae a random subset of training or validation data during each generation.



Known Bugs



  • Training fitness and validation fitness returned to the GUI from the GP are the same, even for bad validation data: the GUI wasn't splitting training data / validation data properly: Status - FIXED.



Last update: 29 May 2009


Group





Back to the group's homepage.