Classification module documentation

From JCLEC wiki
Jump to: navigation, search


JCLEC-Classification is an intuitive, usable and extensible open source module for genetic programming (GP) classification algorithms. This module is an open source software for researchers and end-users to develop classification algorithms based on GP and grammar guided genetic programming (G3P) models. It houses implementations of rule-based methods for classification based on GP, supporting multiple model representations and providing users the tools to easily implement any classifier. This library is a module for JCLEC, which is a software system for Evolutionary Computation (EC) research, developed in the Java programming language. JCLEC provides a high-level software environment to do any kind of Evolutionary Algorithm (EA), with support for genetic algorithms (binary, integer and real encoding), genetic programming (Koza style, strongly typed, and grammar based) and evolutionary programming.

The classification module includes some GP and G3P proposals described in the literature, and provides the necessary classes and methods to develop any kind of evolutionary algorithm for easily solving classification problems.

JCLEC classification module

Similarly to JCLEC core, the structure of the JCLEC classification module is organized in packages. In this section we describe the main packages in the JCLEC classification module. For further information, please visit the API reference. The next figure shows the class diagram for the JCLEC classification module.

JCLEC class diagram


This package contains all JCLEC classification module interfaces. These interfaces allow to represent a classifier and individuals.


  • IClassifier
  • IClassifierIndividual


This base package provides the abstract classes with the properties and methods that any classification algorithm must contain.


  • ClassificationAlgorithm
  • ClassificationReporter
  • Rule
  • RuleBase


This package contains and represents several implementations of all possible primitive functions that could be used in an expression tree node.


  • And
  • Or
  • Not
  • Equal
  • NotEqual
  • Less
  • LessOrEqual
  • Greater
  • GreaterOrEqual
  • In
  • Out
  • AttributeValue
  • ConstantValue
  • RandomConstantOfContinuousValues
  • RandomConstantOfDiscreteValues


This package contains and represents several implementations of all possible fuzzy primitive functions that could be used in an expression tree node.


  • Is
  • Maximum
  • MembershipFunction
  • Minimum
  • TriangularMembershipFunction


This package has implementations to represent the phenotype of a crisp rule-base individual.


  • CrispRule
  • CrispRuleBase


This package has implementations to represent the phenotype of a fuzzy rule-base individual.


  • FuzzyRule
  • FuzzyRuleBase


This package defines the necessary classes to implement genetic programming encoding individuals.


  • ExprTreeSpecies
  • ExprTreeRuleIndividual


This package defines the necessary classes to implement genetic programming encoding multiple individuals.


  • MultiExprTreeRuleIndividual


This package defines the necessary classes to implement grammar guided genetic programming encoding multipe individuals.


  • MultiSyntaxTreeRuleIndividual


This package defines the necessary classes to implement grammar guided genetic programming encoding individuals.


  • SyntaxTreeRuleIndividual
  • SyntaxTreeSchema
  • SyntaxTreeSpecies


This package defines the listener to obtain reports in each generation.


  • RuleBaseReporter

API Reference

For further information, please visit the API reference.


The JCLEC core and the classification module can be obtained as follows:

There is a tutorial of the JCLEC classification module available to download.

Running a classification algorithm

This section describes how to encode the configuration file required to run the algorithm in the JCLEC classification module.

How to encode the configuration file

The configuration file comprises a series of parameters required to run the algorithm. For a further detail see the configuration page. Following, the Tan algorithm configuration file is shown.

	<process algorithm-type="net.sf.jclec.problem.classification.algorithm.tan.TanAlgorithm">
		<rand-gen-factory type="net.sf.jclec.util.random.RanecuFactory" seed="123456789"/>
		<dataset type="net.sf.jclec.problem.util.dataset.KeelDataSet">
		<listener type="net.sf.jclec.problem.classification.listener.RuleBaseReporter">

How to execute the Tan et al. algorithm

Once we have downloaded JCLEC, JCLEC classification module and its examples, and designed our experiment in the configuration file, we can execute the experiment.

Using JAR file

You can execute JCLEC modules using a JAR file.

java -jar jclec4-classification.jar examples/Tan.cfg

Using Eclipse

You can execute JCLEC modules using Eclipse.

Run -> Run Configurations

Create a new launch configuration as Java Application


Project: jclec4-classification

Main class: net.sf.jclec.RunExperiment


Program arguments: examples/Tan.cfg

Finally, we execute our algorithm by clicking on the Run button.

Tan et al. algorithm results

File name: data/iris/iris-10-1tst.dat
Runtime (s): 4.407
Number of different attributes: 4
Number of rules: 4
Number of conditions: 8
Average number of conditions per rule: 2,0
Accuracy (Percentage of correct predictions): 0,9333
Geometric mean: 0,9283
Cohen's Kappa rate: 0,9000
AUC: 0,9667

Percentage of correct predictions per class
Class Iris-setosa: 100,00%
Class Iris-versicolor: 100,00%
Class Iris-virginica: 80,00%
End percentage of correct predictions per class
1 Rule: IF (AND NOT AND AND IN SepalLength 5.521539 7.4334537 < PetalWidth 1.579072 > PetalWidth 1.275671 < PetalWidth 0.769985 ) THEN (Class = Iris-setosa)
2 Rule: ELSE IF (AND IN PetalWidth 0.582293 1.815772 IN PetalWidth 0.190182 1.7987844 ) THEN (Class = Iris-versicolor)
3 Rule: ELSE IF (>= PetalLength 4.755571 ) THEN (Class = Iris-virginica)
4 Rule: ELSE (Class = Iris-setosa)

Test Classification Confusion Matrix

			C0	C1	C2	|
	Actual	C0	5	0	0	|	C0 = Iris-setosa
		C1	0	5	0	|	C1 = Iris-versicolor
		C2	1	0	4	|	C2 = Iris-virginica

Implementing new algorithms

Including new classification algorithms is an easy issue. The following classes are required:


A new algorithm included in the module should inherit from the ClassificationAlgorithm class. In this new class, all its properties should be defined (parent selector, recombinator, mutator, species, etc.). Each of these properties are configured by means of the configuration file, which specifies the classes and the attribute values. (See [Bojarczuk Algorithm example]).


This new class inherits from the AbstractEvaluator class, allowing of evaluating each individual in the evolutionary algorithm. (See [Bojarczuk Evaluator example]).


This class represents the species to be used in the evolutionary algorithm. Here, user could make a differentiation between expression-tree and syntax-tree respectively. In such a way, each GP classification individual is represented by means of the ExprTreeRuleIndividual class, which represents an individual, comprising all the features required: the genotype, the phenotype, and the fitness value. The nodes and functions in GP trees are defined by the ExprTreeSpecies class. Similarly to GP individuals, the SyntaxTreeRuleIndividual class specifies all the features required to represent a G3P individual, while the SyntaxTreeSpecies allows of defining the terminal and nonterminal symbols of the grammar used to generate the individuals. (See [Bojarczuk Species example]).

Running Tests

In this section you will find the results of running unit tests over the algorithms availables in the JCLEC classification package. A unit test is a piece of code written by a developer that tests a specific functionality in the code which is tested. Unit tests can ensure that functionality is working and can be used to validate that this functionality still works after code changes.

For the sake of running unit tests, JUnit framwork was used.

Falco unit test


Tan unit test


Bojarczuk unit test