Tutorial

From JCLEC wiki
Jump to: navigation, search

The tutorial covers four different domains and problems using genetic algorithms (GAs) and Genetic Programming.

GAs using binary encoding: Knapsack problem

This section aims to introduce the use of JCLEC by analyzing how to solve a simple optimization problem using a simple genetic algorithm. The problem considered is the knapsack problem. In this problem, given a set of items, each with a weight and a value, the objective is to determine the number of each item to include in the knapsack, so that the total weight is less than a given limit and the total value is as large as possible.


Problem definition

To solve the knapsack problem and in general any other problem in JCLEC, firstly it is necessary to define the problem that is intended to be solved, and then it is necessary to select one of the algorithms available in the library and set up its parameters.

Consider n distinct objects and a knapsack. Each object has a weight <math>w_i</math> and a price <math>p_i</math> associated for <math>i = 1, 2, ..., n</math>. The knapsack can carry a weight not exceeding a given amount W. Taking into account these considerations, the goal of the problem is to fill the knapsack in order to maximize the value of the objects contained. Notice that any object can be put in the knapsack if there is enough space for it, but the knapsack can not carry an object fraction. Therefore, to achieve the desired solution, for each object it is necessary to decide either it is put it in the knapsack or not. Each object has a variable <math>x_i</math> associated that takes the value 1 if the object is introduced in the knapsack, or 0 otherwise.

The problem can be formulated mathematically as shown below:

<math>\sum_{i=1}^{n} x_{i}*v_{i}</math> subject to <math>\sum_{i=1}^{n} x_{i}*w_{i} \le W</math>

With respect to the parameter configuration, the following restrictions to the knapsack problem are considered: the maximum weight allowed is equal to 4200 grams, the number of articles per type can not exceed from 10 and there are 10 different kinds of objects.


Encoding criterion

It is necessary to determine that an individual follows a binary encoding approach. In this scheme, the individual encodes a binary genotype, whose size is given by the number of different articles considered and the maximum number of articles per type (i.e., in our example the size is 10*10 = 100 genes), and where each gene corresponds to the object number in Table. Notice that a value of 1 in the i−th gene indicates that the i−th article has been put in the knapsack, while a value of 0 indicates the opposite. For example, the following genotype indicates that the individual includes the first, third, fourth and sixth objects in the knapsack:

1 0 1 1 0 1 0 0 0 0


Object Weight Value
1 150 20
2 325 40
3 600 50
4 805 36
5 430 25
6 1200 64
7 770 54
8 60 18
9 930 46
10 353 28


Implementation

This section describes how to encode the configuration file required to run the algorithm in the library JCLEC and how to implement basic methods for evaluating and comparing individuals in the population.

<experiment>
	 <process algorithm-type="net.sf.jclec.algorithm.classic.SG">
		 <rand-gen-factory type="net.sf.jclec.util.random.RanecuFactory" seed="987328938"/>
		 <population-size>100</population-size>
		 <max-of-generations>1000</max-of-generations>
		 <species type="net.sf.jclec.binarray.BinArrayIndividualSpecies" genotype-length="100"/>
		 <evaluator type="tutorial.Knapsack">
			 <products>
				 <max-each-product>10</max-each-product>
				 <product weight="150" profit="20"/>
				 <product weight="325" profit="40"/>
				 <product weight="600" profit="50"/>
				 <product weight="805" profit="36"/>
				 <product weight="430" profit="25"/>
				 <product weight="1200" profit="64"/>
				 <product weight="770" profit="54"/>
				 <product weight="60" profit="18"/>
				 <product weight="930" profit="46"/>
				 <product weight="353" profit="28"/>
				 <max-weight>4200</max-weight>
			 </products>
		 </evaluator>
		 <provider type="net.sf.jclec.binarray.BinArrayCreator"/>
		 <parents-selector type="net.sf.jclec.selector.TournamentSelector">
			 <tournament-size>2</tournament-size>
		 </parents-selector>
		 <recombinator type="net.sf.jclec.binarray.rec.UniformCrossover" rec-prob="0.75" />
		 <mutator type="net.sf.jclec.binarray.mut.OneLocusMutator" mut-prob="0.1" />
		 <listener type="net.sf.jclec.listener.PopulationReporter">
			 <report-frequency>50</report-frequency>	
		 </listener>
	 </process>
</experiment>

It is necessary to implement a java class to obtain an evaluator appropriated to the characteristics that were determined to evaluate the population and to treat unfeasible individuals. An example can be seen in Knapsack.java partially shown below. This class will extend from AbstractEvaluator, located in the package net.sf.jclec.base. By extending it, those abstract methods will be implemented. The method evaluate() evaluates individuals given as argument and sets their fitness values.

protected void evaluate(IIndividual ind)
{
	// Individual genotype
	byte [] genotype = ((BinArrayIndividual)ind).getGenotype();
	// Total weigth and profit
	int totalweigth = 0, totalprofit = 0;
 	// Calculate weight
	for (int i=0; i<genotype.length; i++) 
	{
		totalweigth += genotype[i]*weights.get(i);
 		totalprofit += genotype[i]*profits.get(i);
	}
 	
	// Calculate profit (if necessary)
	if (totalweigth <= maxWeight) 
 	{
		// Set individual fitness
  		ind.setFitness(new SimpleValueFitness(totalprofit)); 
	}
	else
	{
		ind.setFitness(new SimpleValueFitness(-totalprofit));
 	}
}


The Knapsack class is for general purposes and it can solve any knapsack problem. Before running the experiment, the weights and profits values must be set as well as the maximum value that the bag supports. Once parameters are set in the the configuration file, the RunExperiment class can be run giving the configuration file as an argument. This method will run the genetic algorithm and will output a report with all the information requested. This information is usually the best individual, the worst individual found and the average fitness of individuals every 50 generations.

GAs using integer encoding: Traveling Salesman Problem

The Traveling Salesman Problem (TSP) is a problem in combinatorial optimization and fits within known as transportation problems. The TSP has several applications for personal services, goods or other services (messaging, garbage collection, etc). The study of how to make these services costs profitable either minimizing consumption or obtaining the shortest time, make this an explotaited area because of its applications. These problems generally require to consider simultaneously a number of restrictions, conditions and factors affecting the efficiency and the quality of the solutions provided. The variety of objectives, resources and constraints that often have real transportation problems make it very difficult to treat with exact optimization methods.


Problem definition

The Salesman Problem can be stated as follows: A salesman must visit <math>n</math> cities, starting and finishing in his own city. Knowing the cost of going from one town to another, decide the path of minimal cost.

The solution to be decided, given a road map, is the order that the salesman must follow to visit <math>n</math> cities starting whichever, driving fewer miles and returning to the beginning after visiting each city only once. In this case, instead of having to select a subset of items, it is about choosing a permutation of <math>n</math> cities so that the distances traveled are minimal.

The Traveling Salesman Problem has the following properties:

  • It is one of the most interesting which has arisen in Operations Research and for which extensive material has been published.
  • It is very intuitive and easy to understand.
  • Most of the techniques that have been appearing in the area of Combinatorial Optimization have been tested on it because its resolution is very complex.

When the theory of algorithmic complexity was stated, the TSP was one of the first problems to be studied, Karp proved in 1972 that it belongs to the class of NP-hard problems. Sahni and Gonzalez (1976) proved that, unless P = NP, there is no heuristic algorithm for the TSP to work in polynomial time on the size of the problem and is <math>\varepsilon</math>-Average.

This problem can be formally represented using a graph, where nodes represent cities, and edges the connection between these cities. It would be in addition a weighted graph, so each edge would have a value representing the euclidean distance from one city to another. The objective is to find a path in such a way that every city is visited only once, being the starting and the ending city the same. The next figure shows a sample graph for a TSP problem with 7 cities:

Graph.jpg

To solve the problem it is supposed that the graph is complete, in which every pair of distinct vertices is connected by an edge, so that all cities will be connected with all others. The distance considered is the euclidean distance.

This tutorial will practise with 5 problems:

  • berlin52: Number of cities: 52. Optimal solution cost: 7.542.
  • bier127: Number of cities: 127. Optimal solution cost: 118.282.
  • d198: Number of cities: 198. Optimal solution cost: 15.780.
  • lin318: Number of cities: 318. Optimal solution cost: 42.029.
  • rat575: Number of cities: 575. Optimal solution cost: 6.773.


These instances have been obtained from TSPLIB library, a web site that collects various cases of this problem, along with the cost of its optimal solution or the best solution obtained so far. All files have the same format, a list of two values for each city that represent its coordinates in the map. To build the matrix of costs, it must be calculated the euclidean distance between each pair of cities.

To solve the problem, a genetic algorithm is used. This technique does not guarantee to achieve the optimal solution but an acceptable solution; furthermore, it requires less computational time and therefore it is more efficient.


Encoding criterion

The traveling salesman problem is an order representation problem. This representation consist of as many genes as cities there are, and its values will be integer numbers. The chromosome will indicate the order in which to travel the graph, so that if there are have 5 cities (1,2,3,4,5), the code of the chromosome of an individual could be any permutation of that set, so a chromosome representation could be the following:

2 3 5 1 4

which indicates that the cities will be visited in the following order: from city 2 to city 3, from city 3 to city 5, from city 5 to city 1, from city 1 to city 4 and finally as a cycle from city 4 to city 2.

Implementation

This section describes how to encode the configuration file required to run the algorithm in the library JCLEC and how to implement basic methods for evaluating individuals in the population.

This tutorial uses net.sf.jclec.orderarray package which includes the codification of individuals, crossover and mutation operators and all those classes that are necessary for the resolution of the TSP problem.


<experiment>
	<process algorithm-type="net.sf.jclec.algorithm.classic.SGE">
		<rand-gen-factory type="net.sf.jclec.util.random.RanecuFactory" seed="987328938"/>
		<population-size>100</population-size>
		<max-of-generations>1000</max-of-generations>
		<species type="net.sf.jclec.orderarray.OrderArrayIndividualSpecies" genotype-length="52"/>
		<evaluator type="tutorial.TSP" file-name="berlin52.tsp" number-cities="52"/>
		<provider type="net.sf.jclec.orderarray.OrderArrayCreator" />
		<parents-selector type="net.sf.jclec.selector.TournamentSelector" tournament-size="2"/>
		<mutator type="net.sf.jclec.orderarray.mut.Order2OptMutator" mut-prob="0.1" />
		<recombinator type="net.sf.jclec.orderarray.rec.OrderPMXCrossover" rec-prob="0.9" />
		<listener type="net.sf.jclec.listener.PopulationReporter">
			<report-frequency>50</report-frequency>	
		</listener>
 	</process>
</experiment>

It is necessary to implement a java class to obtain an evaluator appropriated to the characteristics that were determined to evaluate the population and to treat unfeasible individuals. An example can be seen in TSP.java partially shown below. This class will extend from AbstractEvaluator, located in the package net.sf.jclec.base. By extending it, those abstract methods will be implemented. The method evaluate() evaluates individuals given as argument and sets their fitness values.

protected void evaluate(IIndividual ind)
{	
	// Individual genotype
	int [] genotype = ((OrderArrayIndividual)ind).getGenotype();
	double distance = 0; 	
	
	for (int i=0; i<genotype.length-1; i++) 
		distance+=distancesMatrix[genotype[i]][genotype[i+1]];

	distance+=distancesMatrix[genotype[genotype.length-1]][genotype[0]];
	
	ind.setFitness(new SimpleValueFitness(distance));
}

GAs using real encoding: Optimization of real functions

This section aims to explain how to optimize real functions of real variable applying genetic algorithms. The purpose of this section is to study the performance of a set of real-coded evolutionary algorithms for the optimization of real functions of real variable, also analyzing the quality of the solutions obtained. These algorithms, that are available in JCLEC library, will be applied to a set of 3 different functions.


Problem definition

The problem proposed here is to find the global optimum of 3 functions. Each function will be minimized in two different domains for variable <math>x</math>. Their description and representation; first represents the Sphere function, then the Rastrigin function, and finally the Ackley function) are shown below:

Name Definition Domain 1 Domain 2
Spehere <math>f_{1}(x)=\Sigma_{i=1}^{p}x_{i}^{2}</math> [-5.12,5.12] [0,5.12]
Rastrigin <math>f_{2}(x)=\Sigma_{i=1}^{p}(x_{i}^{2}-10\cos(2\pi x_{i})+10)</math> [-5.12,5.12] [0,5.12]
Ackley <math>f_{3}(x)=e-20exp(-0.2\sqrt{\frac{1}{p}\Sigma_{i=1}^{p}x_{i}^{2}})-exp(\frac{1}{p}\Sigma_{t=1}^{p}\cos(2\pi x_{t}))</math> [-30,30] [0,30]


Encoding criterion

The representation will adopt the shape of a vector of real values. Thus, each solution is encoded using a real vector of size n, being n equal to 5 for this problem. Each gene of a chromosome is a real number that takes a value within a range depending on the domain of the corresponding variable.


Implementation

This section describes how to encode the configuration file required to run the algorithm in the library JCLEC and how to implement basic methods for evaluating and comparing individuals in the population.

<experiment>
	<process algorithm-type="net.sf.jclec.algorithm.classic.SGE">
		<population-size>100</population-size>
		<max-of-generations>1000</max-of-generations>
		<rand-gen-factory type="net.sf.jclec.util.random.RanecuFactory" seed="124321453"/>
		<species type="net.sf.jclec.realarray.RealArrayIndividualSpecies">
			<genotype-schema>
			<locus type="net.sf.jclec.util.range.Interval" left="0" right="5.12" closure="closed-closed" />
			<locus type="net.sf.jclec.util.range.Interval" left="0" right="5.12" closure="closed-closed" />
			<locus type="net.sf.jclec.util.range.Interval" left="0" right="5.12" closure="closed-closed" />
			<locus type="net.sf.jclec.util.range.Interval" left="0" right="5.12" closure="closed-closed" />
			<locus type="net.sf.jclec.util.range.Interval" left="0" right="5.12" closure="closed-closed" />
			</genotype-schema>
		</species>
		<evaluator type="tutorial.RealEvaluator" />
		<provider type="net.sf.jclec.realarray.RealArrayCreator"/>
 		<parents-selector type="net.sf.jclec.selector.TournamentSelector" tournament-size="2"/>
		<mutator type="net.sf.jclec.realarray.mut.NonUniformMutator" mut-prob="0.15" />
		<recombinator type="net.sf.jclec.realarray.rec.BLXAlphaCrossover" rec-prob="0.9" alpha="0.3"/>
		<listener type="net.sf.jclec.listener.PopulationReporter">
			<report-frequency>50</report-frequency>	
		</listener>
	</process>
</experiment>

It is necessary to implement a java class to obtain an evaluator appropriated to the characteristics that were determined to evaluate the population and to treat unfeasible individuals. An example can be seen in RealEvaluator.java partially shown below. This class will extend from AbstractEvaluator, located in the package net.sf.jclec.base. By extending it, those abstract methods will be implemented. The method evaluate() evaluates individuals given as argument and sets their fitness values.

The method that represents the evaluator for the Sphere function is shown below:

protected void evaluate(IIndividual ind)
{	
	// Individual genotype
	double [] genotype = ((RealArrayIndividual)ind).getGenotype();
	double fitness = 0.;

	for(int i=0; i<genotype.length; i++)
		fitness+= Math.pow(genotype[i],2);
 	ind.setFitness(new SimpleValueFitness(fitness));
}

The method that represents the evaluator for the Rastrigin function is shown below:

protected void evaluate(IIndividual ind)
{	
	// Individual genotype
	double [] genotype = ((RealArrayIndividual)ind).getGenotype();
	double fitness = 0.;

	for(int i=0; i<genotype.length; i++)
		fitness+= Math.pow(genotype[i],2)-10*Math.cos(2*Math.PI*genotype[i])+10;
	ind.setFitness(new SimpleValueFitness(fitness));
}

The method that represents the evaluator for the Ackley function is shown below:

protected void evaluate(IIndividual ind)
{	
	// Individual genotype
	double [] genotype = ((RealArrayIndividual)ind).getGenotype();
	double fitness = 0., sum = 0., sum2 = 0;
	for(int i=0; i<genotype.length; i++)
	{
		sum+= Math.pow(genotype[i],2);
		sum2+= Math.cos(2.*Math.PI*genotype[i]);
 	}
	for(int i=0; i<genotype.length; i++)
		fitness+= 20.+Math.E-20.*Math.exp(-0.2*Math.sqrt(sum/genotype.length))-Math.exp(sum2/genotype.length);
	ind.setFitness(new SimpleValueFitness(fitness));
}

Genetic programming: a symbolic regression problem

This section aims to explain how to solve symbolic regression problems applying Genetic Programming (GP) algorithms, and it also studies the performance of these algorithms according to their parameterization setup. The results obtained can be used as guidelines to suggest the most effective way to set up the parameters for other GP-based experiments.

The main difference with respect to GAs lies in the fact that in GP it must be shown how solutions are represented, and therefore, the user has to define a suitable terminal and a function set (notice that expressions in GP are not typed, and any function can take any terminal or any result of another function as argument). Thus, the purpose of a GP algorithm is to evolve a measurable expression or a program that fits such representation.

Grammar Guided Genetic Programming (GGGP) differs from conventional GP in the use of a context-free grammar that guides the search of expressions, restricting the search space and ensuring that any expression found will be valid regarding this grammar. However, this encoding will not be used to solve symbolic regression problems, because conventional GP is more appropriate for solving this kind of problems.

JCLEC library can deal both with conventional GP representation –expression tree encoding– and GGGP –syntax tree encoding–. The former is supported in the use of the net.sf.jclec.exprtree package, which defines the encoding of the individuals, their structure and operators to manipulate them. The latter is implemented in the net.sf.jclec.syntaxtree package of the library.


Problem definition

The goal of symbolic regression is to obtain a symbolic expression that represents a mathematical function. In particular, this tutorial explains how to approximate the following functions: <math>f(x) = x^4 + x^3 + x^2 + x</math> and <math>f(x) = x^4 + x^3 + x^2 + x + 1</math>. For this purpose, the evaluator to implement will employ a set of pairs (input,output) for both functions, as shown:

<math>f(x) = x^4 + x^3 + x^2 + x</math>
<math>x=-2</math> <math>f(x) = 10</math>
<math>x=-1</math> <math>f(x) = 0</math>
<math>x=0</math> <math>f(x) = 0</math>
<math>x=1</math> <math>f(x) = 4</math>
<math>x=2</math> <math>f(x) = 30</math>
<math>f(x) = x^4 + x^3 + x^2 + x + 1</math>
<math>x=-2</math> <math>f(x) = 9</math>
<math>x=-1</math> <math>f(x) = 1</math>
<math>x=0</math> <math>f(x) = 1</math>
<math>x=1</math> <math>f(x) = 5</math>
<math>x=2</math> <math>f(x) = 31</math>


Implementation

In this tutorial we use the net.sf.jclec.exprtree package. The efficiency of this package is supported in the use of a virtual machine that uses a stack to calculate the results of intermediate operations:

  • Instead of using a preorder route to evaluate each node of the tree, nodes are added to the stack (from right to left and from bottom to top, which is not exactly a postfix tree-traversal).
  • When the last node added is a non-terminal node, the stack evaluates the function, which will affect to the two elements that were introduced previously in the stack (these argument nodes can be either terminals or evaluation results from other functions). For example, the stack contains the values 3.2453 and 2.4532. The next symbol to analyze is the non-terminal node <math>+</math>. The stack values 3.2453 and 2.4532 are popped. These values are added and then the result 5.6985 is pushed into the stack.

First, it is necessary to select the terminal and function set necessary to solve the problem. The operators <math>+, -, *</math> will be used as the function set and the terminal <math>x</math> as the terminal set. Thus, one class per symbol is implemented in a new sub-package.

Below it is shown the whole XML configuration file.

<experiment>
	<process algorithm-type="net.sf.jclec.algorithm.classic.SGE">
		<population-size>200</population-size>
		<max-of-generations>100</max-of-generations>
		<rand-gen-factory type="net.sf.jclec.util.random.RanecuFactory" seed="234567895"/>
		<species type="net.sf.jclec.exprtree.ExprTreeIndividualSpecies" number-of-trees="1">
			<expression-tree>
				<min-tree-size>3</min-tree-size>
				<max-tree-size>25</max-tree-size>
				<root-type>java.lang.Double</root-type>		
				<terminals>
					<terminal class="tutorial.symreg.X"/>
				</terminals>	
				<functions>
					<function class="tutorial.symreg.Add"/>
					<function class="tutorial.symreg.Sub"/>
					<function class="tutorial.symreg.Mul"/>
 				</functions> 
			</expression-tree>
		</species>
		<evaluator type="tutorial.symreg.SymregEvaluator"/>
		<provider type="net.sf.jclec.exprtree.ExprTreeCreator"/>
		<parents-selector type="net.sf.jclec.selector.TournamentSelector" tournament-size="2"/>
		<recombinator type="net.sf.jclec.exprtree.ExprTreeRecombinator" rec-prob="0.8">
			<base-op type="net.sf.jclec.exprtree.rec.SubtreeCrossover" />
		</recombinator>
		<mutator type="net.sf.jclec.exprtree.ExprTreeMutator" mut-prob="0.1">
			<base-op type="net.sf.jclec.exprtree.mut.SubtreeMutator" />
		</mutator>
		<listener type="net.sf.jclec.listener.PopulationReporter">
			<report-title>SymReg</report-title>
			<report-frequency>1</report-frequency>
			<report-on-console>true</report-on-console>
			<report-on-file>false</report-on-file>
			<save-complete-population>false</save-complete-population>
		</listener>	
	</process>
</experiment>


Class X

This class represents a terminal symbol. The evaluate() method pushes this symbol in the stack.

package tutorial.symreg;

import net.sf.jclec.exprtree.fun.Argument;

public class X extends Argument<Double> 
{
	public X() 
	{
		super(Double.class, 0);
	}
	
	// java.lang.Object methods
	
	public boolean equals(Object other)
	{
		return other instanceof X;
	}	
	
	public String toString()
	{
		return "X";
	}
}


Class Add

This class sums two arguments. Its evaluate() method pops the two top arguments of the stack, sums them ant pushes the result back to the stack.

package tutorial.symreg;

import net.sf.jclec.exprtree.fun.AbstractPrimitive; import net.sf.jclec.exprtree.fun.ExprTreeFunction;

public class Add extends AbstractPrimitive 
{ 
	/**
	 * This operator receives two double arrays as arguments and return
	 * a double array as result.
	 */
	
	public Add() 
	{
		super(new Class<?> [] {Double.class, Double.class}, Double.class);
	}

	@Override
	protected void evaluate(ExprTreeFunction context) 
	{
		// Get arguments (in context stack)
		Double arg1 = pop(context);
		Double arg2 = pop(context);
		// Push result in context stack
		push(context, arg1+arg2);
	}

	// java.lang.Object methods
	
	public boolean equals(Object other)
	{
		return other instanceof Add;
	}	
	
	public String toString()
	{
		return "+";
	}	
}


Class Sub

This class subtracts two arguments. Its evaluate() method pops the two top arguments of the stack, subtracks them ant pushes the result back to the stack.

package tutorial.symreg;

import net.sf.jclec.exprtree.fun.AbstractPrimitive; import net.sf.jclec.exprtree.fun.ExprTreeFunction;

public class Sub extends AbstractPrimitive 
{ 
	/**
	 * This operator receives two double arrays as arguments and return
	 * a double array as result.
	 */
	
	public Sub() 
	{
		super(new Class<?> [] {Double.class, Double.class}, Double.class);
	}

	@Override
	protected void evaluate(ExprTreeFunction context) 
	{
		// Get arguments (in context stack)
		Double arg1 = pop(context);
		Double arg2 = pop(context);
		// Push result in context stack
		push(context, arg1-arg2);
	}

	// java.lang.Object methods
	
	public boolean equals(Object other)
	{
		return other instanceof Sub;
	}	
	
	public String toString()
	{
		return "-";
	}	
}


Class Mul

This class multiplies two arguments. Its evaluate() method pops the two top arguments of the stack, multiplies them ant pushes the result back to the stack.

package tutorial.symreg;

import net.sf.jclec.exprtree.fun.AbstractPrimitive; import net.sf.jclec.exprtree.fun.ExprTreeFunction;

public class Mul extends AbstractPrimitive 
{ 
	/**
	 * This operator receives two double arrays as arguments and return
	 * a double array as result.
	 */
	
	public Mul() 
	{
		super(new Class<?> [] {Double.class, Double.class}, Double.class);
	}

	@Override
	protected void evaluate(ExprTreeFunction context) 
	{
		// Get arguments (in context stack)
		Double arg1 = pop(context);
		Double arg2 = pop(context);
		// Push result in context stack
		push(context, arg1*arg2);
	}

	// java.lang.Object methods
	
	public boolean equals(Object other)
	{
		return other instanceof Mul;
	}	
	
	public String toString()
	{
		return "*";
	}	
}


Class SymregEvaluator

Once the aforementioned classes have been defined, it is necessary to design the evaluator to compute the fitness of each individual. Thus, a new class is created, SymregEvaluator, and this class is included in the package net.sf.jclec.tutorial.symreg. In the case of the first function, a training set composed of two arrays is defined, xvalues and yvalues. Then, for each xvalues[i] the method execute() of ExprTreeFunction is called, finding out the difference between this result and the real one (yvalues[i]). The squared difference of each element as accumulated as error, and the square root of the final value will be the fitness of the individual (notice that the objective is to minimize the error, i.e., the smaller is the value, the better is the individual).

package tutorial.symreg;

import java.util.Comparator; 
import net.sf.jclec.IFitness;
import net.sf.jclec.IIndividual;
import net.sf.jclec.base.AbstractEvaluator;
import net.sf.jclec.exprtree.ExprTree;
import net.sf.jclec.exprtree.ExprTreeIndividual;
import net.sf.jclec.exprtree.fun.ExprTreeFunction;
import net.sf.jclec.fitness.SimpleValueFitness;
import net.sf.jclec.fitness.ValueFitnessComparator;

public class SymregEvaluator extends AbstractEvaluator
{ 
	private static final long serialVersionUID = 1L;

	private static final Comparator<IFitness> COMPARATOR = new ValueFitnessComparator(true);
	
	private static final double [] xvalues = {-2., -1., 0., 1., 2.};
	
	private static final double [] yvalues = {10., 0., 0., 4., 30.};
	
	//////////////////////////////////////////////////////////////////////
	// ------------------------------------------------- Protected methods
	//////////////////////////////////////////////////////////////////////

	@Override
	protected void evaluate(IIndividual ind) 
	{
		// Individual genotype
		ExprTree genotype = ((ExprTreeIndividual) ind).getGenotype();

		ExprTreeFunction function = new ExprTreeFunction(genotype);

		// Estimated values
		double [] y = new double[xvalues.length];
		for(int i = 0; i<xvalues.length; i++)
			y[i] = function.<Double>execute(xvalues[i]);
		// Pass all
		double rms = 0.0;
		for (int i=0; i<yvalues.length; i++) {
			double diff = y[i] - yvalues[i];
			rms += diff * diff;
		}
		rms = Math.sqrt(rms);
		// Set rms as fitness for ind
		ind.setFitness(new SimpleValueFitness(rms));
	}

	@Override
	public Comparator<IFitness> getComparator() {
		return COMPARATOR;
	}
}