Particle Swarm Optimization: Artificial Intelligence
Particle Swarm Optimization: Artificial Intelligence
Particle swarm optimization (PSO) is an artificial intelligence (AI) technique that can be used to find approximate solutions to extremely difficult or impossible numeric maximization and minimization problems. The version of PSO I describe in this article was first presented in a 1995 research paper by J. Kennedy and R. Eberhart. PSO is loosely modeled on group behavior, such as bird flocking and fish schooling. The best way for you to get a feel for what PSO is and to see where Im heading here is to examine Figure 1. The first part of the figure describes a dummy problem being solved by a demonstration PSO program. The goal is to find values x0and x1 so the value of the function f = 3 + x02 + x12 is minimized. In the print edition of the article, I use the notation ^2 to indicate the squaring operation. Notice that Ive deliberately chosen an unrealistically simple problem to keep the ideas of PSO clear. Its obvious that the solution to this problem is x0 = 0.0 and x1 = 0.0, which yields a minimum function value of 3.0, so using PSO isnt really necessary. I discuss more realistic problems that can be solved by PSO later in this article. In this example, the dimension of the function to minimize is 2 because we need to solve for 2 numeric values. In general, PSO is well-suited to numeric problems with dimensions of 2 or larger. In most situations, PSO must have some constraints on the range of possible xvalues. Here x0 and x1 are arbitrarily limited to the range -100.0 to 100.0.
Figure 1 Particle Swarm Optimization Demo Run The next part of Figure 1 indicates that the PSO program is using 10 particles and that the program will iterate 1,000 times. As youll see shortly, each particle represents a possible solution to the PSO problem being solved. PSO is an iterative technique and in most cases its not possible to know when an optimal solution has been found. Therefore, PSO algorithms usually must have some limit on the number of iterations to perform. The next lines in Figure 1 indicate that each of the 10 particles in the swarm is initialized to a random position. A particles position represents a possible solution to the optimization problem to be solved. The best randomly generated initial position is x0 = 26.53 and x1 = -6.09, which corresponds to a fitness (the measure of solution quality) of 3 + 26.532 + (-6.09)2 = 744.12. The PSO algorithm then enters a main processing loop where each particles position is updated on each pass through the loop. The update procedure is the heart of PSO and Ill explain it in detail later in this article. After 1,000 iterations, the PSO algorithm did in fact find the optimal solution of x0 = 0.0 and x1 = 0.0, but let me emphasize that in most situations you wont know whether a PSO program has found an optimal solution. In this article, Ill explain in detail the PSO algorithm and walk you line by line through the program shown running in Figure 1. I coded the demo program in C#, but you should be able to easily adapt the code presented here to another language, such as Visual Basic .NET or Python. The complete source code for the program presented in this article is available at code.msdn.microsoft.com/mag201108PSO. This article assumes you have intermediate coding skills with a modern procedural language but does not assume you know anything about PSO or related AI techniques.
Particles
When using PSO, a possible solution to the numeric optimization problem under investigation is represented by the position of a particle. Additionally, each particle has a current velocity, which represents a magnitude and direction toward a new, presumably better, solution/position. A particle also has a measure of the quality of its current position, the particles best known position (that is, a previous position with the best known quality), and the quality of the best known position. I coded a Particle class as shown in Figure 2. Figure 2 Particle Definition 1. public class Particle 2. { 3. public double[] position;
4. public double fitness; 5. public double[] velocity; 6. 7. public double[] bestPosition; 8. public double bestFitness; 9. 10. public Particle(double[] position, double fitness, 11. double[] velocity, double[] bestPosition, double bestFitness) 12. { 13. this.position = new double[position.Length]; 14. position.CopyTo(this.position, 0); 15. this.fitness = fitness; 16. this.velocity = new double[velocity.Length]; 17. velocity.CopyTo(this.velocity, 0); 18. this.bestPosition = new double[bestPosition.Length]; 19. bestPosition.CopyTo(this.bestPosition, 0); 20. this.bestFitness = bestFitness; 21. } 22. 23. public override string ToString() 24. { 25. string s = ""; 26. s += "==========================\n"; 27. s += "Position: "; 28. for (int i = 0; i < this.position.Length; ++i) 29. s += this.position[i].ToString("F2") + " "; 30. s += "\n"; 31. s += "Fitness = " + this.fitness.ToString("F4") + "\n"; 32. s += "Velocity: "; 33. for (int i = 0; i < this.velocity.Length; ++i) 34. s += this.velocity[i].ToString("F2") + " "; 35. s += "\n"; 36. s += "Best Position: "; 37. for (int i = 0; i < this.bestPosition.Length; ++i) 38. s += this.bestPosition[i].ToString("F2") + " "; 39. s += "\n"; 40. s += "Best Fitness = " + this.bestFitness.ToString("F4") + "\n"; 41. s += "==========================\n"; 42. return s; 43. } 44. } // class Particle The Particle class has five public data members: position, fitness, velocity, bestPosition and bestFitness. When using PSO, for simplicity I prefer using public scope fields, but you may want to use private fields along with get and set properties instead. The field named position is an array of type double and represents a possible solution to the optimization problem under investigation. Although PSO can be used to solve non-numeric problems, its generally best-suited for solving numeric problems. Field fitness is a measure of how good the solution represented by position is. For minimization problems, which are the most common types of problems solved by PSO, smaller values of the fitness field are better than larger values; for maximization problems, larger values of fitness are better. Field velocity is an array of type double and represents the information necessary to update a particles current position/solution. Ill explain particle velocity in detail shortly. The fourth and fifth fields in the Particle type are bestPosition and bestFitness. These fields hold the best position/solution found by the Particle object and the associated fitness of the best position.
The Particle class has a single constructor that accepts five parameters that correspond to each of the Particles five data fields. The constructor simply copies each parameter value to its corresponding data field. Because all five Particle fields have public scope, I could have omitted the constructor and then just used field assignment statements in the PSO code, but I think the constructor leads to cleaner code. The Particle class definition contains a ToString method that echoes the values of the five data fields. As with the constructor, because I declared the position, fitness, velocity, bestPosition and bestFitness fields with public scope, I dont really need a ToString method to view a Particle objects values, but incl uding it simplifies viewing the fields and its useful for WriteLine-style debugging during development. In the ToString method I use string concatenation rather than the more efficient StringBuilder class to make it easier for you to refactor my code to a non-Microsoft .NET Framework-based language if you wish.
Figure 3 Plot of f = 3 + x02 + x12 Lets say that a particles current position, x(t), is {x0, x1} = {3.0, 4.0}, and that the particles current velocity,v(t), is {-1.0, -1.5}. Lets also assume that constant w = 0.7, constant c1 = 1.4, constant c2 = 1.4, and that random numbers r1 and r2 are 0.5 and 0.6 respectively. Finally, suppose the particles best known position is p(t) = {2.5, 3.6} and the global best known position by any particle in the swarm is g(t) = {2.3, 3.4}. Then the new velocity and position values are: v(t+1) = (0.7 * {-1.0,-1.5}) + (1.4 * 0.5 * {2.5, 3.6} - {3.0, 4.0}) + (1.4 * 0.6 * {2.3, 3.4} {3.0, 4.0}) = {-0.70, -1.05} + {-0.35, -0.28} + {-0.59, -0.50} = {-1.64, -1.83} x(t+1) = {3.0, 4.0} + {-1.64, -1.83} = {1.36, 2.17} Recall that the optimal solution is {x0, x1} = (0.0, 0.0}. Observe that the update process has improved the old position/solution from (3.0, 4.0} to {1.36, 2.17}. If you mull over the update process a bit, youll see that the new velocity is the old velocity (times a weight) plus a factor that depends on a particles best known position, plus another factor that depends on the best known position from all particles in the swarm. Therefore, a particles new position tends to move toward a better position based on the particles best known position and the best known position of all particles. The graph in Figure 4 shows the movement of one of the particles during the first eight iterations of the demo PSO run. The particle starts at x0 = 100.0, x1 = 80.4 and tends to move toward the optimal solution of x0 = 0, x1 = 0. The spiral motion is typical of PSO.
18. double minX = -100.0; 19. double maxX = 100.0; 20. 21. Particle[] swarm = new Particle[numberParticles]; 22. double[] bestGlobalPosition = new double[Dim]; 23. double bestGlobalFitness = double.MaxValue; 24. 25. double minV = -1.0 * maxX; 26. double maxV = maxX; 27. 28. // Initialize all Particle objects 29. 30. double w = 0.729; // inertia weight 31. double c1 = 1.49445; // cognitive weight 32. double c2 = 1.49445; // social weight 33. double r1, r2; // randomizations 34. 35. // Main processing loop 36. 37. // Display results 38. Console.WriteLine("\nEnd PSO demo\n"); 39. } 40. catch (Exception ex) 41. { 42. Console.WriteLine("Fatal error: " + ex.Message); 43. } 44. } // Main() 45. 46. static double ObjectiveFunction(double[] x) 47. { 48. return 3.0 + (x[0] * x[0]) + (x[1] * x[1]); 49. } 50. 51. } // class Program 52. 53. public class Particle 54. { 55. // Definition here 56. } 57. 58. } // ns After instantiating the Random object with an arbitrary seed value of 0, I initialize some key PSO variables: 1. int numberParticles = 10; 2. int numberIterations = 1000; 3. int iteration = 0; 4. int Dim = 2; 5. double minX = -100.0; 6. double maxX = 100.0; I use 10 Particle objects. As a rule of thumb, more Particle objects are better than fewer, but more can significantly slow program performance. I set the number of main processing loop iterations to 1,000. The number of iterations youll want to use will depend on the complexity of the problem youre trying to optimize and the processing power of your host machine. Typically, PSO programs use a value between 1,000 and 100,000. The variable named iteration is a counter to keep track of the number of main loop iterations. The Dim variable holds the number
of x values in a solution/position. Because my example problem needs to find the values of x0 and x1 that minimize 3 + x02 + x12, I set Dim to 2. As I mentioned earlier, in most PSO situations youll want to limit the x values that make up the position/solution vector to some problem-dependent range. Without some limits, youre effectively searching from double.MinValue to double.MaxValue. Here I arbitrarily limit x0 and x1 to [-100.0, +100.0]. Next, I prepare to instantiate the particle swarm: 1. Particle[] swarm = new Particle[numberParticles]; 2. double[] bestGlobalPosition = new double[Dim]; 3. double bestGlobalFitness = double.MaxValue; 4. double minV = -1.0 * maxX; 5. double maxV = maxX; I create an array of Particle objects named swarm. I also set up an array to hold the global best known position determined by any Particledenoted by g(t) in the algorithmand the corresponding fitness of that position array. I set constraints for the maximum and minimum values for a new velocity. The idea here is that because a new velocity determines a particles new position, I dont want the magnitude of any of the velocity components to be huge. The code to initialize the swarm is as follows: 1. for (int i = 0; i < swarm.Length; ++i) 2. { 3. double[] randomPosition = new double[Dim]; 4. for (int j = 0; j < randomPosition.Length; ++j) { 5. double lo = minX; 6. double hi = maxX; 7. randomPosition[j] = (hi - lo) * ran.NextDouble() + lo; 8. } 9. ... I iterate through each Particle object in the array named swarm. I declare an array of size Dim to hold a random position for the current Particle. Then for each x-value of the position I generate a random value between minX (100.0) and maxX (+100.0). In many realistic PSO problems, the range for each x-value will be different, so youll have to add code to deal with each x-value in the position array separately. Now I continue the initialization process: 1. double fitness = ObjectiveFunction(randomPosition); 2. double[] randomVelocity = new double[Dim]; 3. for (int j = 0; j < randomVelocity.Length; ++j) { 4. double lo = -1.0 * Math.Abs(maxX - minX); 5. double hi = Math.Abs(maxX - minX); 6. randomVelocity[j] = (hi - lo) * ran.NextDouble() + lo; 7. } 8. swarm[i] = new Particle(randomPosition, fitness, randomVelocity, 9. randomPosition, fitness); 10. ... First I compute the quality of the current random position array by passing that array to the method ObjectiveFunction. If you refer back to Figure 5, youll see that the ObjectiveFunction method simply computes the value of the function Im trying to minimize, namely 3 + x02 + x12. Next I compute a random velocity for the current Particle object. After I have a random position, the fitness of the random position and a random velocity, I pass those values to the Particle constructor. Recall that the fourth and fifth parameters are the particles best known position and its associated fitness, so when initializing a Particle the initial random position and fitness are the best known values. The swarm initialization code finishes with: 1. 2. 3. ... if (swarm[i].fitness < bestGlobalFitness) { bestGlobalFitness = swarm[i].fitness;
4. swarm[i].position.CopyTo(bestGlobalPosition, 0); 5. } 6. } // End initialization loop I check to see if the fitness of the current Particle is the best (smallest in the case of a minimization problem) fitness found so far. If so, I update array bestGlobalPosition and the corresponding variable bestGlobalFitness. Next, I prepare to enter the main PSO processing loop: 1. double w = 0.729; // inertia weight 2. double c1 = 1.49445; // cognitive weight 3. double c2 = 1.49445; // social weight 4. double r1, r2; // randomizers I set the value for w, the inertia weight, to 0.729. This value was recommended by a research paper that investigated the effects of various PSO parameter values on a set of benchmark minimization problems. Instead of a single, constant value for w, an alternative approach is to vary the value of w. For example, if your PSO algorithm is set to iterate 10,000 times, you could initially set w to 0.90 and gradually decreasew to 0.40 by reducing w by 0.10 after every 2,000 iterations. The idea of a dynamic w is that early in the algorithm you want to explore larger changes in position, but later on you want smaller particle movements. I set the values for both c1 and c2, the cognitive and social weights, to 1.49445. Again, this value was recommended by a research study. If you set the value of c1 to be larger than the value of c2, you place more weight on a particles best known position than on the swarms global best known position, and vice versa. The random variables r1 and r2 add a random component to the PSO algorithm and help prevent the algorithm from getting stuck at a non-optimal local minimum or maximum solution. Next, I begin the main PSO processing loop: 1. for (int i = 0; i < swarm.Length; ++i) 2. { 3. Particle currP= swarm[i]; 4. 5. for (int j = 0; j < currP.velocity.Length; ++j) 6. { 7. r1 = ran.NextDouble(); 8. r2 = ran.NextDouble(); 9. 10. newVelocity[j] = (w * currP.velocity[j]) + 11. (c1 * r1* (currP.bestPosition[j] - currP.position[j])) + 12. (c2 * r2 * (bestGlobalPosition[j] - currP.position[j])); 13. ... I iterate through each Particle object in the swarm array using i as an index variable. I create a reference to the current Particle object named currP to simplify my code, but I could have used swarm[i] directly. As explained in the previous section, the first step is to update each particles velocity vector. For the current Particle object, I walk through each one of the values in the objects velocity array, generate random variables r 1 and r2, and then update each velocity component as explained in the previous section. After I compute a new velocity component for the current Particle object, I check to see if that component is between the minimum and maximum values for a velocity component: 1. if (newVelocity[j] < minV) 2. newVelocity[j] = minV; 3. else if (newVelocity[j] > maxV) 4. newVelocity[j] = maxV; 5. } // each j 6. newVelocity.CopyTo(currP.velocity, 0); 7. ... If the component is out of range, I bring it back in range. The idea here is that I dont want extreme values for the velocity component because extreme values could cause my new position to spin out of bounds. After all velocity
components have been computed, I update the current Particle objects velocity array using the handy .NET CopyTo method. Once the velocity of the current Particle has been determined, I can use the new velocity to compute and update the current Particles position: 1. for (int j = 0; j < currP.position.Length; ++j) 2. { 3. newPosition[j] = currP.position[j] + newVelocity[j]; 4. if (newPosition[j] < minX) 5. newPosition[j] = minX; 6. else if (newPosition[j] > maxX) 7. newPosition[j] = maxX; 8. } 9. newPosition.CopyTo(currP.position, 0); 10. ... Again I perform a range check, this time on each of the current particles new position components. In a sense, this is a redundant check because Ive already constrained the value of each velocity component, but in my opinion the extra check is warranted here. Now that I have the current Particle objects new position, I compute the new fitness value and update the objects fitness field: 1. newFitness = ObjectiveFunction(newPosition); 2. currP.fitness = newFitness; 3. 4. if (newFitness < currP.bestFitness) { 5. newPosition.CopyTo(currP.bestPosition, 0); 6. currP.bestFitness = newFitness; 7. } 8. if (newFitness < bestGlobalFitness) { 9. newPosition.CopyTo(bestGlobalPosition, 0); 10. bestGlobalFitness = newFitness; 11. } 12. 13. } // each Particle 14. } // main PSO loop 15. ... After updating the current particle, I check to see if the new position is the best known position of the particle; I also check to see if the new position is a best global swarm position. Notice that logically, there can be a new global best position only if theres a best local position, so I could have nested the global best check inside the check for a local best position. At this point my main PSO algorithm loop is finished and I can display my results: 1. Console.WriteLine("\nProcessing complete"); 2. Console.Write("Final best fitness = "); 3. Console.WriteLine(bestGlobalFitness.ToString("F4")); 4. Console.WriteLine("Best position/solution:"); 5. for (int i = 0; i < bestGlobalPosition.Length; ++i){ 6. Console.Write("x" + i + " = " ); 7. Console.WriteLine(bestGlobalPosition[i].ToString("F4") + " "); 8. } 9. Console.WriteLine(""); 10. Console.WriteLine("\nEnd PSO demonstration\n");
11. } 12. catch (Exception ex) 13. { 14. Console.WriteLine("Fatal error: " + ex.Message); 15. } 16. } // Main()