ISP611 (GA Assignment)
ISP611 (GA Assignment)
INDIVIDUAL ASSIGNMENT
TITLE:
MAXIMIZING AUDIENCE REACH FOR MARKETING USING GENETIC ALGORITHM
PREPARED BY:
KHAIRUL AMRI BIN MAZLAN (2021619556)
GROUP:
CS2595B
PREPARED FOR:
PROF. MADYA DR MARINA BINTI YUSOFF
SUBMISSION DATE:
1st December 2023
TABLE OF CONTENT
1. Case Study 3
2. Objective Function 4
3. Chromosome Representation 5
4. Programming Language 6
8. Conclusion 15
2
1. CASE STUDY
An advertising agency is tasked with promoting a new product through various online
channels, such as social media, display ads, and search engine marketing. The main challenge is
to determine the optimal allocation of the advertising budget across different platforms to
maximize audience reach, considering varying costs and potential reach for each channel.
Understanding this issue, we can identify that it similarly resembles a classic optimization
problem known as ‘Knapsack Problem’. The knapsack problem concerns the combination of
items with the purpose the best possible solution that maximizes or minimizes the weight of
those items combined while obeying the constraint established.
As we all know, there are many optimization algorithms developed to help with various
problems encountered on a daily basis, one of the algorithm is called Genetic Algorithm (GA).
As we are considering the 0/1 knapsack problem, Genetic Algorithm is going to be used since it
is able to handle a complex search spaces and find optimal solutions within a relatively short
period of time, hence why it is used when dealing with finding the combination that will satisfy
the objective function and be within the constraint specified.
In conclusion, the exploration of Genetic Algorithm to tackle the issue in finding the best
strategy to utilize advertising channels that provides a practical solution to identify the best
combination of advertising channels that maximize the potential percentage of audience reached,
while still be within the constraint that is the budget limit. As we dive deeper into the world of
knapsack problem optimization, we will uncover its versatile applications and unveil the
strategies used to find optimal solutions to these multifaceted challenges.
3
2. OBJECTIVE FUNCTION
The objective function is set up to maximize the percentage of potential audience reached
while maintaining the sum of cost of those advertising channels to be less or equal to the budget
limit. The objective function is as follows:
𝐹(𝑋) = 𝑚𝑎𝑥(∑𝑋𝑚𝑃𝑚)
𝑤ℎ𝑖𝑐ℎ 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑡𝑜 ∑𝑋𝑚𝐶𝑚 ≤ 𝐵
𝑋𝑚 = 0 𝑜𝑟 1, 𝑚 ∊ 𝑁 = {1, 2,..., 𝑛}
From the above objective function, the algorithm is going to calculate the sum of value
for the combination of items by multiplying the value of each item with binary numbers (1
indicating the use of an advertising channel in the combination and 0 otherwise) and add them up
to find the maximum percentage of potential audience reached. While doing this, it also needs to
obey the constraint specified, that is the sum of cost for the combination of those advertising
channels must be less than or equal to the budget limit, B. The algorithm will try to find the best
possible solution that maximizes the sum of audience reached and also obey the constraint
specified.
4
3. CHROMOSOME REPRESENTATION
Each advertising channel is made up of 2 attributes (genes) which is cost and its
percentage of potential audience reached. The value for both these attributes will be stored in a
2D array with each index represents the attributes in a list. The table below specifies both the
weight and value for each individual item, where the number of item, n = 7.
Cost 50 25 7 10 3 15 11
% of 25 8 12 15 10 10 20
potential
audience
reached
From the table above, we can analyze that each advertising channel consists of two genes
(attribute) which in turn constitutes the chromosome. These values will be used in the
computational part to calculate the sum of cost and potential audience reached respectively. As
previously mentioned, the state of existence of each item in the combination can be represented
via binary numbers. The example is as follow:
Chromosome 1 1 0 1 0 1 0 1
Chromosome 2 1 1 0 0 0 1 1
… … … … … …
The table above represents an example of the combination of advertising channels that
constitutes multiple chromosomes. Each chromosome represents a potential solution in which
each chromosome will be evaluated by using the objective function established. In chromosome
1, four advertising channels namely TV, print, mail, and video are present while item radio,
outdoor, and sponsor are absent and similar interpretation applies to the other chromosomes.
Using this information, the fitness (sum of potential audience reached) is calculated while
simultaneously checking that the chromosome satisfy the constraint (budget limit).
5
4. PROGRAMMING LANGUAGE
I have chosen Python as the programming language to apply the optimization algorithm.
Python is used as it is equipped with various libraries that will significantly help transforming
the steps involved in Genetic Algorithm into a computer program.
6
3 Selection: def selection(population):
Selecting the two chromosomes with cand1, cand2 =
highest fitness. random.sample(population, k=2)
fitness1 = fitness(cand1)
The function is going to select two fitness2 = fitness(cand2)
individuals from the population and if fitness1 > fitness2:
then return the one with a higher return cand1
fitness value. else:
return cand2
7
newPopulation.sort(key=lambda
x:fitness(x),reverse=True)
if pastBest == newPopulation[0]:
totalTolerance += 1
else:
totalTolerance = 0
pastBest = newPopulation[0]
if totalTolerance == tolerance:
t = population[0]
print(f"Solution is found at
generation {i+1}: Value = {fitness(t)}")
break
population = newPopulation
Complete Program:
import random
from collections import namedtuple as ntp
from statistics import mean
from tkinter import ROUND
import matplotlib.pyplot as plt
Item = ntp("Item",["name","cost","aud"])
maxPopulation = 15
maxGeneration = 50
budgetLimit = 65
itemList = [
Item("A",50,30),
Item("B",25,15),
Item("C",3,8),
Item("D",10,13),
Item("E",7,5),
Item("F",15,10),
Item("G",11,12)
]
mutationProb = 0.1
tolerance = 20
8
def createIndividual():
return [random.randint(0,1) for i in range(len(itemList))]
def createPopulation():
solution = []
while len(solution) < maxPopulation:
newIndividual = createIndividual()
if newIndividual in solution:
duplicateIndex = solution.index(newIndividual)
solution[duplicateIndex] = mutate(newIndividual)
else:
solution.append(createIndividual())
return solution
def fitness(individual):
totalCost = 0
totalAudience = 0
for i, exist in enumerate(individual):
if exist == 0:
continue
totalCost += itemList[i].cost
if totalCost > budgetLimit:
return False
totalAudience += itemList[i].aud
return totalAudience
def selection(population):
cand1, cand2 = random.sample(population, k=2)
fitness1 = fitness(cand1)
fitness2 = fitness(cand2)
def selectPair(population):
parent1 = selection(population)
parent2 = selection(population)
return parent1,parent2
9
def crossover(parent1, parent2):
offspring = []
for p1,p2 in zip(parent1,parent2):
prob = random.random()
if prob < 0.5:
offspring.append(p1)
else:
offspring.append(p2)
if random.random() <= mutationProb:
offspring = mutate(offspring)
return offspring
def mutate(individual):
idx = random.randrange(len(individual))
individual[idx] = individual[idx] ^ 1
return individual
def main():
population = createPopulation()
population.sort(key=lambda x:fitness(x),reverse=True)
totalTolerance = 0
fitnessPlot = []
pastBest = population[0]
for i in range(maxGeneration):
fitnessValues = [fitness(individual) for individual in population]
meanFitness = round(sum(fitnessValues)/len(fitnessValues),2)
fitnessPlot.append(meanFitness)
print(f"Generation {i+1}: {population[0]} - Fitness:
{fitness(population[0])} - Mean Fitness : {meanFitness}")
newPopulation = []
while len(newPopulation) < maxPopulation:
parent1,parent2 =
selectPair(population[:int(maxPopulation*0.5)])
offspring = crossover(parent1,parent2)
newPopulation.append(offspring)
newPopulation.sort(key=lambda x:fitness(x),reverse=True)
if pastBest == newPopulation[0]:
totalTolerance += 1
10
else:
totalTolerance = 0
pastBest = newPopulation[0]
if totalTolerance == tolerance:
t = population[0]
print(f"Solution is found at generation {i+1}: Value =
{fitness(t)}")
break
population = newPopulation
plt.plot(range(1, i + 2), fitnessPlot, marker='o')
plt.title('Fitness Level per Generation')
plt.xlabel('Generation')
plt.ylabel('Fitness')
plt.show()
if __name__ == "__main__":
main()
11
6. RESULTS AND DISCUSSION
For results comparison, parameter of the algorithm is modified to utilize the generation of
multiple generations in searching for the optimal solution by calculating the mean value and
display them in a graph. In these comparisons, we are going to change multiple parameters
i) mutation rate (default: 0.1)
ii) maximum generation (default: 50)
iii) maximum population (default: 15)
iv) budget limit (default: 65)
1 mutationProb = 0.2
tolerance = 20
budgetLimit = 65
maxGeneration = 50
maxPopulation = 15
2 mutationProb = 0.2
tolerance = 20
budgetLimit = 65
maxGeneration = 100
maxPopulation = 15
12
3 mutationProb = 0.2
tolerance = 20
budgetLimit = 65
maxGeneration = 50
maxPopulation = 5
4 mutationProb = 0.2
tolerance = 20
budgetLimit = 80
maxGeneration = 100
13
maxPopulation = 5
From the results, we can identify that by changing the parameters, the optimal solution
solution found will also change. Changing the parameters allow the algorithm to make
adjustments accordingly to the population while trying to search for the best probable solution.
For example, in Experiment 4 by changing the budget limit from 65 to 80, the final fitness value
obtained is 111 meanwhile in Experiment 3, the budget limit is set at 65 and the final fitness
value obtained is 70. The solution found is not exactly the best, but it provides an idea of what
could have been the best solution to the problem.
14
7. SUGGESTION FOR IMPROVEMENT
To enhance the performance of the genetic algorithm for solving the advertising strategies
problem, several improvements can be considered. Firstly, incorporating a more diverse set of
crossover operators, such as uniform crossover or multi-point crossover, could help explore the
solution space more effectively. This diversity can prevent premature convergence and enable the
algorithm to discover better solutions.
Additionally, implementing a dynamic mutation strategy, where the mutation rate adapts
based on the convergence status of the population, can strike a balance between exploration and
exploitation. Furthermore, employing a more sophisticated fitness function that considers not
only the total value but also penalizes solutions for exceeding the knapsack capacity can guide
the algorithm towards more feasible and practical solutions.
Lastly, employing elitism, which ensures that the best solutions from one generation are
directly passed on to the next, can help preserve valuable genetic material and accelerate
convergence towards optimal solutions. These enhancements collectively aim to make the
genetic algorithm more robust, adaptable, and capable of efficiently navigating the complex
solution space of the knapsack problem.
8. CONCLUSION
15