0% found this document useful (0 votes)
33 views

Performance Evaluation of Quine-McCluskey Method On Multi-Core CPU

Uploaded by

thalita.o.rocha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Performance Evaluation of Quine-McCluskey Method On Multi-Core CPU

Uploaded by

thalita.o.rocha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2021 8th NAFOSTED Conference on Information and Computer Science (NICS)

Performance Evaluation of Quine-McCluskey Method


on Multi-core CPU
2021 8th NAFOSTED Conference on Information and Computer Science (NICS) | 978-1-6654-1001-4/21/$31.00 ©2021 IEEE | DOI: 10.1109/NICS54270.2021.9701506

Hoang-Gia Vu Ngoc-Dai Bui Anh-Tu Nguyen ThanhBangLe


Faculty of Radio-Electronic Faculty of Radio-Electronic Faculty of Radio-Electronic Faculty of Radio-Electronic
Engineering Engineering Engineering Engineering
Le Quy Don Technical University Le Quy Don Technical University Le Quy Don Technical University Le Quy Don Technical University
Ha Noi, Vietnam Ha Noi, Vietnam Ha Noi, Vietnam Ha Noi, Vietnam
[email protected]

Abstract—The Quine-McCluskey method is an algorithm to The run-time of the method also grows exponentially with the
minimize Boolean functions. Although the method can be input variable number. This slows down the process of analysis,
programmed on computers, it takes a long time to return the set design, and verification of digital logic circuits. The problem
of prime implicants, thus slowing the analysis and design of digital becomes more serious in the design of dynamic run-time
logic circuits. As a result, it slows down the dynamic reconfigurable hardware architectures or adapted hardware
reconfiguration process of programmable logic devices. In this architectures.
paper, we first propose a data representation for storing
implicants in memory to reduce the cache misses of the program. The Quine-McCluskey method consists of two steps:
We then propose an algorithm to find all prime implicants of a
Boolean function. The algorithm aims to reuse the data available Step 1: Finding all prime implicants of the Boolean function
on cache, thus decreasing cache misses. After that, we propose an Step 2: Selection of essential prime implicants that cover all
algorithm for step 2 of the Quine-McCluskey method to select the
minimal number of essential prime implicants. The evaluation the minterms of the function.
shows that our proposals achieve much higher performance than Both of the steps are memory-intensive applications since
the original Quine-McCluskey method. The number of essential they involve many repeated memory references. We believe that
prime implicants is a low percentage, less than 50%, of the total both of the steps can be accelerated on cached CPU by
prime implicants generated in step 1 of the method. exploiting the temporal and spatial data locality. The
contributions of this paper are as follows:
Keywords—Quine-McCluskey, prime implicant, multithreading,
Boolean function 1) We propose a bitarray-based data representation for
I. INTRODUCTION implicants that consume a small size of memory. This helps to
reduce the cache miss rate of the method running on the CPU.
A Boolean function is a function producing a Boolean output 2) We propose an algorithm for step 1 of the method to
by logical calculation of Boolean inputs. It is a key point in the
exploit data locality, thus minimizing the cache misses of the
analysis, design, and implementation of digital logic circuits.
Minimization of Boolean functions is to optimize the algorithm step running on the CPU.
of such functions to achieve a simpler structure of the algorithm. 3) We propose an algorithm for step 2 of the method to
Thus it simplifies the corresponding digital logic circuit. There minimize the number of required prime implicants covering all
are two popular methods to minimize Boolean functions. 1) The the minterms of the Boolean function.
Karnaugh method is based on a graphical representation of
The rest of the manuscript is organized as follows: Section
Boolean functions [1]. And 2) The Quine-McCluskey method
II discusses the related work. Section III presents the data
generates prime implicant lists using the tabulation method [2].
The method was first proposed by Quine [3, 4] and then representation of implicants. Section IV presents the proposed
improved by McCluskey. The Quine-McCluskey method is algorithms of the method. Section V shows the evaluation. The
functionally identical to the Karnaugh method. However, while conclusion is summarized in section VI.
the Karnaugh mapping is suitable to Boolean functions of a few
II. RELATED WORK
input variables, the Quine-McCluskey algorithm is dedicated to
Boolean functions with a large number of Boolean inputs [5]. Wegener et al. proved that the minimization of Boolean
Therefore, the Karnaugh method is often used in education. functions is a hard problem [7]. Prasad et al. analyzed the
Meanwhile, the Quine-McCluskey method is practically simplification of the Quine-McCluskey method for a different
employed in the analysis and design of digital logic circuits for number of product terms [8]. The complexity of the Quine-
real-world applications. McCluskey method was mathematically modelled in the
following equation [8]:
However, the computational complexity of the Quine-
McCluskey method is 𝑂(𝑁$%&' ( 𝑙𝑜𝑔, 𝑁), N - the input length [6].

XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE


Authorized licensed use limited to: UNIV ESTADUAL PAULISTA JULIO DE MESQUITA FILHO. Downloaded on July 15,2023 at 18:52:57 UTC from IEEE Xplore. Restrictions apply.
978-1-6654-1001-4/21/$31.00 ©2021 IEEE 60
2021 8th NAFOSTED Conference on Information and Computer Science (NICS)

𝑁 = 𝑎 . 𝑡 3 . 𝑒 567 + 1 (1) Algorithm 1: Comparison of two bit arrays


Where, 1: def compare_2(a, b)
N : the number of literals 2: temp = a ^ b
3: if temp.count(1) == 1:
t : the number of non-repeating product terms in the Boolean
4: return temp.find(1)
function
5: else:
a, b, and c : three constants depending on the number of input 6: return -1
variables
For improving the minimization of Boolean functions, there (10-00-10) à bit array ‘0100100000100100’
were several works to quickly and automatically simplify The implicant (10-00-10) consumes 8 bytes if it is
Boolean functions. Dusa et al. proposed eQMC to reduce the represented in the form of ASCII characters. The implicant uses
computational complexity of the Quine-McCluskey method by 2 bytes (16 bits) if represented in a bit array. Therefore, the
taking into account only minterms that the corresponding memory utilization of the bitarray-based representation is four
outputs are ‘1’ [9]. The eQMC method performs an exhaustive times efficient than that of the ASCII-based representation.
procedure that relies on index vectors instead of complex
matrices. The method achieved higher performance and smaller For comparison between two bit arrays to find out if they can
memory usage compared to their implementation of the original be combined into a new implicant, we propose to use the
Quine-McCluskey method. However, the performance was still operator XOR as in Algorithm 1. If the two implicants differ in
low at 4.87 seconds, execution time with 15 input variables and only one symbol, then the XOR operation of two corresponding
only 20 observed configurations. Gurunath et al. introduced an bit arrays will return a bit array including only one bit ‘1’. It is
algorithm for multiple output minimization [10]. Jain et al. [11] noted that the comparison is only for implicants having the same
optimized the Quine-McCluskey method by introducing the number of dashes. In the function, the XOR operation of the two
concept of Reduced Mask. The new concept helped to reduce bit arrays is first executed, followed by counting the number of
the computational complexity of the method. As a result, the bits ‘1’ in the result. If the number of bit ‘1’ is equal to one, the
execution time decreased significantly. Majumder et al. position of the bit ‘1’ is returned. Otherwise ‘-1’ will be
presented a technique based on decimal values to decrease the returned.
probability of an error occurrence [12].
IV. THE ALGORITHMS FOR THE QUINE-MCCLUSKEY METHOD
There are a couple of works for the acceleration of the
In this paper, we focus on optimizing both step 1 and step 2
Quine-McCluskey method. Siladi et al. proposed a scheme to
of the Quine-McCluskey method.
adapt step 1 of the Quine-McCluskey method on a parallel
computing platform – GPU [13, 14]. In this works, the author A. Algorithm for Step 1: Finding all Prime Implicants
presented a parallel algorithm for simplification of step 1 on the In step 1 of the Quine-McCluskey method, the major
GPU. They process the implicants in multiple rounds. In each operations are memory access to read all the implicants and
round of step 1, implicants are first partitioned by the positions
comparison of implicants. We believe that the performance
of dashes in the terms. Each partition is then scanned for merge-
able terms. In their merging algorithm, the list of terms is first bottleneck in this step is the huge number of memory references
converted to bitmap representation, thus each term is a bit set. repeated again and again. In this part, we propose an algorithm
Dashes in the term are treated as zeroes when calculating the bit to exploit temporal data locality on cache memory. The Pseudo
indexes. The algorithm achieved not much higher performance code is described in Algorithm 2.
compared to the implementation on the CPU for large numbers • Let m be the number of input variables in the Boolean
of input variables. Small instances running on the GPU is even function.
slower than those running on the CPU. In these works, they did • Let list-1 be the list of all minterms that the corresponding
not take into account the output value ‘x’ – don’t care of the output is evaluated to ‘1’.
Boolean function. • Let list-x be the list of all minterms that the corresponding
output is evaluated to ‘x’ – don’t care.
III. BITARRAY-BASED DATA REPRESENTATION • Let implicant-list be the list of all implicants in each round
To reduce the memory usage and the cache misses of the of comparison.
Quine-McCluskey method, we propose to represent implicants • Let prime-list be the list of all prime implicants found out
in form of bit arrays instead of ASCII characters. Particularly, after step 1 of the method.
symbols ‘1’, ‘0’, ‘-‘ are represented as follows: • Let new-implicant-list be the new imlicant list generated
Symbol ‘1’ à bit array ‘01’ after each round of comparison and merging.
• Let combined be the Boolean variable indicating if there
Symbol ‘0’ à bit array ‘00’ is any combination between two implicants. It starts at True
Symbol ‘-’ à bit array ‘10’ value.
In the while loop, combined is first assigned to False. Then
As a result, implicant (10-00-10) will be represented in the all the implicants are classified as in Line 10. Implicants
following form: belonging to the same group have the same number of symbols

Authorized licensed use limited to: UNIV ESTADUAL PAULISTA JULIO DE MESQUITA FILHO. Downloaded on July 15,2023 at 18:52:57 UTC from IEEE Xplore. Restrictions apply.
61
2021 8th NAFOSTED Conference on Information and Computer Science (NICS)

Algorithm 2: Finding all prime implicants Algorithm 3: Selection of Essential Prime Implicants
1: m = # input variables 1: final-prime-list = []
2: list-1 = [minterms] 2: victim = None
3: list-x = [d-terms] 3: max-len = 1
4: implicant-list = list-1.append(list-x) 4: while max-len != 0:
5: prime-list = [] 5: max-len = 0
6: new-implicant-list = []
6: for prime in prime-list:
7: combined = True
7: prime.val-1.difference_update(victim.val-1)
8: while (combined):
8: if len(prime.val-1) > max-len:
9: combined = False
10: groups = make_groups(implicant_list) 9: max-len = len(prime.val-1)
11: for i in range(m): 10: temp = prime
12: reversed = False 11: prime-list.remove(temp)
13: for x1 in groups[i]: 12: victim = temp
14: if reversed: 13: final-prime-list.append(temp)
15: range-litst = groups[i+1].reverse()
16: else:
in Line 32. After that, the implicant-list is assigned to the new-
17: range-litst = groups[i+1]
implicant-list before going to the next round of the while loop.
18: for x2 in range-list:
19: pos = compare_2(x1, x2) B. Algorithm for Step 2: Selection of Essential Prime
20: if pos != -1: Implicants
21: combined = True In step 2 of the Quine-McCluskey method, essential prime
22: new-term = x1 implicants will be selected among the full set of prime
23: new-term[pos -1] = 1 implicants from step 1. In this step, we aim to choose the
24: new-term[pos] = 0 smallest number of prime implicants that cover all the minterms
25: x1.used = True of the Boolean function. For that perspective, we propose
26: x2.used = True Pseudo code as in Algorithm 3. After step 1 of the Quine-
27: new-implicant-list.append(new-term) McCluskey method, we have prime-list including all prime
28: reversed = not reversed implicants of the Boolean function.
29: for impl in implicant-list: • Let final-prime-list be the final list of essential prime
30: if impl.used: implicants covering all the minterms of the function.
31: implicant-list.remove(impl) • Let victim be the prime that is selected after each round of
32: prime-list.append(implicant-list) the while loop. victim is the prime implicant that is merged from
33: implicant-list = new-implicant-list the largest number of minterms.
• Let max-len be the maximum length of all minterm sets of
prime implicants.
‘1’. Therefore, there are at most m+1 groups. Implicants in • Let val-1 be the minterm set of a prime implicant.
each group are then compared with the implicants of the
consecutive group to find if they can be merged into a new In the while loop, max-len is assigned to zero in Line 5. max-
implicant as in Line 11 to Line 19. If there is any combination, len is found by iterating prime-list and comparing the length of
the variable combined is marked as True in Line 21. That means the minterm sets of all prime implicants. It is noted that after
at least one new implicant is generated, and the next round of finding out a victim in each round of the while loop, the victim
the while loop is required. At the same time, the two combined is removed from prime-list as in Line 11. Then, the minterm set
implicants are marked as used. of each prime implicants is also updated as in Line 7 before
finding max-len.
It is noted that in the first for loop, a variable named
reversed is used to indicate the third for loop should be iterated V. EVALUATION
in the reversed order or the normal order. This variable starts at
False as in Line 12. Iterating the loop in the reversed order helps Table 1 shows the experimental setup we used to evaluate
to utilize the data in cache memory that are fetched recently the proposed algorithms. In this evaluation, the number of input
from the lower-level memory before the data are replaced on variables in the Boolean function is scaled from 10 to 20. For
the k-input Boolean function, the number of possible input
the cache memory. As a result, the cache miss rate will be
decreased. That is also the key point in Algorithm 2. vectors is 2; , which evaluate to ‘0’, ‘1’, or ‘x’ – don’t care.
All the marked-as-used implicants are then removed from • Let fill-factor-1 be the ratio between the number of input
the implicant-list before the list is updated into the prime-list as vectors that evaluate to ‘1’ and the total number of input vectors
2; .

Authorized licensed use limited to: UNIV ESTADUAL PAULISTA JULIO DE MESQUITA FILHO. Downloaded on July 15,2023 at 18:52:57 UTC from IEEE Xplore. Restrictions apply.
62
2021 8th NAFOSTED Conference on Information and Computer Science (NICS)

TABLE I. EVALUATION SETUP TABLE IV. EXECUTION TIME OF STEP 1 AND STEP 2 FOR DIFFERENT
FILL FACTOR
CPU Intel Core i3-7100
fill- Step 1 Step 1 (Our Step 1 Step 2
Number of cores 4 factor (Original) proposal) (Multi-
CPU operation frequency 800 MHz thread)

L1 Dcache 32 KB 0.1 0.0379 0.0043 0.0057 0.0014

L1 Icache 32 KB 0.2 0.1806 0.0641 0.0588 0.0116

L2 cache 256 KB 0.3 7.2595 2.4564 2.0784 0.1597

L3 cache 3 MB 0.4 4,301.5054 1,453.5304 1,235.8151 59.5760

Block cache size 64 Bytes


Operating system Ubuntu 18.4 TABLE V. EXECUTION TIME OF STEP 1 AND STEP 2 FOR DIFFERENT
NUMBER OF INPUT VARIABLES
Cache profiling tool Perf 5.4.143
Inputs Step 1 Step 1 Step 1 Step 2
(Original) (Our (Multi-
proposal) thread)
TABLE II. CACHE MISS FOR DIFFERENT FILL FACTORS
10 0.0379 0.0043 0.0057 0.0014
fill- L1-load-misses Miss- L1-load-misses Miss-
factor (Original) rate (Our proposal) rate 15 22.3329 6.0953 5.1162 2.1205
0.1 3,612,183 4.12% 3,363,111 3.45% 20 41629.7808 8994.4525 7503.7638 3101.154
0.2 4,194,506 0.54% 4,153,117 1.15%
0.3 45,486,911 0.15% 36,471,029 0.37% IV.A. The same situation is in Table III when the number of
input variables is scaled and the fill factor is fixed at 0.1.
0.4 22,437,678,570 0.12% 19,667,227,871 0.35% Although the miss rate is basically higher in our proposal, except
for the function with 10 input variables and the fill factor at 0.1,
the larger number of L1 data cache load misses will cause the
TABLE III. CACHE MISS FOR DIFFERENT NUMBER OF INPUT VARIABLES longer total miss penalty in the original Quine-McCluskey
method. As a result, the execution time in our proposal is
Inputs L1-load-misses Miss- L1-load-misses Miss- expected shorter than that of the original Quine-McCluskey.
(Original) rate (Our proposal) rate
10 3,612,183 4.12% 3,363,111 3.45% B. Execution Time
Table IV shows the execution time of step 2 and step 1 of the
15 203,862,212 0.21% 166,546,860 0.55%
Quine-McCluskey method for different fill factors. In this
20 321,717,266,536 0.19% 225,112,520,565 0.52% experiment, we executed step 1 in the original method in a single
core. Then step 1 was executed with our proposal in Section III
and Section IV.A in a single core. After that step 1 with our
proposal was executed in multiple threads running in the multi-
• Let fill-factor-x be the ratio between the number of input core CPU. The results show that our proposal achieved much
vectors that evaluate to ‘x’ and the total number of input vectors higher performance than the original Quine-McCluskey did. The
- 2; . same situation is presented in Table V for scaling the number of
In this section, we describe the cache performance, execution input variables. This higher performance in our proposal comes
time, and the number of essential prime implicants in our from the smaller numbers of cache load misses revealed in
algorithms. For the number of input variables 10, we scale the Section V.A.
fill-factor-1 and fill-factor-x from 0.1 to 0.4. As can be seen in the two tables, the method with our
A. Cache Performance proposal running in multiple threads achieved not much higher
performance compared to running on a single thread, apart from
Table II shows the cache load misses of L1 data cache and the experiment with 10 input variables and the fill factor 0.1.
the corresponding miss rate for the original Quine-McCluskey This is because the Quine-McCluskey method is a memory-
method and for our proposal evaluated in a Boolean function intensive application. The majority of the execution time is
with 10 input variables while scaling the fill factor. As can be consumed by the memory access, but not the computation in the
seen from the table, the number of L1 cache load misses in our CPU cores. The two tables also show that the execution time of
proposal is quite smaller than that of the original Quine- step 2 is quite smaller than that of step 1. However, it is noted
McCluskey method. The reduced number of cache load misses that the proposed algorithm for step 2 do not focus on improving
is a consequence of the proposed small data representation the performance, but focus on reducing the number of essential
described in Section III and the algorithm presented in Section prime implicants.

Authorized licensed use limited to: UNIV ESTADUAL PAULISTA JULIO DE MESQUITA FILHO. Downloaded on July 15,2023 at 18:52:57 UTC from IEEE Xplore. Restrictions apply.
63
2021 8th NAFOSTED Conference on Information and Computer Science (NICS)

TABLE VI. NUMBER OF ESSENTIAL PRIME IMPLICANTS FOR DIFFERENT of prime implicants that becomes essential is quite small, less
FILL FACTOR
than 50%. This helps to reduce the hardware cost in the
fill- Minterms Primes Essential implementation of the Boolean function. In future work, we
factor primes will take into account the minimazation of multi-output
Boolean functions as well as a framework for design and
0.1 104 146 72 (49.3%) verification of Boolean functions.
0.2 205 673 106 (15.8%)
0.3 308 7,952 110 (1.4%) References
0.4 411 3,485,799 97 (0.003%) [1] M. Karnaugh, “The map method for synthesis of combinational logic
circuits,” Transactions of the American Institute of Electrical Engineers,
vol. 72 part I, pp. 593–598, 1953.
[2] E. J. McCluskey, “Minimization of boolean functions,” Bell System
TABLE VII. NUMBER OF ESSENTIAL PRIME IMPLICANTS FOR DIFFERENT Technical Journal, vol. 35, Issue 6, pp. 1417–1444, 1956.
NUMBER OF INPUT VARIABLES [3] W. V. Quine, “The problem of simplifying truth functions,” The
American Mathematical Monthly, vol. 59, no. 8, pp. 521–531,
Inputs Minterms Primes Essential Mathematical Association of America, 1952.
primes [4] W. V. Quine, “A way to simplify truth functions,” The American
Mathematical Monthly, vol. 62, no. 9, pp. 627–631, Mathematical
10 104 146 72 (49.3%) Association of America, 1955.
[5] M. Petrík, “Quine–McCluskey method for many-valued logical
15 3,278 6919 2048 (29.6%) functions,” Soft Computing, vol. 12, Issue 4, pp. 393–402, Springer-
Verlag, 2007.
20 104,857 296,035 60,682 (20.5%) [6] S. P. Tomaszewski, I. U. Celik, G. E. Antoniou, “WWW-based boolean
function minimization,” International Journal of Applied Mathematics
and Computer Science,vol. 13, no. 4, pp. 577–583, 2003.
[7] I. Wegener, “The complexity of boolean functions,” John Wiley & Sons,
C. Number of Essential Prime Implicants Inc. New York, NY, USA,1987.
Table VI and Table VII show the number of prime [8] P. W. Chandana Prasad, Azam Beg, Ashutosh Kumar Singh, “Effect of
implicants before and after step 2 of the Quine-McCluskey Quine-McCluskey simplification on boolean space complexity,” In:
Innovative Technologies in Intelligent Systems and Industrial
method. As can be seen from the tables, the number of essential Applications, CITISIA 2009, IEEE, 2009.
prime implicants achieved after step 2 decreases significantly [9] A. Dus ̧a, A. Thiem, “Enhancing the minimization of boolean and
compared to the number of prime implicants. The number of multivalue output functions with eQMC,” The Journal of Mathematical
essential prime implicants is always smaller than the number of Sociology, 39:2, pp. 92–108, 2015.
minterms and much smaller than the number of total prime [10] B. Gurunath, N.N. Biswas, “An algorithm for multiple output
implicants. Particularly, the number of essential prime minimization,” IEEE Transactions on Computer-Aided Design of
Integrated Circuits and Systems, vol. 8, Issue 9, pp. 1007–1013, IEEE,
implicants is less than 50% of the total number of prime 1989.
implicants. For the high fill factors, 0.3 and 0.4, the percentages [11] T. K. Jain, D. S. Kushwaha, A. K. Misra, “Optimization of the Quine-
are even lower than 2%, 1.4% at the fill factor 0.3 and 0.003% McCluskey method for the minimization of the boolean expressions,”
at the fill factor 0.4. Table VI reveals that the higher fill factor Fourth International Conference on Autonomic and Autonomous Systems
(ICAS’08), Gosier, 2008, pp. 165–168.
the lower percentage of total prime implicants that are essential.
[12] A. Majumder, B. Chowdhury, A. J. Mondai, K. Jain, “Investigation on
VI. CONCLUSION Quine McCluskey method: A decimal manipulation based novel approach
for the minimization of boolean function,” International Conference on
In this paper, we address the performance bottleneck of the Electronic Design, Computer Networks & Automated Verification
Quine-McCluskey method to the memory access since the (EDCAV), 2015, DOI: 10.1109/EDCAV.2015.7060531.
application is memory-intensive. We propose a bitarray-based [13] V. Siládi, T. Filo, “Quine-McCluskey algorithm on GPGPU,” In: AWER
data representation for implicants of Boolean functions and an Procedia Information Technology and Computer Science, vol. 4 3rd
World Conference on Innovation and Computer Science (INSODE-
algorithm to find all the prime implicants. The proposals exploit 2013), pp. 814–820, 2013.
the data locality of cache memories. The experimental results [14] V. Siládi, M. Povinsky, L. Trajtel, “Adapted parallel Quine-McCluskey
show that our proposal causes fewer cache load misses than the algorithm using GPGPU,” 14th International Scientific Conference on
original Quine-McCluskey method does, leading to higher Informatics, pp. 327-331, 2017.
performance. In this work, we also minimize the number of
essential prime implicants. The result shows that the percentage

Authorized licensed use limited to: UNIV ESTADUAL PAULISTA JULIO DE MESQUITA FILHO. Downloaded on July 15,2023 at 18:52:57 UTC from IEEE Xplore. Restrictions apply.
64

You might also like