0% found this document useful (0 votes)
12 views

IEEE - Logic Synthesis For RRAM-based In-Memory Computing

This document summarizes a research paper on logic synthesis for resistive random-access memory (RRAM)-based in-memory computing. It presents three key contributions: 1) Methods for realizing logic representations like binary decision diagrams (BDDs), and-inverter graphs (AIGs), and majority-inverter graphs (MIGs) using either implication (IMP) operations or built-in resistive majority (MAJ) operations in RRAM devices. 2) Optimization algorithms that minimize the number of RRAM devices or operations needed for each logic representation. For BDDs and MIGs, multi-objective algorithms also optimize for area and delay. 3) Efficient design methodologies

Uploaded by

karnatipj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

IEEE - Logic Synthesis For RRAM-based In-Memory Computing

This document summarizes a research paper on logic synthesis for resistive random-access memory (RRAM)-based in-memory computing. It presents three key contributions: 1) Methods for realizing logic representations like binary decision diagrams (BDDs), and-inverter graphs (AIGs), and majority-inverter graphs (MIGs) using either implication (IMP) operations or built-in resistive majority (MAJ) operations in RRAM devices. 2) Optimization algorithms that minimize the number of RRAM devices or operations needed for each logic representation. For BDDs and MIGs, multi-objective algorithms also optimize for area and delay. 3) Efficient design methodologies

Uploaded by

karnatipj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

1422 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 37, NO.

7, JULY 2018

Logic Synthesis for RRAM-Based


In-Memory Computing
Saeideh Shirinzadeh, Student Member, IEEE, Mathias Soeken, Member, IEEE,
Pierre-Emmanuel Gaillardon, Senior Member, IEEE, and Rolf Drechsler, Fellow, IEEE

Abstract—Design of nonvolatile in-memory computing devices to the theory of memristors proposed in 1971 [2] due to
has attracted high attention to resistive random access mem- possessing the same resistive characteristics [3].
ories (RRAMs). We present a comprehensive approach for High scalability of RRAMs [1] makes it possible to imple-
the synthesis of resistive in-memory computing circuits using ment ultra dense resistive memory arrays [4]. Such architec-
binary decision diagrams, and-inverter graphs, and the recently tures using memristive devices are of high interest for their
proposed majority-inverter graphs for logic representation and
possible applications in nonvolatile memory design [5], [6],
manipulation. The proposed approach allows to perform paral-
lel computing on a multirow crossbar architecture for the logic digital and analog programmable systems [7]–[9], and neuro-
representations of the given Boolean functions throughout a level- morphic computing structures [10].
by-level implementation methodology. It also provides alternative In [11], it was shown that material implication (IMP) can
implementations utilizing two different logic operations for each be used for logic synthesis with resistive devices. In the same
representation, and optimizes them with respect to the number work a memristive NAND gate was proposed which enables
of RRAM devices and operations, addressing area, and delay, to realize any Boolean function. This allows advanced com-
respectively. Experiments show that upper bounds of the afore- puter architectures different from classical von Neumannn
mentioned cost metrics for the implementations obtained by architectures by providing memories capable of comput-
our synthesis approach are considerably improved in compar- ing [12], [13]. Although RRAM-based implication logic is
ison with the corresponding existing methods in both area and
especially latency.
sufficient to express any Boolean function, the number of
required computational steps to synthesize a given function
Index Terms—BDD, in-memory computing, logic synthesis, is a real drawback [14], and only has been addressed by a few
RRAM. works [13], [15].
So far, various memristive logic circuits based on IMP oper-
ators have been proposed. An RRAM-based 2-to-1 multiplexer
I. I NTRODUCTION (MUX) containing six RRAM devices was proposed in [16]
HE ABRUPT switching capability of an oxide insulator
T sandwiched by two metal electrodes was known from
1960s, but it did not come into interest for several decades until
that requires seven IMP operations. In [17], a similar structure
but more efficient in the number of RRAM devices and oper-
ations was used for synthesis of Boolean functions based on
feasible device structures were proposed. Nowadays, a variety binary decision diagrams (BDDs). Besides BDDs, and-inverter
of two-terminal devices based on resistance switching prop- graphs (AIGs) have been also used for logic synthesis with
erty exist which use different materials. These devices possess resistive memories [18]. However, none of these works opti-
resistive switching characteristics between two high and low mize the utilized data structures with respect to the cost metrics
resistance values and are known by various acronyms, such of in-memory computing circuit design.
as OxRAM, ReRAM, and resistive random access memory A novel homogeneous logic representation structure,
(RRAM) [1]. RRAM devices have also attracted high attention majority-inverter graph (MIG) was proposed in [19] that uses
the majority function together with negation as the only logic
Manuscript received March 10, 2017; revised June 30, 2017; accepted operations. MIGs allow higher speeds in design of logic cir-
August 17, 2017. Date of publication September 7, 2017; date of current cuits and field-programmable gate array implementations [20].
version June 18, 2018. This work was supported in part by the University
of Bremen’s graduate school SyDe through the German Excellence Initiative, In comparison with the well-known data structures BDDs
CyberCare, under Grant H2020-ERC-2014-ADG 669354, in part by the Swiss and AIGs, MIGs have experimentally shown better results in
National Science Foundation Projects 200021 146600 and 200021 169084, logic optimization, especially in propagating delay [19]. In
and in part by United States-Israel Binational Science Foundation under particular, MIGs are highly qualified for logic synthesis of
Grant 2016016. This paper was recommended by Associate Editor T. Mitra.
(Corresponding author: Saeideh Shirinzadeh.)
RRAM-based circuits since they can efficiently execute the
S. Shirinzadeh is with the Department of Mathematics and Computer built-in resistive majority operation in RRAM devices [12].
Science, University of Bremen, 28359 Bremen, Germany (e-mail: In this paper, we present a comprehensive approach for
[email protected]). logic synthesis of RRAM-based in-memory computing cir-
M. Soeken is with the Integrated Systems Laboratory, EPFL, 1015 cuits using the three mentioned data structures for efficient
Lausanne, Switzerland.
P.-E. Gaillardon is with the Electrical and Computer Engineering representation, i.e., BDDs, AIGs, and MIGs. The presented
Department, University of Utah, Salt Lake City, UT 84112 USA. approach includes the following contributions.
R. Drechsler is with the Cyber-Physical Systems, DFKI GmbH, 1) We present two realizations for each data structure
28359 Bremen, Germany, and also with the Department of Mathematics and primitives: a) a realization based on IMP and b) a
Computer Science, University of Bremen, 28359 Bremen, Germany.
Color versions of one or more of the figures in this paper are available
realization that exploits the built-in resistive majority
online at http://ieeexplore.ieee.org. property of RRAM devices [12] denoted by built-in
Digital Object Identifier 10.1109/TCAD.2017.2750064 majority operation (MAJ).
0278-0070 c 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
SHIRINZADEH et al.: LOGIC SYNTHESIS FOR RRAM-BASED IN-MEMORY COMPUTING 1423

2) For each logic representation, we present optimiza-


tion algorithms with respect to the number of RRAM
devices or operations. For BDDs and MIGs, we also
propose multiobjective optimization algorithms to lower
both cost metrics of in-memory computing addressing
the area and delay of the resulting implementations.
Experiments confirm the efficiency of the proposed
optimization algorithms in comparison with existing
approaches.
3) We present efficient design methodologies which enable
a certain amount of parallel computing on a multirow
multicolumn crossbar, according to the features of the
optimized logic representations. We show that the sug- Fig. 1. Initial BDD representation for the function f = (x1 ⊕ x2 ) ∨ (x3 ⊕ x4 ),
using the ascending variable ordering and complemented edges.
gested approach guarantees the validity of computa-
tions and avoids data distortion during parallel comput-
representations before optimization with a fixed ascending
ing requiring only small crossbar dimensions for any
variable ordering x1 < x2 < · · · < xn , where n is the num-
Boolean function.
ber of input variables, e.g., in Fig. 1 n = 4 and therefore the
4) We provide a range of design preferences regarding
ordering is x1 < x2 < x3 < x4 .
area and latency for logic-in-memory computing, by sur-
2) Homogeneous Logic Representations for Circuits: In
veying the three data structures widely used for logic
this paper, we use AIGs [23] and MIGs [19] as homoge-
synthesis over two basic operations, i.e., IMP and MAJ.
neous logic representation. Each node in the graphs represents
The remainder of this paper is organized as follows.
one logic operation, x · y (conjunction) in case of AIGs, and
Section II introduces the employed logic representations,
M(x, y, z) = x·y+x·z+y·z (majority of three) in case of MIGs.
the logic operations for in-memory computing design, and
Inverters are represented in terms of complemented edges; reg-
discusses the related work. In Section III, we present our syn-
ular edges represent noncomplemented inputs. Homogeneous
thesis method for RRAM-based in-memory computing design
logic representations allow for efficient and simpler algorithms
and the experimental results separately for BDDs, AIGs, and
due to their regular structure—no case distinction is required
MIGs. Section IV makes a comparison between the exploited
for the logic operations. Consequently, such logic represen-
logic representations based on their effect on the metrics
tations are the major data structure in state-of-the-art logic
designating area and latency. Considerations for a crossbar
synthesis tools.
implementation are discussed in Section V, and this paper is
Logic represents in MIGs are at least as compact as in AIGs,
concluded in Section VI.
since each AND node can be mapped to exactly one major-
ity node; we have x · y = M(x, y, 0). However, even smaller
II. BACKGROUND MIGs can be obtained if their capability of compactness is
A. Logic Representations fully exploited such that no node in the graph has constant
inputs [24].
1) Binary Decision Diagrams: A BDD (e.g., [21]) is a A Boolean algebra was proposed in [19] in order to optimize
graph-based representation of a function that is derived from MIGs. The following set () includes the primitive transfor-
the Shannon decomposition f = xi fxi ⊕ x̄i fx̄i . Applying this mations that can be applied to an existing MIG to get a more
decomposition recursively allows dividing the function into efficient representation
many smaller subfunctions, which constitute the nodes of BDD ⎧
⎪ Commutativity − .C
representation. By use of complement attribute, a subfunction ⎪

and its complement can be represented by the same node. An ⎪
⎪ M(x, y, z) = M(y, x, z) = M(z, y, x)

⎪ Majority − .M
example of a BDD representing a function with four variables ⎪


⎪ M(x, x, z) = x M(x, x̄, z) = z
is shown in Fig. 1. The nodes corresponding to each input ⎪

Associativity − .A
variable xi represent a BDD level i which needs to be calcu-  M(x, u, , M(y, u, z)) = M(z, u, M(y, u, x))
lated in order, starting from the bottom of the graph to the root ⎪


⎪ Distributivity − .D
node f . Each node at level i has two high and low successors ⎪


⎪ M(x, y, M(u, v, z)) = M(M(x, y, u), M(x, y, v), z)
denoted by solid and dashed lines, referring to assignments ⎪


⎪ InverterPropagation − .I
xi = 1, and xi = 0, respectively. The complemented edges are ⎩
shown by dots on the successors. M(x, y, z) = M(x̄, ȳ, z̄).
BDDs make use of the fact that for many functions of It was proven in [24] that any MIG can be transformed
practical interest, smaller subfunctions occur repeatedly and to another logically equivalent MIG using only  axioms. It
need to be represented only once. Combined with an effi- means that reaching a desired MIG optimized with respect to
cient recursive algorithm that makes use of caching techniques the considered cost metric is possible by applying , however,
and hash tables to implement elementary operations, BDDs the length of transformation sequence might be impractical. To
are a powerful data structure for many applications. BDDs solve this problem, a more advanced set of transformations
are ordered in the sense that the Shannon decomposition is derived from the basic rules in  was proposed in [19] which
applied with respect to some given variable ordering which was denoted by . We only refer to complementary associa-
also has an effect on the BDD’s number of nodes. Improving tivity (.C) from the set  that is used in this paper, and is
the variable ordering for BDDs is NP-complete [22] and many formally expressed by
heuristics have been presented that aim at finding a good
ordering. Throughout this paper, we consider initial BDD .C : M(x, u, M(y, ū, z)) = M(x, u, M(y, x, z)).
1424 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 37, NO. 7, JULY 2018

(a) (b)
Fig. 3. Intrinsic majority operation within an RRAM device.
Fig. 2. IMP operation. (a) Implementation of IMP using RRAM devices.
(b) Truth table for IMP (q ← p IMP q = p + q) [11].

C. Related Work
B. Logic Operations for RRAM-Based Design
So far, few synthesis approaches using logic representations
1) Material Implication: IMP and FALSE operation, i.e., have been proposed for in-memory computing. All the existing
assigning the output to logic 0, are sufficient to express any approaches in this area exploit IMP for realization of the nodes
Boolean function [11]. Fig. 2 shows the implementation of of their employed graph-based data structures. The unfavorable
an IMP gate which was proposed in [11]. P and Q designate nature of sequential operations for RRAM-based in-memory
two resistive devices connected to a load resistor RG . Three computing has been mostly exploited to reduce the number of
voltage levels VSET , VCOND , and VCLEAR are applied to the required RRAM devices. Some approaches evaluate the graph-
devices to execute IMP and FALSE operations by switching based representation completely in sequence such that only a
between low-resistance (logic 1) or high-resistance (logic 0) single node can be computed each time [18], [25]. These evalu-
states. ation methods increase the length of computational sequences
The FALSE operation can be performed by applying in comparison with the parallel evaluation proposed in [17]
VCLEAR to an RRAM device. An RRAM device can be also in which nodes of equal latency are computed at the same
switched to logic 1 by applying a voltage larger than a thresh- time at a higher cost in area. However, the approach presented
old VSET to its voltage driver. To execute IMP, two voltage in [18] tries to avoid higher increase in the number of oper-
levels VSET and VCOND are applied to the switches P and Q ations by providing a tradeoff between the additional number
simultaneously. The magnitude of VCOND is smaller than the of operations and RRAM devices required for maintaining the
required threshold to change the state of the switch. However, intermediate results.
the interaction of VSET and VCOND can execute IMP according In [25], IMP was used to synthesize combinational logic cir-
to the current states of the switches, such that switch Q is set cuits with resistive memories using or-inverter graphs (OIGs).
to 1 if p = 0 and it retains its current state if p = 1 [11]. The approach applies an extension of the delay minimization
2) Built-in Majority Operation: RRAM devices have two algorithm proposed in [26] to the OIGs and also uses an
terminals and their internal resistance R can be switched area minimization to lower the costs of the equivalent circuits
between two logic states 0 and 1 designating high and low constructed with resistive memories. Synthesis of in-memory
resistance states, respectively. Denoting the top and bottom computing circuits using OIGs can be also possible with NOR
terminals by P and Q, the memory can be switched with a gates based on memristor-aided loGIC (MAGIC) proposed
negative or positive voltage VPQ based on the device polar- in [27]. MAGIC provides a memristive stateful logic, which
ity. Here, we assume that the logic statements (P = 1, Q = 0) has experimentally shown lower latency in comparison with
switches the RRAM device to logic 1, (P = 0, Q = 1) switches IMP [15].
the device to logic 0, and (P = Q) does not change the current Another approach using AIGs was proposed in [18] for syn-
state of the device. Accordingly, we can make the truth tables thesis of in-memory computing logic circuits. The approach
shown in Fig. 3 for the next sate of the switch (R ) when the uses the state-of-the-art synthesis tool ABC [28] to map an
current state (R) is either 0 or 1. In the following, the Boolean arbitrary Boolean function to an AIG and optimize it. An AIG
relations represented by tables in Fig. 3 are extended which representing a given function is then mapped to an equivalent
formally express the MAJ of RRAM devices [12]: network of IMP gates (Fig. 2) according to the IMP-based real-
R = (P · Q) · R + (P + Q) · R ization of NAND gate proposed in [11]. The approach executes
a given Boolean function using N + 2 RRAM devices, where
= P·R+Q·R+P·Q·R N is the number of input RRAM devices, which keep their
= P·R+Q·R+P·Q·R+P·Q·R initial values until the target function is executed and 2 is the
= P·R+Q·R+P·Q number of work RRAM devices, which states are changed dur-
= M(P, Q, R). ing the operations by intermediate results or the final output.
Nevertheless, some extra RRAM devices are also considered
The operation above is referred to 3-input resistive major- to maintain values of the IMP gates which have more than
ity RM3 (x, y, z), such as RM3 (x, y, z) = M(x, ȳ, z) [12]. one fanout.
According to RM3 , the next state of a resistive switch is equal BDD-based synthesis of Boolean functions using resis-
to a result of a built-in majority gate when one of the three tive memories has been proposed in [17]. Two IMP-based
variables x, y, and z is already preloaded and the variable corre- realizations are proposed for a 2-to-1 MUX one for a min-
sponding to the logic state of the bottom electrode is inverted. imum number of resistive switches and the other for a
We denote this intrinsic property of RRAM devices by MAJ minimum number of operations when lower latency is of
which provides an alternative for IMP and enables shorter higher importance than area. It has not been referred to any
computational length for synthesis of Boolean functions using BDD optimization method in [17] to lower either the number
resistive switches. of RRAM devices or operations. For a given Boolean function,
SHIRINZADEH et al.: LOGIC SYNTHESIS FOR RRAM-BASED IN-MEMORY COMPUTING 1425

expressed as
x · s̄ + y · s = M(M(x, s̄, 0), M(y, s, 0), 1)
= M(M(x, s̄, 0), M(y, 0, s), 1)
 
= RM3 RM3 (x, s, 0), RM3 (y, 1, s), 1 .
The equations above can be executed by three RM3 oper-
ations as well as a negation. Therefore, the MAJ-based
Fig. 4. Realization of an IMP-based MUX using RRAM devices [17].
realization of the MUX can be obtained by the following
operations after a data loading step.
1) S = s, X = x, Y = y, A = 0, B = 0, C = 1.
the approach maps the corresponding BDD representation to a 2) PA = x, QA = s, RA = 0 ⇒ RA = x · s̄.
netlist of RRAM devices using any of the MUX realizations. 3) PS = y, QS = 1, RS = s ⇒ RS = y · s.
This is carried out using two mapping approaches, one fully 4) PB = 1, QB = s, RB = 0 ⇒ RB = y · s.
sequential which is slow but needs a small number of RRAM 5) PC = a, QC = b, RC = 1 ⇒ RC = x · s̄ + y · s.
devices, and the other partially parallel which performs much The proposed MAJ-based MUX can be realized quite sim-
faster but needs some considerations for the complemented ilarly to the IMP-based circuit shown in Fig. 4 such that the
edges and fanouts. bottom electrodes of the switches are electrically connected via
a horizontal nanowire and the switching can be done by apply-
III. RRAM-BASED I N -M EMORY C OMPUTING D ESIGN ing the three discussed voltage levels to the top electrodes.
In this section, we present our proposed synthesis approach As can be seen, the MAJ-based realization of MUX needs
for RRAM-based logic-in-memory computing using the three one more RRAM devices and one less operation. Considering
representations explained before. For each representation, two area and delay two equally important cost metrics, using IMP
realizations for a graph node are presented using IMP and MAJ or MAJ does not make a difference in the circuits synthesized
as well as the methodology to map the graph to its equivalent by the proposed BDD-based approach. Indeed, the MAJ-based
circuit constructed by RRAM devices. realization of BDD nodes allows faster circuits, while the
We present optimization algorithms for each logic repre- IMP-based realization leads to circuits with smaller area con-
sentation to lower the cost metrics of the resulting in-memory sumption. Such property in both realizations can be exploited
computing circuits dissimilar to the conventional optimization when higher efficiency in delay or area is intended.
algorithms that are mainly designed to reduce the size, i.e., 2) Design Methodology for BDD-Based Synthesis: In order
the number of nodes of the graph also called area, or depth, to escape heavy delay penalties, we assume parallelization per
i.e., the number of levels of the graph. level for BDD-based synthesis [17], [29]. As explained before,
In the rest of this section, we present the node realiza- in the parallel implementation, each time one BDD level is
tions, design methodology, optimization, and the experimental evaluated entirely starting from the level designating the last
results for each graph-based representations introduced in ordered variable to the first ordered variable the so-called root
Section II-A in order. node. This is performed through transferring the computation
results between successive levels, i.e., using the outputs of
each computed level as the inputs of the next level. Using IMP,
A. BDD-Based Synthesis for In-Memory Computing Design the results of previous levels are read and copied, wherever
1) Realization of Multiplexer Using RRAM Devices: Fig. 4 required within the first loading step of the next level, while
shows the IMP-based realization for 2-to-1 MUX proposed for executing MAJ the results are read and then applied as
in [17]. The implementation requires six operations and five voltages to the rows and columns.
RRAM devices of which three, named S, X, and Y, store the Regardless of the possible fanouts and complemented edges
inputs, and the two others, A and B, are required for operations. in the BDD, the number of RRAM devices required for com-
The corresponding implication steps of the MUX realization puting by this approach is equal to five or six times the
shown in Fig. 4 are as follows. maximum number of nodes in any BDD level. In a simi-
1) S = s, X = x, Y = y, A = 0, B = 0. lar way, the number of operations is six or five times the
2) a ← s IMP a = s̄. number of BDD levels, for the IMP-based and MAJ-based
3) a ← y IMP a = ȳ + s̄. realizations, respectively. A multiple row crossbar architec-
4) b ← a IMP b = y · s. ture entirely based on resistive switches was proposed in [30],
5) s ← x IMP s = x̄ + s. which can be used to realize the presented parallel evaluation.
6) b ← s IMP b = x · s̄ + y · s. The cost metrics of the proposed BDD-based synthesis
In the first step, devices keeping the input variables and the approach are given in Table I. However, the larger part of the
two extra work switches are initialized. The remaining steps costs representing area and delay of the resulting circuits are
are performed by sequential IMP operations that are executed explained above, some additional RRAM devices addressing
by applying simultaneous voltage pulses VCOND and VSET . complemented edges and fanouts are still required.
To find the MAJ-based realization of MUX, we first express Every complemented edge in the BDD requires an NOT
the Boolean function of an MUX with majority gates and then gate to invert its logic value. As shown in the computational
simply convert it to RM3 by adding a complement attribute steps for both IMP-based and MAJ-based realizations, invert-
to each gate. For this purpose, the AND and OR operations ing a variable can be executed after an operation with a zero
are represented by majority gates using a constant as the third loaded RRAM device (see step 2 in the IMP-based MUX and
input variable, i.e., 0 for AND and 1 for OR [19]. Accordingly, step 4 in the MAJ-based MUX descriptions). Accordingly, for
an MUX with input variables x, y and a select input s can be each MUX with a complemented input an extra RRAM device
1426 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 37, NO. 7, JULY 2018

TABLE I
C OST M ETRICS OF L OGIC R EPRESENTATIONS FOR
RRAM-BASED I N -M EMORY C OMPUTING

(a) (b)

Fig. 5. Cost metrics of RRAM-based in-memory computing for an arbitrary


BDD, (a) before (initial), and (b) after optimization (optimized).

framework of MOGA employed BDD optimization algorithm


is based on nondominated sorting genetic algorithm [31]
that has been experimentally proven useful for solving NP-
complete problems, such as BDD optimization [29], [32].
MOGA is also capable of handling higher priority to any of
the cost metrics, which allows to design smaller in-memory
computing circuits at a fair cost of latency or vice versa. We
refer to [32] for the details of MOGA.
Fig. 5 shows an example with two BDDs both represent-
should be considered and set to FALSE (Z = 0) that can be ing a 4-variable 2-output Boolean function. The left BDD has
performed in parallel with the first loading step without any the initial ordering, whereas the second BDD has the ordering
increase in the number of steps. Then, an IMP or MAJ opera- obtained by MOGA. The number of required RRAM devices
tion should be executed to complete the logic NOT operation. for computing BDD levels (N + CE) (see Table I) is equal
It is obvious that the required operations for all complemented before and after optimization since both BDDs have a max-
edges in a level can be carried out simultaneously that means imum number of two nodes and one ingoing complemented
for any level with ingoing complemented edges only one extra edge. However, there is a nonconsecutive fanout of node x3 tar-
step is required. This implies that the number of additional geting x1 before optimization requiring an extra RRAM device
steps required for inverting all of the complemented edges can- to maintain the intermediate result. In the optimized BDD the
not exceed the number of BDD levels. Therefore, the number inputs of all of the nodes come from the consecutive levels
of steps to evaluate a BDD possessing complemented edges or the constant 1 which has reduced the number of required
is equal to the number of BDD levels with ingoing comple- RRAM devices by 1. The number of operations has been also
mented edges besides the basic value required for the level reduced after optimization since one level has been released
counts [29]. from complemented edges.
It is obvious that the RRAM devices keeping the outputs of As can be seen, the numbers of RRAM devices and opera-
each BDD level can be reused and assigned to the inputs of the tions decrease although the number of BDD nodes increases.
next successive level. Nevertheless, the results of nodes target- The effect of BDD optimization sounds to be too small for
ing levels which are not right after their origin level might be the example function by reducing each one of the cost met-
lost during computations if their corresponding RRAM devices rics only by one. Nevertheless, this reduction can be much
are rewritten by the next operations. Thus, we consider extra more visible for larger functions due to the higher possibil-
RRAM devices for such nonconsecutive fanouts to retain the ity of finding BDDs with smaller number of nonconsecutive
result of their origin nodes to be used as an input signal of fanouts, complemented edges, and level sizes caused by larger
their target nodes. The required number of RRAM devices for search space.
this is equal to the maximum number of such fanouts over all 4) Results of BDD-Based Synthesis: We have evaluated our
BDD levels. This will not increase the number of steps because proposed synthesis approaches using a set of 25 benchmark
copying the results of nodes with nonconsecutive fanouts in functions selected from LGsynth91 [33]. The results of the
additional RRAM devices and using the stored value in the BDD-based and AIG-based synthesis approaches have been
fanouts’ targets can be performed simultaneously in the first also compared with the similar existing approaches introduced
data loading step of nodes on the both sides of the fanouts. in Section II-C, which use the same data structures for RRAM-
3) BDD Optimization for RRAM-Based Circuit Design: based in-memory computing. For each benchmark function,
Optimization of BDDs in this paper is carried out as a bi- MOGA has been run ten times with a termination criterion of
objective problem aiming at minimizing the number of RRAM 500 generations. The population is three times as large as the
devices and computational steps simultaneously, i.e., finding a number of inputs of each function with a maximum allowed
tradeoff between the number of RRAM devices and operations size of 120. The rest of the experimental setup including the
of the resulting circuits. For this purpose, we have exploited genetic operators and their probabilities are the same as used
a multi-objective genetic algorithm (MOGA). The general in [32].
SHIRINZADEH et al.: LOGIC SYNTHESIS FOR RRAM-BASED IN-MEMORY COMPUTING 1427

TABLE II
C OMPARISON OF R ESULTS BY G ENERAL AND P RIORITIZED MOGA W ITH C HAKRABORTI et al. [17]

138
178
180

201
208

Table II presents the results of the three versions of MOGA IMP has been proposed in [11]. The proposed NAND gate
and compares them with results of the BDD-based synthesis in [11] corresponds to a node with complemented fanout in
approach proposed in [17]. For MOGA with priority to the an AIG and therefore can be utilized as the IMP-based imple-
number of RRAM devices and operations, we chose results mentation realizing AIGs with RRAM devices. In this case, a
with the smallest number of RRAM devices and operations FALSE operation is required for any regular edge in the graph.
among all runs and populations. The results shown in the table The implementation proposed in [11] requires three resistive
for the general MOGA have been also selected such that they memories connected by a common horizontal nanowire to a
represent a good tradeoff between the minimum and maximum load resistor, i.e., structurally similar to the circuit shown in
values found by the prioritized algorithms. It is worth men- Fig. 4 with a different number of switches. The interaction
tioning that the runtime varies between 0.56 to 187.22 s for of the tri-state voltage drivers on the RRAM devices execute
the benchmark functions 5xp1_90 and seq_201, respectively. the NAND operation within three computational steps listed
According to Table II, the number of RRAM devices below.
obtained by MOGA with priority to R for the IMP-based 1) X = x, Y = y, A = 0.
realization is reduced by 5.74% on average compared to the 2) a ← x IMP a = x̄.
corresponding value by the general MOGA. MOGA with 3) a ← y IMP a = x̄ + ȳ.
priority to the number of operations also achieves smaller Using MAJ, AIG can be also implemented with equal num-
latency by reducing the average operation count up to 0.74% in ber of RRAM devices and operations. A majority operation of
comparison to the general MOGA. It should be noted that opti- two variables x and y together with a constant logic value of
mization cannot noticeably lower the number of operations. 0 (M(x, 0, y)) [19] executes the AND operation. This corre-
As shown in Table I, the main contribution to the operation sponds to MAJ(x, 1, y) which only needs one extra operation
count is the number of BDD levels, i.e., the number of input to preload operand y in a resistive switch. The required steps
variables and hence is not changeable. are as follows.
In comparison with results of [17] our BDD-based synthesis 1) X = x, Y = y, A = 0.
approach has achieved better performance in both cost metrics. 2) PA = y, QA = 1, RA = 0 ⇒ RA = y.
The average values of results over the whole benchmark set 3) PA = x, QA = 1, RA = y ⇒ RA = x · y.
by the general MOGA for the IMP-based realization, which 2) Design Methodology for AIG-Based Synthesis: Although
is also used by [17], shows reduction of 21.11% and 31.74% both of the realizations using IMP and MAJ for the AIG-based
in the number of RRAM devices and operations, respectively. synthesis approach impose sequential circuit implementations,
The reduction in the number of operations reaches 42.64% they allow a reduction in area by reusing RRAM devices
for the MAJ-based realization which also has 12.22% smaller released from previous computations. According to the parallel
number of RRAM devices. evaluation method, we only consider one AIG level each time,
such that the employed RRAM devices to evaluate the level
B. AIG-Based Synthesis for In-Memory Computing Design can be reused for the next levels. Starting from the inputs of the
1) Realization of NAND/AND Gate Using RRAM Devices: graph, the RRAM devices in a level are released when all the
Realization of NAND gate using resistive switches based on required operations are executed. Then, the RRAM devices are
1428 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 37, NO. 7, JULY 2018

TABLE III
reused for the upper level and this procedure is continued until R ESULTS OF AIG-BASED S YNTHESIS U SING S IZE AND
the target function is evaluated. Depending on the use of IMP D EPTH R EWRITING BY ABC [28]
or MAJ in the realization, such an implementation requires as
many NAND or AND gates as the maximum number of nodes
in any level of the AIG. Hence, the corresponding number of
RRAM devices and operations for synthesizing the AIG is
three times the number of required majority gates and three
times the number of levels, respectively.
However, still some additional RRAM devices should be
allocated for the required NOT operations, i.e., the regular
edges in the IMP-based realization, where the outputs of AIG
nodes are already negated due to being implemented by NAND
gates, and the complemented edges in the MAJ-based real-
ization. Table I shows the number of RRAM devices and
computational steps of the resulting RRAM-based circuits.
Since the implementation starts from the input of AIG, the
ingoing regular edges for the IMP-based realization, and the 138
ingoing complemented edges for the MAJ-based realization 178
of any level should be first inverted similarly to the procedure 180
explained for BDDs. Therefore, the total number of RRAM
201
devices required for the synthesis of the whole graph for both 208
IMP-based and MAJ-realizations is equal to the maximum of
three times the number of nodes in the level plus the number
of ingoing edges to be inverted over all AIG levels.
3) AIG Optimization for RRAM-Based Circuit Design: For
AIG optimization we have used ABC [28] commands. To
address the area of the resulting circuits using RRAM devices
we use the command dc2 which minimizes the number of
nodes in the graph. Latency of the circuits has been also seven times smaller than the that of [18] for both rewritings.
reduced before mapping them to their corresponding netlist Furthermore, the method proposed in [18] fails to keep the
of RRAM devices by another ABC command if -x -g. The number of computational steps at a reasonable value when
command minimizes the depth of AIG which is actually the the number of inputs increases. For example, the number of
most significant term in the required number of operations operations by [18] for functions sym10_d and t481_d is equal
given in Table I due to the factor of three for both IMP- to 1172 and 1564, respectively. While using our method, both
based and MAJ-based realizations. Both of the area and depth functions can be synthesized with less than 80 operations.
AIG rewriting commands by ABC do not target the extra It is worth mentioning that the runtime for each bench-
number of RRAM devices and computational steps caused mark function in both Tables III and IV is in the range of
by the NOT operations for synthesis. Nevertheless, applying milliseconds.
any of the aforementioned commands iteratively can notice-
ably reduce the cost metrics of RRAM-based in-memory
computing. C. MIG-Based Synthesis for In-Memory Computing Design
It should be noted that we cannot optimize AIG for both 1) Realization of Majority Gate Using RRAM Devices: We
cost metrics since area minimization leads to worsening the propose two realizations for majority gate based on IMP and
latency and on the other hand depth minimization increases the MAJ [34]. The proposed IMP-based realization of majority
number of nodes in the graph. Thus, according to the applica- gate is similar to the circuit shown in Fig. 4 with six of RRAM
tion one can choose the optimization command regarding the devices. It also requires ten sequential steps to execute the
area or delay of the resulting circuits. majority function. The corresponding steps for executing the
4) Results of AIG-Based Synthesis: Results of the proposed majority function are as follows.
AIG-based synthesis approach for in-memory computing are 1) X = x, Y = y, Z = z, A = 0, B = 0, C = 0.
presented in Table III for both area and depth rewriting meth- 2) a ← x IMP a = x̄.
ods by ABC [28]. A quick look at Table III reveals that the 3) b ← y IMP b = ȳ.
number of RRAM devices is smaller for the MAJ-based real- 4) y ← a IMP y = x + y.
ization, while the operation counts are almost equal. According 5) b ← x IMP b = x̄ + ȳ.
to Table III, area and depth rewriting reduce the total number 6) c ← y IMP c = x + y.
of RRAM devices and operations, respectively, by 24.31% and 7) c ← z IMP c = x · z + y · z.
10.04% on average compared to each other. 8) a = 0.
Table IV makes a comparison with the AIG-based approach 9) a ← b IMP a = x · y.
proposed in [18] for a different benchmark set with single 10) a ← c IMP a = x · y + y · z + x · z.
output functions, i.e., PO = 1. Since the number of required Three RRAM devices denoted by X, Y, and Z keep input
RRAM devices for the benchmark set are not given in [18], we variables and the remaining three other RRAM devices A, B,
can only compare with respect to the number of operations. and C are required for retaining the intermediate results and
The number of operations obtained by our proposed method the final output. In the first step, the input variables are loaded
using the IMP-based realization, which is also used by [18], is and the other RRAM devices are assigned to FALSE to be
SHIRINZADEH et al.: LOGIC SYNTHESIS FOR RRAM-BASED IN-MEMORY COMPUTING 1429

TABLE IV
C OMPARISON OF R ESULTS BY THE P ROPOSED AIG-BASED represent majority gate without an extra negation, the same
S YNTHESIS W ITH B ÜRGER et al. [18] formula can be used for them with different constant fac-
tors addressing the number of RRAM devices and operations
required by each realization [34].
3) MIG Optimization for RRAM-Based Circuit Design: In
general, MIG optimization is performed by applying a set of
valid transformations to an existing MIG to find an equivalent
MIG that is more efficient with respect to the considered cost
metrics. In this section, we present the three MIG optimiza-
tion algorithms tackling the cost metrics of logic synthesis
with RRAM devices. The first proposed algorithm considers
both cost metrics simultaneously [34], while the others aim at
reducing the number of operations [34] or RRAM devices [35].
In [19], two algorithms for conventional MIG optimiza-
tion in terms of delay and area have been proposed, which
aim at reducing the depth, i.e., the number of levels, or the
size of graph, i.e., the number of nodes, respectively. For
area rewriting, [19] suggests a set of axioms called eliminate
including .M; .DR→L . eliminate can remove some of the
MIG nodes by repeatedly applying majority rule (.M) and
distributivity from right to left (.DR→L ) to the entire MIG.
Assuming x, y, z, u, and v as input variables .DR→L trans-
forms M(M(x, y, u), M(x, y, v), z) to M(x, y, M(u, v, z)) which
d means the total number of nodes has decreased from three
d to two.
In general, the depth of the graph is of high importance in
MIG optimization to lower the latency of the resulting circuits.
The depth of the MIG can be reduced by pushing the critical
variable with the longest arrival time to upper levels. For this
used later for the next operations. Another FALSE operation purpose, a set of axioms called push-up has been proposed
is also performed in step 8, to clear an RRAM device which in [19]. Push-up includes majority, distributivity, and associa-
is not required anymore for inverting an intermediate result tivity axioms applied in a sequence, i.e., .M; .DL→R ; .A;
which is not required anymore. Finally, the Boolean function and .C. It is obvious that the majority rule may reduce depth
representing a majority gate is executed by implying results by removing unnecessary nodes. Applying distributivity from
from the seventh and ninth step. left to right (.DL→R ) such that M(x, y, M(u, v, z)) is trans-
It is obvious that the MAJ-based majority gate can be formed to M(M(x, y, u), M(x, y, v), z) may also result in an
realized with smaller number of RRAM devices and com- MIG with smaller depth at a cost of one extra node. If either
putational steps due to benefiting from the discussed built-in x or y is the critical variable with the latest arrival, distribu-
majority property. Using MAJ, the majority gate will require tivity cannot reduce the depth of M(x, y, M(u, v, z)). However,
only four RRAM devices that can be placed in the same struc- if z is the critical variable, applying .DL→R will reduce the
ture shown in Fig. 4. Furthermore, the majority function can depth of MIG by pushing z one level up. In the cases that the
be executed within only three steps carrying out simple oper- associativity rules (.A, .C) are applicable, the depth can
ations. The MAJ-based computational steps for the proposed be reduced by one if the axioms move the critical variable
RRAM-based realization are as follows. to the upper level. After performing push-up, the relevance
1) X = x, Y = y, Z = z, A = 0. axiom (.R) is applied to replace the reconvergent variables
2) PA = 1, QA = y, RA = 0 ⇒ RA = ȳ. that might provide further possibility of depth reduction for
3) PZ = x, QZ = ȳ, RZ = z ⇒ RZ = M(x, y, z). another push-up.
In the first step, the initial values of input variables as well Using RRAM devices for implementation, considerable
as an additional RRAM device are loaded by applying VSET or parts of the metrics determining area and delay depend on the
VCLEAR to their voltage divers. Step 2 executes the required number and distribution of complemented edges which are not
NOT operation in RRAM device A. This can be done with intended in conventional area and depth optimization. We pro-
applying appropriate voltage levels VSET or VCOND to switch pose a multiobjective MIG optimization algorithm to obtain
A, for cases y = 0 and y = 1, respectively. In the last step, the efficient RRAM-based logic circuits with a good tradeoff
majority function is executed by use of MAJ at RRAM device between both objectives. Algorithm 1 includes a combination
Z by applying any of the three voltage levels corresponding of conventional area and depth optimization algorithms besides
the difference between logic states of x and ȳ. techniques tackling complemented edges from both aspects of
2) Design Methodology for MIG-Based Synthesis: The area and delay and iterates them for a maximum number of
number of RRAM devices and operations for the proposed cycles called effort. The algorithm starts with applying push-up
MIG-based synthesis approach are given in Table I. The to obtain a smaller depth. Then, the complemented edges are
method of mapping MIGs to equivalent RRAM-based in- aimed by applying an extension of axiom inverter propagation
memory computing circuits is exactly similar to the design from right to left (.IR→L ) for the condition that the con-
methodology for AIGs with MAJ-based realization. Since both sidered node has at least two outgoing complemented edges.
IMP-based and MAJ-based realizations proposed for MIGs The three cases satisfying this condition and their equivalent
1430 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 37, NO. 7, JULY 2018

Algorithm 1 MIG Rewriting for RRAM Cost Optimization


for (cycles = 0; cycles < effort; cycles++) do

.ML→R ; .DL→R ; .A; .C; ⎪

.IR→L(1−3) ;
push-up
.ML→R ; .DL→R ; .A; .C; ⎪ ⎭
.A; .DR→L ;
Fig. 6. Applying an extension of .IR→L to reduce the extra RRAM devices
and steps caused by complemented edges. end for

majority gates are shown below and discussed in the following Algorithm 2 MIG Rewriting for Operation Count
considering their effect on both cost metrics: Minimization
M(x̄, ȳ, z̄) = M(x, y, z) (1) for (cycles = 0; cycles < effort; cycles++) do

M(x̄, ȳ, z) = M(x, y, z̄) (2) .ML→R ; .DL→R ; .A; .C; ⎪

M(x̄, ȳ, z) = M(x, y, z̄). (3) .IR→L ;
push-up
.IR→L(1−3) ; ⎪

In the first case, the ingoing complemented edges of the .ML→R ; .DL→R ; .A; .C;
gate are decreased from three to zero, while one complement
end for
attribute is moved to the upper level, i.e., the level including
the output of the gate. Assuming that the current level, i.e., the
level including the ingoing edges, is the critical level with the
maximum number of required RRAM devices, this case is Finally, distributivity from right to left (.DR→L ) is applied
favorable for area optimization. However, if the upper level is to the graph to reduce the number of nodes in levels.
the critical level, the number of required RRAM devices will Due to the importance of latency in logic synthesis, and
increase by only one. Similar scenarios exist for the two other the issue of sequential implementation in RRAM-based cir-
cases, although the last case might be less interesting because cuits, we propose Algorithm 2 for reducing the number of
the number of complemented edges in both levels is changed operations. In the proposed operation minimization algorithm,
equally by one. That means a penalty of one is possible as the two axioms of inverter propagation are applied to the MIG
cost for a reduction of one, while transformations (1) and (2) after push-up. First, only the axiom presented by case (1),
may reduce the number of RRAM devices by three and two, i.e., the base rule of inverter propagation from right to left
respectively. (.IR→L ), is applied to the entire MIG to lower the number
To reduce the number of operations, the number of of levels with complemented edges. Since the transformation
levels possessing complemented edges should be reduced. moves one complement attribute to the upper level, it might
Depending on the presence of complemented edges by other create new inverter propagation candidates for the all three
gates in both levels, the two first transformations given above discussed cases if the upper level already has one or two
might reduce or increase the number of operations or even ingoing complemented edges. Hence, we apply .IR→L(1−3)
leave it unchanged. Case (1) is beneficial if the upper level again to ensure maximum coverage of complemented edges.
already has complemented edges and also the transforma- Although case (3) cannot reduce the number of operations, it
tion removes all the complemented edges from the current is not excluded from .IR→L(1−3) due to its effect on balanc-
level. It might be also neutral if none of the levels are going ing the levels’ sizes. Finally, push-up is applied to the MIG to
to be improved to a complement-free level. The worst case reduce the depth more if new opportunities are generated. It
occurs when moving the complement attribute to the upper should be noted that the number of operations is mainly deter-
level increments the number of levels with complemented mined by the MIG depth. In fact, in the worst case caused by
edges. Similar arguments can be made for the remaining complemented edges, the total number of operations would
cases. However, case (2) is more favorable because it never be equal to seven times the number of levels, i.e., the MIG
adds a level with complemented edges and case (3) cannot depth. Nevertheless, we show the efficiency of our proposed
be advantageous because it can never release a level from step optimization algorithm in the following section.
complemented edges. Algorithm 3 is proposed to reduce the number of required
Fig. 6 shows a simple MIG that is applicable to transfor- RRAM devices [35]. The algorithm starts with eliminate to
mation (2) (.IR→L(2) ). The transformation has released one reduce the number of nodes. Then, it applies .A, .C to
level of the MIG from the complement attribute (black dot), reshape the MIG to enable further reduction of area and applies
which results in a smaller number of computational steps. eliminate for the second time as suggested in [19]. After elim-
Furthermore, as a result of removing one complemented edge inating the unnecessary nodes, we use .IR→L(1−3) to reduce
from the critical level, the required number of RRAM devices the number of additional RRAM devices required for com-
is decreased by one. plemented edges. At the end, since the MIG might have been
After applying inverter propagation for the aforementioned changed after the three inverter propagation transformations,
conditions (.IR→L(1−3) ), the MIG is also reshaped and more .IR→L is applied again to ensure the most costly case with
chances for reducing the depth might be created. Thus, push- respect to the complemented edges is removed. In general,
up is applied to the entire MIG again to reduce the number of applying the last two lines of Algorithm 3 over the entire
operations as much as possible. In the last step, the number MIG repetitively can lead to much fewer RRAM cost.
of RRAM devices are reduced to make a tradeoff between 4) Results of MIG-Based Synthesis: Table V shows the
both objectives. Applying .A, some of changes by push-up results of experiments performed to evaluate the three
that have increased the maximum level size can be undone. proposed MIG rewriting algorithms. The number of iterations
SHIRINZADEH et al.: LOGIC SYNTHESIS FOR RRAM-BASED IN-MEMORY COMPUTING 1431

TABLE V
R ESULTS OF MIG-BASED S YNTHESIS FOR RRAM-BASED I N -M EMORY C OMPUTING U SING THE T HREE P ROPOSED MIG O PTIMIZATION A LGORITHMS

Algorithm 3 MIG Rewriting for RRAM Device Minimization


for (cycles = 0; cycles < effort; cycles++) do
.M; .DR→L ;
.A; .C; eliminate
.M; .DR→L ;
.IR→L(1−3) ;
.IR→L ; (a) (b)

end for Fig. 7. Comparison of synthesis results by logic representations for RRAM-
based in-memory computing. (a) Average number of RRAMs. (b) Average
number of operations.

(effort) was set to 40. We observed that the MIGs are well
optimized after 40 cycles and the cost metrics do not change IV. C OMPARISON OF L OGIC R EPRESENTATIONS
noticeably with more iterations. The total runtime for the entire Fig. 7 compares the average values of synthesis results
benchmark set under this setting has been about 3 s. over the whole benchmark set for the three discussed logic
As expected the cost metrics are much lower for MIGs representations. For a fair comparison and due to the high
using the MAJ-based realization. The results confirm that the importance of latency in logic-in-memory computing synthe-
proposed operation count and RRAM device minimization sis, the values shown in Fig. 7 are chosen from optimization
algorithms have achieved the smallest value for the corre- results with respect to the number of operations, i.e., MOGA
sponding optimization objective which has worsened the other with priority to the number of operations for BDDs, MIG
cost metric. The number of RRAM devices and operations rewriting for operation count minimization, and AIG depth
given by the proposed multiobjective algorithm are between rewriting.
the minimum and maximum boundaries found by the operation According to Fig. 7, BDDs clearly achieve smaller num-
count and RRAM device minimization algorithms. This con- ber of RRAM devices and therefore can be a better choice
firms the capability of the proposed MIG rewriting technique when area is considered a more critical cost metric. On the
to find a good tradeoff between both objectives. other hand, the operation counts obtained by the BDD-based
More precise comparisons for the results of the MAJ-based synthesis are much higher than the same values resulted by
realization show that the number of RRAM devices by the bi- AIG-based and MIG-based methods. Comparison of synthesis
objective algorithm is on average 19.78% less than that of the results by the AIGs and MIGs also shows that the aver-
operation minimization algorithm at a cost of 21.09% increase age number of operations for the MIG-based method using
in the number of operations. A similar comparison with results the MAJ-based realization is reduced by 19.37% compared
obtained by the RRAM device minimization algorithm shows to the AIG-based synthesis using depth minimization. This
an average reduction of 44.11% in the number of operations confirms the advantage of MIGs in providing higher speed
at a fair cost of 39.64% in the number of required RRAM in-memory computing circuits in comparison with the two
devices. other representations.
1432 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 37, NO. 7, JULY 2018

Fig. 8. MIG representing a three bit XOR gate.

V. C ROSSBAR I MPLEMENTATION FOR THE


P ROPOSED S YNTHESIS A PPROACH
We have already discussed the number of required RRAMs
and operations for the presented in-memory computing
approach using any of the three mentioned logic represen-
tations. In this section, we will show how an entire graph can
be computed on a crossbar and what determines its required (a)
dimensions. We present step by step implementation of an
example MIG shown in Fig. 8 which represent a three input
XOR gate.
Both MAJ-based and IMP-based implementations for in-
memory computing logic circuits can be executed on a
standard crossbar architecture as shown in Fig. 9(a). In (b) (c)
such an architecture, an entire row should be allocated for
Fig. 9. Crossbar implementation for the presented synthesis approach for
computing a single graph node which represents a data struc- logic-in-memory computing. (a) Standard crossbar architecture. (b) Upper-
ture primitive, e.g., a majority gate if MIG is used for bound crossbar for MAJ-based, and (c) IMP-based implementations for MIG
synthesis. To compute an entire level of a data structure shown in Fig. 8.
simultaneously, all nodes of the level should be computed
in parallel in separate rows. This means that a number of
rows equal to the maximum level size in the entire graph can directly be used as the second inverted operand of MAJ
is required. For example, for a logic representation whose and thus, only one needs to be negated. The RRAM devices
largest level has four nodes, a crossbar architecture with at allocated for the complemented edges are displayed in red
least four rows is needed, independent of the type of the uti- dashed surrounds at the end of the rows.
lized representation or the basic operation of MAJ or IMP. The implementation steps for the MAJ-based computation
Nevertheless, the number of RRAM devices at each row,
of the MIG shown in Fig. 8 are listed below
i.e, the number of columns, is determined by the RRAM-
based realization of the exploited data structure primitive and
Initialization: Rij = 0 : Qij = 1, Pij = 0;
therefore absolutely depends on both of the aforementioned
conditions. 1: Loading the third Q1 = Q2 = 0, P1 = P2 = z;
In the following, we assume that all the primary inputs are operands R11 : RM3 (z, 0, 0) = M(z, 1, 0) = z;
already written in the memory. R21 : RM3 (z, 0, 0) = M(z, 1, 0) = z;
2: Negation for node 2 Q1 = Q2 = x, P1 = x, P2 = 1;
A. MAJ-Based Implementation R25 : RM3 (1, x, 0) = M(1, x̄, 0) = x̄;
The values given for MAJ-based implementation in Table I 3: Computing level 1 node 1: P1 = y, Q1 = x, R11 = z
indeed present the upper bounds for the crossbar realization.
R11 : RM3 (y, x, z) = M(y, x̄, z);
According to Table I, the MAJ-based synthesis of the MIG
shown in Fig. 8 can be realized using a maximum of nine node 2: P1 = y, Q2 = x̄(@R25 ), R21 = z;
RRAM devices, since the critical level needs 2×4 for its nodes R21 : RM3 (y, x̄, z) = M(y, x, z);
and 1 more RRAM for the ingoing complemented edge. Also, 4: Computing level 2 P1 = x, Q1 = @R21 , R11 = M(x̄, y, z);
each level can require three operations (2×3), which results (root node) R11 : RM3 (x, @R21 , @R11 ) = M(x, @R21 , @R11 ):
in a total of eight operations for the whole MIG considering M(M(x̄, y, z), x, M(x, y, z)).
the presence of complemented edges at both levels.
Here, we show that the resulting MAJ-based implementation We assume that all RRAM devices are first loaded with zero.
can be executed much more efficiently with respect to both For initialization, voltage levels 1 and 0 should be applied
time and area. to the bottom electrodes (Qij ) and the top electrodes (Pij ),
The crossbar with the upper bound dimensions for the MAJ- respectively. This step is not considered in the operation count.
based implementation of 3-bit XOR gate is shown in Fig. 9(b). Step 1 starts to compute the nodes 1 and 2 in level 1 (see
The MIG has a maximum level size of two, and accordingly Fig. 8) by loading the variable z as the third operands of MAJ.
the required crossbar needs two rows. Each row consists of As said before, every node of the level should be computed
four RRAM devices to compute a node and one additional in a separate crossbar row. Accordingly, nodes 1 and 2 are,
device to be used in case of having a complemented edge. We respectively, computed in row 1 and 2 by selecting R11 and R21
assume that a maximum of two ingoing edges for an MIG as the corresponding third operands, i.e., the destinations of
node can be complemented after rewriting, from which one the operations. Then, the primary inputs are read from memory
SHIRINZADEH et al.: LOGIC SYNTHESIS FOR RRAM-BASED IN-MEMORY COMPUTING 1433

and applied to the corresponding row and columns to execute 2: Negation for node 1: R17 ← x IMP R17 : R17 = x̄;
the operations.
3-11: Computing level 1: node1: R14 = M(x̄, y, z);
It is worth noting that the MAJ-based realization of major-
ity gate in Section III-C1 also allocates RRAM devices for the node2: R24 : M(x, y, z);
first and the second operands. This is not actually required by 12: Loading variables for R11 = x, R12 = M(x, y, z), R13 = M(x̄, y, z)
the MAJ operation, dissimilar to IMP which needs all variables level 2: R14 = R15 = R16 = R17 = 0;
to be accessible on the same row. Nevertheless, all of the input 13: Negation for node 3: R17 ← R12 IMP R17 :
RRAM devices are already considered in the crossbar with
R17 = R12 = M(x, y, z);
the upper bound dimensions shown in Fig. 9(b). Furthermore,
in the MAJ-based realization of majority gate, we simply 14-22: Computing level 2 R14 = M(M(x, y, z), x, M(x, y, z)).
assumed that the second operand needs to be inverted and (root node):
considered an RRAM device for it, while this is not always
required. An MIG node with a single ingoing complemented To explain the implementation step-by-step, we use names
edge can actually be implemented faster by skipping the nega- R1 to R6 for the RRAM devices at each row to denote
tion and using the complemented edge directly as the second X, Y, Z, A, B, and C, respectively, as used in Section III-C1.
operand. In the MIG shown in Fig. 8, nodes 1 and 3 (the root For initialization, all of the RRAM devices in the entire cross-
node) are ideal for MAJ due to possessing a single comple- bar are cleared. Dissimilar to MAJ, IMP needs all variables
mented edge but node 2 requires one negation which needs to used for computation to be stored in the same horizontal line.
be performed first. This means that there may be a need to have several copies of
The negation required for node 2 is performed in step 2 primary inputs or intermediate results at different rows simul-
at R25 by setting its bottom electrode (Q2 ), to the value of x taneously, as shown in step 1, where the variables of nodes
and its top electrode (P2 ) to 1. It should be noted that R25 1 and 2 are loaded into RRAM devices in both rows. Step 2
is not independently accessible and the entire second row and computes the complemented edge of node 1 in the seventh
column are exposed to these voltage levels. Therefore, we need RRAM device considered for this case at the end of first row,
to ensure that the previously stored values are retained. By R17 . Steps 3–11 compute both nodes at level one and store the
setting Q2 to x, the bottom electrode of R21 is also changed to results in the forth RRAM device at the corresponding crossbar
x. To maintain the value stored in R21 , its top electrode (P1 ) row, similarly to the RRAM device A used in Section III-C1.
should be also set to the same voltage level x as shown in step The same procedure continues to compute the second level,
2. By doing this, the top electrode of R11 changes to x, and which only consists of the MIG root node. Two out of the
thus its bottom electrode (Q1 ) also needs to be set to x for three inputs of node 3 are intermediate results, which have to
keeping the current state of the devise. be first read and then copied into the corresponding RRAM
The entire level 1 is computed simultaneously in step 3. devices at row 1 besides other input and work devices as shown
One read is required to apply the value of x̄ to Q2 . As shown in step 12. In step 13, the complemented edge originating at
in the step 3, both nodes can be computed in the same column node 2 is negated, and then root node is computed in step 22.
since their first operands are equal, which is not necessarily
true in all the cases. Step 4 computes the root node of the C. Discussion
MIG. This can be done at one of the RRAM devices storing The MAJ-based implementation for the example MIG in
the intermediate results from the previous step and does not Fig. 8 was carried out using a small number of RRAM devices
require any data loading. Since the value of node 2 is comple- and within only four operations far less than the upper bounds,
mented, it is more efficient to use the value stored in R21 as while no operation or RRAM devices could be saved during
the second operand to skip negation. This requires one read the IMP-based implementation. Length of operations required
from R21 and R11 can be set to the third operand which is also for data loading and negation of MAJ-based implementation
the final destination of the computation. can be even shortened much more for larger Boolean func-
tions. Number of RRAM devices can also be much lower than
the MAJ-based upper bounds given in Table I by performing
B. IMP-Based Implementation operations successively in the devices carrying the results of
The required crossbar for the IMP-based implementation is previous levels.
shown in Fig. 9(c) which has one extra RRAM at the end of It is obvious that using MAJ provides higher efficiency
each row for the complemented edges. According to Table I, especially for MIG-based synthesis. Moreover, IMP requires
the example MIG shown in Fig. 8 with a maximum level size additional voltages, which is not the case for MAJ, and as
of 2 needs an upper bound of 12 (2×6) RRAM devices placed a result needs more complex control scheme and peripheral
in two rows in addition to one more for the ingoing comple- circuitry. However, MAJ-based implementation needs active
mented edge. As Table I suggests, the computation needs 22 read operations for each RM3 cycle, while IMP-based imple-
steps, 2×10 for the two levels plus two more steps for the mentation can reduce this requirement and propagate data
complemented edges, including the IMP operations and the natively within the memory array. MAJ-based implementation
loads. also does not allow to independently set the values of the top
The required steps for the IMP-based implementation of the electrodes of the computing RRAM devices placed in the same
MIG shown in Fig. 8 are listed below columns. Such computation correlations do not occur during
IMP-based operations since all IMP operations are executed
with the same voltage levels VSET and VCOND .
Initialization: Rij = 0;
Nonetheless, dependency of the voltage levels of crossbar’s
1: Loading variables for R11 = x, R12 = y, R13 = z; rows and columns can be managed in many cases due to the
level 1: R21 = x, R22 = y, R23 = z; commutativity property of the majority operation. This allows
1434 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 37, NO. 7, JULY 2018

of operations in comparison with the two other logic


representations.
2) The proposed AIG-based synthesis approach reduces the
number of operations by an order of magnitude in com-
parison with an existing approach, as well as providing
a fair tradeoff between both cost metrics among the
experimented representations.
3) In comparison with BDDs and AIGs, MIGs show a high
capability in reduction of the length of operations which
(a) (b) (c) is mostly considered as the main drawback of RRAM-
based in-memory computing.
Fig. 10. Selecting crossbar computing RRAM devices to avoid computa- 4) MAJ combined with the use of MIGs, provides a plat-
tion interferences for MAJ-based implementation. (a) Performing conflicting form for logic-in-memory computing synthesis, which
computations at different columns. (b) Diagonal computation. (c) Retaining is highly efficient with respect to latency and crossbar
previously stored devices.
dimensions.

to perform computations simultaneously at RRAM devices in R EFERENCES


the same column if they share a single operand to be applied to [1] H.-S. P. Wong et al., “Metal–oxide RRAM,” Proc. IEEE, vol. 100, no. 6,
the entire column. Using an automated procedure, the RRAM pp. 1951–1970, Jun. 2012.
devices allocated for parallel computations can be placed [2] L. Chua, “Memristor-the missing circuit element,” IEEE Trans. Circuit
Theory, vol. 18, no. 5, pp. 507–519, Sep. 1971.
in different nonconflicting columns as shown in Fig. 10(a). [3] L. Chua, “Resistance switching memories are memristors,” Appl. Phys.
When none of the MAJ operations share any operand, the A, vol. 102, no. 4, pp. 765–783, 2011.
computations should be performed diagonally [Fig. 10(b)]. [4] S. H. Jo, K.-H. Kim, and W. Lu, “High-density crossbar arrays based on
The rows or columns with computing devices may also pos- a Si memristive system,” Nano Lett., vol. 9, no. 2, pp. 870–874, 2009.
sess previously stored RRAM devices, which values have to [5] Y. Ho, G. M. Huang, and P. Li, “Dynamical properties and design anal-
ysis for nonvolatile memristor memories,” IEEE Trans. Circuits Syst. I,
be maintained during computations by applying equal voltage Reg. Papers, vol. 58, no. 4, pp. 724–736, Apr. 2011.
levels to their top and bottom electrodes, i.e., their row and [6] K.-C. Liu et al., “The resistive switching characteristics of a
column drivers. For example, in Fig. 10(c), the second column Ti/Gd2 O3 /Pt RRAM device,” Microelectron. Rel., vol. 50, no. 5,
has a stored device in the second row from top and a com- pp. 670–673, 2010.
puting one in the third row. To keep the value of the stored [7] Y. V. Pershin and M. Di Ventra, “Practical approach to programmable
analog circuits with memristors,” IEEE Trans. Circuits Syst. I, Reg.
device unchanged, the same voltage level, which is applied to Papers, vol. 57, no. 8, pp. 1857–1864, Aug. 2010.
its column for computing, has to be applied to its row. Setting [8] K.-T. T. Cheng and D. B. Strukov, “3D CMOS-memristor hybrid cir-
the voltage level of the third column from left also needs a cuits: Devices, integration, architecture, and applications,” in Proc. ACM
similar consideration as the column has a stored device located Int. Symp. Phys. Design, Napa, CA, USA, 2012, pp. 33–40.
[9] P.-E. Gaillardon et al., “A ultra-low-power FPGA based on monolith-
in a computing row. ically integrated RRAMs,” in Proc. DATE, Grenoble, France, 2015,
It is obvious that a computation can coexist with the stored pp. 1203–1208.
device in the same row or column if one of its operands [10] D. B. Strukov, “Nanotechnology: Smart connections,” Nature, vol. 476,
is equal to what applied to the coordinates of the stored pp. 403–405, Aug. 2011.
device. However, presence of several stored devices in a row [11] J. Borghetti et al., “‘Memristive’ switches enable ‘stateful’ logic
operations via material implication,” Nature, vol. 464, pp. 873–876,
or column may make it complex to arrange safe computations Apr. 2010.
which can be handled by freeing such rows or columns from [12] P.-E. Gaillardon et al., “The programmable logic-in-memory (PLiM)
computation. As Fig. 10 shows, considerations regarding the computer,” in Proc. DATE, Dresden, Germany, 2016, pp. 427–432.
conflicting computing or previously stored devices increase [13] S. Kvatinsky et al., “Memristor-based material implication (IMPLY)
the area of the crossbar architecture since a larger number of logic: Design principles and methodologies,” IEEE Trans. Very Large
Scale Integr. (VLSI) Syst., vol. 22, no. 10, pp. 2054–2066, Oct. 2014.
columns or rows may be required, although the number of [14] E. Lehtonen, J. Poikonen, and M. Laiho, “Implication logic synthesis
required RRAM devices does not change. Nevertheless, the methods for memristors,” in Proc. ISCAS, Seoul, South Korea, 2012,
number of steps can increase if the crossbar array does not pp. 2441–2444.
meet the required number of rows and columns, which needs [15] N. Talati, S. Gupta, P. Mane, and S. Kvatinsky, “Logic design within
memristive memories using memristor-aided loGIC (MAGIC),” IEEE
to move some computations into the successive steps. Trans. Nanotechnol., vol. 15, no. 4, pp. 635–650, Jul. 2016.
[16] H. Owlia, P. Keshavarzi, and A. Rezai, “A novel digital logic imple-
VI. C ONCLUSION mentation approach on nanocrossbar arrays using memristor-based
multiplexers,” Microelectron. J., vol. 45, no. 6, pp. 597–603, 2014.
We presented an approach for logic synthesis of RRAM- [17] S. Chakraborti, P. V. Chowdhary, K. Datta, and I. Sengupta, “BDD based
based in-memory computing circuits using the logic repre- synthesis of Boolean functions using memristors,” in Proc. IDT, Algiers,
sentations BDDs, AIGs, and MIGs. We also showed that Algeria, 2014, pp. 136–141.
[18] J. Bürger, C. Teuscher, and M. Perkowski, “Digital logic synthesis for
the presented approach provides valid and efficient crossbar memristors,” in Proc. Reed Muller, 2013.
implementations. The following remarks are concluded by [19] L. G. Amarù, P.-E. Gaillardon, and G. D. Micheli, “Majority-inverter
comparison of experimental results. graph: A novel data-structure and algorithms for efficient logic opti-
1) The proposed BDD-based synthesis approach using mization,” in Proc. DAC, San Francisco, CA, USA, 2014, pp. 1–6.
multiobjective optimization reduces both cost met- [20] L. Amarù et al., “Majority-inverter graph for FPGA synthesis,” in Proc.
SASIMI, 2015, pp. 165–170.
rics considerably compared to an existing BDD-based [21] R. Drechsler and D. Sieling, “Binary decision diagrams in theory and
method. Using BDDs for synthesis results in smaller practice,” Int. J. Softw. Tools Technol. Transf., vol. 3, no. 2, pp. 112–136,
number of RRAM devices at a high cost in the number 2001.
SHIRINZADEH et al.: LOGIC SYNTHESIS FOR RRAM-BASED IN-MEMORY COMPUTING 1435

[22] B. Bollig and I. Wegener, “Improving the variable ordering of OBDDs Pierre-Emmanuel Gaillardon (S’10–M’11–
is NP-complete,” IEEE Trans. Comput., vol. 45, no. 9, pp. 993–1002, SM’16) received the Electrical Engineering degree
Sep. 1996. from CPE-Lyon, Villeurbanne, France, in 2008,
[23] A. Kuehlmann, V. Paruthi, F. Krohm, and M. K. Ganai, “Robust Boolean the M.Sc. degree in electrical engineering from
reasoning for equivalence checking and functional property verification,” INSA Lyon, Villeurbanne, in 2008, and the Ph.D.
IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 21, no. 12, degree in electrical engineering from CEA-LETI,
pp. 1377–1394, Dec. 2002. Grenoble, France, and the University of Lyon,
[24] L. G. Amarù, P.-E. Gaillardon, and G. De Micheli, “Majority- Lyon, France, in 2011.
inverter graph: A new paradigm for logic optimization,” IEEE Trans. He was a Research Assistant with CEA-LETI. He
Comput.-Aided Design Integr. Circuits Syst., vol. 35, no. 5, pp. 806–819, was a Research Associate with the Swiss Federal
May 2016. Institute of Technology, Lausanne, Switzerland,
[25] A. Chattopadhyay and Z. Rakosi, “Combinational logic synthesis within the Laboratory of Integrated Systems (Prof. De Micheli), and a
for material implication,” in Proc. VLSI SoC, Hong Kong, 2011, Visiting Research Associate with Stanford University, Palo Alto, CA, USA.
pp. 200–203. He is an Assistant Professor with the Electrical and Computer Engineering
[26] J. C. Beatty, “An axiomatic approach to code optimization for expres- Department, University of Utah, Salt Lake City, UT, USA, and he leads
sions,” J. ACM, vol. 19, no. 4, pp. 613–640, 1972. the Laboratory for NanoIntegrated Systems. His current research interests
[27] S. Kvatinsky et al., “MAGIC—Memristor-aided logic,” IEEE Trans. include development of reconfigurable logic architectures and digital circuits
Circuits Syst. II, Exp. Briefs, vol. 61, no. 11, pp. 895–899, Nov. 2014. exploiting emerging device technologies, and novel EDA techniques.
[28] Berkeley Logic Synthesis and Verification Group. ABC: A System for Prof. Gaillardon was a recipient of the C-Innov 2011 Best Thesis Award
Sequential Synthesis and Verification, Release 2016-02-09. [Online]. and the Nanoarch 2012 Best Paper Award. He is an Associate Editor of the
Available: http://www.eecs.berkeley.edu/~alanmi/abc/ IEEE T RANSACTIONS ON NANOTECHNOLOGY. He has been serving as a
[29] S. Shirinzadeh, M. Soeken, and R. Drechsler, “Multi-objective BDD TPC member for many conferences, including Design, Automation & Test in
optimization for RRAM based circuit design,” in Proc. DDECS, Košice, Europe (DATE)’15-16, Design Automation Conference’16, Nanoarch’12-16.
Slovakia, 2016, pp. 1–6. He is a Reviewer for several journals and funding agencies. He will serve
[30] H. Li et al., “A learnable parallel processing architecture towards unity as the Topic Co-Chair “Emerging Technologies for Future Memories” for
of memory and computing,” Sci. Rep., vol. 5, Aug. 2015, Art. no. 13330. DATE’17.
[31] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist
multiobjective genetic algorithm: NSGA-II,” IEEE Trans. Evol. Comput.,
vol. 6, no. 2, pp. 182–197, Apr. 2002.
[32] S. Shirinzadeh, M. Soeken, and R. Drechsler, “Multi-objective BDD
optimization with evolutionary algorithms,” in Proc. GECCO, Madrid,
Spain, 2015, pp. 751–758.
[33] S. Yang, Logic Synthesis and Optimization Benchmarks User Guide:
Version 3.0, MCNC, Durham, NC, USA, 1991.
[34] S. Shirinzadeh, M. Soeken, P.-E. Gaillardon, and R. Drechsler, “Fast
logic synthesis for RRAM-based in-memory computing using majority-
inverter graphs,” in Proc. DATE, Dresden, Germany, 2016, pp. 948–953.
[35] M. Soeken et al., “An MIG-based compiler for programmable logic-in-
memory architectures,” in Proc. DAC, 2016, pp. 1–6.
Rolf Drechsler (M’94–SM’03–F’15) received the
Diploma and Dr.Phil.Nat. degrees in computer sci-
ence from Johann Wolfgang Goethe University
Saeideh Shirinzadeh (S’16) received the B.Sc. and
Frankfurt am Main, Frankfurt, Germany, in 1992 and
M.Sc. degrees in electrical engineering from the
1995, respectively.
University of Guilan, Rasht, Iran, in 2010 and 2012,
He was with the Institute of Computer Science,
respectively. She is currently pursuing the Ph.D. Albert-Ludwigs University, Freiburg im Breisgau,
degree with the Group of Computer Architecture,
Germany, from 1995 to 2000, and with the Corporate
University of Bremen, Bremen, Germany.
Technology Department, Siemens AG, Munich,
Her current research interests include
Germany, from 2000 to 2001. Since 2001, he
multiobjective optimization, evolutionary
has been with the University of Bremen, Bremen,
computation, logic synthesis, and in-memory
Germany, where he is currently a Full Professor and the Head of the Group
computing.
for Computer Architecture, Institute of Computer Science. In 2011, he addi-
tionally became the Director of the Cyber-Physical Systems Group, German
Research Center for Artificial Intelligence, Bremen. He is the Co-Founder
of the Graduate School of Embedded Systems and he is the Coordinator of
Mathias Soeken (S’09–M’13) received the Diploma the Graduate School “System Design” funded within the German Excellence
degree in engineering and the Ph.D. degree in com- Initiative. His current research interests include the development and design
puter science and engineering from the University of data structures and algorithms with a focus on circuit and system design.
of Bremen, Bremen, Germany, in 2008 and 2013, Mr. Drechsler was a recipient of the best paper awards at the Haifa
respectively. Verification Conference in 2006, the Forum on Specification & Design
He is a Scientist with the Integrated Systems Languages in 2007 and 2010, the IEEE Symposium on Design and Diagnostics
Laboratory, École Polytechnique Fédéderale de of Electronic Circuits and Systems in 2010, and the IEEE/ACM International
Lausanne, Lausanne, Switzerland. He is involved Conference on Computer-Aided Design (ICCAD) in 2013. He is an Associate
in active collaborations with the University of Editor of the IEEE T RANSACTIONS ON C OMPUTER -A IDED D ESIGN OF
California at Berkeley, Berkeley, CA, USA, and I NTEGRATED C IRCUITS AND S YSTEMS, the ACM Journal on Emerging
Microsoft Research, Redmond, WA, USA. He is Technologies in Computing Systems, the IET Cyber-Physical Systems: Theory
also actively maintaining the logic synthesis frameworks CirKit and RevKit. & Applications, and the International Journal on Multiple-Valued Logic
His current research interests include the many aspects of logic synthesis and Soft Computing. He was a member of Program Committees of numer-
and formal verification, constraint-based techniques in logic synthesis, and ous conferences, including Design Automation Conference (DAC), ICCAD,
industrial-strength design automation for quantum computing. Design, Automation & Test in Europe (DATE), Asia and South Pacific
Dr. Soeken was a recipient of the Scholarship from German Academic Design Automation Conference, FDL, ACM-IEEE International Conference
Scholarship Foundation, the Germany’s oldest and largest organization that on Formal Methods and Models for System Design, and Formal Methods
sponsors outstanding students in the Federal Republic of Germany. He has in Computer-Aided Design, the Symposiums Chair of IEEE International
been serving as a TPC member for several conferences, including Design Symposium on Multiple-Valued Logic 1999 and 2014, and the Topic Chair for
Automation Conference’17 and IEEE/ACM International Conference on “Formal Verification” DATE 2004, DATE 2005, DAC 2010, as well as DAC
Computer-Aided Design (ICCAD)’17. He is a Reviewer for Mathematical 2011. He is the General Chair of International Workshop on Logic Synthesis
Reviews as well as for several journals. 2016 and the General Co-Chair of FDL 2016.

You might also like