0% found this document useful (0 votes)
29 views9 pages

IET Computers Digital Tech - 2017 - Khan - Comparative Analysis of Network‐on‐Chip Simulation Tools

Uploaded by

vivekva879
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views9 pages

IET Computers Digital Tech - 2017 - Khan - Comparative Analysis of Network‐on‐Chip Simulation Tools

Uploaded by

vivekva879
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

IET Computers & Digital Techniques

Research Article

Comparative analysis of network-on-chip ISSN 1751-8601


Received on 8th April 2017
Revised 5th August 2017
simulation tools Accepted on 12th September 2017
E-First on 5th December 2017
doi: 10.1049/iet-cdt.2017.0068
www.ietdl.org

Sarzamin Khan1 , Sheraz Anjum2, Usman Ali Gulzari3, Frank Sill Torres4
1Department of Electrical Engineering, COMSATS Institute of Information Technology, Wah Cantt, Pakistan
2Department of Computer Science, COMSATS Institute of Information Technology, Wah Cantt, Pakistan
3Department of Electrical Engineering, COMSATS Institute of Information Technology, Islamabad, Pakistan
4Department of Electronic Engineering, Federal University of Minas Gerais, Belo Horizonte, Brazil

E-mail: [email protected]

Abstract: Network-on-chip (NoC) is a reliable and scalable communication paradigm deemed as an alternative to classic bus
systems in modern systems-on-chip designs. Consequently, one can observe extensive multidimensional research related to
the design and implementation of NoC-based systems. A basic requirement for most of these activities is the availability of NoC
simulators that enable the study and comparison of different technologies. This study targets the analysis of different NoC
simulators and highlights its contributions towards NoC research. Various NoC tools such as NoCTweak, Noxim, Nirgam,
Nostrum, BookSim, WormSim, NOCMAP and ORION are evaluated and their strengths and weaknesses are highlighted. The
comparative analysis includes methods for estimation of latency, throughput and energy consumption. Further, the exemplary
real world application, video object plane decoder is mapped on a 2D mesh NoC using different mapping algorithms under
NOCMAP and NoCTweak simulators for comparative analysis of the NoC simulators and their embedded mapping algorithms.

1 Introduction NoCs apply packet based switching for on chip communication.


The packets enable the exchange of data among processing
Modern integrated system designs incorporate the system-on-chip elements (PEs), using resource network interfaces (RNI), routers
(SoC) technology in order to improve performance, costs and and interconnecting links as shown in Fig. 2. Routers forward
energy consumption. Embedded SoC systems are comprised of packets across the network and may consist of input/output buffers,
intellectual property (IP) cores, memory units, processors etc. [1]. routing logic, allocators and crossbars [5]. Links/Channels enable
In the state-of-the-art SoC designs, these blocks are connected by the interconnection of the routers for data transmission. They are
traditional busses to communicate and exchange data with each usually bidirectional and multi-layered. RNI is the interfacing unit
other. However, beyond a certain number of elements, bus-based between the PEs and the router. RNI and router collectively work
systems encounter their communication limitations due to high as resource network router. Its function is to packetise and de-
power consumption, high bandwidth requirements and high latency packetise the messages for processing and routing across the
[2]. Hence, the network-on-chip (NoC) paradigm has been network. PE is the functional block that processes the data and may
proposed as an alternative solution to these communication be an IP core, a memory element, a Processor etc.
bottlenecks. Bus-based systems enable communication among the The NoC architecture is specified by its topology, routing
resources via custom busses, shared busses, hierarchical busses or algorithm and flow control. Here, topology means the physical
bus matrix as shown in Fig. 1. ordering and arrangement of the links connecting the network
In contrast, NoC-based systems include 2D, 3D mesh, Tree and nodes. Exemplary topologies are Mesh, Torus, Folded Torus and
Torus NoC designs. Also hybrid design of these architectures may Tree [6, 7]. The routing algorithm defines how messages between
be adopted [3, 4].

Fig. 1 Embedded system categories

IET Comput. Digit. Tech., 2018, Vol. 12 Iss. 1, pp. 30-38 30


© The Institution of Engineering and Technology 2017
1751861x, 2018, 1, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-cdt.2017.0068, Wiley Online Library on [19/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Fig. 2 Elements of a common NoC

PEs traverse the network. Routing may be deterministic or parameters of the simulators. The author has presented a
adaptive. A good routing algorithm will balance the traffic load quantitative analysis of the tools regarding their structure, but
across the network and should avoid deadlock and live-lock. Flow focused only on general aspects of the NoC tools.
control is the allocation of communication resources for guaranteed The study in [39] presented an overview of the NoC concepts
and reliable transmission of packets between the PEs. In other and simulation tools. Furthermore, the recent contributions on
words, flow control allocates the resources like buffers and link simulation tools are highlighted. The survey presented in [40]
bandwidth to the messages as they traverse through the network reviewed different approaches for NoC traffic models and
[8–10]. performance evaluation. In this work, the evaluation-based
NoCs are generally evaluated with respect to their performance analytical models and the design challenges of the NoC simulators
parameters, such as energy consumption, area, communication are discussed, but only a descriptive study is presented about the
bandwidth, throughput and latency [11]. Several simulators have NoC tools. The author of research work in [41] presented a short
been designed in order to evaluate these parameters and predict the comparison of the HNOCS simulator with a few NoC simulators
system's performance prior to the design implementation [12–26]. and highlighted the contribution of HNOC to the NoC research.
Some simulators like ORION [12, 13] only focus on the The work presented in [42] collected numerous programming
performance of individual components, e.g. router power and area, models and proposal for MPSoC but did not discuss dedicated NoC
while others such as Noxim, Nirgam and NoCTweak [14–17], are simulators. Availability of 3D, optical and wireless NoC simulators
designed for measurement of the entire network performance. are also limited, because it is a new research area for the research
ORION simulator is used as a standard model for the estimation of community. It is an open problem for future NoC research to
energy of a router and link of the network. It is usually embedded design specific simulators for NoC designs.
in other network simulators (e.g. Noxim, Nirgam) to calculate the
energy of the entire network. There are other energy models like 3 Representation of NoC applications
CMOS standard library model [17] and Bit energy model [27]
which are also used for estimation of the network energy NoC systems are generally represented by characterisation graphs.
consumption. As NoC inherits many features from general Different simulators extract data from these graphs for performance
computer networks, so these network simulators can also be used simulations. An application can be represented by network task
for NoC system simulations [28–32]. Having in mind the rising graph (NTG), which is subsequently scheduled on the available IPs
amount of different simulators, this work is an attempt to provide a through network core graph (NCG). NCG is then transformed and
road-map that facilitates the selection of appropriate tools. mapped on NoC packet switched network architecture through
This paper is organised as follows: Section 2 presents related NoC architecture graph (NAG). In this work, these characterisation
work on NoC simulators. Section 3 describes briefly the graphs are used to input data to the NoC simulators for
representation of NoC applications and Section 4 relates to the comparative analysis and are briefly discussed in the following
comparative analysis of NoC simulators. Simulation results are sections.
evaluated in Section 5 and concluding remarks are presented in
Section 6. 3.1 Network task graph
An NTG is a directed acyclic graph, Gr = Gr(T, C) in which vertex
2 Related work of the graph characterises a task, (ti Є T, i = 1, 2, 3,…) for the
Current research is mostly involved in topology design, router computational resource of the application and represents
design, routing protocols, mapping and scheduling techniques [33]. information, such as execution time, energy consumption, task
Only limited study is available about NoC tools comparisons and deadlines as shown in Fig. 3a. The directed arc (ci, j Є C, i = 1, 2, 3,
surveys [34–42]. The work presented in [37] discusses some NoC …, j = 1, 2, 3,…) represents either data or dependent information
proposals and contributions regarding their attainable capabilities. between processing tasks (ti and tj). The arc (ci, j) is associated with
The author has collected a few NoC tools from the literature and a value (v (ci, j)), which characterises the communicating data that
presented a short description of the characteristics of the are exchanged between the processing tasks. A real application is
simulators. A comprehensive study is carried out about NoC usually represented by NTG, which provides necessary information
concepts, but the survey lacks the comparative analysis of the NoC about the application for processing and simulations as shown in
performance measurements. Fig. 3b.
In the research work of Neuenhahn [38], static and dynamic
performance analysis of NoC is performed and evaluated at
different abstraction levels. The static performance analysis has 3.2 Network core graph
included timing models, while dynamic performance models are An NCG, Gr′ = Gr′(P, A), is a directed graph in which vertex of the
composed of FPGA, VHDL, Colored Petri Nets and SystemC- graph (pi Є P) represent the PE, while the directed arc (ai, j Є A)
based simulations. Comparative analysis of different modelling shows characteristic parameters between the PEs (pi to pj) as
techniques showed significant deviations in term of performance

IET Comput. Digit. Tech., 2018, Vol. 12 Iss. 1, pp. 30-38 31


© The Institution of Engineering and Technology 2017
1751861x, 2018, 1, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-cdt.2017.0068, Wiley Online Library on [19/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Fig. 3 Representation of embedded applications
(a) NTG, (b) VOPD application, (c) NCG, (d) NAG

shown in Fig. 3c. An arc (ai, j) is associated with communication To evaluate the potential of NoC simulators for mapping real
information, e.g. rate and volume, of communicating data and may application on NAG, video object plane decoder (VOPD)
have system design constraints, e.g. required latency and data application is considered in this research work. The NTG of VOPD
bandwidth. NCG intrinsically represents the scheduling of the tasks consists of 16 distinct tasks as shown in Fig. 3b. In this case, we
(T) on the available PEs (P) for processing. When P = T, or P > T assume that the tasks are equal to the available processors,
then a single task can be scheduled on an individual PE, but when therefore scheduling is not required. The NTG and NCG have one
P < T, then two or more tasks should be scheduled on a single PE. to one mapping, so we directly input the NTG to NoCTweak and
For this purpose, a scheduler is required before performance NoCMAP for simulation as discussed in Section 5.2.
simulations.
4 NoC simulators
3.3 NoC architecture graph
A common challenge of selecting the right NoC simulator is that
NAG, A = A(R, H), is an architecture graph, in which each vertex available tools usually are strong in certain measurements and
(ri Є R) means a router (node) in the graph, while a directed arc (hi, having deficits in others. NoC simulator can be divided into two
broad categories:
j Є H) indicates the bidirectional routing channel between the
routers (ri) and (rj) as shown in Fig. 3d. Arc (hi, j) is associated
(1) General network simulators that can be used for NoC
with Mi, j (set of minimum paths from ti to tj) and L(mi, j) that
simulations (e.g. NS2, NS3, Omnet++, Wattch, Hotspot, Netsim,
represents the set of all links associated with mi,j. The cost of the Gem5, Graphite, Hornet, Opnet, Fusionsim, Esece) [28–32].
arc is represented by e(hi, j) which shows the average consumption (2) Specific NoC simulators, which are explicitly designed for NoC
of energy (joule) of sending data between network tiles (ti and tj), simulation (e.g. BookSim, HNOCS, WormSim, Ocsim, Vnoc,
e.g. Ei, j (bit). NAG represents the mapped applications on NoC Matrics, SICOSY, Tpzsimul, Garnet, SUNMAP, Ocintsim, Noxim,
tiles, which is then simulated on a specific tool for performance Nostrum, Nirgam, Occn, Nocsim, NoCTweak, Atlas, Gpnocsim,
measurements. Xmulator, NONMAP, ReliableNoC, MapoNoC, Phoenixsim,
Access Noxim and ORION) [12–26, 43–48] (Table 1).

32 IET Comput. Digit. Tech., 2018, Vol. 12 Iss. 1, pp. 30-38


© The Institution of Engineering and Technology 2017
1751861x, 2018, 1, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-cdt.2017.0068, Wiley Online Library on [19/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
In this section, we describe some important simulators for NoC- currently designed for 2D mesh topology with only synthetic traffic
based designs. Fig. 4 depicts the block diagram of a generic NoC pattern and wormhole router design. Energy, throughput and
simulator with its characteristic parameters. The NoC simulator communication delay are its output statistic parameters. The input
may have the following input parameters: parameters include network size, packet size, packet injection rate,
buffer depth, routing strategies and traffic distribution. The routing
(1) Configuration options: Configuration options define the type of algorithms include XY, negative-first, west-first, north-last, OE,
application traffic simulated on the NoC tool. It may be synthetic dyad, fully adaptive and lookup table based.
traffic patterns or embedded application traces. It may also have a
seed value selection for the simulation, log file for simulation 4.3 Nirgam
outputs, warm-up time for the network to become stable and
simulation run time selection. Nirgam is developed by Lavina Jain as a joint collaboration
(2) Synthetic options: Synthetic options define the size and type of between the University of Southampton, UK and Malaviya
topology for the traffic like 2D mesh and the type of synthetic National Institute of Technology, India [14]. It is an open source
traffic pattern such as random, transpose, bit-complement, bit- discrete event, cycle accurate simulator for NoC. The current
reverse, bit-shuffle, bit-rotate and hotspot routers selection. The version of the simulator is designed for 2D mesh and torus
hotspots are routers in the network that receive packetised data at a topologies. The wormhole switching mechanism is adopted in
higher rate that they can handle. This phenomenon reduces system which packet consists of head, body and tail flits. Number of
performance and can lead to deadlocks. An intelligent routing virtual channels, buffer size and clock frequency can also be
algorithm can prevent the formation of hotspots. changed for simulation. The simulator support source, XY and OE
(3) Embedded application traces: Embedded applications are real routing mechanism. Traffic option includes source/sink and
application task graphs used in the simulation such as a VOPD, synthetic generator which may be constant bit rate, bursty or input
multimedia system, multi window display, MPEG4 decoder and traced based. Performance parameters include average latency per
E3S benchmarks. packet (in clock cycles), average latency per flit (in clock cycles)
and average throughput (in Gbps) for each channel.
(4) Mapping option: Mapping option such as near-optimal mapping
(NMAP), simulating annealing (SA), branch and bound (BB)
should also be included for obtaining optimised latency, throughput 4.4 Nostrum
and energy consumption. Nostrum is a cycle accurate, layered NoC simulator written in
(5) Traffic options: Traffic options include the number of flits SystemC and python by the Nostrum Team at Royal Institute of
injected by each core per cycle (flit injection rate), the probability Technology (KTH), Stockholm [19]. It supports 2D mesh and torus
distribution of the period between two injected packets, packet topology with wormhole store and forward switching mechanism.
length and flits per packet selection. The routing algorithms are XY and deflection order routing. It has
(6) Router settings: Router settings provide the type of router such the ability of mapping an application to the nodes of the network.
as wormhole router, virtual channel router, shared queues router, The user can also change the arbitration policy and buffering
bufferless router and circuit-switched router. It also defines options. It can be configured for both best effort and guaranteed
pipeline type, the number of pipeline stages and buffer depth etc. communications. Best effort communication provides average
(7) Routing options: Routing options define routing algorithm like performance, but better utilisation of network resources.
XY dimension-ordered routing, west-first, north-last and odd-even Guaranteed communication is based on time division multiplexing
(OE) minimal adaptive routing. It may also have an output port with virtual circuits to insure guaranteed latency and throughput.
selection such as the X dimension first, the dimension nearest to
the destination first, the dimension farthest to the destination first, 4.5 BookSim
round-robin among output ports, the output port with highest credit
first, switching arbitration policy and inter router link length. BookSim is an interconnected network simulator written in C++
(8) Technology settings: It includes CMOS technology process (e.g. [18]. This simulator can simulate a wide range of topologies, e.g.
90, 65, 45, 32, 22 nm), clock frequency and supply voltage 2D mesh, torus, concentrated mesh, fat tree, butterfly, flattened
selection. butterfly, quad tree and user specific any net etc. The simulator
(9) Measurement options: The output measurements parameters support input queued router and event-driven router micro-
include throughput, power, latency and energy consumption. The architecture with virtual channel support. The performance
results of the output performance parameters predict the behaviour parameters are either latency or throughput versus offered load.
of the NoC multicore system before physical implementation. The simulators also support changing buffer size, routing
mechanism and arbitration policy. Currently, it only supports
synthetic traffic patterns.
4.1 NoCTweak
4.6 NOCMAP/reliableNoC
NoCTweak is a SystemC-based simulator developed by Tran and
Baas [17] for NoC. It is designed for early exploration of NOCMAP is an open source mapping simulator of NoC written in
performance, e.g. throughput, latency and energy estimation of C++ by Hu et al. [27, 50]. Two mapping algorithms such as BB
NoC designs. NoCTweak uses standard CMOS library cell data for and SA are implemented in this tool. It uses the bit energy model to
post layout timing and power estimation. The simulator has a calculate the minimum total communication energy of the NoC.
command line interface for changing input simulation parameters. BB mapping algorithm is used for topological placement of IPs
The current version of the simulator is designed only for 2D mesh onto NoC platform to minimise total communication energy
topology, but has the option of synthetic and embedded traffic consumption. The link bandwidth is considered as constraint
pattern. The hotspot and mapping options, i.e. random and NMAP parameter. For comparison, ad-hoc SA method is also
algorithm for mapping the IPs are included. The router design implemented, which indicates that BB is faster than SA technique
selection, switching strategies, CMOS technology process with comparable results.
selection, frequency and voltage selection are also included in this Based on the bit energy model EPAM (XY, OE and WF), which
simulator. is energy and performance aware mapping with different routing
algorithm (XY, OE and west-first), an efficient BB algorithm is
implemented in this simulator. For comparison, SA algorithm is
4.2 Noxim also implemented which shows that the proposed algorithms are
Noxim is developed by Maurizio Palesi, Davide Patti and Fabrizio more efficient than SA regarding result optimality and simulation
Fazzino at the University of Catania [15, 16]. It is a SystemC-based speed. Moreover, it is observed that EPAM-OE gives more
simulator having a command line interface for changing input accurate results when applied to real and complex applications with
parameters. The simulator is based on ORION power model and large system size. ReliableNoC is an extended version of

IET Comput. Digit. Tech., 2018, Vol. 12 Iss. 1, pp. 30-38 33


© The Institution of Engineering and Technology 2017
1751861x, 2018, 1, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-cdt.2017.0068, Wiley Online Library on [19/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
4.8 Component-based interconnection network simulator
(CINSim)
CINSim is a general purpose simulator for communication
networks developed by a research group at Technische University
Berlin (Real-Time Systems and Robotics) [28, 29]. The core of the
simulator is written in C++ and has a command line interface. The
editor of the simulator is written in JAVA and is platform
independent. The graphical user interface (CINSim GUI) of the
simulator is through an XML file of XML. The simulator has the
capability of simulating both regular and irregular communication
networks. The performance parameters are throughput, delay and
latency. The network components are source buffers, non-shared
buffers, routers, target buffers, routing and switching techniques
and scheduling algorithms. Like NS-2 it can also be used for NoC
simulations.

5 Comparative simulations
To carry out comparative analysis of NoC tools, a generic
configuration setup is described in order to evaluate each simulator
on the basis of its metrics of performance. The available simulators
Fig. 4 Generic NoC simulator
do not have a common approach for input/output parameter
selection and also have different simulation models, so it is
difficult to precisely scrutinise each simulator. Therefore a
NOCMAP simulator which has the addition of reliability parameter
common configuration setup is selected in such a way that it has
in NOCMAP simulator.
same input parameters for comparison. Parameters in Table 2 are
used for analysis of the tools.
4.7 ORION Most of the available simulators support 2D mesh, synthetic
Architectural power estimation is an important factor in designing traffic pattern, wormhole switching mechanism, buffer depth
NoC-based systems. ORION is a fast and accurate architectural availability and XY routing algorithm. Throughput, latency and
power model for router power and area simulation of NoC [12]. power are the output characteristics parameters of these simulators.
ORION 3.0 [13] is its latest release after ORION 1.0 and ORION In this research work, the results of different simulators are
2.0. ORION 2.0 and ORION 3.0 have some additional compared on the basis of a unified configuration setup for a 5 × 5,
functionality as well as closest performance parameters to the 2D mesh. The input parameters and the design structure of most of
actual NoC designs. The comparison between ORION 2.0 the simulators are different, so it is difficult to precisely compare
simulations and Intel 80-core actual power consumption of Link, all the simulators on a unified configuration setup. It is tried to
FIFO, Clock, Arbiter and Crossbar closely verify the simulation compare some of these simulators on the basis of a best uniform
results of ORION 2.0 simulator. approach. 2D mesh of size 5 × 5 is selected, because most of the

Table 1 Comparison of NoC simulators


Simulator NoCTweak [17] Noxim [15, 16] Nirgam [14] Nostrum [19] BookSim WormSim NOCMAP ORION
[18] [49] [27, 50] [12, 13]
language system C system C system C system C C++ C++ C++ C++
topology 2D mesh 2D mesh 2D mesh, 2D mesh, wide range 2D mesh, 2D mesh no
torus torus torus
traffic pattern synthetic, synthetic synthetic, synthetic synthetic synthetic, synthetic, no
embedded embedded embedded embedded
switching wormhole virtual wormhole with wormhole wormhole with wormhole wormhole with wormhole with user design
mechanism channel, Roshaq, virtual channel with virtual virtual channel with virtual virtual channel virtual channel
bufferless, circuits channel, channel
switched
buffer depth yes yes yes yes yes yes yes yes
option
routing XY, negative first, XY, negative first, source XY, deflection all XY, OE, dyad XY, OE, west no
algorithm west first, north west first, north routing, XY, routing first, dyad
last, OE, lookup last, OE, dyad, fully OE
table adaptive, lookup
table
performance power/energy energy, throughput, power, latency, latency, energy energy, router
parameters consumption, communication latency, throughput, throughput reliability
power, link
throughput, delay throughput link utilisation power,
latency router area
energy model CMOS standard ORION model ORION model no no ORION/bit bit energy ORION
cell library model energy model model model
input command line command line log file command line log file command line command line command
parameters line
interface
hotspot option yes yes no no no yes no no
mapping NMAP, random no manual manual no no BB, SA no

34 IET Comput. Digit. Tech., 2018, Vol. 12 Iss. 1, pp. 30-38


© The Institution of Engineering and Technology 2017
1751861x, 2018, 1, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-cdt.2017.0068, Wiley Online Library on [19/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Table 2 Platform description
Platform Description Remarks
topology 2D mesh most widely used
network size 5 × 5 (25 nodes) selected for simulation
workload/benchmark synthetic most simulators support
fixed packet length 10 flits nominal size
flit injection rate 0.1–0.7 flits/cycle/node range 0–1
traffic type uniform predictable results
router type wormhole most common
routing algorithm XY commonly supported
buffer size 8 flits selected for simulation
input voltage 1V selected for simulation
operating clock frequency 1000 MHz selected for simulation
warm-up time 20,000 cycles for accuracy

simulator support 2D mesh topology and size 5 × 5 is chosen to cell library model [17]. In network power calculations (Fig. 5d),
minimise simulation time. again the results of NoCTweak and Nirgam are comparable and
almost have the same trend at a high injection rate.
5.1 Synthetic benchmark
5.2 Real application benchmark
For comparison of NoC simulators, synthetic benchmark is
selected because some simulator such as Noxim and CinSim NoCTweak and NOCMAP have the ability to map a real
support only synthetic traffic pattern. Injected traffic is also limited application on NoC tiles for energy optimisation. A VOPD
to 0.7 flits/cycle/node, in order to avoid network traffic congestion. (Fig. 3b) is simulated on these simulators for energy comparison.
In latency analysis, it is observed that Noxim and Nirgam have VOPD has 16 tasks and therefore requires 4 × 4, 2D mesh for
almost the same latency trend with injected traffic (flit injection implementation. NoCTweak has built in random and NMAP
rate) as shown in Fig. 5a. NoCTweak and CinSim have a non- mapping algorithms, while NOCMAP utilises BB and SA
linear response at low traffic, but when flit injection rate exceeds algorithms. In NoCTweak, the mapping through NMAP algorithm
0.5 flits/cycle/node; their response is comparable with Noxim and has 23.52% improvement in energy when compared with random
Nirgam. It may be due to the fact that NoCTweak and CinSim algorithms as shown in Table 3.
requires some extra parameters for analysis, which are limited in Also, the BB algorithm in NOCMAP has 1% improvement in
Noxim and Nirgam. Variations in the results are also due to the energy than SA algorithm. The simulation time of NMAP, SA, BB
unavailability of an exact and uniform embedded platform in these and Random algorithms of NoCTweak and NOCMAP simulator is
simulators. The design structure of these simulators is also different also shown in Table 3. SA algorithm of NOCMAP is the slowest
from each other. among all the other algorithms. As application mapping is non-
In throughput calculations (Fig. 5b), variation in results of polynomial hard problem, therefore, the SA algorithm in
NoCTweak is observed at the start, but the response is closely NOCMAP is not feasible for mapping of large applications.
matched to that of Noxim and CinSim at a slightly high injection A 24.89% improvement in power, 2% reduction in throughput
rate. Similarly, in Network energy analysis, NoCTweak and Noxim and 22% improvement in latency of NMAP algorithm are also
have a minor difference in results with the same injected traffic as observed when compared with random mapping of VOPD
shown in Fig. 5c. It is due to the fact that Noxim uses ORION application (Table 4). The mapping results of BB, SA, random and
energy model [13, 14] while NoCTweak utilises CMOS standard NMAP algorithms are shown in Fig. 6. As different algorithms

Fig. 5 Continued

IET Comput. Digit. Tech., 2018, Vol. 12 Iss. 1, pp. 30-38 35


© The Institution of Engineering and Technology 2017
1751861x, 2018, 1, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-cdt.2017.0068, Wiley Online Library on [19/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Fig. 5 Performance parameters of NoC simulators
(a) Network latency of NoCTweak, Noxim, Nirgam and CINSim, (b) Network throughput of NoCTweak, Noxim and CinSim, (c) Average energy of NoCTwek and Noxim, (d)
Network power of NoCTweak and Nirgam

Table 3 VOPD application energy and simulation time comparison


NoCTweak random mapping NoCTweak NMAP mapping NOCMAP BB mapping NOCMAP SA mapping
energy consumption, pJ/flit 34.56 26.43 29.90 30.01
simulation time, s 14 14 16 150

Table 4 VOPD application performance comparison


Performance parameters NoCTweak random mapping NoCTweak NMAP mapping
average latency, cycles 21.81 17.01
average throughput, flits/cycle 0.048 0.047
total power, mW 26.30 19.76

generate different application mappings (Figs. 6a–d) due to the topology, XY routing algorithm, wormhole switching mechanism
diverse design structure and convergence time, therefore the and synthetic traffic pattern. The performance parameters such as
performance parameters differ from each other even under similar energy/power, throughput and latency are compared with respect to
configurations. NMAP algorithm has the best application mapping the offered load (flit injection rate). Some NoC simulators also
for performance parameters among all the three algorithms. This support area (ORION) and application mapping optimisation
efficiency of NMAP algorithm is due to the fact that it has better (NOCMAP, NoCTweak). The simulators and tools available for
algorithm and design parameters for mapping the applications on NoC are not mature, thereby creating a big challenge of
network nodes. standardisation to the research community in this important field of
The results and analysis of NoC tools divulge that an efficient study. To accommodate most of the afore-mentioned performance
mapping algorithm should be embedded in designing a complete parameters, generalised, accurate and precise simulators and tools
NoC simulator for embedded system applications. Only synthetic are mandatory for modern embedded system designs. The future
traffic generators do not predict the real behaviour of NoC NoC research will mainly focus on optical, wireless and 3D NoC
embedded systems. The mapping of the task on a specific network designs, therefore efficient simulators and tools will be the
tile has a great impact on the performance parameters such as essential part of these designs.
latency, power, throughput, and energy consumption of the
network. Therefore real world applications should be properly
scheduled and mapped on a specific topology before physical
implementation.

6 Conclusion
This paper tries to pinpoint an important aspect of NoC research by
analysing some renowned tools used for the simulation and
comparison of NoC platforms. A comparative analysis of a few
NoC open source simulators is also carried out to identify the
strong and weak points of these simulators with respect to NoC
performance parameters such as latency, throughput, power and
energy consumption. Most of these simulators support 2D mesh

36 IET Comput. Digit. Tech., 2018, Vol. 12 Iss. 1, pp. 30-38


© The Institution of Engineering and Technology 2017
1751861x, 2018, 1, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-cdt.2017.0068, Wiley Online Library on [19/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Fig. 6 Comparison of VOPD application mapping of NOCMAP and NoCTweak
(a) NOCMAP-BB mapping, (b) NOCMAP-SA mapping, (c) NoCTweak-NMAP mapping, (d) NoCTweak-random mapping

7 References [20] Murali, S., De Micheli, G.: ‘SUNMAP: a tool for automatic topology
selection and generation for NoCs’. Proc. 41st Design Automation Conf.,
[1] Benini, L., De Micheli, G.: ‘Networks on chips: a new SoC paradigm’, IEEE 2004, pp. 914–919
Comput. Soc., 2002, 35, (1), pp. 70–78 [21] Jueping, C., Gang, H., Shaoli, W.: ‘OPNEC-Sim: an efficient simulation tool
[2] Dally, W.J., Towels, B.: ‘Route packets, not wires, on-chip interconnection for network-on-Chip communication and energy performance analysis’. 10th
networks’. Proc. 38th DAC, 2001, pp. 684–689 IEEE Int. Conf. Solid-State and Integrated Circuit Technology (ICSICT),
[3] Choudhary, N.: ‘Network-on-chip: a new SoC communication infrastructure 2010, pp. 1892–1894
paradigm’, Int. J. Soft Comput. Eng., 2012, 1, (6), pp. 332–335 [22] Hossain, H., Ahmed, M., Al-Nayeem, A., et al.: ‘Gpnocsim – a general
[4] Mineo, C., Davis, W.R.: ‘The benefits of 3D networks-on-chip as shown with purpose simulator for network-on-chip’. Int. Conf. Information and
LDPC decoding’. IEEE Int. Conf. 3D System Integration, 2009. 3DIC 2009, Communication Technology, 2007. ICICT ‘07.Dhaka, 2008
pp. 1–8 [23] Lis, M., Shim, K., Cho, M., et al.: ‘DARSIM: a parallel cycle-level NoC
[5] Jantch, A., Tenhunen, H.: ‘Networks on chip’ (Kluwer Academic Publishers, simulator’. 6th Annual Workshop on Modeling, Benchmarking and
2003) Simulation, Saint Malo, France, June 2010, pp. 1–10
[6] Gulzari, U.A., Anjum, S., Torres, F.S., et al.: ‘A new cross-by-pass-torus [24] Chan, J., Parameswaran, S.: ‘NoCGEN: a template based reuse methodology
architecture based on CBP-mesh and torus interconnection for on-chip for networks on chip architecture’. IEEE 17th Int. Conf. VLSI Design, 2004,
communication’, PLoS ONE December 1, 2016, 11, (12), pp. 1–18, e0167590 pp. 717–720
[7] Gulzari, U.A., Khan, S., Anjum, S., et al.: ‘An efficient and scalable cross-by- [25] Cong, J., Gururaj, K., Han, G., et al.: ‘MC-Sim: an efficient simulation tool
pass-mesh architecture for on-chip communication’, IET Comput. Digit. for MPSoC designs’. IEEE/ACM Int. Conf. Computer-Aided Design
Tech., 2017, (11), pp. 140–148 (ICCAD), 2008, pp. 364–371
[8] Atienza, D., Angiolini, F., Murali, S., et al.: ‘Network-on-chip design and [26] Lv, M., Guo, Y., Guan, N., et al.: ‘RTNoc: a simulation tool for real-Time
synthesis outlook’, VLSI J., 2008, 41, pp. 340–359 communication scheduling on networks-on-Chips’. Int. Conf. Computer
[9] Tsai, W.C., Lan, Y.C., Hu, Y.H., et al.: ‘Networks on chips: structure and Science and Software Engineering, 2008, vol. 4, pp. 102–105
design methodologies’, Hindawi J. Electric. Comput. Eng., 2012, 2012, [27] Hu, J., Marculescu, R.: ‘Energy-aware mapping for tile-based NoC
Article ID 509465, 15 pages architectures under performance constraints’. Asia and South Pacific Design
[10] Sahu, S., Kittur, H.M.: ‘Area and power efficient network-on-chip router Automation Conf., 2003, pp. 233–239
architecture’. IEEE Conf. Information and Communication Technologies [28] Tutsch, D., Lüdtke, D., Walter, A., et al.: ‘CINSim – a component based
(ICT), 2013, pp. 855–859 interconnection network simulator for modeling dynamic reconfiguration’.
[11] Gehlot, P., Singh Chouhan, S.: ‘Performance evaluation of network-on-chip Proc. 12th Int. Conf. ASMTA, 2005, pp. 132–137
architectures’. Int. Conf. Emerging Trends in Electronic and Photonic Devices [29] Anjum, S., Munir, E.U.: ‘Simulation and performance evaluation of network-
and Systems (ELECTRO-2009) on-chip architectures and algorithms using CINSIM’, J. Basic Appl. Sci. Res.,
[12] Kahng, A.B.: ‘ORION 2.0: a power-area simulator for interconnection 2011, 1, (10), pp. 1594–1602
networks’, IEEE Trans. VLSI Syst., 2012, 20, (1), p. 191 [30] Ning, W.: ‘Simulation and performance analysis of network-on-chip
[13] Kahng, A.B, Lin, B., Nath, S.: ‘ORION 3.0: a comprehensive NoC router architectures using OPNET’, 1-4244-1132-7/07 © 2007 IEEE
estimation tool’, IEEE Embedded Syst., 2015, 7, (2), pp. 41–45 [31] Ben-Itzhak, Y., Zahavi, E., Cidon, I., et al.: ‘NoCs simulation framework for
[14] Jain, L., Al-Hashimi, B., Gaur, M., et al.: ‘NIRGAM: a simulator for NoC OMNeT++’. Fifth IEEE/ACM Int. Symp. Networks on Chip (NoCS), 2011,
interconnect routing and application modeling’. Workshop on Diagnostic pp. 265–266
Services in Network-on-Chips, DATE, 2007, pp. 16–20 [32] Kourdy, R., Yazdanpanah, S., Rad, M.R.N.: ‘Using the NS-2 network
[15] Fazzino, F., Palesi, M., Patti, D.: ‘Noxim: network-on-chip simulator’, 2008 simulator for evaluating multi-Protocol label switching in network-on-Chip’.
[16] Catania, V., Mineo, A., Palesi, M., et al.: ‘Cycle-accurate network-on-chip Second Int. Conf. Computer Research and Development, 2010, pp. 795–799
simulation with Noxim’, ACM Trans. Model. Comput. Simul., 2016, 27, (1), [33] Marculescu, R., Ogras, U.Y., Peh, L.S., et al.: ‘Outstanding research problems
Article 4, 25 pages in NoC design: systems, micro architecture, and circuit perspectives’, IEEE
[17] Tran, A.T., Baas, B.M.: ‘NoCTweak: a highly parameterizable simulator for Trans. Computer-Aided Des. Integr. Circ. Syst., 2009, 28, (1), pp. 03–21
early exploration of performance and energy of networks on chip’. Technical [34] Agarwal, A., Iskander, C., Shankar, R.: ‘Survey of network-on-chip (NoC)
Report, VLSI Computation Lab, ECE Department, UC Davis, July, 2012 architectures & contributions’, J. Eng. Comput. Arch., 2009, 3, (1), pp. 1–15
[18] Jiang, N.: ‘BookSim 2.0 user's guide’, May 7, 2013 [35] Abbas, A.: ‘A survey on energy-efficient methodologies and architectures of
[19] Lu, Z.: ‘NNSE: Nostrum network-on-chip simulation environment’. Swedish, network-on-chip’, Comput. Electric. Eng. J., 2014, 40, (8), pp. 333–347
System on Chip, 2005 [36] Kumar Sahu, P., Chattopadhyay, S.: ‘A survey on application mapping
strategies for network-on-chip design’, J. Syst. Archit., 2013, 59, pp. 60–76

IET Comput. Digit. Tech., 2018, Vol. 12 Iss. 1, pp. 30-38 37


© The Institution of Engineering and Technology 2017
1751861x, 2018, 1, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-cdt.2017.0068, Wiley Online Library on [19/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
[37] Ben Achballah, A., Ben Saoud, S.: ‘A survey of network-on-chip tools’, Int. [45] Liu, W., Xu, J., Wu, X., et al.: ‘A NoC traffic suite based on real
J. Adv. Comput. Sci. Applic., 2013, 4, (9), p. 61 applications’. IEEE Computer Society Annual Symp. VLSI (ISVLSI), 2011,
[38] Neuenhahn, M.C., Schleifer, J., Blume, H., et al.: ‘Quantitative comparison of pp. 66–71
performance analysis techniques for modular and generic network-on-chip’, [46] Ghosh, D., Ghosal, P., Mohanty, S.P.: ‘A highly parameterizable simulator for
Adv. Radio Sci., 2009, 7, pp. 107–112 performance analysis of NoC architectures’. Int. Conf. Information
[39] Alalaki, M.S, Agyeman, M.O: ‘A study of recent contribution on simulation Technology (ICIT), 2014, pp. 311–315
tools for network-on-chip’, Int. J. Comput. Electric. Autom. Control Inf. Eng., [47] Onizawa, N., Funazaki, T., Matsumoto, A., et al.: ‘Asynchronous network-on-
2017, 11, (4), pp. 33–37 chip simulation based on a delay-aware mode’. IEEE Computer Society
[40] Qian, Z., Bogdan, P., Tsui, C.-Y., et al.: ‘Performance evaluation of NoC- Annual Symp. VLSI, 2010, pp. 357–362
based multicore systems: from traffic analysis to NoC latency modelling’, [48] Genko, N., Atienza, D., De Micheli, G., et al.: ‘Feature – NoC emulation: a
ACM Trans. Design Autom. Electron. Syst., 2016, 21, (3), pp. 1–38 tool and design flow for MPSoC’, IEEE Circuits Syst. Mag., 2007, 7, (4), pp.
[41] Ben-Itzhak, Y., Zahavi, E., Cidon, I., et al.: ‘HNOCS: modular open-source 42–51
simulator for heterogeneous NoCs’. Embedded Computer Systems (SAMOS), [49] Ogras, U.Y., Marculescu, R.: ‘It's a small world after all’: NoC performance
2012 Int. Conf. Samos, 2013 optimization via long-range link insertion’, IEEE Trans. Very Large Scale
[42] Fernandez-Alonso, E., Castells-Rufas, D., Joven, J.: ‘Survey of NoC and Integr. Syst., 2006, 14, (7), pp. 693–706
programming models proposals for MPSoC’, IJCSI Int. J. Comput. Sci. [50] Hu, J.: ‘Energy and performance aware mapping for regular NoC
Issues, 2012, 9, (2), No 3, pp. 22–32 architectures’, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2005,
[43] Amoretti, M.: ‘Modeling and simulation of network-on-chip systems with 24, (4), pp. 551–562
DEVS and DEUS’, Hindawi Sci. World J., 2014, 2014, Article ID 982569, pp.
1–9
[44] Dahir, N.S., Mak, T., Xia, F., et al.: ‘Modeling and tools for power supply
variations analysis in networks-on-chip’, IEEE Trans. Comput.’, 2014, 63,
(3), pp. 679–690

38 IET Comput. Digit. Tech., 2018, Vol. 12 Iss. 1, pp. 30-38


© The Institution of Engineering and Technology 2017

You might also like