IET Computers Digital Tech - 2017 - Khan - Comparative Analysis of Network‐on‐Chip Simulation Tools
IET Computers Digital Tech - 2017 - Khan - Comparative Analysis of Network‐on‐Chip Simulation Tools
Research Article
Sarzamin Khan1 , Sheraz Anjum2, Usman Ali Gulzari3, Frank Sill Torres4
1Department of Electrical Engineering, COMSATS Institute of Information Technology, Wah Cantt, Pakistan
2Department of Computer Science, COMSATS Institute of Information Technology, Wah Cantt, Pakistan
3Department of Electrical Engineering, COMSATS Institute of Information Technology, Islamabad, Pakistan
4Department of Electronic Engineering, Federal University of Minas Gerais, Belo Horizonte, Brazil
E-mail: [email protected]
Abstract: Network-on-chip (NoC) is a reliable and scalable communication paradigm deemed as an alternative to classic bus
systems in modern systems-on-chip designs. Consequently, one can observe extensive multidimensional research related to
the design and implementation of NoC-based systems. A basic requirement for most of these activities is the availability of NoC
simulators that enable the study and comparison of different technologies. This study targets the analysis of different NoC
simulators and highlights its contributions towards NoC research. Various NoC tools such as NoCTweak, Noxim, Nirgam,
Nostrum, BookSim, WormSim, NOCMAP and ORION are evaluated and their strengths and weaknesses are highlighted. The
comparative analysis includes methods for estimation of latency, throughput and energy consumption. Further, the exemplary
real world application, video object plane decoder is mapped on a 2D mesh NoC using different mapping algorithms under
NOCMAP and NoCTweak simulators for comparative analysis of the NoC simulators and their embedded mapping algorithms.
PEs traverse the network. Routing may be deterministic or parameters of the simulators. The author has presented a
adaptive. A good routing algorithm will balance the traffic load quantitative analysis of the tools regarding their structure, but
across the network and should avoid deadlock and live-lock. Flow focused only on general aspects of the NoC tools.
control is the allocation of communication resources for guaranteed The study in [39] presented an overview of the NoC concepts
and reliable transmission of packets between the PEs. In other and simulation tools. Furthermore, the recent contributions on
words, flow control allocates the resources like buffers and link simulation tools are highlighted. The survey presented in [40]
bandwidth to the messages as they traverse through the network reviewed different approaches for NoC traffic models and
[8–10]. performance evaluation. In this work, the evaluation-based
NoCs are generally evaluated with respect to their performance analytical models and the design challenges of the NoC simulators
parameters, such as energy consumption, area, communication are discussed, but only a descriptive study is presented about the
bandwidth, throughput and latency [11]. Several simulators have NoC tools. The author of research work in [41] presented a short
been designed in order to evaluate these parameters and predict the comparison of the HNOCS simulator with a few NoC simulators
system's performance prior to the design implementation [12–26]. and highlighted the contribution of HNOC to the NoC research.
Some simulators like ORION [12, 13] only focus on the The work presented in [42] collected numerous programming
performance of individual components, e.g. router power and area, models and proposal for MPSoC but did not discuss dedicated NoC
while others such as Noxim, Nirgam and NoCTweak [14–17], are simulators. Availability of 3D, optical and wireless NoC simulators
designed for measurement of the entire network performance. are also limited, because it is a new research area for the research
ORION simulator is used as a standard model for the estimation of community. It is an open problem for future NoC research to
energy of a router and link of the network. It is usually embedded design specific simulators for NoC designs.
in other network simulators (e.g. Noxim, Nirgam) to calculate the
energy of the entire network. There are other energy models like 3 Representation of NoC applications
CMOS standard library model [17] and Bit energy model [27]
which are also used for estimation of the network energy NoC systems are generally represented by characterisation graphs.
consumption. As NoC inherits many features from general Different simulators extract data from these graphs for performance
computer networks, so these network simulators can also be used simulations. An application can be represented by network task
for NoC system simulations [28–32]. Having in mind the rising graph (NTG), which is subsequently scheduled on the available IPs
amount of different simulators, this work is an attempt to provide a through network core graph (NCG). NCG is then transformed and
road-map that facilitates the selection of appropriate tools. mapped on NoC packet switched network architecture through
This paper is organised as follows: Section 2 presents related NoC architecture graph (NAG). In this work, these characterisation
work on NoC simulators. Section 3 describes briefly the graphs are used to input data to the NoC simulators for
representation of NoC applications and Section 4 relates to the comparative analysis and are briefly discussed in the following
comparative analysis of NoC simulators. Simulation results are sections.
evaluated in Section 5 and concluding remarks are presented in
Section 6. 3.1 Network task graph
An NTG is a directed acyclic graph, Gr = Gr(T, C) in which vertex
2 Related work of the graph characterises a task, (ti Є T, i = 1, 2, 3,…) for the
Current research is mostly involved in topology design, router computational resource of the application and represents
design, routing protocols, mapping and scheduling techniques [33]. information, such as execution time, energy consumption, task
Only limited study is available about NoC tools comparisons and deadlines as shown in Fig. 3a. The directed arc (ci, j Є C, i = 1, 2, 3,
surveys [34–42]. The work presented in [37] discusses some NoC …, j = 1, 2, 3,…) represents either data or dependent information
proposals and contributions regarding their attainable capabilities. between processing tasks (ti and tj). The arc (ci, j) is associated with
The author has collected a few NoC tools from the literature and a value (v (ci, j)), which characterises the communicating data that
presented a short description of the characteristics of the are exchanged between the processing tasks. A real application is
simulators. A comprehensive study is carried out about NoC usually represented by NTG, which provides necessary information
concepts, but the survey lacks the comparative analysis of the NoC about the application for processing and simulations as shown in
performance measurements. Fig. 3b.
In the research work of Neuenhahn [38], static and dynamic
performance analysis of NoC is performed and evaluated at
different abstraction levels. The static performance analysis has 3.2 Network core graph
included timing models, while dynamic performance models are An NCG, Gr′ = Gr′(P, A), is a directed graph in which vertex of the
composed of FPGA, VHDL, Colored Petri Nets and SystemC- graph (pi Є P) represent the PE, while the directed arc (ai, j Є A)
based simulations. Comparative analysis of different modelling shows characteristic parameters between the PEs (pi to pj) as
techniques showed significant deviations in term of performance
shown in Fig. 3c. An arc (ai, j) is associated with communication To evaluate the potential of NoC simulators for mapping real
information, e.g. rate and volume, of communicating data and may application on NAG, video object plane decoder (VOPD)
have system design constraints, e.g. required latency and data application is considered in this research work. The NTG of VOPD
bandwidth. NCG intrinsically represents the scheduling of the tasks consists of 16 distinct tasks as shown in Fig. 3b. In this case, we
(T) on the available PEs (P) for processing. When P = T, or P > T assume that the tasks are equal to the available processors,
then a single task can be scheduled on an individual PE, but when therefore scheduling is not required. The NTG and NCG have one
P < T, then two or more tasks should be scheduled on a single PE. to one mapping, so we directly input the NTG to NoCTweak and
For this purpose, a scheduler is required before performance NoCMAP for simulation as discussed in Section 5.2.
simulations.
4 NoC simulators
3.3 NoC architecture graph
A common challenge of selecting the right NoC simulator is that
NAG, A = A(R, H), is an architecture graph, in which each vertex available tools usually are strong in certain measurements and
(ri Є R) means a router (node) in the graph, while a directed arc (hi, having deficits in others. NoC simulator can be divided into two
broad categories:
j Є H) indicates the bidirectional routing channel between the
routers (ri) and (rj) as shown in Fig. 3d. Arc (hi, j) is associated
(1) General network simulators that can be used for NoC
with Mi, j (set of minimum paths from ti to tj) and L(mi, j) that
simulations (e.g. NS2, NS3, Omnet++, Wattch, Hotspot, Netsim,
represents the set of all links associated with mi,j. The cost of the Gem5, Graphite, Hornet, Opnet, Fusionsim, Esece) [28–32].
arc is represented by e(hi, j) which shows the average consumption (2) Specific NoC simulators, which are explicitly designed for NoC
of energy (joule) of sending data between network tiles (ti and tj), simulation (e.g. BookSim, HNOCS, WormSim, Ocsim, Vnoc,
e.g. Ei, j (bit). NAG represents the mapped applications on NoC Matrics, SICOSY, Tpzsimul, Garnet, SUNMAP, Ocintsim, Noxim,
tiles, which is then simulated on a specific tool for performance Nostrum, Nirgam, Occn, Nocsim, NoCTweak, Atlas, Gpnocsim,
measurements. Xmulator, NONMAP, ReliableNoC, MapoNoC, Phoenixsim,
Access Noxim and ORION) [12–26, 43–48] (Table 1).
5 Comparative simulations
To carry out comparative analysis of NoC tools, a generic
configuration setup is described in order to evaluate each simulator
on the basis of its metrics of performance. The available simulators
Fig. 4 Generic NoC simulator
do not have a common approach for input/output parameter
selection and also have different simulation models, so it is
difficult to precisely scrutinise each simulator. Therefore a
NOCMAP simulator which has the addition of reliability parameter
common configuration setup is selected in such a way that it has
in NOCMAP simulator.
same input parameters for comparison. Parameters in Table 2 are
used for analysis of the tools.
4.7 ORION Most of the available simulators support 2D mesh, synthetic
Architectural power estimation is an important factor in designing traffic pattern, wormhole switching mechanism, buffer depth
NoC-based systems. ORION is a fast and accurate architectural availability and XY routing algorithm. Throughput, latency and
power model for router power and area simulation of NoC [12]. power are the output characteristics parameters of these simulators.
ORION 3.0 [13] is its latest release after ORION 1.0 and ORION In this research work, the results of different simulators are
2.0. ORION 2.0 and ORION 3.0 have some additional compared on the basis of a unified configuration setup for a 5 × 5,
functionality as well as closest performance parameters to the 2D mesh. The input parameters and the design structure of most of
actual NoC designs. The comparison between ORION 2.0 the simulators are different, so it is difficult to precisely compare
simulations and Intel 80-core actual power consumption of Link, all the simulators on a unified configuration setup. It is tried to
FIFO, Clock, Arbiter and Crossbar closely verify the simulation compare some of these simulators on the basis of a best uniform
results of ORION 2.0 simulator. approach. 2D mesh of size 5 × 5 is selected, because most of the
simulator support 2D mesh topology and size 5 × 5 is chosen to cell library model [17]. In network power calculations (Fig. 5d),
minimise simulation time. again the results of NoCTweak and Nirgam are comparable and
almost have the same trend at a high injection rate.
5.1 Synthetic benchmark
5.2 Real application benchmark
For comparison of NoC simulators, synthetic benchmark is
selected because some simulator such as Noxim and CinSim NoCTweak and NOCMAP have the ability to map a real
support only synthetic traffic pattern. Injected traffic is also limited application on NoC tiles for energy optimisation. A VOPD
to 0.7 flits/cycle/node, in order to avoid network traffic congestion. (Fig. 3b) is simulated on these simulators for energy comparison.
In latency analysis, it is observed that Noxim and Nirgam have VOPD has 16 tasks and therefore requires 4 × 4, 2D mesh for
almost the same latency trend with injected traffic (flit injection implementation. NoCTweak has built in random and NMAP
rate) as shown in Fig. 5a. NoCTweak and CinSim have a non- mapping algorithms, while NOCMAP utilises BB and SA
linear response at low traffic, but when flit injection rate exceeds algorithms. In NoCTweak, the mapping through NMAP algorithm
0.5 flits/cycle/node; their response is comparable with Noxim and has 23.52% improvement in energy when compared with random
Nirgam. It may be due to the fact that NoCTweak and CinSim algorithms as shown in Table 3.
requires some extra parameters for analysis, which are limited in Also, the BB algorithm in NOCMAP has 1% improvement in
Noxim and Nirgam. Variations in the results are also due to the energy than SA algorithm. The simulation time of NMAP, SA, BB
unavailability of an exact and uniform embedded platform in these and Random algorithms of NoCTweak and NOCMAP simulator is
simulators. The design structure of these simulators is also different also shown in Table 3. SA algorithm of NOCMAP is the slowest
from each other. among all the other algorithms. As application mapping is non-
In throughput calculations (Fig. 5b), variation in results of polynomial hard problem, therefore, the SA algorithm in
NoCTweak is observed at the start, but the response is closely NOCMAP is not feasible for mapping of large applications.
matched to that of Noxim and CinSim at a slightly high injection A 24.89% improvement in power, 2% reduction in throughput
rate. Similarly, in Network energy analysis, NoCTweak and Noxim and 22% improvement in latency of NMAP algorithm are also
have a minor difference in results with the same injected traffic as observed when compared with random mapping of VOPD
shown in Fig. 5c. It is due to the fact that Noxim uses ORION application (Table 4). The mapping results of BB, SA, random and
energy model [13, 14] while NoCTweak utilises CMOS standard NMAP algorithms are shown in Fig. 6. As different algorithms
Fig. 5 Continued
generate different application mappings (Figs. 6a–d) due to the topology, XY routing algorithm, wormhole switching mechanism
diverse design structure and convergence time, therefore the and synthetic traffic pattern. The performance parameters such as
performance parameters differ from each other even under similar energy/power, throughput and latency are compared with respect to
configurations. NMAP algorithm has the best application mapping the offered load (flit injection rate). Some NoC simulators also
for performance parameters among all the three algorithms. This support area (ORION) and application mapping optimisation
efficiency of NMAP algorithm is due to the fact that it has better (NOCMAP, NoCTweak). The simulators and tools available for
algorithm and design parameters for mapping the applications on NoC are not mature, thereby creating a big challenge of
network nodes. standardisation to the research community in this important field of
The results and analysis of NoC tools divulge that an efficient study. To accommodate most of the afore-mentioned performance
mapping algorithm should be embedded in designing a complete parameters, generalised, accurate and precise simulators and tools
NoC simulator for embedded system applications. Only synthetic are mandatory for modern embedded system designs. The future
traffic generators do not predict the real behaviour of NoC NoC research will mainly focus on optical, wireless and 3D NoC
embedded systems. The mapping of the task on a specific network designs, therefore efficient simulators and tools will be the
tile has a great impact on the performance parameters such as essential part of these designs.
latency, power, throughput, and energy consumption of the
network. Therefore real world applications should be properly
scheduled and mapped on a specific topology before physical
implementation.
6 Conclusion
This paper tries to pinpoint an important aspect of NoC research by
analysing some renowned tools used for the simulation and
comparison of NoC platforms. A comparative analysis of a few
NoC open source simulators is also carried out to identify the
strong and weak points of these simulators with respect to NoC
performance parameters such as latency, throughput, power and
energy consumption. Most of these simulators support 2D mesh
7 References [20] Murali, S., De Micheli, G.: ‘SUNMAP: a tool for automatic topology
selection and generation for NoCs’. Proc. 41st Design Automation Conf.,
[1] Benini, L., De Micheli, G.: ‘Networks on chips: a new SoC paradigm’, IEEE 2004, pp. 914–919
Comput. Soc., 2002, 35, (1), pp. 70–78 [21] Jueping, C., Gang, H., Shaoli, W.: ‘OPNEC-Sim: an efficient simulation tool
[2] Dally, W.J., Towels, B.: ‘Route packets, not wires, on-chip interconnection for network-on-Chip communication and energy performance analysis’. 10th
networks’. Proc. 38th DAC, 2001, pp. 684–689 IEEE Int. Conf. Solid-State and Integrated Circuit Technology (ICSICT),
[3] Choudhary, N.: ‘Network-on-chip: a new SoC communication infrastructure 2010, pp. 1892–1894
paradigm’, Int. J. Soft Comput. Eng., 2012, 1, (6), pp. 332–335 [22] Hossain, H., Ahmed, M., Al-Nayeem, A., et al.: ‘Gpnocsim – a general
[4] Mineo, C., Davis, W.R.: ‘The benefits of 3D networks-on-chip as shown with purpose simulator for network-on-chip’. Int. Conf. Information and
LDPC decoding’. IEEE Int. Conf. 3D System Integration, 2009. 3DIC 2009, Communication Technology, 2007. ICICT ‘07.Dhaka, 2008
pp. 1–8 [23] Lis, M., Shim, K., Cho, M., et al.: ‘DARSIM: a parallel cycle-level NoC
[5] Jantch, A., Tenhunen, H.: ‘Networks on chip’ (Kluwer Academic Publishers, simulator’. 6th Annual Workshop on Modeling, Benchmarking and
2003) Simulation, Saint Malo, France, June 2010, pp. 1–10
[6] Gulzari, U.A., Anjum, S., Torres, F.S., et al.: ‘A new cross-by-pass-torus [24] Chan, J., Parameswaran, S.: ‘NoCGEN: a template based reuse methodology
architecture based on CBP-mesh and torus interconnection for on-chip for networks on chip architecture’. IEEE 17th Int. Conf. VLSI Design, 2004,
communication’, PLoS ONE December 1, 2016, 11, (12), pp. 1–18, e0167590 pp. 717–720
[7] Gulzari, U.A., Khan, S., Anjum, S., et al.: ‘An efficient and scalable cross-by- [25] Cong, J., Gururaj, K., Han, G., et al.: ‘MC-Sim: an efficient simulation tool
pass-mesh architecture for on-chip communication’, IET Comput. Digit. for MPSoC designs’. IEEE/ACM Int. Conf. Computer-Aided Design
Tech., 2017, (11), pp. 140–148 (ICCAD), 2008, pp. 364–371
[8] Atienza, D., Angiolini, F., Murali, S., et al.: ‘Network-on-chip design and [26] Lv, M., Guo, Y., Guan, N., et al.: ‘RTNoc: a simulation tool for real-Time
synthesis outlook’, VLSI J., 2008, 41, pp. 340–359 communication scheduling on networks-on-Chips’. Int. Conf. Computer
[9] Tsai, W.C., Lan, Y.C., Hu, Y.H., et al.: ‘Networks on chips: structure and Science and Software Engineering, 2008, vol. 4, pp. 102–105
design methodologies’, Hindawi J. Electric. Comput. Eng., 2012, 2012, [27] Hu, J., Marculescu, R.: ‘Energy-aware mapping for tile-based NoC
Article ID 509465, 15 pages architectures under performance constraints’. Asia and South Pacific Design
[10] Sahu, S., Kittur, H.M.: ‘Area and power efficient network-on-chip router Automation Conf., 2003, pp. 233–239
architecture’. IEEE Conf. Information and Communication Technologies [28] Tutsch, D., Lüdtke, D., Walter, A., et al.: ‘CINSim – a component based
(ICT), 2013, pp. 855–859 interconnection network simulator for modeling dynamic reconfiguration’.
[11] Gehlot, P., Singh Chouhan, S.: ‘Performance evaluation of network-on-chip Proc. 12th Int. Conf. ASMTA, 2005, pp. 132–137
architectures’. Int. Conf. Emerging Trends in Electronic and Photonic Devices [29] Anjum, S., Munir, E.U.: ‘Simulation and performance evaluation of network-
and Systems (ELECTRO-2009) on-chip architectures and algorithms using CINSIM’, J. Basic Appl. Sci. Res.,
[12] Kahng, A.B.: ‘ORION 2.0: a power-area simulator for interconnection 2011, 1, (10), pp. 1594–1602
networks’, IEEE Trans. VLSI Syst., 2012, 20, (1), p. 191 [30] Ning, W.: ‘Simulation and performance analysis of network-on-chip
[13] Kahng, A.B, Lin, B., Nath, S.: ‘ORION 3.0: a comprehensive NoC router architectures using OPNET’, 1-4244-1132-7/07 © 2007 IEEE
estimation tool’, IEEE Embedded Syst., 2015, 7, (2), pp. 41–45 [31] Ben-Itzhak, Y., Zahavi, E., Cidon, I., et al.: ‘NoCs simulation framework for
[14] Jain, L., Al-Hashimi, B., Gaur, M., et al.: ‘NIRGAM: a simulator for NoC OMNeT++’. Fifth IEEE/ACM Int. Symp. Networks on Chip (NoCS), 2011,
interconnect routing and application modeling’. Workshop on Diagnostic pp. 265–266
Services in Network-on-Chips, DATE, 2007, pp. 16–20 [32] Kourdy, R., Yazdanpanah, S., Rad, M.R.N.: ‘Using the NS-2 network
[15] Fazzino, F., Palesi, M., Patti, D.: ‘Noxim: network-on-chip simulator’, 2008 simulator for evaluating multi-Protocol label switching in network-on-Chip’.
[16] Catania, V., Mineo, A., Palesi, M., et al.: ‘Cycle-accurate network-on-chip Second Int. Conf. Computer Research and Development, 2010, pp. 795–799
simulation with Noxim’, ACM Trans. Model. Comput. Simul., 2016, 27, (1), [33] Marculescu, R., Ogras, U.Y., Peh, L.S., et al.: ‘Outstanding research problems
Article 4, 25 pages in NoC design: systems, micro architecture, and circuit perspectives’, IEEE
[17] Tran, A.T., Baas, B.M.: ‘NoCTweak: a highly parameterizable simulator for Trans. Computer-Aided Des. Integr. Circ. Syst., 2009, 28, (1), pp. 03–21
early exploration of performance and energy of networks on chip’. Technical [34] Agarwal, A., Iskander, C., Shankar, R.: ‘Survey of network-on-chip (NoC)
Report, VLSI Computation Lab, ECE Department, UC Davis, July, 2012 architectures & contributions’, J. Eng. Comput. Arch., 2009, 3, (1), pp. 1–15
[18] Jiang, N.: ‘BookSim 2.0 user's guide’, May 7, 2013 [35] Abbas, A.: ‘A survey on energy-efficient methodologies and architectures of
[19] Lu, Z.: ‘NNSE: Nostrum network-on-chip simulation environment’. Swedish, network-on-chip’, Comput. Electric. Eng. J., 2014, 40, (8), pp. 333–347
System on Chip, 2005 [36] Kumar Sahu, P., Chattopadhyay, S.: ‘A survey on application mapping
strategies for network-on-chip design’, J. Syst. Archit., 2013, 59, pp. 60–76