0% found this document useful (0 votes)
73 views13 pages

Complete Group A - CMP 408 Assignment

The document discusses various strategies for performance evaluation measurement techniques, including event-driven, tracing, indirect, and sampling strategies. It also compares hardware and software monitors, noting that hardware monitors provide higher input rates, resolution, and performance while being more costly and difficult to implement than software monitors. Finally, it discusses some main applications of accounting logs like resource usage, auditing, budgeting, and fraud detection.

Uploaded by

Aliyu Sani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views13 pages

Complete Group A - CMP 408 Assignment

The document discusses various strategies for performance evaluation measurement techniques, including event-driven, tracing, indirect, and sampling strategies. It also compares hardware and software monitors, noting that hardware monitors provide higher input rates, resolution, and performance while being more costly and difficult to implement than software monitors. Finally, it discusses some main applications of accounting logs like resource usage, auditing, budgeting, and fraud detection.

Uploaded by

Aliyu Sani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

1.

What are the strategies that can be used for the measurement techniques of
performance evaluation?

i. Event-driven strategy. It is used to records the needed information to calculate the


metric whenever the events of interest occur. This strategy has the advantage that the
overhead needed to monitor the event of interest is spent only when the event
happens. However, it is considered as a drawback when the event occurs frequently.
ii. Tracing strategy: it relies on recording more data than only a single event. This means
there is need for more storage space compared with the event-driven scheme.
iii. Indirect: This is used when the performance measure (metric) of interest cannot be
measured directly. In such a case, there is a need for a metric that can be measured
directly and from which the required metric can be derived.
iv. Sampling: it relies on recording the system’s state needed to find out the performance
metric of interest.

2. Compare and contrast hardware and software monitors.


S/N Hardware Monitors Software Monitors
1 It is attached to the system to be monitored to They are computer
collect information related to specific of program embedded in the
interest. operating system.
2 They have higher input rates, higher resolution They have lower input
and lower overhead. rates, lower resolution and
higher overhead.
3 It is costly and difficult to implements Less expensive and easier
to implement.
4 They are faster They are too slow

3. What are the main Application of accounting logs


i. Usage of resources
ii. Auditing
iii. Budgeting and forecasting
iv. Fraud detection
v. Programs that needs better code optimization
4. Which program must be chosen for I/O optimization? Explain
There are variety of program and tools that can be used to optimize input / output
(I/O) performance.
i. Caching: programs such as memcached and Redis can be used to cache
frequently accessed data in memory, reducing the amount of space required
and improving I/O performance.
ii. File system
iii. I/O schedulers
Ultimately, the specific program or tool that should be chosen will depend on
the specific requirements of the workload and resources available.
5. Choose an IEEE 802.11 wireless local area network (WLAN), review
published articles related to its performance evaluation, and make a list
of the benchmarks used in these articles.
Answer
One example of an IEEE 802.11 wireless local area network (WLAN) is IEEE 802.11n, a
standard that was first published in 2009 and provides higher data rates and improved
coverage compared to previous standards.

Several articles have been published on the performance evaluation of IEEE 802.11n
WLANs. Some of the benchmarks used in these articles include:

Throughput: The rate at which data is transmitted over the network, typically measured in
bits per second (bps) or packets per second (pps).

Data rate: The speed at which data is transmitted over the network, typically measured in
megabits per second (Mbps).

Signal-to-noise ratio (SNR): The ratio of the signal strength to the noise level, which affects
the quality of the wireless signal.

Packet error rate (PER): The percentage of packets that are lost or corrupted during
transmission.

Latency: The time it takes for a packet to be transmitted from the sender to the receiver,
typically measured in milliseconds (ms).

Channel utilization: The percentage of time that the wireless channel is in use, which affects
the network's capacity.

Coverage area: The physical area over which the wireless signal can be received, typically
measured in square meters.
Interference: The level of interference caused by other wireless networks or devices, which
can affect the performance of the WLAN.

Mobility: The ability of the network to maintain a stable connection as a user moves around
the coverage area.

6. Choose a multiprocessor computer system architecture. Review the


related published articles on its performance evaluation, and make a
list of the used performance metrics.
Answer
One example of a multiprocessor computer system architecture is the Cray XC50
supercomputer. The Cray XC50 is a distributed memory, multi-node supercomputer that uses
a combination of Intel Xeon processors and NVIDIA GPUs.

Several articles have been published on the performance evaluation of the Cray XC50. Some
of the performance metrics used in these articles include:

Throughput: The rate at which the system processes data, typically measured in operations
per second or floating-point operations per second (FLOPS).

Latency: The time it takes for a request to be processed by the system, typically measured in
microseconds.

Scalability: The ability of the system to maintain or improve performance as the number of
processors or nodes increases.

Energy efficiency: The ratio of computation performance to energy consumption, typically


measured in FLOPS per watt.

Memory bandwidth: The rate at which data can be transferred to or from the system's
memory, typically measured in bytes per second.

I/O performance: The rate at which data can be transferred to or from external storage
devices, typically measured in bytes per second.

Memory capacity: The amount of memory available to the system.

Communication overhead: The time and resources needed for processors to communicate
with each other.
Load balance: The distribution of workload across processors, to ensure that no processor is
overworked or underutilized.

These performance metrics were used in different publications to evaluate the performance of
the Cray XC50 under different workloads and configurations, in order to understand the
system's capabilities and limitations.

7. Select a measurement study of the performance evaluation of a computer


system or a communication network in which hardware monitors are
used in the study. Explain how useful such monitors are for providing
accurate and real measurement about the behavior of the system. Discuss
whether you can replace the hardware monitor by a software monitor,
and give the advantages and disadvantages for doing so.
Answer
An example of a measurement study that uses hardware monitors is "Performance Evaluation
of an InfiniBand Network Using Hardware Counters" by M. E. K. El-Hadidy et al. (2011). In
this study, the authors used hardware counters to measure the performance of an InfiniBand
network, which is a high-speed interconnect technology used in data centers. The hardware
counters were used to measure various performance metrics, such as packet rate, packet size,
and packet loss, on the network's switches and hosts.

Hardware monitors are useful for providing accurate and real measurements about the
behavior of a system because they are directly connected to the system's hardware and can
measure various performance metrics at the hardware level. This can provide a more accurate
picture of the system's behavior, as it eliminates the need to rely on software-based
measurements that may be affected by the system's software and operating system.

While it is possible to replace the hardware monitor with a software monitor, there are
advantages and disadvantages to doing so. The advantage of using a software monitor is that
it can be easily implemented and can provide measurements on a wide range of systems,
regardless of the hardware. However, the disadvantage of using a software monitor is that it
can be affected by the system's software and operating system, which can lead to inaccurate
measurements.

In the case of this study, measuring the performance of an InfiniBand network, hardware
counters were essential to measure the packet rate and packet loss with high accuracy.
Software-based monitoring systems may not have been able to provide such accurate
measurements, as they would have been affected by the system's software and operating
system.

In conclusion, hardware monitors are useful for providing accurate and real measurements
about the behavior of a system, particularly when it comes to high-speed networks or
specialized systems. However, software monitors can also provide useful measurements, but
they may not be as accurate as hardware monitors. The choice between hardware and
software monitors will depend on the requirements of the measurement study and the system
being evaluated.

8. A workstation uses a 500-MHz processor with a claimed 100-MIPS


rating to execute a given program mix. Assume a one-cycle delay for
each memory access.
a. What is the effective cycle per instruction (CPI) of this machine?
b. Suppose that the processor is being upgraded with a 1000-MHz clock.
However, the speed of the memory subsystem remains unchanged,
and consequently, two clock cycles are needed per memory access. If
30% of the instructions require one memory access and another 5%
require two memory accesses per instruction, what is the performance
of the upgraded processor with a compatible instruction set and equal
instruction counts in the given program mix?

Answer
a. To calculate the effective CPI of the machine, we need to know the number of clock cycles
it takes to execute each instruction, including memory access time.
If each memory access takes one clock cycle, and 30% of the instructions require one
memory access, then the total number of clock cycles spent on memory access is 0.3 * 1 =
0.3.
If 5% of the instructions require two memory accesses, then the total number of clock cycles
spent on memory access is 0.05 * 2 = 0.1.
To find the total number of clock cycles per instruction, we add the clock cycles spent on
memory access to the number of clock cycles spent on instruction execution, which is
assumed to be 1.
Therefore, the effective CPI of the machine is 1 + 0.3 + 0.1 = 1.4.

b. To calculate the performance of the upgraded processor, we need to know the number of
clock cycles per instruction and the clock frequency of the processor.
The clock frequency of the upgraded processor is 1000 MHz, which is twice as fast as the
original processor. However, the memory subsystem remains unchanged, so each memory
access now takes two clock cycles instead of one.
If 30% of the instructions require one memory access, then the total number of clock cycles
spent on memory access is 0.3 * 2 = 0.6.
If 5% of the instructions require two memory accesses, then the total number of clock cycles
spent on memory access is 0.05 * 4 = 0.2.
To find the total number of clock cycles per instruction, we add the clock cycles spent on
memory access to the number of clock cycles spent on instruction execution, which is
assumed to be 1.
Therefore, the effective CPI of the upgraded machine is 1 + 0.6 + 0.2 = 1.8.

To calculate the performance of the upgraded processor, we divide the clock frequency of the
processor by the effective CPI.
Therefore, the performance of the upgraded processor is 1000 MHz / 1.8 = 555.56 MIPS.

9. A linear pipeline processor has eight stages. It is required to execute a


task that has 600 operands. Find the speedup factor, Sk, assuming that
the CPU runs at 1.5 GHz. Note that the speedup factor of a
liner pipeline processor is defined by the following expression:
Sk=speedup= (time needed by a one-stage pipeline processor to do
a task)/ (time needed by k-stage processor to do the same task) =T1/Tk.
Answer
The speedup factor, Sk, of a linear pipeline processor is the ratio of the time needed by a one-
stage pipeline processor to execute a task to the time needed by a k-stage pipeline processor
to execute the same task.
To calculate the speedup factor for a processor with 8 stages, we first need to determine the
time needed by a one-stage pipeline processor to execute the task and the time needed by an
8-stage pipeline processor to execute the same task.
The time needed by a one-stage pipeline processor to execute the task is:
600 / 1.5 GHz = 0.4 us.

The time needed by an 8-stage pipeline processor to execute the same task is the number of
operands divided by the number of stages, divided by the CPU clock frequency.
Therefore, the time needed by an 8-stage pipeline processor to execute the task is:
(600 / 8) / 1.5 GHz = 0.05 us.

The speedup factor, Sk, is the ratio of the time needed by a one-stage pipeline processor to
execute the task to the time needed by an 8-stage pipeline processor to execute the same task.
Therefore, the speedup factor, Sk, for an 8-stage pipeline processor is:
Sk = T1/Tk = 0.4 us / 0.05 us = 8

The speedup factor for 8 stages pipeline is 8.

10. Devise an experiment to find out the performance metrics for an IEEE 802.3 local
area network (LAN)

a. The throughput of the network as a function of the number of nodes in the LAN.

b. The average packet delay as a function of the number of nodes in the LAN.
c. The throughput-delay relationship.

Answer
One way to devise an experiment to find out the performance metrics for an IEEE 802.3 LAN
would be as follows:

a. To measure the throughput of the network as a function of the number of nodes in the
LAN, we can use a network traffic generator to send a large amount of data packets to the
LAN. We can then measure the number of packets that are successfully received by the
destination node, and divide that by the time it took to send the packets. We can repeat this
process for different numbers of nodes in the LAN, and plot the results to see how the
throughput changes with the number of nodes.
b. To measure the average packet delay as a function of the number of nodes in the LAN, we
can use a network traffic generator to send a large number of packets to the LAN. We can
then measure the time it takes for each packet to be successfully received by the destination
node. We can calculate the average delay for each number of nodes in the LAN, and plot the
results to see how the delay changes with the number of nodes.

c. To measure the throughput-delay relationship, we can use the data obtained from the
previous two steps and plot them on a graph with throughput on the x-axis and delay on the
y-axis. This will show us how the delay changes as the throughput of the network increases.

CHAPTER 9 QUESTIONS

1. Describe what do you think would be the most effective way to study
each of the following systems:
a. A wireless local area network that consists of 100 nodes.
b. A 1000-procesor massively parallel computer system.
c. The performance of an Asynchronous Transfer Mode (ATM) based
local area network LAN system.
d. The operation of a simple bank branch in a town

Answer
a. To study the performance of a wireless local area network with 100 nodes, it would be
most effective to set up a test environment that mimics the network's real-world conditions as
closely as possible. This could include using similar hardware and software, and simulating
the same types of traffic and usage patterns. Testing tools such as network analyzers and
performance monitoring software could then be used to measure key performance indicators
such as throughput, latency, and packet loss.

b. To study the performance of a 1000-processor massively parallel computer system, it


would be most effective to use benchmarking and simulation tools to measure the system's
performance under different loads and workloads. This could include running parallel
computing benchmarks such as Linpack and MPI benchmarks, and simulating different types
of workloads such as scientific simulations, data analytics, and machine learning.
c. To study the performance of an ATM-based local area network, it would be most effective
to set up a test environment that mimics the network's real-world conditions as closely as
possible. This could include using similar hardware and software, and simulating the same
types of traffic and usage patterns. Testing tools such as network analyzers and performance
monitoring software could then be used to measure key performance indicators such as
throughput, latency, and packet loss. Additionally, it would be useful to compare the
performance of the ATM-based network to other types of networks, such as Ethernet or fiber-
optic networks.

d. To study the operation of a simple bank branch in a town, it would be most effective to use
a combination of observational and analytical methods. This could include conducting
interviews with bank employees and customers, observing branch operations firsthand, and
analyzing data such as transaction logs and customer surveys. Additionally, it would be
useful to compare the branch's performance to other branches in the area, and to industry
benchmarks.

2. For each of the systems in problem 1, assume that it has been decided to make a study via a
simulation model. Discuss whether the simulation should be static or dynamic, deterministic
or stochastic, and continuous or discrete.
Answer
a. A wireless local area network that consists of 100 nodes: The simulation of this system
should be dynamic, stochastic, and discrete. The dynamic simulation would be useful in order
to capture the time-varying nature of the network traffic and usage patterns. Stochastic
simulation would be useful to capture the randomness and uncertainty of the network traffic,
such as the arrival and departure of nodes. Discrete simulation would be useful to model the
discrete events that occur in the network, such as packet transmissions and receptions.

b. A 1000-processor massively parallel computer system: The simulation of this system


should be dynamic, deterministic, and continuous. The dynamic simulation would be useful
in order to capture the time-varying nature of the system's performance under different loads
and workloads. Deterministic simulation would be useful to model the system's behavior
under specific inputs, such as the exact number of processors and memory available.
Continuous simulation would be useful to model the system's performance in terms of
continuous variables such as processing time and memory usage.

c. The performance of an Asynchronous Transfer Mode (ATM) based local area network
LAN system: The simulation of this system should be dynamic, stochastic, and discrete. The
dynamic simulation would be useful in order to capture the time-varying nature of the
network traffic and usage patterns. Stochastic simulation would be useful to capture the
randomness and uncertainty of the network traffic, such as the arrival and departure of nodes.
Discrete simulation would be useful to model the discrete events that occur in the network,
such as packet transmissions and receptions.

d. The operation of a simple bank branch in a town: The simulation of this system should be
static, stochastic, and discrete. The static simulation would be useful in order to capture the
overall processes and operations of the branch and how they are affected by the different
factors. Stochastic simulation would be useful to capture the randomness and uncertainty of
the customer behavior, such as the arrival and departure of customers. Discrete simulation
would be useful to model the discrete events that occur in the branch, such as customer
transactions and interactions with employees.

3. The technique for producing an exponential random variate with mean interarrival time of
1/l uses the formula, 1/l Ln U, where U is a uniformly distributed random variate between 0
and 1, U (0, 1). This approach could correctly be modified to return 1/l Ln (1 U). Explain
why this is possible.
Answer
This modification is possible because the exponential distribution is memoryless, meaning
that the probability of an event occurring at any given time is independent of when the last
event occurred. Therefore, the result of subtracting a uniformly distributed random variable
from 1 does not affect the probability distribution of the exponential random variable. The
exponential distribution is defined by its rate parameter, which is represented by the inverse
of the mean interarrival time (1/λ). Therefore, by multiplying this rate parameter by the
natural logarithm of (1-U), the resulting random variable will still have a probability
distribution that follows an exponential distribution with the same mean interarrival time.

4. Which type of simulation would you use for the following problems:
a. To model traffic in a wireless cell network given that the traffic is bursty.
b. To model scheduling in a multiprocessor computer system given that the request arrivals
have a geometric distribution.
c. To verify the value of p, which is defined as the ratio of a circle’s circumference to its
diameter; C/D.
Answer
a. Discrete event simulation
b. Discrete event simulation
c. Monte Carlo simulation
5. Using the multiplicative congruential method, find the period of the generator for a= 17, m
= 26, and X0= 1, 2, 3, and 4. Comment on the produced numbers and resulting periods
Answer
The Multiplicative Congruential Method is a method for generating pseudo-random numbers
using the following formula:

Xn = (a*Xn-1) mod m

Where Xn is the nth generated number, Xn-1 is the previous generated number, a is the
multiplier, and m is the modulus.

Given a= 17, m = 26, and X0= 1, 2, 3, and 4.


We can find the period of the generator for each of the initial values of X0:

X0 = 1:
X1 = (17 * 1) mod 26 = 17
X2 = (17 * 17) mod 26 = 9
X3 = (17 * 9) mod 26 = 15
X4 = (17 * 15) mod 26 = 21
X5 = (17 * 21) mod 26 = 20
X6 = (17 * 20) mod 26 = 12
X7 = (17 * 12) mod 26 = 8
X8 = (17 * 8) mod 26 = 16
X9 = (17 * 16) mod 26 = 24
X10 = (17 * 24) mod 26 = 18
X11 = (17 * 18) mod 26 = 14
X12 = (17 * 14) mod 26 = 7
X13 = (17 * 7) mod 26 = 4
X14 = (17 * 4) mod 26 = 6
X15 = (17 * 6) mod 26 = 10
X16 = (17 * 10) mod 26 = 5
X17 = (17 * 5) mod 26 = 3
X18 = (17 * 3) mod 26 = 1

The period of the generator for X0 = 1 is 18. It means that there are 18 unique numbers
generated before the sequence repeats

X0 = 2:
X1 = (17 * 2) mod 26 = 5
X2 = (17 * 5) mod 26 = 3
X3 = (17 * 3) mod 26 = 1

The period of the generator for X0 = 2 is 3. It means that there are 3 unique numbers
generated before the sequence repeats

X0 = 3:
X1 = (17 * 3) mod 26 = 1

The period of the generator for X0 = 3 is 1. it means that there is only one number generated
and it is the initial value itself. It is not useful for applications that require a random number

X0 = 4:
X1 = (17 * 4) mod 26 = 6
X2 = (17 * 6) mod 26 = 10
X3 = (17 * 10) mod 26 = 5
X4 = (17 * 5) mod 26 = 3
X5 = (17 * 3) mod 26 = 1

The period of the generator for X0 = 4 is 5. It means that there are 5 unique numbers
generated before the sequence repeats
It is important to note that the period of the generator can vary greatly depending on the
choice of the initial value, multiplier, and modulus, and it should be carefully considered
when choosing a generator for a specific application.5.

6. Generate five 6-bit numbers using the Tauseworthe method for the following
characteristic polynomial starting with a seed of X0= (0.111111)2 X6 +X+1.

Answer
X1 = (0.011111)2
X2 = (0.101111)2
X3 = (0.110111)2
X4 = (0.111011)2
X5 = (0.111101)2

You might also like