Samknows-Testing-Methodology-September-2018.Pdf (Comcom - Govt.Nz)
Samknows-Testing-Methodology-September-2018.Pdf (Comcom - Govt.Nz)
nz)
SamKnows test methodology Download and Upload (TCP) Measures the download and upload speed of
the broadband connection in bits per second. The transfer is conducted over one or more concurrent
HTTP connections (using the GET verb of download and the POST verb for uploads). In the download
speed test the client will fetch a portion of an infinitely sized binary (nonzero, randomly generated)
payload hosted on an HTTP server on the target test server. The content is discarded as soon as it is
received. In the upload test the client will generate the payload itself (using /dev/urandom as a
nonblocking source of random content) to send to the server. The measure of throughput may be
optionally carried out on the server side (the receiver) in the upload test. The speed tests (both
download and upload) operate for either a fixed-duration (specified in seconds) or a fixed-volume
(specified in MB). Where possible, a fixed-duration test is preferred as it will cater well for all broadband
access speeds. However, a fixed-volume test may be necessary where predictability of bandwidth usage
is desired. Four separate variations of the test are supported: Single TCP connection download speed
test Multiple TCP connection download speed test Single TCP connection upload speed test
Multiple TCP connection upload speed test For multiple TCP connection tests we typically recommend
that three concurrent connections are used. In some cases (e.g. where the round-trip time between
client and server is very high) it may be necessary to increase this. Factors such as TCP slow start are
accounted for through the use of a “warm-up” period. This period begins as soon as the test starts and
seeks to establish that the throughput has reached stable rate before starting the real test (which will
continue over the same TCP connection(s)). It is important to note that the data transferred in the
warm-up period is excluded from the main test results, but it is still recorded separately as a
supplementary metric. The speed test client will record the throughput, bytes transferred and time
taken at the end of the test. It may also record these values at multiple intervals during the test. This is
commonly used to help characterise the difference between 'burst' and 'sustained' throughput (where
transfer speeds may be inflated at the start of a TCP connection). 2 3345540 Latency and packet loss
(UDP) Measures the round trip time of small UDP packets between the router and a target test server.
Each packet consists of an 8-byte sequence number and an 8-byte timestamp. If a packet is not received
back within two seconds of sending, it is treated as lost. The test records the number of packets sent
each hour, the average round trip time of these and the total number of packets lost. The test will use
the 99th percentile when calculating the summarised minimum, maximum and average results on the
router. The test operates continuously in the background. It is configured to randomly distribute the
sending of the echo requests over a fixed interval, reporting the summarised results once the interval
has elapsed. Typically 2000 samples are taken per hour, distributed throughout the hour. If the line is
busy then fewer samples will be taken. A higher sampling rate may be used if desired. Contiguous packet
loss / disconnections (UDP) This test is an optional extension to the UDP Latency/Loss test. It records
instances when two or more consecutive packets are lost to the same test server. Alongside each event
we record the timestamp, the number of packets lost and the duration of the event. By executing the
test against multiple diverse servers, a user can begin to observe server outages (when multiple probes
see disconnection events to the same server simultaneously) and disconnections of the user's home
connection (when a single probe loses connectivity to all servers simultaneously). Typically, this test is
accompanied by a significant increase in the sampling frequency of the UDP Latency/Loss client to ~2000
packets per hour (providing a resolution of 2-4 seconds for disconnection events). Latency, jitter and
packet loss (UDP) This test uses a fixed-rate stream of UDP traffic, running between client and test node.
A bidirectional 64kbps stream is used with the same characteristics and properties (i.e. packet sizes,
delays, bitrate) as the G.711 codec. The client initiates the connection, thus overcoming NAT issues, and
informs the server of the rate and characteristics that it would like to receive the return stream with.
The standard configuration uses 500 packets upstream and 500 packets downstream. The client records
the number of packets it sent and received (thus providing a loss rate), and the jitter observed for
packets it received from the server. The server does the same, but with the reverse traffic flow, thus
providing bi-directional loss and jitter. Jitter is calculated using the PDV approach described in section
4.2 of RFC5481. The 99th percentile will be recorded and used in all calculations when deriving the PDV.
3 3345540 DNS resolution time and failure rate (UDP) This test measures the DNS resolution time of a
selection of common website domain names. These tests will be targeted directly at the ISPs recursive
resolvers. A list of appropriate servers will be sought from each ISP in advance of the tests for manual
configuration or it will use the default DNS servers in use on the router. Web browsing (TCP) Measures
the time taken to fetch the HTML and referenced resources from a page of a popular website. This test
does not test against centralised testing nodes; instead it tests against real websites, allowing for
content distribution networks and other performance enhancing factors to be considered. Each
Whitebox will test ten common websites on every test run. The time taken to download the resources,
the number of bytes transferred and the calculated rate per second will be recorded. The primary
measure for this test is the total time taken to download the HTML page and all associated images,
JavaScript and stylesheet resources. The results include the time taken for DNS resolution. The test uses
up to eight concurrent TCP connections to fetch resources from targets. The test pools TCP connections
and utilises persistent connections where the remote HTTP server supports them. The test may
optionally run with or without HTTP headers advertising cache support (through the inclusion or
exclusion of the “Cache-Control: no-cache” request header). The client advertises the user agent of
Microsoft Internet Explorer 10. Netflix (TCP) The Netflix test is an application-specific test, supporting
the streaming of binary data from Netflix's servers using the same CDN selection logic as their real client
uses. The test has been developed with direct cooperation with Netflix. The test begins by calling a
Netflix hosted web-based API. This API examines the client's source IP address and uses the existing
proprietary internal Netflix logic to determine which Netflix server this user's IP address would normally
be served content from. This logic will consider the ISP and geographic location of the requesting IP
address. Where the ISP participates in Netflix's Open Connect programme, it is likely that one of these
servers will be used. The API will return to the client a HTTP 302 redirect to a 25MB binary file hosted on
the applicable content server. The test will then establish an HTTP connection to the returned server
and attempt to fetch the 25MB binary file. This runs for a fixed 20 seconds of realtime. HTTP pipelining is
used to request multiple copies of the 25MB binary, ensuring that if the payload is exhausted before the
20 seconds are complete, we can continue receiving more data. The client downloads data at full rate
throughout; there is no client-side throttling taking place. It's important to note that this 25MB binary
content does not contain video or audio; it is just random binary data. However, with knowledge of the
bitrates that Netflix streams 4 3345540 content at, we can treat the binary as if it were video/audio
content operating at a fixed rate. This allows us to determine the amount of data consumed for each
frame of video (at a set bitrate) and the duration that it represents. Using this, we then can infer when a
stall occurred (by examining when our simulated video stream has fallen behind realtime). The test
currently simulates videos at bitrates of 235Kbps, 375Kbps, 560Kbps, 750Kbps, 1050Kbps, 1750Kbps,
2350Kbps, 3000Kbps, 4500Kbps, 6000Kbps and 15600Kbps. This approach also allows us to derive the
'bitrate reliably streamed', using the same methodology as the YouTube test. A small difference here is
that we do not need to restart the download at a lower bitrate if a stall is encountered; because the
incoming stream of binary data is decoded at a simulated bitrate, we can simply recompute the playback
characteristics of the same network stream at a different bitrate entirely on the client side. This simply
means that the test uses a predictable amount of bandwidth, even in cases where stalls occur. The key
outputs from this metric are: The bitrate reliably streamed The startup delay (the time taken to
download two seconds of video) The TCP connection time The number of stalls and their duration
The downstream throughput achieved YouTube (TCP) The YouTube test is an application-specific test,
supporting the streaming of video and audio content from YouTube using their protocols and codecs.
The test begins by seeking out the most popular video in the user's country. This is achieved by fetching
a list of the most popular YouTube videos from a central SamKnows server. The central list of videos is
refreshed once every 12 hours using the YouTube API. We filter for videos that are at least 60 seconds in
length and have an HD quality variant. Note that by interacting with the YouTube API from a central
location we can ensure that every probe is delivered the same list of videos. The test running on the
probe will now fetch the YouTube web page for the most popular video, and parse the JavaScript
contained within the page. Within this JavaScript is held a list of all the encodings of the video in
question and the content server hostname. By making this request from the probe we ensure that the
test is receiving the same content server as the user would if they were using a desktop computer on
the same connection. The test will then connect to the content server (using whatever server YouTube
would normally direct a real client on the same connection to) and begins streaming the video and
audio. MPEG4, WebM, Dash (adaptive) and Flash video codecs are supported. Although the 5 3345540
adaptive codec is supported, the test does not actually adapt its rate; we stream at full rate all the time,
which provides for reproducibility. The test parses video frames as it goes, capturing the timestamp
contained within each video frame. After each frame, we sample how much realtime has elapsed versus
video time. If video time > realtime at a sample period, then an underrun has not occurred. Otherwise,
one has occurred. The test downloads 10 seconds of audio and video at a time, with a buffer of 40
seconds. So, on startup, the test will immediately download (at full speed) 40 seconds of video and
audio, and will then download more as required, keeping the 40 second playback buffer full. By default,
the test will run for a fixed duration of 20 seconds of realtime. In its default mode of operation, the test
will capture the 'bitrate that can be reliably streamed' on the user's connection. This is achieved through
the following process: 1. Find the fastest recent speedtest result that the probe has completed. 2. As
described above, fetch the list of YouTube videos, find the most popular one, and then select the highest
bitrate encoding which is less than the fastest speedtest result found in step 1. 3. Attempt to stream this
video, for a fixed duration of 20 seconds of realtime. If successful, then the “bitrate reliably streamed”
for this instance is the bitrate that we just fetched. 4. However, if a stall event occurs, then we
immediately abort the test and retry at the next lower bitrate. 5. If we find a bitrate that we can stream
without a stall event occurring, then that bitrate is our “bitrate reliably streamed” for this instance. 6.
However, if we encounter stalls for every bitrate, then the “bitrate reliably streamed” is zero. The key
outputs from this metric are: The bitrate reliably streamed The startup delay (the time taken to
download two seconds of video) The TCP connection time The number of stalls and their duration
(this is only applicable if the test is not running in the 'bitrate reliably streamed' mode