Scte 35
Scte 35
Ad Insertion is a very important part of many video delivery systems because of the
monetization aspect—it generates revenue! With Over-The-Top (OTT) video delivery on
the internet, the holy grail of advertisement is finally achievable—it is technically
possible to send individual, personalized ads to each viewer. Such systems are based, in
part, on the traditional ad insertion workflows that use the SCTE-104 and SCTE-35
standards as their starting point. This paper provides an overview of such systems,
showing how a traditional ad-insertion workflow at the programmer side can be used as
a basis for an OTT system. We also show some other uses for the ad insertion
infrastructure (for program delimitation) and comment on the importance of frame
accuracy.
· Start with video feed, containing programs and ads. This is typically a national network
feed. Some of the ads are high-priority national ads and some of the ads are low-priority
national ads that could be replaced by local ads down the chain.
Advertisement
· An automation/playout system “decorates” the baseband video feed with markers that
delimit the ads. These baseband markers are defined in SCTE-104.
· An encoder converts the baseband video into a compressed bit stream. The baseband
markers are translated into compressed stream markers for transmission with the
content. The compressed stream markers are defined in SCTE-35.
· Somewhere in the reception chain, before the video is delivered to the consumer, new
ads are spliced in the locations indicated by the markers. This is where OTT systems
take over.
The Programmer Side is responsible for generating the compressed bit stream with the
appropriate ad insertion markers. A general diagram is presented in Figure 1.
The final output of the programmer side is a transport stream with the compressed
content, plus the SCTE-35 markers. This is the feed to the affiliates and is a good
starting point for the OTT ad insertion workflow.
The traditional system has a “one-size-fits-all” approach. The resulting output is a fully
compliant linear stream, decodable by any set-top box, no special features or
functionality needed. This stream typically goes into a cable plant, or a terrestrial
transmitter, and all receivers in the service area show exactly the same ad.
A variation of the traditional system has been proposed in the SCTE-138 standard to
allow for a small amount of individual ad targeting. In this approach, the transport
stream will contain the main program and a certain number of additional ads, carried as
separate audio/video PIDs in the same service as the main program.Markers are left in
place and processed by the receiving set-top boxes.The set-top box will then decide
which one of the small number of available ads to display (or not; it can also leave the ad
in the main program).This allows for a very limited amount of ad targeting.
The term “Over-The-Top” (OTT) refers content delivery services using the internet (i.e.,
“on top” of the network services from the provider). Figure 3 shows the basic OTT
operation.
· The original transport stream from the programmer side (possibly decorated with
SCTE-35 markers) goes to a transcoder device.
· The transcoder device produces several versions of the stream, at different resolutions
and bit rates. These versions are called “profiles.”
· Each profile is further divided into individual files, called segments. Each segment is
individually decodable—in other words, no data from a previous segment is required to
start decoding it and it can be decoded up to its last frame, with no data required from
the next segment. For H.264 streams, the segment starts on an IDR (“Instantaneous
Decoding Refresh”) frame and the last GOP (“Group Of Pictures”) of the segment is
closed.
· A “manifest” is also placed in the web server. The manifest lists the segments and there
is a top-level manifest that lists the available profiles and their characteristics. Manifests
are text files, and their format changes from standard to standard.
· Playback devices will read the top level manifest and learn the available profiles. They
will then decide on a profile, read its individual manifest and start reading decoding the
segments. If the network conditions change, the playback device may switch to a higher
or lower profile as needed. On a live stream, manifests are frequently updated.
· OTT is unicast: each player device establishes its own connection to the server. With
appropriate support, ads can be personalized to each specific viewer. Moreover, since
these are connections to a web server, the user identification and tracking
methodologies developed for the web can be used here as a basis.
· In general, the video frame identified by the SCTE-35 marker is not aligned with a
segment boundary. The transcoder creating the profiles must ensure that a new segment
starts at that frame—i.e., it will terminate the previous segment “early”.
· The SCTE-35 marker is added to the manifest. Depending on the OTT standard being
used, the way to do that varies as follows:
o SCTE-67 details how to do this for HLS, DASH and HDS. Some of the same
information can also be found in SCTE-35 2016.
In some situations, additional control is required for OTT systems, due to the following:
· Differences in how the SCTE-35 messages are interpreted between different providers
may require them to be translated.
· Choices between how the ad is inserted—will it be left up to the playback device, or will
it be done by manifest manipulation?
This functionality is provided by the CableLabs Event Signaling and Management API
(ESAM). This is a set of interfaces between a transcoder, segment packager and a
control element. It provides the following functionality:
· Manifest conditioning:
OTT AD INSERTION
After the program is transcoded, converted to OTT profiles and possibly conditioned,
the actual ad insertion can happen. There are two possibilities for the insertion:
· Client-Side Ad Insertion:
· Server-Side Ad Insertion:
o Server manipulates the manifest provided to client, to point at the ad at the correct
points in time
Note that both possibilities support the notion of individually targeted ads.
The next step in the workflow is to decide which ad to play. In the traditional system, a
splicer simply uses SCTE-30 to request an ad from the ad server; in the simplest system,
the ad server has a list of ads and provides them in sequence. However, if the final
objective is to individually target the ad, a more sophisticated system is required. Two
standards cover this functionality:
· The Interactive Advertising Bureau (IAB) created a standard called VAST (Video Ad
Serving Template) for a client to request an ad from a server.
o VAST is a layer on top of browser technology—HTTP is used for the player to server
interactions.
VAST is typically used as an interface to get ads from third parties, while SCTE-130 is
used when only one party controls (or owns) the whole infrastructure.
VAST OPERATION
VAST supports both client-side and server-side ad serving. Figure 6 illustrates the client
side ad serving.
2. The Ad Server may return the ad, or may point the client to another server. In the
figure, the first ad server sends a Wrapper Response to the client, directing it to contact
another server.
4. In the figure, the second server provides an InLine Response with the ad to be played.
5. The client plays the ad. Ad tracking happens at the client, using standard web cookies.
Figure 7 are:
1. In response to a marker, the splicer sends a VAST request to the server. The splicer
itself can trigger this request by inspecting the manifest and finding the marker, or the
request may come from the client.
2. The server will send a VAST response, which will include the ad in a mezzanine file
format. This is a high bit rate format with good quality that will need to be transcoded
prior to transmission to the client.
3. The splicer will contact a transcoder for the ad. The transcoder may have a cached
version of that ad ready to go; in this case, it is provided to the splicer, which will
“stitch” the ad in the right place and send to the client. The standard has provisions for
the case where the transcoder does not have a cached version of the ad—in this case,
another lower priority ad is served, while the transcoder prepares the new ad. If this
happens, this insertion opportunity is lost, but the next time this ad is requested, it will
be ready.
The Interactive Advertising Bureau has defined a protocol called “Video Player-Ad
Interface Definition” (VPAID) to support interactive ads. This is an extension of the
client-side ad serving case.
VPAID is layered on top of VAST. As part of the VAST response, the server may provide
a VPAID ad unit. This is an executable “app” that remains in contact with the
appropriate server, providing interactivity and possibly impression reporting.
Executable ads can be written in ActionScript 3, Silverlight, or JavaScript.
SCTE-130 defines logical functions and interfaces for managing advertisement systems,
and is similar to some aspects of VAST. The logical functions provided are:
SCTE-130 uses XML-based data interfaces, defined in part two of the standard. The
network transport is SOAP, defined in part seven. Figure 8 shows a diagram of the basic
blocks in the SCTE-130 set of standards.
· The Content Information Service (CIS) has knowledge of what is available to be played.
It can manage both ads and programs. The interface to the CIS is in SCTE-130 part four.
· The Placement Opportunity Information System (POIS) manages policies, rights and
constraints. The interface to the POIS is in SCTE-130 part five. Note that ESAM can also
be used here.
· Finally, the Subscriber Information System (SIS) may have knowledge of individual
subscribers, and may help refine ad targeting. The interface to the SIS is in SCTE-130
part six.
Figure 9 shows a high-level block diagram of the main ad insertion modules described in
this paper.
A vastly more detailed block diagram can be found in the SCTE DVS site, and is
reproduced in Figure 10 below.
The figure indicates the main interfaces and protocols previously discussed in this
paper.
o EPGs are usually not very precise—recording may miss head or tail
o Triggering the DVR by SCTE-35 messages ensures that the recording does not miss
the beginning or end of the program.
· Enforcing DRM restrictions on specific programs and regions
o Due to content rights, sometimes an entire program cannot be sent to a given region
or transmitted over the Internet.
o Live content may be recorded into VOD servers for later access.
o SCTE-35 markers can be used to segment these recordings and automatically create
separate programs in the VOD server.
FRAME ACCURACY
At the inserter in the programmer side (refer to Figure 1), the SCTE-104 markers are
associated with very specific, well-defined video frames. These markers are inserted in
the VANC for a specific frame, so they are completely frame accurate. When the
baseband video is compressed, the corresponding SCTE-35 markers make a reference to
the Presentation Time Stamp (PTS) of a specific frame in the bitstream. Therefore, with
the right equipment, frame accuracy can be maintained end-to-end.
Applications can benefit from this as follows:
· Frame accurate DVR recording: start and end at the right places
· Exact ad placement