0% found this document useful (0 votes)
16 views4 pages

DiVA A Distributed Video Analysis Framework Applied To Video-Surveillance

Uploaded by

beoverall
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views4 pages

DiVA A Distributed Video Analysis Framework Applied To Video-Surveillance

Uploaded by

beoverall
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

DiVA: a Distributed Video Analysis framework applied to video-surveillance

systems
Juan Carlos San Miguel, Jesús Bescós, José M. Martínez and Álvaro García
Grupo de Tratamiento de Imágenes
Escuela Politécnica Superior, Universidad Autónoma de Madrid, SPAIN
E-mail: {Juancarlos.Sanmiguel, J.Bescos, JoseM.Martinez, Alvaro.Garciamartin}@uam.es

Abstract configuration design adding low computational cost to


the algorithms running on this system. Due to easy
This paper describes a generic, scalable, and component integration, the system provides a good
distributed framework for real-time video-analysis environment for researchers to develop new video
intended for research, prototyping and services analysis algorithms.
deployment purposes. The architecture considers The paper is organised as follows: section 2
multiple cameras and is based on a server/client describes the state-of-the-art in video-surveillance
model. The information generated by each analysis system design and architectures, section 3 and 4
module and the context information are made describe, respectively, the system and its components,
accessible to the whole system by using a database section 5 shows the performance obtained with this
system. System modules can be interconnected in flexible and scalable approach and finally section 6
several ways, thus achieving flexibility. Two main concludes and indicates future work.
design criteria have been low computational cost and
easy component integration. The experimental results 2. State-of-the-art
show the potential use of this system.
The work presented in this paper covers mainly the
1. Introduction design and implementation of video-analysis systems
where multiple cameras are in play; video-surveillance
The growing demand for surveillance, especially for systems are currently one of the most important ones
outdoor/indoor security in buildings, is increasing the demanding these features. The requirements for
need to create and develop intelligent systems. Most designing this type of systems are the object of very
surveillance systems suffer from non-scalability or low active research[1][3][4]. In general, they can be
frame rates due to computationally expensive described by the following desirable functionalities: 1)
algorithms. Current research in this field is focused scalable systems with computational charge
towards the design of wide-area automatic and distribution, 2) real-time operation, 3) low resource
intelligent surveillance systems (third generation consumption, 4) communication control, 5)
systems)[1]. These systems can be categorized as communication over standard networks and 6) runtime
concurrent, distributed, embedded and real time. They re-configuration.
are usually designed in a synchronous way using an Several surveillance systems have been designed
object-oriented approach (e.g. CORBA), what can and developed by industry and academia. In the
produce bottlenecks or exhaustion of resources. literature, these systems are described on a very high
Usually, the systems deployed are designed for a level not allowing the distinction among the
specific context and they can’t be easily adapted to functionalities of the different approaches. One of the
other contexts. It is generally accepted that the use of classical distinctions between these systems is their
context information allows to improve the quality purpose: they can be divided into specialized and
results of the analysis processes[2]. Traditionally, this general-purpose systems. For instance, [5] focuses on
information has been incorporated into these processes metro stations and [6] focuses on traffic surveillance.
through manual parameters adjustment or implicit Another classification can be distributed and non-
inclusion in the algorithm code. distributed systems. The need of designing distributed
The present work proposes an approach to the systems is briefly discussed in [4]. The distribution
design of intelligent distributed surveillance systems. process can be further categorized depending on which
This approach is based on a distributed and flexible technology is used. Thus, there are systems like [7][8]
that use their own communication protocol to acquisition card on a PC or directly to the Ethernet
distribute work into the network; others use RSTP[9], network for IP cameras. The computers are used to
SOAP[4] or CORBA[5] to manage their acquire the video, run algorithms and store the data.
communications. Systems can be also classified based The main advantage of this architecture is the
on the existence or not of a centralised server to control flexibility. Future needs in computing power will be
all the system components. In [3] a system without simply addressed by adding PCs (or replacing existing
server is presented, that avoids the centralisation, ones with more powerful ones) in the cluster.
making all the independent subsystems completely The logical part is composed by four independent
self-contained. On the other hand, systems like [10] layers (see Fig. 2). Each layer is designed in a modular
use a central server that restricts the system scalability, way and has an specific role. The next section
allowing a better system management. describes them with more detail. The different modules
Usually the design of these types [5][6] of systems can be distributed in several ways allowing flexible
is based on an object-oriented and synchronous configuration. Also the system supports the addition of
approach (e.g. CORBA). This approach can produce modules at operation time. Depending on application
overhead at run-time and may cause communication requirements, layers can be combined into one single
bottlenecks. To avoid this constraint, the component with the required functionality (e.g. fast
MASCOT[11] design method was proposed for data acquisition, fast module intercommunication).
improving the communication protocol making it
simple and allowing asynchronous communications. VIDEO
SOURCE
USE
R
S

3. Framework overview Acquisition Layer


Presentation
(Video Capture System)
system
(Display, …)
The proposed framework is divided in two levels of
Video Processing modules layer
abstraction: physical part and logical part. frames

The physical part (see Fig. 1) is composed of the Analysis modules


(Analysis algorithms, classifiers and reasoners, …
required hardware: the cameras and a cluster of )

standard computers (PCs) connected together through a MPEG-7 descriptors Image Context information
fast Ethernet network. s

Data
Analysis results Context
(Database Managemen
knowledge
system) t (Database system)
Layer

Figure 2. Logical system layers

The main features of the proposed system are:


o Distributed environment for research, prototyping
and deployment of visual analysis systems with
multicamera and contextual information support.
o Modular and multithreaded design.
o Based on processing at frame level.
o Based on a client/server model.
o Flexible configuration (cascading or parallel
interconnection of processing algorithms).
o Monitoring and reuse of analysis results.
o Plug and play support .
Figure 1. Physical system description
o Asynchronous operation mode.
In order to cope with bandwidth restrictions and to
improve the system performance, the network 4. Logical System layer description
architecture is composed of two networks. The main
processing units of the system are a set of rack- This section describes the different layers of the
mounted standard PCs interconnected by a dedicated logical system of DiVA.
Gigabit Ethernet (core network). The other system
units (mainly processing modules) are distributed in a 4.1 Acquisition layer
100BaseT Ethernet network around the core network.
Different types of cameras are plugged either to an
This layer acquires the video from multiple video 4.4 Data Management layer
feeds and distributes video frame-by-frame to the
whole system. The distribution is based on a This layer is in charge of storing and distributing
server/client model. Video frames are currently information required for analysis purposes. Due to the
exchanged using baseline JPEG (ISO/IEC 10918-1) or existence of different types of information in the
uncompressed format. A time stamp is attached to each system, this layer is divided into two modules: one for
frame at grabbing time, and used in the processing analysis results and other for contextual information.
stage (e.g., tracking algorithms). The first module is composed of a database that
Due to its modular design, the system can easily manages the availability and intercommunication of
support new camera connectivity protocols with very analysis results between system modules. This module
few effort. Currently, our system handles IP, has two different levels of storage: one for system
IEEE1394, GigE and USB protocols as well as input configuration and the other for analysis results.
via video files. Metadata associated to analysis results are generated in
MPEG-7 format. To exploit the information described,
4.2 Communications layer the metadata are extracted from the MPEG-7 structures
and stored in the database. A relational database is
The communication between system components is used with a straightforward mapping of the data onto
based on a server/client model using point-to-point; the database tables. Additionally some of the MPEG-7
flow control is realized through a TCP-based network. structures are compressed to reduce their size
To initiate a communication, the client logs on the providing a bandwidth reduction in the communication
server and then it starts the data transfer. The processes.
communications protocol is based on transmitting only The second module enables the use of domain and
useful information, allowing fast communication application context knowledge in the whole system. It
between modules. To avoid network problems, data allows the use of the system (designed as a ‘generic’
buffering between modules is supported at both sides. system) in different specific scenarios. This kind of
information is coded using ontologies for knowledge
4.3 Processing modules layer representation and stored in this module. Then, system
modules can request this information and use it in their
In our approach, a processing module is a tasks.
component of the system responsible for some
particular task not related to the other layers (e.g. video 5. Results and performance
content analysis module, player module).
The modules run concurrently and asynchronously: The proposed framework has been validated on real
typically each module will run on its own processor, use-cases. One of them is detection of abandoned or
but this is not mandatory. This is the main mechanism removed objects in three different scenarios (each one
of parallel and distributed processing in our approach. with one camera input; IEE1394, IP and GigE) at the
The modules are arranged in a fixed network topology same time, alerting when an event occurs.
as determined by the overall system structure. They A logical description of the system modules can be
communicate only with Acquisition and Data seen in Fig. 3. The implementation runs over a
Management layers to request and store data. In these Pentium 4 at 2.8GHz under Windows OS. For each
two layers, there are servers that are waiting to deliver scenario, the system extracts the foreground objects
the content stored. The processing modules act as and classifies the static foreground regions as
clients requesting the video data and analysis results abandoned or removed objects. The candidate
necessary for completing their task. Each module has a foreground object extraction task involves two
specific task, and the modules cooperate to achieve the Processing Modules (PM): one performs foreground
system goals. The communication between modules is segmentation (PM1) and the other candidate object
mapped through a database system. Modules read/write extraction (PM2). The former generates a binary mask
their results in the database system (acting as a content that is stored in the database system. The latter requests
server). video frames and previous segmentation masks, and
To allow easy integration in the overall system, generates an MPEG-7 description of the candidate
some module templates have been created to ensure objects that is stored in the database system. Finally,
fast development of new algorithms within the module PM3 performs event detection using video data
framework. and the MPEG-7 description previously generated:
candidate objects are classified as abandoned or
removed by matching the boundaries of static 65400 SemanticVideo), by the Comunidad de Madrid
foreground regions. These three modules operate in the (S-050/TIC-0223 - ProMultiDis-CM), by the
same PC. Consejería de Educación of the Comunidad de Madrid
Module PM4 uses the final MPEG-7 descriptions and by The European Social Fund. The authors would
generated for each scenario, showing the alarms like to thank J. Molina, H. Cabrejas and V. Fernández-
detected over the corresponding scenario frames. Carbajales for their valuable contributions.
Measured metadata flow is less than 15KB/s per
communication. It shows that the overhead introduced References
by multiple transmissions is lower than a video frame
transmission. [1] M. Valera and S.A. Velastin, “Intelligent distributed
Overall results show that the system can process surveillance systems: A Review”, IEEE Procs. of VISP,
data from multiple cameras at real-time at CIF 152(2):192-204, 2005
resolution, confirming that the computational cost [2] N. Mallot, M. Thonnat, A. Boucher, “Towards
introduced by the system does not reduce its Ontology Based Cognitive Vision”, Procs. of MVA,
performance. This configuration demonstrates that 16(1):33-40, 2004.
[3] M. Christensen and R. Alblas, "V2-design issues in
DiVA allows flexible configuration. distributed video surveillance systems," Tech. Rep.,
Department of Computer Science, Aalborg University, 2000.
6. Conclusions and future work [4] R.G.J Wijnhoven et al, “Flexible Surveillance System
Architecture for prototyping Video Content Analysis
This paper presents a distributed surveillance Algorithms”, Procs. of SPIE 6073, 2006.
system/framework that allows flexible configuration [5] N. Siebel and S. Maybank, "The advisor visual
and dynamic reconfiguration at runtime. Due to low surveillance system", Procs. of ECCV, pp.103-111, 2004.
[6] S. Kamijo, et al.“Development and evaluation of real-
computation management cost, it operates at real-time
time video surveillance system on highway based on
over standard computers. Additionally, it provides a semantic hierarchy and decision surface", IEEE Procs. of
flexible environment to develop computer vision SMC, pp. 840-846, 2005.
algorithms via easy component integration. [7] X. Yuan et al, “A distributed visual surveillance system”,
For the future, other extensions and improvements Procs. of AVSS, pp. 199- 204, 2003.
will be made on the global system, like integration of a [8] H. Dias et al, “Distributed Surveillance System”, Procs.
compressed video analysis path or adaptation of the of EPIA, pp. 257-261, 2005
system to work under Linux (using POSIX threads for [9] D. Ostheimer, “A modular distributed Video Surveillance
multitasking scheduling). System Over IP”, Procs. of CCECE, pp 518-521, 2006
[10] B. Lo et al, "An Intelligent Distributed Surveillance
System For Public Transport", IEEE Procs. of IDSS, pp.
7. Acknowledgements 10/1-10/5 , 2003
[11] M. Valera et al, “Real-Time Architecture for a
large distributed Surveillance System”, IEEE Procs. of IDSS,
This work is supported by Cátedra Infoglobal-UAM pp 41-45, 2004
for “Nuevas Tecnologías de video aplicadas a la
seguridad”, by the Spanish Government (TEC2007-

Figure 3. Logical system description for sample application

You might also like