Content Analysis and Network DELivery Architectures


PR - Publications

"The CANDELA video sequences will be publicly available via this website in the near future. Currently we are in the process of arranging hosting for the DVD data"

Name/Partner Abstract/Title Venue Date
J. Nesvadba, A. Hanjalic, P. Fonseca, B. Kroon, H. Celik, E. Hendriks Title: Towards a real-time and distributed system for face detection, pose estimation and face-related features



The evolution of storage capacity, computation power and connectivity in Consumer-Electronics(CE)-, in-vehicle-, medical-IT- and on-chip-networks allow the easy implementation of grid-computing-based real-time and distributed face-related analysis systems. A combination of facial-related analysis components - Service Units (SUs) – such as face detection, pose estimation, face tracking and facial feature localization provide a necessary set of basic visual descriptors required for advanced facial- and human-related feature analysis SUs, such as face recognition and facial-based mood interpretation. Smart reuse of the available computational resources across individual CE devices or across in-vehicle- or medical-IT- networks in combination with descriptor databases facilitate the establishment of a powerful analytical system applicable for various domains and applications.

Invited Paper, Proc. Int. Con. on Methods and Techniques in Behavioral Research, Wageningen, The Netherlands 2005
Paavo Pietarila, Utz Westermann, Sari Järvinen, Jari Korva, Janne Lahti, Henri Löthman CANDELA - Storage, in distrubuted systems, analysis and retrieval of Video content in distrubuted systems: Personal mobile multimedia Management"

accepted to IEEE International Conference on Multimedia (ICME 2005),


Amsterdam, The Netherlands July 6-8, 2005.
E.G.T. Jaspers, R.G.J. Wijnhoven, A.H.R. Albers, X. Desurmont, M. Barais, J. Hamaide, B. Lienard CANDELA – Storage, analysis and Retrieval of Video Content in Distributed Systems: Real-time Video Surveillance and Retrieval", Proc of Int. Conf. for Multimedia and Expo ICME

Abstract - Although many different types of technologies for information systems have evolved over the last decades (such as databases, video systems, the Internet and mobile telecommunication), the integration of these technologies is just in its infancy and has the potential to introduce "intelligent" systems. This paper describes the novelties of a video content analysis in a surveillance system, demonstrating the benefits for fast retrieval in huge video databases.

Amsterdam, The Netherlands July 6-8, 2005.
Multitel Material from CANDELA project for performance evaluation of content analysis:
Scenario:  Abandoned object.

The detection of abandoned objects is more or less the detection of idle (stationary or non-moving) objects that remain stationary over a certain period of time. The period of time is adjustable. In several types of scenes, idle objects should be detected. In a parking lot e.g., an idle object can be a parked car or a left suitcase. For this scenario we are not looking at the object types "person" or "car", but at unidentified objects, called "unknown objects". An unknown object is any object that is not a person or a vehicle. In general, unknown objects cannot move.

Video Analysis Group
Multitel A.S.B.L. , Mons
Parc Initialis  B-7000 Mons (Belgium)
ph: +32 (65) 374752  fax:+32 (65) 374729

J. Nesvadba, P. Fonseca , A. Sinitsyn, F. de Lange, M. Thijssen, P. van Kaam, H. Liu, R. van Leeuwen, J. Lukkien, A. Korostelev, J. Ypma, B. Kroon, H. Celik, A. Hanjalic, U. Naci, J. Benois-Pineau , P. de With, J. Han Real-Time and Distributed AV Content Analysis system for Consumer Electronics Networks

Abstract: The ever-increasing complexity of generic Multimedia-Content-Analysis-based (MCA) solutions, their processing power demanding nature and the need to prototype and assess solutions in a fast and cost-saving manner motivated the development of the Cassandra Framework. The combination of state-of-the-art network and grid-computing solutions and recently standardized interfaces facilitated the set-up of this framework, forming the basis for multiple cross-domain and cross-organizational collaborations [1]. It enables distributed computing scenario simulations for e.g. Distributed Content Analysis (DCA) across Consumer Electronics (CE) In-Home networks, but also the rapid development and assessment of complex multi-MCA-algorithm-based applications and system solutions. Furthermore, the framework’s modular nature - logical MCA units are wrapped into so-called Service Units (SU) - ease the split between system-architecture- and algorithmic-related work and additionally facilitate reusability, extensibility and upgradeability of those SUs.

Int. Conf. for Multimedia and Expo ICME, Amsterdam, The Netherlands, URL: June 6-8, 2005
P. Fonseca, J. Nesvadba Face Tracking in the Compressed Domain

Abstract: A compressed domain generic object tracking algorithm offers, in combination with a face detection algorithm, a low computational cost solution to the problem of detecting and locating faces in frames of compressed (such as MPEG-1 or MPEG-2) video sequences. Objects such as faces can thus be tracked through a compressed video stream using motion information provided by existing forward and backward motion vectors. The described solution requires only low computational resources on CE devices and offers at one and the same time sufficiently good

Journal: EURASIP Journal on Applied Signal Processing, URL: 17 May 2005
J. Nesvadba, F. Ernst, A. Dommisse, J. Perhavc, J. Benois-Pineau Comparison of Shot Boundary Detectors’

Abstract: Video Shot Boundary Detection (SBD) is an essential element for spatio-temporal audiovisual (AV) segmentation and video processing technologies. Platform, processing and performance constraints forced the development of various dedicated SBDs. Future platforms allow the usage of advanced SBD algorithms with higher reliability. In order to enable an appropriate trade-off decision to be made between reliability and the required processing power, benchmarking of four SBD algorithms has taken place on bases of a generic, real-world multi-genre AV corpus. In terms of complexity/performance trade-off, a field-difference based SBD proved to be optimal.

Venue: Int. Conf. for Multimedia and Expo ICME, Amsterdam, The Netherlands, URL: June 6-8, 2005
F. de Lange, J. Nesvadba, 

A Networked Hardware/Software Framework for the Rapid Prototyping of Multimedia Analysis Systems

Abstract: This paper describes a hardware/software framework and approach for fast integration and testing of complex real-time multimedia analysis algorithms. It enables the rapid assessment of combinations of multimedia analysis algorithms, in order to determine their usefulness in future consumer storage products. The framework described here consists of a set of networked personal computers, running a variety of multimedia analysis algorithms and a multi-media database. The database stores both multimedia content and metadata – as generated by multimedia content analysis algorithms – and maintains links between the two. The multimedia (meta)database is crucial in enabling applications to offer advanced content navigation and searching capabilities to the end-user. The full hardware/software solution functions as a test-bed for new, advanced content analysis algorithms; new algorithms are easily plugged-in into any of the networked PCs, while outdated algorithms are simply removed. Once a selected consumer system configuration has passed important user-tests, a more dedicated embedded consumer product implementation is derived in a straightforward way from the framework.

 Int. Conf. On Web Information Systems and Technologies (WEBIST), Miami, USA

URL: , co-located with ICEIS 2005

May 26-28, 2005, 
F de Lange, J. Nesvadba,

Applying PC network technology to assess new multimedia content analysis applications for future consumer electronics storage devices

Abstract: The paper deals with software productivity improvement for consumer multimedia devices by means of PC & component technology and shows how this is done for complex real-time content analysis applications that are expected to be used in advanced new storage products of the future.

4th Int. Conf. On Intelligent Multimedia Computing and Networking (IMMCN) Salt Lake City, USA, July 21-26, 2005
Candela Candela Demo's ICME 2005 July 2005
J. Nesvadba, N. Louis, J. Benois-Pineau, M. Desainte-Catherine and M. Klein Middelink "Low-level cross-media statistical approach for semantic partitioning of audio-visual content in a home multimedia environment" Proc. IEEE IWSSIP’04 (Int. Workshop on Systems, Signals and Image Processing), pp. 235-238, Poznan, Poland, September 13-15, 2004
J. Nesvadba, P. Miguel Fonseca, R. Kleihorst, H. Broers, J. Fan, ‘Face Related Features in Consumer Electronic (CE) device environments’, Invited paper



Proc. IEEE Int'l Conf. on Systems, Man, and Cybernetics, pp 641-648, ISBN 0-7803-8567-5, The Hague, Netherlands, 2004, Special Session on Automatic Facial Expression Recognition


July 2005
W. Berkvens, A. Claassen, J. van Gassel., A. Sinitsyn 'Media Distribution in a Pervasive Computing Environment'

Abstract: Distribution of media in the fast growing world of digital stored content and multimedia supporting devices with connectivity, calls for a new media distribution architecture. The user should be provided with the experience of having an overview of his full media collection, regardless of the time, the place, and the connectivity. The architecture presented in this paper, fulfils these needs and can cooperate furthermore with non-compliant devices.

 IEEE Proc. Con. on Pervasive Computing and Communications (Perware04),  accepted for publication March 2005,
Press release from VTT and Solid to promote CANDELA and published in several press/online news


VTT are serving as experts in a European project for developing processing methods for mobile videos. In the near future consumers will be able to store videos taken by video camera and video phones to their personal digital archives, where they can search and browse them, share them with their friends and view them on their own devices. The videos are easy to find and view on a computer, mobile phone or handheld computer. The new methods promote the commercialisation of video services and business activity in the field and improve the competitiveness of Finnish companies


May 2004
Wolski  and K. Laiho:  "Rolling Upgrades for Continuous Services",  (     

Proc. International Service Availability Symposium (ISAS 2004), 

May 13-14, 2004, Munich, Germany.
J. van der Peijl,  “Automated detection and segmentation of the lungs in CT datasets”,  Master thesis at Eindhoven University of Technology  Nov.2003
M.Quist, H.Bouma, F.A.Gerritsen “Computer-aided pulmonary embolism detection on multi-detector CT”   ECR 2004, the Matrix

I.W.O. Serlie, et al


“Automatic cleansing for CT colonography using a three material transition Model”,

th I International Virtual Colonoscopy Symposium

Boston 2004 Vries, et al.

“Feasibility of automatic prone-supine matching in CT-colonography: precision and practical


 5th International Virtual Colonoscopy Symposium Boston 2004 

P. Fonseca,

J. Nesvadba

“Face detection in the compressed domain”, Proc. (

IEEE Int. Conf. on Image (Video,

Retrieval) Processing

Singapore, October 2004

Vehkaperä, Janne

 Behaviour of Scalable Video in Packet-switched Networks

Master Thesis VTT


Sari Järvinen, Jari Korva, Anna Sachinopoulou, Paavo Pietarila, Janne Vehkaperä

"Videoarkistot Hallintaan",

Prosessori Special issue on electronics design,

November 2004

A-P Liedes


 “Checkpointing a Main-Memory Database”

Master Thesis Helsinki University of



11 October 2004

Rintaluoma, Tero

Optimisation of Software Based MPEG-4 Video Decoder

Master Thesis Hantro


Hyyryläinen A

Realisation of image post-processing in a hardware-based video decoder  

Master Thesis Hantro


J. Peltola et al


Digitaalisen videon helppokäyttöisyys luo nopeasti videoviidakon. Kuinka videot voidaan tallentaa niin, että kiinnostavat videoleikkeet löydetään myöhemmin valtavasta videomateriaalin määrästä? Kuinka löytyneet videoleikkeet siirretään erilaisten verkkojen ja liikkuvien Video jungle The ease of creating and viewing video content has grown a multimedia jungle. Solutions are needed to store this multimedia material in a manner that facilitates retrieving multimedia presentations, or parts of videos that are of interest to the user. Also delivery of these videos to the user via a range of different networks and terminals needs scaleable and adaptive solutions. VTT, Solid and Hantro in cooperation with their international partners in the Candela project are working on solutions related to video and audio analysis, compression, storage, retrieval and networked delivery.         

Ensiapua videoviidakossa (Video jungle)”, Prosessori. Vol. 24

11 November 2003

M. Rautiainen, et al


MediaTeam Oulu and VTT Technical Research Centre of Finland participated jointly in semantic feature extraction, manual search and interactive search tasks of TRECVID 2003. We participated to the semantic feature extraction by submitting results to 15 out of the 17 defined semantic categories. Our approach utilized spatio-temporal visual features based on correlations of quantized gradient edges and color values together with several physical features from the audio signal. Most recent version of our Video Browsing and Retrieval System (VIRE) contains an interactive cluster-temporal browser of video shots exploiting three semantic levels of similarity: visual, conceptual and lexical. The informativeness of the browser was enhanced by incorporating automatic speech transcription texts into the visual views based on shot key frames. The experimental results for interactive search task were obtained by conducting a user experiment of eight people with two system configurations: browsing by (I) visual features only (visual and conceptual browsing was allowed, no browsing with ASR text) or (II) visual features and ASR text (all semantic browsing levels were available and ASR-text content was visible). The interactive results using ASR-based features were better than the results using only visual features. This indicates the importance of successful integration of both visual and textual features for video browsing. In contrast to previous version of VIRE which performed early feature fusion by training unsupervised self-organizing maps, newest version capitalises on late fusion of features queries, which was evaluated in manual search task. This paper gives an overview of the developed system and summarises the results.

Experiments at MediaTeam Oulu and VTT”. TRECVID Workshop at Text 2003

S. Desurmont,

J-F. Delaigle

“A generic flexible and robust approach for intelligent real-time video-surveillance

systems”, Proceedings of the SPIE - Real-time imaging VIII, Vol 5297, No 1

San Jose, CA, USA, p 134-141  

20-22 Jan. 2004

Xavier Desurmont, et al

“A seamless modular image analysis architecture for surveillance systems” p 66-70

London, UK

23 Feb. 2004

Dirk Farin, et al


We present a new semi-automatic segmentation tool, which is motivated by the Intelligent Scissors algorithm, but which uses a modified concept of user-interaction. This new interface provides better capabilities for modifying previous segmentation results. The advantage of the new approach is that it enables to gradually increase the quality of the segmentation. The segmentation tool is based on a shortest circular path search within a corridor that is drawn by the user along the object boundary. For this purpose, we present a new algorithm for computing the shortest circular paths. Our algorithm is so fast that it almost achieves the speed of a regular non-circular shortest path search, while still ensuring an optimal solution. 

Corrisor Scissors: A Semi-Automatic Segmentation Tool Employing Minimum-Cost Circular Paths”

IEEE Proceedings Int. Conf. on Image Processing (ICIP), Singapore

Oct. 2004

Dirk Farin

Peter H. N.With Wolfgang Effelsberg


The background subtraction algorithm is a frequently-used object segmentation technique because of its algorithmic simplicity. However, we show that for general rotational camera-motion, it is impractical or even impossible to use a single background image. As a solution, we propose to use multi-sprite backgrounds which enables processing of arbitrary rotational camera-motion. This paper describes a complete video-object segmentation system employing multisprites. The system generates object masks and background sprites that are compatible with the MPEG-4 object-oriented video-coding tools. Good segmentation results are also obtained for sequences, which cannot be processed with ordinary background images.

Video-Object Segmentation using Multi-Sprite Background Subtraction

proceedings of the IEEE International Conference on Multimedia and Expo (ICME) Taipei, Taiwan.

June 2004

Ling-Zhen Wu

Master thesis

Advanced Connection Management - establishing cooperation of distributed image processing components

TU Eindhoven


R.G.J. Wijnhoven et al


In this paper we describe an architecture for a multi-client, multi-channel video streaming server targeted at the security market. To obtain scalability in bit-rate, multiple compressed video streams are available for each video channel. For transmission to the user, one stream has to be selected for each video channel. Parameters and cost functions to derive the optimal set of streams are defined and heuristic algorithms to find the best combination of video streams are introduced

Multi-channel video streaming server for surveillance systems

IEEE proceedings of the International Symposium on Consumer Electronics, Reading, UK

1-3 September 2004

P. Merkus, et al

CANDELA – Integrated Storage, Analysis and Distribution of Video Content for Intelligent

Information Systems”

Proceeding of the European Workshop on the Integration of Knowledge, Semantics and Digital Media Technology, EWIMT,  London, U.K.

Nov. 25-26, 2004

P. Merkus

“Embedded software for digital surveillance video products by Bosch Security Systems”

BITS&Chips conference, Eindhoven, The Netherlands.

22 April 2004

R.G.J. Wijnhoven Bosch et al


In this paper we describe an architecture for a multi-client, multi-channel video streaming server targeted at the security market. For each connected client, the best selection of available compressed video streams per channel is made, optimising the perceived quality of the video channels requested by the client. We propose techniques to scale the bit rate per video stream and introduce a scheduling scheme to select the best video streams for a given set of channels and bandwidth constraints.

 "Architecture for Multi-Client Multi-Channel Compressed Video Streaming", in Proceedings of the VCA Workshop On the Design of Multimedia Architectures, Eindhoven, The Netherlands,  December 2003 Vries, et al

Feasibility of automatic prone-supine matching in CT-colonography: precision and practical use

5th International Virtual Colonoscopy Symposium Boston 2004
I.W.O. Serlie, et al "Automatic cleansing for CT colonography using a three material transition Model" 5th International Virtual Colonoscopy Symposium Boston 2004
J. van der Peijl "Automated detection and segmentation of the lungs in CT datasets" Master thesis at Eindhoven University of Technology Nov.2003
Rob Wijnhoven Bosch


Verification of Video Content Analysis (VCA) algorithms is quite difficult. To verify the

performance of an algorithm, the results have to be compared with the results that should

have been generated, which are called the “ground truth”. The generation of this ground truth has to be done manually. This document describes the scenarios that will be used within the CANDELA project for the verification of video content analysis systems.

Scenario Description Document 2004
Article in the "Prosessori"

Candela video-Viidakossa


The article can also be downloaded November 2003