Wright Jaron - Computer Animation

Published on June 2016 | Categories: Documents | Downloads: 41 | Comments: 0 | Views: 599

of 252

Content

COMPUTER SCIENCE, TECHNOLOGY AND APPLICATIONS

COMPUTER ANIMATION

No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or
by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no
expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of information
contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in
rendering legal, medical or any other professional services.
COMPUTER SCIENCE,
TECHNOLOGY AND APPLICATIONS

Additional books in this series can be found on Nova’s website at:

https://www.novapublishers.com/catalog/index.php?cPath=23_29&seriesp=
Computer%20Science%2C%20Technology%20and%20Applications&sort=2a&page=1

Additional e-books in this series can be found on Nova’s website at:

https://www.novapublishers.com/catalog/index.php?cPath=23_29&seriespe=
Computer+Science%2C+Technology+and+Applications

COMPUTER SCIENCE, TECHNOLOGY AND APPLICATIONS

COMPUTER ANIMATION

JARON S. WRIGHT
AND
LLOYD M. HUGHES
EDITORS

Nova Science Publishers, Inc.
New York

Copyright © 2010 by Nova Science Publishers, Inc.

All rights reserved. No part of this book may be reproduced, stored in a retrieval system or
transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical
photocopying, recording or otherwise without the written permission of the Publisher.

For permission to use material from this book please contact us:
Telephone 631-231-7269; Fax 631-231-8175
Web Site: http://www.novapublishers.com

NOTICE TO THE READER
The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or
implied warranty of any kind and assumes no responsibility for any errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of
information contained in this book. The Publisher shall not be liable for any special,
consequential, or exemplary damages resulting, in whole or in part, from the readers’ use of, or
reliance upon, this material. Any parts of this book based on government reports are so indicated
and copyright is claimed for those parts to the extent applicable to compilations of such works.

Independent verification should be sought for any data, advice or recommendations contained in
this book. In addition, no responsibility is assumed by the publisher for any injury and/or damage
to persons or property arising from any methods, products, instructions, ideas or otherwise
contained in this publication.

This publication is designed to provide accurate and authoritative information with regard to the
subject matter covered herein. It is sold with the clear understanding that the Publisher is not
engaged in rendering legal or any other professional services. If legal or any other expert
assistance is required, the services of a competent person should be sought. FROM A
DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE
AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS.

LIBRARY OF CONGRESS CATALOGING-IN-PUBLICATION DATA

Computer animation / editors, J aron S. Wright and Lloyd M. Hughes.
p. cm.
ISBN 978-1-61209-078-8 (eBook)

Published by Nova Science Publishers, Inc. New York

CONTENTS

Preface vii

Chapter 1 Computer Animation Applied to the Recovery of Preindustrial
Heritage: A New Approach
1
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita

Chapter 2 Virtual Engineering in Augmented Reality 57
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata

Chapter 3 A Survey of Popular 3D Soft-Body Animation Compression
Approaches
85
S. Ramanathan and A.A. Kassim

Chapter 4 Virtual Emotion to Expression: A Comprehensive Dynamic
Emotion Model to Facial Expression Generation Using
the MPEG-4 Standard
113
Paula Rodrigues, Asla Sá and Luiz Velho

Chapter 5 Example-Based Performance-Driven Animation
of an Anatomical Face Model
129
Yu Zhang

Chapter 6 Dynamics for Managing Occlusion of Buildings
in Panoramic Maps
145
Neeharika Adabala

Chapter 7 Constraint-Based and Feature-Based CAD Systems
and Applications
157
Ioannis Fudos and Vasiliki Stamati

Chapter 8 Computer Aided Geometric Design with Powell-Sabin Splines 177
Hendrik Speleers, Paul Dierckx and Stefan Vandewalle

Chapter 9 An Ontology of Computer-Aided Design 209
Udo Kannengiesser and John S. Gero

Index 235

PREFACE

During the last decades, computer-aided engineering (CAE) methodologies have deeply
changed the way of designing and developing products, systems and services. Thanks also to
significant hardware and software improvements, CAE techniques are widely used by
designers from the early conceptual phases up to the final stages of engineering processes. At
the industry level, these methodologies have become a fundamental tool to be competitive
and to ensure high quality standards. In industrial engineering, computer-aided methodologies
typically are instrumental for design teams in shape modeling, behavioral simulations, digital
mock-ups and realistic animations. They are able to follow the development of a product from
conception to production, also managing its life-cycle. Character animation is one of the key
research areas in computer graphics and multimedia. It has applications in many fields,
ranging from entertainment, games, virtual presence and others. This new important book
gathers the latest research from around the globe in this dynamic field.
The heritage of the preindustrial period is today coming under examination more often, as
engineering must accept the study of its evolution as a discipline, from a technical as well as a
historical perspective.
Engineering therefore provides industrial Archaeology and the history of technology with
an important element in order to complete the study of industrial heritage. These studies are
generally considered from the perspectives of history, ethnography, philology and
architecture, but do not usually include studies from an engineering perspective.
Chapter 1 provides a detailed examination of the infographic work carried out on a
Manchegan windmill (La Mancha – Quixote), as an example of preindustrial heritage, in
order to obtain a computer animation, so that the procedure followed can be extrapolated to
other examples of preindustrial heritage.
One of the reasons for choosing the windmill is that flour mills represented an important
nucleus of the economy and of the industrial and social development of society. For this
reason its study is important, especially for industrial history.
The study and analysis of these windmills is especially important owing to their general
state of abandonment and deterioration, including analysis of the techniques used in their
construction and those used in the working of the windmill. Computer animation is a key
element in the recovery of this interesting preindustrial heritage.
In addition, the chapter discusses the advantages of this technique compared with others
such as virtual reality, and why the majority of museum interpretation centres already possess
these tools.
J aron S. Wright and Lloyd M. Hughes viii
CAD-CAE (Computer-Aided Design/Computer-Aided Engineering) techniques provide
through computer animation a fundamental tool to present an integral study from the
perspective of engineering of any example of preindustrial heritage.
The importance of this chapter resides in that it presents in an innovative and structured
way the procedure for generating a computer animation of preindustrial heritage.
In Chapter 2 the authors discuss several approaches in order to integrate computer-aided
engineering instruments into Augmented Reality environment. Engineers and designers often
develop their creative ideas in front of a computer monitor using mouse and keyboard.
Although the integration between numerical computation and graphics leads to the generation
of very realistic digital mock-ups, they are still far from the real context and the user has
limited interaction with them. The purpose is to illustrate how recent development in
computer graphics and image processing can improve the realism and interactivity with
digital mock-ups. Starting from the interactive modeling of 3d shapes, the chapter presents
some examples about the integration of real-time mechanism motion simulation, structural
and fluid dynamics analysis post-processing.
In Chapter 3, the authors review 3D dynamic mesh compression algorithms and
investigate how vertex clustering, which chiefly contributes to animation coding complexity,
affects compression performance. The authors finally conclude this chapter with observations
that need to be effectively addressed by future 3D animation coding algorithms.
In Chapter 4 the authors present a framework for generating dynamic facial expressions
synchronized with speech, rendered using a tridimensional realistic face. Dynamic facial
expressions are those temporal-based facial expressions semantically related with emotions,
speech and affective inputs that can modify a facial animation behavior.
The framework is composed by an emotion model for speech virtual actors, named VeeM
(Virtual emotion-to-expression Model), which is based on a revision of the emotional wheel
of Plutchik model. The VeeM introduces the emotional hypercube concept in the R
4
canonical
space to combine pure emotions and create new derived emotions.
The VeeM model implementation uses the MPEG-4 face standard through a innovative
tool named DynaFeX (Dynamic Facial eXpression). The DynaFeX is an authoring and player
facial animation tool, where a speech processing is realized to allow the phoneme and viseme
synchronization. The tool allows both the definition and refinement of emotions for each
frame, or group of frames, as the facial animation edition using a high-level approach based
on animation scripts. The tool player controls the animation presentation synchronizing the
speech and emotional features with the virtual character performance. Finally, DynaFeX is
built over a tridimensional polygonal mesh, compliant with MPEG-4 facial animation
standard, what favors tool interoperability with other facial animation systems.
Recent development of physics-based face modeling that emulates the anatomical
structure including skin, muscles, and skull allows us to create detailed, realistic animations.
However, synthesis of facial expressions on such complex models often involves significant
manual work due to the difficulty in determining appropriate values of the muscle actuation
parameters. Chapter 5 presents an example-based performance-driven method to
automatically estimate facial muscle actuation parameters from markerless video footage. The
authors method is based on an efficient face tracker which uses a facial deformation subspace
model. During the training phase of the tracker a set of templates associated with the subspace
basis is computed to alleviate the online computation. At runtime, the tracking algorithm
establishes temporal correspondence of the face region in the video sequence by
Preface ix
simultaneously determining both motion and appearance parameters. Using a set of example
pairs that consist of the appearance and animation parameters corresponding to the key
expressions, we learn the relationship between facial appearances and animation parameters.
It enables the animation parameters to be computed in real-time from the appearance
parameters obtained by the tracker, allowing animation of the anatomical model at interactive
rates.
Panoramic maps depict urban areas in oblique view. This form of cartography was
prevalent from the late sixteenth century to the early nineteenth century, when there were not
many skyscrapers in urban areas. But oblique view maps in the current urban scenarios suffer
from loss of details due to occlusion among closely located multistory buildings. In Chapter 6
the authors leverage the time dimension to overcome the clutter in space dimension by
introducing functional dynamics. The authors define a parameter called occlusion index for
an urban scene at a given viewpoint. Solving the problem of occlusion involves devising
methods for visualizing the urban scene that reduce/minimize the occlusion index. They
explore occlusion reduction techniques that involve selecting optimal viewpoints, displacing
buildings, making buildings transparent and changing building heights. The authors
demonstrate these approaches by presenting screen shots of the solution applied to a
prototype city block, and discuss the advantages and disadvantages of these solutions. This
work is pioneering in its approach to applying animation in cartography, which has previously
used animations only to depict time-dependent phenomena or fly-throughs.
A new generation of Computer Aided Design systems has become available in which
geometric constraints can be defined to determine properties of large designs. The new design
concept, often called constraint-based design or design by features offers users the capability
of easily defining and modifying a design, but introduces the problem of solving complicated,
not always well defined, constraint problems. Traditional parametric models can also be
enhanced to partially support declarative constraint-based descriptions. In Chapter 7 the
authors provide an overview of representation schemes for CAD applications. Then they
present a survey of methods for geometric constraint solving appropriate for Computer Aided
Design. The authors demonstrate how these representations and constraint solving methods
can be combined or adapted to support a broad range of CAD applications by presenting two
example cases of successfully using a feature-based constraint-based representation scheme to
support two different CAD applications.
Powell-Sabin splines are bivariate C1-continuous quadratic splines defined on an
arbitrary triangulation. Their construction is based on a particular split of each triangle in the
triangulation into six smaller triangles. In Chapter 8 the authors give an overview of the
properties of Powell-Sabin splines in the context of computer aided geometric design. These
splines can be represented in a compact normalized B-spline basis with an intuitive geometric
interpretation involving control triangles. Using these triangles one can interactively change
the shape of the splines in a predictable way. The authors describe the simple subdivision
rules for Powell-Sabin splines, and discuss some applications. The authors consider a new
efficient spline visualization technique based on subdivision. The authors also look at two
useful generalizations of the Powell-Sabin splines, i.e., QHPS splines and NURPS surfaces.
The QHPS splines are a hierarchical variant of Powell-Sabin splines. They have very similar
properties as the Powell-Sabin splines, and their hierarchical nature allows a local refinement
of the spline in a very straightforward way. The NURPS surface is the rational extension of
J aron S. Wright and Lloyd M. Hughes x
the Powell-Sabin spline. By means of weights they give extra degrees of freedom to the
designer for the modelling of surfaces.
Chapter 9 develops an ontology of computer-aided design, based on the function-
behaviour-structure (FBS) ontology. It proposes two complementary views of the process of
design. The object-centred view applies the FBS ontology to the artefact being designed.
Integrating an ontology of three “design worlds”, this view establishes a framework of
designing as a set of transformations between the function, behaviour and structure of the
design object, driven by interactions between the three design worlds. Building on this
framework, the process-centred view applies the FBS ontology to the activities defined by the
object-centred view. This increases the level of detail and provides a more well-defined set of
representations of these activities. The authors ontological framework can be used to provide
a better understanding of the functionalities required of existing and future computer-aided
design support.

In: Computer Animation ISBN: 978-1-60741-559-6
Editors: J.S. Wright and L.M. Hughes, pp. 1-56 © 2010 Nova Science Publishers, Inc.

Chapter 1

COMPUTER ANIMATION APPLIED
TO THE RECOVERY OF PREINDUSTRIAL HERITAGE:
A NEW APPROACH

J osé I gnacio Rojas-Sola
*
and Francisco J avier Contreras-Anguita
University of Jaén, Department of Engineering Graphics, Design and Projects,
Campus de las Lagunillas, s/n, Jaén 23071, Spain
Abstract
The heritage of the preindustrial period is today coming under examination more often, as
engineering must accept the study of its evolution as a discipline, from a technical as well as a
historical perspective.
Engineering therefore provides industrial Archaeology and the history of technology with
an important element in order to complete the study of industrial heritage. These studies are
generally considered from the perspectives of history, ethnography, philology and
architecture, but do not usually include studies from an engineering perspective.
This chapter provides a detailed examination of the infographic work carried out on a
Manchegan windmill (La Mancha – Quixote), as an example of preindustrial heritage, in order
to obtain a computer animation, so that the procedure followed can be extrapolated to other
examples of preindustrial heritage.
One of the reasons for choosing the windmill is that flour mills represented an important
nucleus of the economy and of the industrial and social development of society. For this
reason its study is important, especially for industrial history.
The study and analysis of these windmills is especially important owing to their general
state of abandonment and deterioration, including analysis of the techniques used in their
construction and those used in the working of the windmill. Computer animation is a key
element in the recovery of this interesting preindustrial heritage.
In addition, the chapter discusses the advantages of this technique compared with others
such as virtual reality, and why the majority of museum interpretation centres already possess
these tools.

*
E-mail address: [email protected]. Tel: +34-953-212452; Fax: +34-953-212334; Corresponding author. Professor
Dr. José Ignacio Rojas-Sola. University of Jaén, Department of Engineering Graphics, Design and Projects,
Campus de las Lagunillas, s/n, Jaén 23071. Spain.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 2
CAD-CAE (Computer-Aided Design/Computer-Aided Engineering) techniques provide
through computer animation a fundamental tool to present an integral study from the
perspective of engineering of any example of preindustrial heritage.
The importance of this chapter resides in that it presents in an innovative and structured
way the procedure for generating a computer animation of preindustrial heritage.
Introduction
A Google search for the term “computer animation” shows up 2,650,000 results, while a
search for the term “heritage” gives 117,000,000 results. A search for both terms together
gives 91,400 results (search carried out on the 22
nd
December 2008). This shows the
increasing importance of heritage in any of its facets.
This importance can also be seen in the numerous prestigious international congresses on
the subject, such as “World Heritage in the Digital Age" (organized by UNESCO's World
Heritage Centre) or VAST Conferences (International Symposium on Virtual Reality,
Archaeology and Cultural Heritage). We must also consider the existence of high-impact
journals, such as the Journal of Cultural Heritage (JCR) among others, the large number of
websites dedicated to the issue [1], [2], or the European Union’s 7
th
Framework [3].
The UNESCO World Heritage [4] defines heritage as “our legacy from the past, what we
live with today, and what we pass on to future generations”.
In terms of the virtual heritage which concerns us here, researchers believe that it can
serve to encourage people to visit the actual site, and can provide a complement to such a visit
[5]; visitors can benefit from the changes and opportunities it offers [6].
Current trends in work on virtual heritage point to three different steps: complete 3D
documentation, 3D representation (from historical reconstruction to visualization) and 3D
publication (from immersive reality to augmented reality) [7]. Many applications have been
developed which deal with historical sites or buildings, and in 2000 it was already forecast
that in the following decade work would be centered on virtual industrial heritage [8].
Industrial heritage has a close relationship with Industrial Archaeology. Much has been
written on this subject, defining it variously as the discovery, analysis, record and
preservation of past industrial remains [9], the discovery, cataloguing and study of physical
remnant of the industrial past, in order to learn about significant aspects of the world of work
and technical and production processes [10], or the study of material culture and aspects
linked to production, distribution and consumption, in the future and un connections with the
past [11].
Today there are many examples of industrial heritage that are about to disappear, and in
many cases in ruins. Many organizations are working to study and analyze these cases, some
linked to industrial archaeology, such as TICCIH (The International Committee for the
Conservation of Industrial Heritage) [12], AIA (Association for Industrial Archaeology - UK)
[13], or linked to the history of technology, such as SHOT (Society for the History Of
Technology - USA) [14], as well as branches of UNESCO which study the many aspects of
heritage; architectural, industrial, cultural, ethnographic, to name but a few.
The recovery of heritage is in many cases linked to the history of technology, as it is a
fundamental element in the study of the technological evolution of any invention.
Engineering Graphics, and more specifically infographic techniques, play an essential role in
the study of the history of technology, given the universal character of graphic language, as is
Computer Animation Applied to the Recovery of Preindustrial Heritage 3
shown by the large number of articles in print which deal with graphic reconstructions of
various inventions and devices [15 - 19].
However, in many cases the efforts of conservationists, archaeologists and restorers are
not enough. In particular, the heritage provided by ruined buildings and constructions,
whether architectural or industrial, is often lost owing to the interests of urban development or
the lack of a renovation project which could give life to the area and bring opportunities for
work. This loss is more clearly shown in the case of preindustrial heritage
1
, which has been
part of production processes, not only because of the wideness of its scope, but also because
older machines suffer greater deterioration when they are no longer used.
On many occasions initiatives are in put in place to conserve examples of industrial
heritage, for example Museums of Science and Technology, which are becoming increasingly
frequent, as they are a way of safeguarding a form of culture linked to the socio-economic
development of a given area [20]. This example is all the more evident in the case of elements
related to proto-industrialization (windmills, watermills, fulling mills, or oil presses, among
others), as they date from the preindustrial period, and their age makes them more susceptible
to deterioration and disappearance.
The role of synthesis images in the conservation of industrial heritage has grown
exponentially in recent years. They allow an area, building or object to be preserved and
interpreted in ways otherwise impossible to imagine, using photographic techniques. The
most important factor is however that when using virtual models, it is not necessary to disturb
or modify the original item.
There are also other advantages, stemming from the computer animation itself. Firstly
there is a socio-cultural objective in the conservation of the ‘collective historical memory’ of
an area where a given type of heritage was prevalent, providing information on the evolution
of that society. Secondly, there is a clear educational objective in showing details of an
abandoned culture [21]. Thirdly, there is also technological interest, as the use of computer
animation techniques and processes provides valued know-how. A computer-generated image
should be as faithful as a figure in a journal, although this is rarely possible [22].
In sum, whenever an element of a society’s heritage is lost, it also becomes impossible to
study, analyze and value its impact on that society.
This chapter presents a new approach to the use of computer animation techniques
applied to an element of preindustrial heritage, the Manchegan windmill (La Mancha, Spain),
which were built in the 16
th
century, and some of which remain in near-perfect condition
today. The specific windmill under study is the ‘Sardinero’, one of the 10 which still stand in
the area of Campo de Criptana (Ciudad Real, Spain); it has also been declared of special
cultural interest by the Spanish Government. These famous windmills appear in the
masterpiece of Spanish literature, Don Quixote, by Miguel de Cervantes.
The Windmill
The windmill is one of the devices that has most been used over the centuries to obtain flour,
an essential part of the human diet. A detailed study of their working mechanisms [23-25] and

1
The following is applicable to both preindustrial and industrial heritage.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 4
their design codes allows us to develop a computer model and conserve completely this
example of heritage, leaving a legacy which can be studied by future generations.
Architecture
Although there are various different types of windmills both in Spain and in other countries,
the original architecture of a Manchegan windmill has three different floors. The ground floor
(cuadra) is where the cereal was received and the canvas sails were stored. The first floor
(camareta) is where the flour was packed into sacks, and the second floor (moledero) housed
all the machinery necessary for milling the cereal [26].
The windmill had a cylindrical masonry tower about 8 m in height, capped with a conical
cover (windmill cap) made of zinc, about 3.5 m in height. This rested on a ring on the top of
the tower, which allowed this part of the windmill to turn to face the prevailing wind.
Working
The way a Manchegan windmill worked can be explained using the following photographs,
which were taken by the author.
Figure 1 shows an exterior view of the ‘Sardinero’ windmill in Campo de Criptana. The
photograph shows different functional elements of the windmill, such as the sails (which
would be covered with canvas to increase their surface area), the windshaft and the windmill
cap.

SAILS
WINDMILL
CAP
WIND
SHAFT
UPPER
WINDOWS

Figure 1. View of the Windmill.
Computer Animation Applied to the Recovery of Preindustrial Heritage 5

Figure 2. Close-up view of the sails.

Figure 3. Close-up view of the join between the sails and the windshaft.
Each windmill has two rotation systems: a horizontal system, formed by the windshaft,
the sails and other elements which will be described below; and a vertical system formed by
the windmill cap and the tailpole. This vertical rotation system allowed the windmill cap to
turn so that the sails faced the prevailing wind. This was done by the miller, who would use
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 6
the 12 small upper windows around the windmill to determine which way the wind was
blowing.
Figure 2 shows the geometry of the sails, which have become deformed over time. They
measure some 16 m from tip to tip, and are formed by 2 central stocks, and each of the sails
had the central stock, four long ways struts and 19 crossways struts, giving rigidity to the sail.
Figure 3 shows the detail of the join between the sails and the windshaft, and the three
struts which provide rigidity to the sails.

FIRST FLOOR
WINDOW
ENTRANCE
TAILPOLE
TRIPOD STONE
MARKERS

Figure 4. Front View of the Entrance.

Figure 5. Space for storing sail canvasses.
Computer Animation Applied to the Recovery of Preindustrial Heritage 7
Figure 4 is a view of the front of the windmill, showing the only entrance to the ground
floor (cuadra), the window of the first floor (camareta) which provided the only source of
natural light, the tailpole, which allowed the windmill cap to turn into the wind, the tripod or
support for the tailpole, and the 12 stone markers which marked the 12 possible positions of
the tailpole.

Figure 6. Entrance to the ground floor, with the counterweight which was used to separate the milling
stones.

Figure 7. Spiral staircase leading to the first floor.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 8
In the entrance to the ground floor there was an area where the sail canvasses were stored
(Figure 5), and where the counterweight hung (Figure 6). This counterweight could be pulled
down manually in order to separate the two milling stones.

Figure 8. Beams (marranos) supporting the milling floor.

Figure 9. View of the flour channel where the flour was put into sacks.
Computer Animation Applied to the Recovery of Preindustrial Heritage 9
The spiral staircase (Figure 7) which leads up to the first floor runs alongside the two
huge beams (marranos), which supported the second floor (milling floor) (Figure 8). On this
floor there was a flour channel (Figure 9) through which the milled flour passed directly to be
put into sacks.
The mechanism which separated the milling stones (relief mechanism) (Figure 10) was
located on the milling floor, and was activated by the counterweight shown in Figure 6.
Figure 11 shows the runner and the bedstone, between which the cereal was milled.
These stones were normally grooved to aid the milling process. The photograph also shows
the outlet for flour which led to the flour channel shown in Figure 9.

MECHANISM FOR SEPARATING
MILLING STONES

Figure 10. Stairway to milling floor.

BEDSTONE RUNNER
FLOUR
CHANNEL

Figure 11. View of the two milling stones and the outlet for flour.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 10
Figure 12 shows gearing between the wallower and the brake wheel (fixed to the
windshaft), which were the gear wheels which transmitted movement to the milling
mechanism. The brake wheel had 40 cogs, and the wallower had 8 segments in which the
cogs fitted, giving a gear ratio of 1:5 (8/40). Therefore, when the sails turned (normally at
around 9 rpm), the brake wheel transmitted movement to the wallower, which in turn drove
the rotation of the runner stone via the iron wallower axle.
A further important element in the working mechanism was the brake rim, which was
activated using a set of struts and a rope.
Figure 13 shows the joint between the tailpole and the windmill cap, which were joined
by a wooden block called the fraile. This linked the tailpole to the roof structure of the
windmill, which was strengthened by wooden ribs.

BRAKE
WHEEL WALLOWER
IRON WALLOWER
AXLE
BRAKE RIM
COG

Figure 12. Detail of the gearing between the wallower and the brake wheel.

WINDMILL
CAP
STRUCTURE
FRAILE TAILPOLE

Figure 13. Detail of the joint between the tailpole and the windmill cap at the fraile.
Computer Animation Applied to the Recovery of Preindustrial Heritage 11
Figure 14 shows the hopper where the cereal was housed and the channel which fed the
cereal into the central hole in the runner stone.
Lastly, Figure 15 shows the roof structure (formed by perpendicular beams called madres
and manzanos), which was the wooden structure on which the ribs of the roof section of the
windmill rested, as well as the tailpole, which allowed the windmill cap to turn. It turned on a
ring (rueda terrera), which was greased in order to avoid excessive friction. The photograph
also shows the windshaft and one of the two stones on which it rested, the forestone
(fuélliga), and the tailstone (rabote).

HOPPER
CENTRAL HOLE IN THE
RUNNER STONE
CHANNEL

Figure 14. Hopper and channel feeding the hole in the runner stone.

WINDSHAFT
ROOF
STRUCTURE
TAILSTONE

Figure 15. View of the roof structure, windshaft and tailstone.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 12
Methodology
Computer animation forms part of an innovative methodology for the conservation, diffusion
and updating of industrial heritage [27].
In order to systemize the process of acquisition, classification, treatment and distribution
of the available sources of information dealing with the object of study (plans, photographs,
documents, slides, analogue recordings, hard-copy texts) and thereby both improve the
conservation of this material and help to generate formats with higher added value, a working
methodology was developed, as shown in Figure 16.

UPDATING EXECUTION VERIFICATION TARGET MATERIAL

Corrective actions

Figure 16. Diagram of the methodology proposed for the recovery and updating of industrial heritage.
The Updating stage is divided into three sections:

• Location, classification, nomenclature and storage.
• Digitalization of the original material.
• Classification in digital repositories.

The Execution stage has four steps:

• Identification of the technical requirements.
• Definition of functional features.
• Workflows between applications.
• Analysis of the creation and publication processes.

The Verification stage has two parts:

• Development of tests.
• Analysis and verification.

This methodology provides an organized sequence of procedures, structured in three
stages. A computer animation forms part of the first stage of the procedure, digitalization of
the original material, as it consists of creating a sequence of frames in sequence, which when
played at an adequate speed forms a video animation.
There are many programs for modeling, synthesizing images and computer animation,
and a comparative study of them [28] shows the most well known characteristics of each.
Two of the most outstanding are Autodesk 3ds Max
TM
and Autodesk Maya
TM
. Although
either of the two could have been chosen for this study, Autodesk 3ds Max was chosen owing
to the need to create particles (grains of wheat and flour).
One of the critical phases of this process was the generation of digital models of the
object, in order to create realistic images from the original sources. The applications and
processes used in this task are described in the following sections.
Computer Animation Applied to the Recovery of Preindustrial Heritage 13
The work process followed these steps:

1. General outline of the virtual recreation of the ‘Sardinero’ windmill
2. Creation of CAD model with AutoCAD and import to Autodesk 3ds Max
2.1. Fieldwork
2.2. Modeling
2.2.1. From AutoCAD, by exporting .3ds files
2.2.2. From Autodesk 3ds Max, by importing .dwg files
3. Cameras and illumination
3.1. Camera movement. Creation of path
3.2. Illumination
4. Animation of working parts
4.1. Runner Stone raising mechanism
4.2. Brake rim mechanism
5. Materials and maps. Mapping coordinates
6. Creation of textures
7. Rendering and video creation
8. Postproduction
Development
1. General Outline of the Virtual Recreation of the ‘Sardinero’ Windmill
It is recommendable, when working with industrial heritage, to perform two sequences or videos: a
virtual ‘static’ view, which shows the object and its surroundings, and a second, ‘dynamic’ view,
showing the working of the object, following the logical order of the productive process.
This is how the work has been carried out in the case of the ‘Sardinero’ windmill studied
here, establishing a playback speed of PAL frames of 25 frames per second.
Given the nature of this work we decided to create a single file in Autodesk 3ds Max
which included both sequences (static and dynamic) so as not to have to make adjustments in
texture and illumination in various files.
Lastly, we chose .avi as the file format, defined by Windows as its Video for Windows
technology, as it is a format which is compatible with most video players. The file was
created from the frames rendered individually in .png format.
2. Creation of CAD Model with AutoCAD and Import to Autodesk 3ds Max
Although there are many existing procedures to digitalize industrial heritage objects in 3D
[29], such as Empirical techniques, Topographic techniques, Laser scanning techniques or
Photogrammetry, we have used empirical techniques, owing to their ease of use, their
transferability and to the fact that precision measurement was not a determining factor. In
addition, the geometry is relatively simple, with a cylindrical tower which could easily be
modeled using CAD techniques, and from the perspective of engineering graphics this
technique allows us to obtain all types of views, perspectives and sections of the windmill.
This in turn allows us to make comparisons with other forms.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 14
Two examples of the plans obtained are shown in Figures 17 and 18.

Figure 17. Section Perspective of the windmill modeled in 3D.

Figure 18. Exploded view of the horizontal rotation system of the windmill.
Computer Animation Applied to the Recovery of Preindustrial Heritage 15
The development of the empirical approach applied to the ‘Sardinero’ windmill is based
on two fundamental previous sections: fieldwork and graphic reconstruction.
2.1. Fieldwork
The fieldwork necessary for the project includes both taking photographs and drawing
sketches of the building and its mechanisms. The quality of the computer animation depends
on that of the photographs, as the texture captured from them is applied to the model in order
to provide a high degree of realism in the final video.

Figure 19. Transition from sketch to 3D CAD model.
We used a Nikon D-200 digital camera to take the photographs, with an ISO setting of
800. This allowed us to obtain clear images, and we took around 500 photographs of the
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 16
exterior and of the three floors of the windmill. The windmill was measured using a 10 m tape
measure in order to draw the sketches. The inside areas and mechanisms of the windmill were
measured using an engineer’s scale. As the windmill is not a precision-built construction,
various references were taken and measurements had to be adjusted.
The precision of the final model depends on the accuracy of the sketches and
measurements taken. It is also necessary to bear in mind that the geometrical data obtained
also allow us to infer certain technological considerations, and to make possible comparisons
with other types of windmills.
2.2. Modeling
After sketching, the next step is modeling, which is necessary in the graphic reconstruction to
plan the virtual tour of the windmill. From the perspective of engineering, modeling is a
powerful tool which allows for an accurate study of each part of the machinery, as well as
giving an overall idea of how the different parts of the mechanism worked together.
Some of the measurements taken ‘in situ’ were not completely accurate, and so further
measurements had to be taken in order to shed light on certain assembly details which were
not totally clear.
The program used for modeling was AutoCAD. To obtain a model which is a faithful as
possible to the original is a complex task, as there are often limiting factors, such as the
measurement of certain elements which cannot be measured by hand, and other techniques have
to be used. In this way the CAD model is obtained from the hand-drawn sketches (Figure 19).
Given that AutoCAD and Autodesk 3ds Max were developed by the same company, it is
easy to exchange information between them; for example, a point with coordinates x, y, z in
AutoCAD corresponds exactly to another with coordinates u, v, w in Autodesk 3ds Max. This
is an added advantage, because although there are neutral exchange files such as IGES, STEP
or VDAS, these files sometimes produce a loss of information.
This exchange of information can be made in two ways:

2.2.1. From AutoCAD, by Exporting .3ds Files
Using this method it is possible export all the parts which are necessary, avoiding the need to
debug later all the non-necessary elements, but they must by solids surfaces, lines, 3D
polylines or 3D faces, among others. However, there are some disadvantages:
Sometimes when working with complex geometry AutoCAD cannot obtain the 3ds file format.
In order to avoid in Autodesk 3ds Max curved surfaces which have a multi-sided
appearance, it is necessary to increase a variable in AutoCAD (facetrees) from a default value
of 0.5 to a value of 10, which causes a notable slowing of the program.
For these reasons, we chose the second option:

2.2.2. From Autodesk 3ds Max, by Importing .dwg Files
The model created in AutoCAD can be imported directly with the extension .dwg, although it
is necessary beforehand to configure the .max receiver file in Autodesk 3ds Max with a series
of options such as the measurement units, considerations about AutoCAD primitives,
geometry, layering and rendering options of the splines.
Computer Animation Applied to the Recovery of Preindustrial Heritage 17
Once the model has been imported, the screen is divided into four windows, called
graphic windows (Figure 20), in order to create the sequences from different angles and
perspectives. The active window is marked with a thick grey line, and all options can be
accessed from a contextual menu using the right mouse button.

Figure 20. Working Screen in Autodesk 3ds Max.
3. Cameras and Illumination
Cameras can be used to obtain personalized views of a scene much in the same way as with
real cameras. Here, they need lens adjustments which are measured in millimeters.
Autodesk 3ds Max has two types of cameras: Target and Free. The first is centered on the
given object and the area around it, giving an independent animation of the object, while the
free type simple records a scene in the direction in which it is pointing, without being linked
to a specific object. In the case of the ‘Sardinero’ windmill, we used target cameras to obtain
general plans of the exterior views, and of the first and second floors. Free cameras were used
to focus in specific elements or movements, for example the counterweight relief mechanism
used to life the runner stone.
Although Autodesk 3ds Max provides a wide variety of groups of lenses, from 35 mm to
200 mm, in the windmill cameras with a focal distance of 24.29 mm were used, which is the
default setting, in order to obtain wide angle views of the scene. As well as the lens, it is
necessary to adjust the field of vision (FOV) which is measured in degrees (Figure 21). This
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 18
is linked directly to the focal length and measures the visible part of the scene. In the case of
the default focal distance, the program adjusts the value directly to 45º.

Figure 21. Camera with adjusted FOV.
3.1. Camera Movement. Creation of Path
Although a free camera is usually the better option if it is in movement and a target camera is
more useful in situations where the camera does not move, we chose to use a target camera to
produce the static video of the windmill (that is, a virtual visit where the windmill is not
working), animating both the ‘body’ of the camera and the objective.
It is very important to maintain a constant and appropriate speed during the path of the
camera, and therefore movement constraints were used, which link objects to others or to the
path of the camera. Autodesk 3ds Max offers different constraints, such as:

• Attachment constraint
• Surface constraint
• Path constraint
• Position constraint
• Link constraint
• LookAt constraint
• Orientation constraint

Computer Animation Applied to the Recovery of Preindustrial Heritage 19

Figure 22. Path of the camera following a spline curve.

Figure 23. Dummy linked to the camera lens and moved through the scene.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 20
In this case, the path constraint has been used, so that the camera follows a spline curve
previously created in Autodesk 3ds Max (Figure 22). Using the appropriate commands, the
following path was created, which is the path of the camera outside and inside the ‘Sardinero’
windmill.
Once the path has been generated, this has to be assigned to the ‘body’ of the camera.
Although this can be done directly, in this case it was done indirectly, by assigning the path to
a false object (Dummy) and then linking the camera to this object. This allows us to create
camera travelling, at the same time as the camera uniformly followed its path; this is very
useful to position objects and measure dimensions.
A simple cube was used as the Dummy, with a pivot point in the centre, which was not
rendered and which had no parameters. The link was then made between objects lower and
higher in the kinematic chain.
Finally, camera lens was animated independently of the ‘body’ of the camera using key
frames. A helper was also linked, and moved through the scene by movement
transformations, which does not imply any changes in the geometry of the object, but rather a
modification of its initial state.
The following figure shows the situation of the camera which travels the path through the
windmill.
3.2. Illumination
Illumination is the most intricate and complex part of the creation of any scene, as it forms the
basis of the work carried out with textures and materials, and also determines to a large extent
the rendering.
In cases of complex examples of industrial heritage such as a windmill, simplicity should
be a key factor, in order to find an optimum balance between rendering time and the quality
of the result. The configuration of the materials in the scene will also be a conditioning factor.
In this example, we have used different types of lights from Autodesk 3ds Max, including
the Daylight system to simulate natural sunlight (Figure 24) with its various options, in which
the software simulates the position of the sun at a specific time and date, and from a specific
direction.
It is also necessary to activate shadows by selecting the ray traced type (Figure 25),
which are very accurate, as AutoDesk 3ds Max calculates the shadows according to each ray
of light which enters the scene. In addition, we activated the option which determines the
transition between bright areas and areas without illumination, and the exponential
attenuation of light with distance.
The other values are default values, and the color and intensity of the light are according
to the geographical location selected earlier.
It is also necessary to include fill light which does not generate shadows in areas where
the principal light does not provide illumination. This fill light projects light from a defined
area rather than from a single point, and with a lower intensity than that of the principal
light. In the windmill these lights have been placed in each of the small upper windows
(Figure 26).

Computer Animation Applied to the Recovery of Preindustrial Heritage 21

Figure 24. Simulation of sunlight.

Figure 25. Selection of Ray Traced Shadows.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 22

Figure 26. Fill light situated in the upper windows.
4. Animation of Working Parts
The following working subgroups were studied and animated:

1. Runner Stone raising mechanism. Inverse kinematics was used, which allow the
designation of the movements of objects higher in the kinematics chain through the
movement of the objects lower in the kinematics chain.
2. Brake rim mechanism. Here, inverse kinematics was also used, as well as Free Form
Deformation (FFD), which allows elastic deformations of objects.
3. Creation and animation of ropes. Here the Reactor module has been used, and so
once the approximate forms have been created, gravity is applied, giving a realistic
curve.
4. Brake wheel–wallower. In this case we have used forward kinematics, that is, to
determine the movements of the objects lower in the kinematics chain acting on the
objects higher in the kinematics chain.
5. Obtaining flour from grains of wheat. This operation was carried out in two phases:
in the first a Reactor module was used to achieve the effect of the grains of wheat,
stored in the hopper, falling into the channel through an opening and from there
falling to the milling stones. In the second phase, the particle systems were
introduced, which are elements which generate groups of objects called particles,
which behave as a single unit, and which allow the creation of real-time simulations
of natural phenomena such as rain, dust and snow, among others.
Computer Animation Applied to the Recovery of Preindustrial Heritage 23
6. Movement of the sail canvases. Here an independent simulation system called Cloth
is used, which allows the creation and animation of deformable material. Autodesk
3ds Max also has a specific modifier called Garment Maker, which transforms
geometric primitives into material patterns.

As it would take up too much space to give a detailed explanation of each of these
subgroups, we have chosen two as examples: the runner stone raising mechanism and the
brake rim mechanism.
4.1. Runner Stone Raising Mechanism
In order to animate this mechanism which separates the milling stones, we used inverse
kinematics, which allows us to determine the movements of objects higher in the kinematics
chain by controlling the objects lower in kinematics chain. This is more effective than
forward kinematics.
Autodesk 3ds Max includes various methods to animate using inverse kinematics, such as
IK Solvers, and traditional methods, Interactive IK and Applied IK.
IK solvers are helpers which apply inverse kinematics to systems of linked objects. For
example, there is a History-Dependent solver which is recommended in mechanical systems
with sliding joints in inverse kinematics, as it has controls for damping, priority, and spring
back.

Figure 27. Elements in the scene.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 24

Figure 28. Control Element and actions on the other elements.
Interactive IK allows the positioning of a hierarchy linked to objects in different frames,
and Autodesk 3ds Max interpolates all the key frames. This is not an accurate method,
although it uses a minimum number of keys. Finally, applied IK is a method which applies a
solution in a range of frames, calculating the keys in each frame; it is more accurate that
Interactive IK, although it creates a large number of key frames.
In this case we used inverse kinematics applying both methods. Once the necessary links
between the objects had been established using Interactive IK, their behavior was observed,
and the animation was carried out using Applied IK.
To animate the movement of the separation mechanism of the milling stones, the raising
mechanism of the runner stone, the elements have to be renamed, as when the AutoCAD
model is imported into Autodesk 3ds Max, a predetermined name is given to all elements in
the scene, and these names are not clear when there are many elements. Therefore, it is
necessary to re-designate all the elements (Figure 27).
Then, the control element for inverse kinematics is established. This is the element which
will be animated manually, to be used as the basis for the animation (Figure 28).
The next step is to determine the links between the control element and the other
elements. This is the most difficult and time-consuming step, as it is necessary to use helpers
and to relocate the pivot points of some objects. We added 6 helpers, to allow for the
interconnection between all the elements and the combination of movements of some of them,
for example in the runner stone which must turn and rise at the same time.
Computer Animation Applied to the Recovery of Preindustrial Heritage 25

Figure 29. Positions of helpers.
Figure 29 shows the functions of the helpers in the mechanism:

Helper 01: Linked to the runner stone (including rings), the wallower (including rings
and cogs) and to the iron wallower axle, which is the object higher in the kinematics
chain. It allows the transmission of circular movement to these elements.
Helper 02: Linked in the same way as helper 01, and transmits higher and lower
movement hierarchy to this set of elements.
Helper 03: Linked to the lever-beam joint, acting as a link between this and the other
elements.
Helper 04: Linked to the exterior raising beam, allowing for two pivot points on this
element.
Helper 05: Linked to the exterior raising beam, at the point where this joins the interior
raising beam. Its function is to connect these two beams.
Helper 06: This helper is lower on the hierarchy than the interior raising beam and its
function is to create two pivot points and also to connect the interior and exterior
raising beams.

In addition, it is necessary to link two elements which are already joined, helper 04 and
helper 03. In the same way, helper 06 is linked to helper 05, and helper 02 to helper 05.
It is then necessary to define the constraints of the joints of each element, as each has six
degrees of movement: rotation and movement along the X, Y and Z axes. The Rotational
Joints and Sliding Joints options were used (Figure 30).
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 26
Each joint has three sections referring to each of the three axes, and if the Active option is
deselected, that axis is constrained; that is, if the active option of the x axis of the interior
raising beam is deselected in the rotational joints window, the element cannot turn on this
axis. In the same way, if the same option is deselected in the sliding joints window, the joint
cannot slide along this axis. This is how the joints are defined for the rest of the elements of
the mechanism.

Figure 30. Rotational & Sliding Joints.
Computer Animation Applied to the Recovery of Preindustrial Heritage 27
The rings of the runner stone, the wallower, and the screw and bolts which form part of
the mechanism do not have defined joints, as they are linked solidly to elements which
already have these joints defined.
Once the links and joints have been established (rotational and sliding) interactive IK is
used to check that the elements of the mechanism move correctly. The button select and rotate
shows another transformation of the three available in Autodesk 3ds Max, which does not
imply any change in the geometry of the object, but rather a modification of its initial state
(Figure 31).
It can be seen that from a certain angle of the control element, the joints do not function
as in real life; specifically, from 30º, the elements begin to intersect with one another. To
solve this problem, other animation tools can be used such the Reactor plug-in, which allows
the creation of key frames when objects interact according to the laws of physics.

Figure 31. Inverse Kinematics and select and rotate buttons.
However, it would be time-consuming to configure the scene using the Reactor module,
and given that the runner stone has a movement of approximately 1 cm, for which the angle at
which the control element turned was not more than 6º, this simulation is unnecessary.
Therefore it is only necessary to apply the inverse kinematic solution using applied IK, which
can be applied to any range of frames. The required animation is therefore obtained, as the
program calculates the key frames for the other elements according to the control element and
the links established.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 28
4.2. Brake Rim Mechanism
Inverse Kinematics has also been used to simulate the mechanism which brakes the brake
wheel. In addition, a Free Form Deformation (FFD) modifier has been used to achieve the
elastic deformation of the brake rim (Figure 32).
In this case the elements which are present in the animation are the linking beam,
counterweight beam, hook joint with windmill cap, counterweight-linking beam joint, linking
beam-counterweight joint, pin flange, pin and bolt (Figure 33). The ring and the rim itself will
be animated once the movement of the other parts has been determined.
The control element is the linking beam, which is also the real-life control element
(Figure 34).
The links are then made between the control element and the other elements. A helper has
been added, not to link objects, but to make possible the presence of two pivot points on the
counterweight beam. This beam is then established as higher in the kinematics chain than the
pin, pin flange, bolt, hook joint with windmill cap, counterweight-linking beam joint, as well
as the helper, and lastly, the control element is designated as higher in the kinematics chain
than the linking beam-counterweight joint.
The counterweight-linking beam joint is linked to the linking beam-counterweight joint;
specifically, the counterweight-linking beam joint is made to follow the linking beam-
counterweight joint, and finally, the constraints of the joints are defined in these elements in
the same way as before, using the Rotational Joints and Sliding Joints options.

Figure 32. Brake rim mechanism.
Computer Animation Applied to the Recovery of Preindustrial Heritage 29

Figure 33. Elements of the brake rim mechanism.
As before, correct movement is checked using interactive inverse kinematics, turning the
control element with respect to its y axis, and observing the behavior of the other elements.
Applied IK is then used as it can be applied to a given range of frames. Once the animation of
the beams has been completed, the brake rim and its metal ring are animated. For this, the
FFD modifier is used, as it can model rounded deformations without arrises, adjusting the
control points of a lattice.

José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 30

Figure 34. Control element and actions on the other elements.

Figure 35. FFD modifer button.
Computer Animation Applied to the Recovery of Preindustrial Heritage 31

Figure 36. Cylindrical Geometry of the FFD modifier, and surface adjustment button.

Figure 37. Modifier adjusted to fit the geometry of the rim.
The selection which is closest to the geometry of this example is cylinder type, FFd (cyl).
The geometry of the modifier is then situated in the desired location (Figure 36).
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 32

Figure 38. Control points selected two by two.

Figure 39. Effect of the modifier on the geometry after moving the control points.
Computer Animation Applied to the Recovery of Preindustrial Heritage 33
The lattice is then selected and its resolution and size is configured, so that it coincides as
much as possible with the brake rim and the ring. The higher the resolution, the better the
results in the elastic deformation of the object, but more time is required. In this case, the
resolution of the lattice was set at 42 control points, giving a perfect fit with the geometry of
the rim, and also allowing us to animate it quickly.
Figure 37 shows the geometry of the FFD modifier (cyl) after making these adjustments.
The brake rim and its ring are then linked to the animation, similarly to the way objects
are linked using the command Select and Link. Finally, the lattice control points are animated
using key frames, defining an initial and final state using movement transformation.
Therefore, the control points should be selected two by two (Figure 38) in order to achieve a
good result, and to ensure that the elastic deformation of the rim coincides with the movement
of the beams.
Figure 39 shows the effect of the FFD modifier on the geometry of the brake rim after
moving the control points vertically upwards.
5. Materials and Maps. Mapping Coordinates
Autodesk 3ds Max uses materials to cover objects to imitate the effect of light on these
objects. Maps or textures are elements which are applied to materials in order to achieve a
realistic appearance, using mapping coordinates which are defined as how the objects are
aligned using three-dimensional coordinates u, v and w.

Figure 40. Material selection window in Autodesk 3ds Max.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 34
Autodesk 3ds Max has various types of materials, and their choice depends to a great
extent on the rendering motor used and the type of illumination, among other factors. In the
case of our windmill we have used standard materials to give realism to the animation, which
are found in the Material Editor (Figure 40). Below is a description of the process used for the
‘dustcover’, a piece of wood which covered the milling stones.
In many cases it is necessary to create a new material because it does not exist in the
library contained in the software. In our case, as the element which is to be textured is formed
by a series of wooden staves, a material was created for each stave with a similar texture, so
that the final appearance of the surface of the element does not have repeated patterns.
It is also necessary to define the shader to be used, as this is the algorithm which
calculates the appearance of the material according to the specified parameters. In this case
the Blinn shader has been used (the default setting), as it renders simple circular projections
and softens adjacent surfaces.
This shader has color panels to configure the Ambient, Diffuse and Specular colors,
which determine the appearance of the final color of an object. However, we used maps and
textures taken from images of the original model in the texturizing process, using the Diffuse
component through another shader.
It is also possible to obtain better results by configuring some aspects of the indirect
illumination in the rendering motor, although this takes more time. This decision depends
ultimately on the designer.
It is then necessary to define the mapping coordinates in the element to be texturized. We
have used the UVW Map modifier, with a projection gizmo, which defines how the map will
be projected onto the surface and how the material will be applied (Figure 41).

Figure 41. Projection Gizmo and material assignment.
Computer Animation Applied to the Recovery of Preindustrial Heritage 35

Figure 42. Adjustment of dimensions of gizmo to those of the image.

Figure 43. Final Result of Mapping.
However, the visualization of the texture on the surface of the element it not correct, as
the adaptation of the gizmo to its geometry implies uneven steps, and therefore when the
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 36
texture is applied it seems stretched or compacted. Final adjustments have to be made to
ensure that the dimensions of the gizmo are proportional to those of the image (Figure 42).
Figure 43 shows the final result, which is very similar to real life.
6. Creation of Textures
Maps or textures are applied to the materials to obtain a realistic effect. In our case the
textures (Figure 44) are taken from digital photographs taken of the real object and edited
with Adobe Photoshop
TM
.

Figure 44. Texture taken from digital photograph.
.
Figure 45. Continued on next page.
Computer Animation Applied to the Recovery of Preindustrial Heritage 37

Figure 45. Exterior ground of the windmill processed by Adobe Photoshop.

Figure 46. Wooden door from which repetition texture is extracted.
The digital model was divided into a number of pixels with color and intensity, giving
images which Adobe Photoshop can work with. These textures can be repeated indefinitely
without the sensation of a repeated pattern. This is shown in Figure 45 applied to the ground
of the exterior of the windmill. The upper part of the image has not been processed with
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 38
Adobe Photoshop and edges of the images do not match; however, the lower part of the
image shows how this problem is solved.

Figure 47. Image with corrected perspective and area to be cut selected.

Figure 48. Rendered image with texture applied.
Computer Animation Applied to the Recovery of Preindustrial Heritage 39
The first step when using Adobe Photoshop is to select the color mode RGB (Red-Green-
Blue) and the 8 bits /channel option, which is a standard mode found in televisions and color
monitors. Files used should be saved in format .psd. The first step is extract the area of the
photograph (texture) which is to be applied, using the lens correction and cut tools. The
example shows the process used for the wooden door of a food store (Figure 46).
The lens correction tool is used to correct the perspective, eliminating the divergence of
parallel line, and the cut tool is then used to extract the desired area of the photograph (Figure 47).
As digital photographs are already illuminated, it is necessary to adjust the brightness in
order to ensure that the textures extracted are not excessively bright. The brightness has
therefore been reduced by between 10% and 20% in photographs taken inside the windmill,
and between 25% and 30% in photographs taken outside.
Once this has been done, the image is cut and adjusted, obtaining a texture which can be
repeated on the model without clear edges. This is undoubtedly one of the most complex
stages of the work, to give a texture which has a high degree of realism (Figure 48).
7. Rendering and Video Creation
Rendering is a process which calculates the properties of objects before they are shown on
screen, that is, it generates a synthesis of the scene created. Autodesk 3ds Max has a
rendering motor called Mental Ray and an additional plug-in called VRay, which give
excellent results thanks to the representation of light through rays. Rendering is done in the
active window, which is marked by a thick border (Figure 49).

Figure 49. Active window where the scene is rendered.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 40

Figure 50. Render process window.

Figure 51. Video Post process window.
Computer Animation Applied to the Recovery of Preindustrial Heritage 41
The Rendering drop-down menu includes the Render command where the configuration
is done, defining the rendering motor, the range of frames to render, the format and size of the
images, among other factors.
We chose the format .png (Portable Network Graphics), with a color configuration RGB
26 bits (16.7 million), with alpha channel and interlinking activated, which is one of the best
image formats for computer animation. The resolution of the image is 768x576 pixels, with a
width to height ratio of 4:3, although the final video format is a DVD.
Once the configuration has been done, the rendering process itself starts, and a dialogue
box shows the adjustments made and the progress of the process (Figure 50).
Before rendering the final image is configured, defining the range of color and the output
levels of the final image. This stage is very important, as it controls the clarity of the colors of
the scene (not the illumination), the intensity of the tones, the intensity of the standard lights,
and adjusts the colors so that they correspond to an exterior scene.
Once the rendering process is complete, the frames are linked together using the Video
Post command, which allows the inclusion of many effects; this command is also found in the
Rendering menu.
The first operation to obtain video is to include the rendered images to make up the list of
images. Then the output format AVI (Audio Video Interleave) is set, as it a simple and
standard digital video format, and a compression codec is chosen to reduce the size of the
final video so that it is more manageable. The resolution of the final video file is also set, in
this case PAL 768x576 pixels. We obtained a video of 720 MB for the static sequence (a
virtual tour with the windmill not working) and a video of 8.41 GB for the dynamic sequence
(virtual tour with of the working windmill).

José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 42
The sequence is executed and the following window appears (Figure 51).
The following 25 images obtained using rendering show the degree of realism obtained in
the computer animation process.

Computer Animation Applied to the Recovery of Preindustrial Heritage 43

José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 44

Computer Animation Applied to the Recovery of Preindustrial Heritage 45

José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 46

Computer Animation Applied to the Recovery of Preindustrial Heritage 47

José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 48

Computer Animation Applied to the Recovery of Preindustrial Heritage 49

José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 50

Computer Animation Applied to the Recovery of Preindustrial Heritage 51

José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 52

Computer Animation Applied to the Recovery of Preindustrial Heritage 53

José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 54
8. Postproduction
The final stage in any Project in the graphic conservation of industrial heritage is the post
production or editing of the video, to create an audiovisual document combining effectively
audio, video and text. The object is to give a clear idea of the heritage and its setting in the
production process.
This stage has been carried out with Adobe Premiere, which is a very versatile and
intuitive program with a wide range of video, audio and transition effects.
The final video has the configuration DV-PAL with the Standard 48 kHz, allowing it to
be shown anywhere using any equipment, and with options which allow the user to
personalize the video.
First, the sound and video files are loaded, and the title and subtitles are created using a
text editor. These are then added to the timeline, to set the order and timing of each element,
and the transitions between videos are established. Lastly, the video is exported with the
necessary settings.
Conclusion
This project shows that to create a realistic computer animation requires a great deal of time
and effort. This process is usually carried out by a team of designers equipped with powerful
computers with various high-speed microprocessors, sufficient RAM and graphic cards with
large amounts of memory. In our case, two people and three computers took almost one year
to complete the project.
Another important conclusion to be drawn is the importance of the technical training of
the person who carries out the virtual recreation of the apparatus and devices which make up
the element of preindustrial heritage, as in order to create a true-to-life animation it is
necessary to know how these devices worked and were originally designed. Without such
knowledge, it would be impossible for example to reproduce real working speeds of the
machinery in the animation. At the same time it is extremely useful to be familiar with
forward and inverse kinematics, as it makes working with Autodesk 3ds Max much easier.
Generating a high quality computer animation requires great effort, not only in learning
the main software packages used, but also in learning to use other graphic programs, such as
video editing software and photographic software, which offer many possibilities.
There has been a great deal of progress in the field of virtual reality, for example
augmented reality, which is especially useful in the design of virtual scenes where real-life
images are mixed with virtual images. However, computer animation using specific software
still gives very high quality results, which makes it very useful when the objective is to show
in detail how old machinery worked and its environment.
Computer animation using specific software provides a better solution when dealing with
complex machinery than a virtual reality in which the user interacts with the system, as the
user would need to know in detail how the machinery worked. For example, in the case of a
windmill, how to use the regulation elements such as the counterweight which operates the
raising mechanism of the runner stone or the mechanism which controls the brake rim; this is
specialist knowledge which a normal user is unlikely to possess.
Computer Animation Applied to the Recovery of Preindustrial Heritage 55
Therefore we have produced an audiovisual animation with as many camera angles as
necessary, which shows quickly and intuitively all the details of the working of the machinery
and its environment, of which it is not necessary for the user to have expert knowledge. This
shows how non-immersive virtual reality using computer animations can provide many
advantages.
Funding
This research was funded by the Spanish National R&D Plan (HUM2006-00377), “Estudio
histórico-tecnológico y representación gráfica de la evolución en el diseño de los molinos de
viento en la mancha, en la España de los siglos XVI y XVII, mediante técnicas de Dibujo
Asistido”, of the Research Projects Subdepartment of the Universities Department of the
Ministry of Science and Innovation.
References
[1] http://www.virtualheritage.net/
[2] http://www.itabc.cnr.it/VHLab
[3] http://cordis.europa.eu/fp7/ict/telearn-digicult/home_en.html (Cultural Heritage &
Technology Enhanced Learning)
[4] http://whc.unesco.org/en/35/
[5] Refsland, S. T., Ojika, T., Addison A. C., & Stone, R. J. (2000) Virtual heritage:
Breathing new life into our ancient past. IEEE Multimedia, 7, 20-21.
[6] Arnold, D. (2001). Virtual heritage: Challenges and opportunities. Digital content
creation (281-293). New York: Springer-Verlag.
[7] Addison, A. C. (2000). Emerging trends in virtual heritage. IEEE Multimedia, 7, 22-25.
[8] Stone, R. J., & Ojika, T. (2000). Virtual Heritage: What Next?. IEEE Multimedia, 7,
73-74.
[9] Buchanan, R. A. (1972). Industrial Archaeology in Britain. Harmondsworth: Penguin
Books.
[10] Hudson, K. (1971). A guide to the industrial archaeology of Europe. London: Adams
& Dart.
[11] Carandini, A. (1984). Arqueología y cultura material. Barcelona: Mitre.
[12] http://www.mnactec.cat/ticcih/
[13] http://www.industrial-archaeology.org.uk/
[14] http://www.historyoftechnology.org/
[15] Rojas-Sola, J. I. (2005). Ancient technology and Computer-Aided Design: olive oil
production in Southern Spain. Interdisciplinary Science Reviews, 30, 59-67.
[16] Rojas-Sola, J. I., & Domene-García, J. (2005). Engineering and Computer-aided design: A
Study of watermills in Southeastern Spain. Interciencia, 30, 745-751.
[17] Rojas-Sola, J. I., Suárez-Quirós, J., & Rubio-García, R. (2007). The tradition of fulling
mills: a study from engineering. Interciencia, 32, 675-678.
[18] Rojas-Sola, J. I., & López-García, R. (2007). Engineering graphics and watermills:
Ancient technology in Spain. Renewable Energy, 32, 2019-2033.
José Ignacio Rojas-Sola and Francisco Javier Contreras-Anguita 56
[19] Pennestri, E., Pezzuti, E., Valentini, P. P., & Vita, L. (2006). Computer-aided virtual
reconstruction of Italian ancient clocks. Computer animation and virtual worlds, 17, 565-
572.
[20] Rojas-Sola, J. I. (2006). Cultural heritage and information technologies: improvement
proposal for science and technology museums and interactive Centers of Venezuela.
[21] Rojas-Sola, J. I., & López-García, R. (2007). Computer-aided design in the recovery and
analysis of industrial heritage: Application to a watermill. International Journal of
Engineering Education, 23, 192-198.
[22] Bakker, G., Meulenberg, F., & De Rode, J. (2003). Truth and Credibility as a Double
Ambition: Reconstruction of the Built Past, Experiences and Dilemmas. Journal of
Visualization and Computer Animation, 14, 159-167.
[23] Rojas-Sola, J. I., & Amezcua-Ogáyar, J.M. (2005). Graphical and Technical study of
windmills in Spain. Interciencia, 30, 339-346.
[24] Rojas-Sola, J. I., & Amezcua-Ogáyar, J.M. (2005). Southern Spanish windmills:
technological aspects. Renewable Energy¸30, 1943-1953.
[25] Rojas-Sola, J. I., Gómez-Elvira González, M.A., & Pérez-Martín, E. (2006). Computer-
aided design and engineering: a study of windmills in La Mancha (Spain). Renewable
Energy, 31, 1471-1482.
[26] Rojas-Sola, J. I., & Amezcua-Ogáyar, J.M. (2005). Origin and expansion of windmills in
Spain. Interciencia, 30, 316-325.
[27] Suárez-Quirós, J., Rojas-Sola, J. I., Rubio-García, R., Martín-González, S., & Morán-
Fernández, S. (2009). Teaching applications of the new computer-aided modelling
technologies in the recovery and diffusion of the industrial heritage. Computer
Applications in Engineering Education, 17¸ 455-466.
[28] http://www.tdt3d.be/articles_viewer.php?art_id=99
[29] Pavlidis, G., Koutsoudis, A., Arnaoutoglou, F., Tsioukas, V., & Chamzas, C. (2007).
Methods for 3d Digitization of Cultural Heritage. Journal of Cultural Heritage, 8, 93-
98.

In: Computer Animation ISBN: 978-1-60741-559-6
Editors: J.S. Wright and L.M. Hughes, pp. 57-83 © 2010 Nova Science Publishers, Inc.
Chapter 2
VIRTUAL ENGINEERING IN AUGMENTED REALITY
Pier Paolo Valentini
*
, Eugenio Pezzuti and Davide Gattamelata
University of Rome “Tor Vergata”, Department of Mechanical Engineering
Via del Politecnico, 1 – 00133 Rome, Italy
Abstract
In this chapter the authors discuss several approaches in order to integrate computer-
aided engineering instruments into Augmented Reality environment. Engineers and designers
often develop their creative ideas in front of a computer monitor using mouse and keyboard.
Although the integration between numerical computation and graphics leads to the generation
of very realistic digital mock-ups, they are still far from the real context and the user has
limited interaction with them. The purpose is to illustrate how recent development in
computer graphics and image processing can improve the realism and interactivity with digital
mock-ups. Starting from the interactive modeling of 3d shapes, the chapter presents some
examples about the integration of real-time mechanism motion simulation, structural and fluid
dynamics analysis post-processing.
Keywords: Virtual Engineering; Augmented Reality; Simulation; Computer-Aided Design
1. Introduction
1.1. The Role of Virtual Engineering
During last decades computer-aided engineering (CAE) methodologies have deeply changed
the way of designing and developing products, systems and services [1]. Thank also to
significant hardware and software improvements, CAE techniques are widely used by the
designers since the early conceptual phases up to the final stages of engineering processes. At
the industry level, these methodologies have become a fundamental tool to be competitive
and to ensure high quality standards. In industrial engineering, computer-aided methodologies
typically are instrumental for design teams in shape modeling, behavioral simulations, digital

*
E-mail address: [email protected]. (Address all correspondence to this author)
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 58
mock-ups up and realistic animations. They are able to follow the development of a product
from conception to production, also managing its life-cycle. For this reason, the science that
support the use of these tools is called Virtual Engineering. It means that all the tasks which
are typical of an engineer or a designer can be arranged in a virtual way, using informatics
and computers. The benefits of this approach are many. First of all there is a saving of time:
many solutions can be tested, compared and optimized without building physical prototypes.
As a consequence there is a saving of money because a digital mock-up is much less
expensive than a physical one. On the other hand, a meaningful and reliable virtual simulation
needs accurate study of the involved phenomena, definition of parameters, simplifications.
The available computing capabilities allow to produce very realistic models and animations,
mimic the real behavior of a system. Examples of these model are reported in Figure 1.
Figure 1. Examples of digital mock-up built and simulated using virtual engineering techniques.
Virtual Engineering in Augmented Reality 59
1.2. Augmented Reality
The Augmented reality (AR) is an emerging field of the visual communication and
information technologies [2-4]. It deals with the combination of real world images and
computer generated data. Although the idea of virtual reality can be dated in the last century
and the development of portable displays is dated 1966, the phrase “augmented reality” was
coined by Prof. Tom Caudell only in 1990 considering an application developed at Boeing to
help workers in assembling aircraft components. At present, most AR research is concerned
with the use of live video imagery which is digitally processed and "augmented" by the
addition of computer generated graphics. The idea behind the augmented reality is simple, but
its development has been slowed down by the inadequacy of hardware resources to support
real time heavy computation. Only during last decades, thanks to the increasing of hardware
performance, the research in the field of augmented reality boosted [5].
With an AR system, the user can extend the visual perception of the world, being
supported by additional information and virtual objects. The level of details of the augmented
scene has to be very realistic in order to give the user the illusion of a unique real world.
There are different types of AR applications depending on the level of details, graphics
effects and interactivity (Figure 2).
A generic AR implementation is depicted in Figure 3. The user interacts with the
application. This one has to manage the presence of interfaces, ensure a correct tracking of
user and objects in the real scene and compute an adequate and realistic collimation between
real world and augmented contents.
At the first level there are the tracking and the registration technologies; the first aims to
calculate the user’s point of view with respect to the scene while the registration deals with
the collimation of virtual world objects with the real environment. The human-scene
interfaces can be mechanical, electromagnetic or optical devices. The second basic element
for an AR system is the capability of real time rendering. The detail of rendered scene
depends on hardware capabilities. The display technology ensures the immersion of the user
in the scene. The ways to capture images from the real world, process and project again to the
user can be different [6]. Three different technologies are currently implemented:
• the video based system;
• the optical see-through system;
• the video see-through system.
In the first one, as shown in the left side picture of Figure 2, the images coming from the
real world are augmented with simple graphics data and streamed on a monitor. This
approach is often used in sports to superimpose data in a common television broadcasting. It
is quite simple, but the user is not immersed in the scene at all.
In optical see-through systems, the information is not directly captured from the real
world, but the augmented contents are added by projecting them onto a semi-transparent visor
which naturally mixes the real perception with the augmented one. The user feels immersed
in the scene, but there are some limitations on the quality of the images and some difficulties
in collimating real and virtual objects.
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 60

Figure 2. Examples of augmented reality applications. On the left: a simple video augmented
application in which a virtual line is added to the video stream in a swimming competition. On the
right: a complex augmented scene for training in medicine (courtesy of Institute for Computer Graphics
and Vision, Austria).
Figure 3. Layout of a generic high-end Augmented Reality System.
The video see-through system (see Figure 2, on the right and the scheme depicted in
Figure 4) is based on the use of one or two cameras which acquire an image stream from the
real world. The stream is processed by a computer which add virtual contents, producing an
augmented image stream which is projected again to the user by means of a blind visor. It
allows to increase the level of immersion in the scene and the quality of the images depends
on the resolution of the visor. As witnessed by several investigations, the video see-through
systems revealed to be more suitable for different applications, especially in the engineering
field.
Virtual Engineering in Augmented Reality 61
Figure 4. AR Video see-through system.
Scientific literature reports an increasing interest for the development of applications of
augmented reality in many different fields [7-9]. The AR has been used in medicine and
surgery [10] to improve the reliability of complex clinical treatments and assist operations
(image-guided surgery). Moreover, for military purposes, the army is already using AR
displays in cockpits where screened information are shown to the pilot on the windshield of
the cockpit or the visor of their flight helmets. Other fields of interests in AR technologies are
the robotics [11] and telerobotics in which an augmented display can assist the user with a
visual image of the remote workspace to guide the robot movements. AR is also useful in
maintenance and assembling activities [12-14] where technicians can approach a new or
unfamiliar piece of equipment simply putting on an AR display, instead of opening several
repair manuals, and visualizing information directly on the desired object. AR applications
have been developed in architecture [15] for perceiving the structural modifications inside
and outside an house, by superimposing walls and furniture to the current solution and
perceiving the results in a realistic way. There are also applications in e-learning [15-18],
manufacturing [19-20], services and logistics [21-22], arts [23], navigation [24], etc.
The most of all these applications deals with the merging in the real world object, scene
and animation which have been modeled and simulated outside the system. It means that the
user perceives a real scene augmented with pre-computed object. His interaction with them is
often limited to exploration.
In order to extend the application of augmented reality in engineering field we have to
move some steps forward. A designer or an engineer cannot be limited to the exploration of
the scene, but he has to interact with objects to modify, create, animate and materialize his
creativity [25-29]. For this purpose the augmented reality has to be enriched with tools (haptic
devices) which allow to interact with the objects in the scene. The most important interaction
is about the tracking of user’s position in the augmented scene. There are five types of
tracking devices that are implemented, depending on the methodology used for measuring the
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 62
position in the space: mechanical, electromagnetic, optical, acoustic and inertial. The first
ones use a multi degree-of-freedom linkage to compute the position in the space of a pointer.
They have the advantage to be very precise and rather cheap but they have a small workspace
and limit the user’s movement. The electromagnetic devices are comprised of an emitter and
a receiver. The emitter generates a magnetic field which is captured by the receiver. The
changing of the acquired signal is converted to information about the position and attitude of
the receiver. Due to its small size, the receiver can be easy put on by the user or attached to a
tracking stick. On the other hand, the electromagnetic devices are sensible to interference of
the magnetic field due to electronic equipments or bulky ferromagnetic objects. Optical
tracking systems are more complex. They require the use of several markers and two or more
high-speed cameras. Their precision depends on the resolution of the cameras and the size of
the markers. They allow wide working area, but require a specific setup to ensure that the
markers are always visible to cameras. Acoustic devices are comprised of an emitter and
several receiver. The emitter generates an acoustic signal whose time of flight is acquired by
each receiver and converted in spatial position information. They are quite cheap devices but
cannot ensure great precision and they are sensible to temperature and humidity variation and
to the presence of echoes. Inertial devices use accelerometers and gyroscopes to measure the
position and the attitude. They require a frequent calibration but offer good precision.
In a general application, the interaction with the scene has to fulfill the following
requirements:
• the devices have to be easy to use;
• the devices have to be precise and be allow an acquisition with a high frequency;
• their application has to be intuitive and natural without limit the user’s movement;
• they have to support the ability and the intent of the designer being an alley and not
an obstacle.
Unfortunately many of these devices or system are often much expensive and their cost
limits the usage to large research facilities or large industries.
The AR system has some similarities with the Virtual Reality (VR) one. The main
difference is that in the VR the perceived world is fully virtual (generated by one or more
rendering pipelines), while in the AR the virtual world merges the real one. This mixing
involves not only the generation of rendered graphics, but also complex procedures of
registering the real and the virtual video stream. Moreover, in order to give the most natural
perception of the augmented world to the user, all these activities need to be real time
processed. Such demanding requirements slowed down the advancements and researches of
AR systems with respect to the VR ones.
1.3. Motivation and Objectives
The motivations of this chapter come from all the considerations discussed in the previous
section and are fuelled by the idea that future of virtual engineering will be on virtual and
augmented platforms in order to increase the user’s interaction and the level of realism and
perception.
Virtual Engineering in Augmented Reality 63
According to many researchers, the AR system can be the future of Computer-Aided
Design (CAD) technologies and Virtual Engineering. Presently CAE applications support the
designer through numerical computation and computer graphics. Often engineers and
designers develop their creative ideas in front of a computer monitor using mouse and
keyboard. Although the integration between numerical computation and graphics leads to the
generation of realistic digital mock-ups, they are still far from the real context and the user
has a limited interaction with them. This limitation can generate problems (non-conformities,
unexpected behaviour and appearance, for instance) when the designed products have to be
integrated in the real world. For overcoming this disadvantage, new instruments, based on AR
systems can be set up.
The contents of the chapter concern with the discussion about possible implementation of
virtual engineering tools in an augmented reality environment. Three aspects will be faced:
the modeling of three dimensional shapes, the simulation of physical motion of mechanisms
and the visualization of engineering structural and fluid-dynamics investigations. Both
hardware and software details will be discussed, proposing an implementation of a low-cost
system.
2. 3D Modelling in Augmented Reality
The first step toward the building of a virtual prototype is the modeling of shapes. Standard
applications implemented into Computer-Aided Design systems make use of keyboard,
mouse and, in some cases, of spatial pointers. The results of modeling actions can be viewed
by the designer on the pc monitor. They have a good level of realism but they are limited
inside the monitor. The designer with his personal ability has to extrapolate these results and
imagine how they will fit in the real world. In order to overcome this limitation, virtual shapes
can be merged in the real world using augmented reality techniques. This solution still limits
the creativity of the designer because he has to model using a monitor interface and then
project the shapes in the real world. A more interactive solution involves the possibility to
create virtual shapes directly in the real environment by passing the use of external interfaces.
In order to arrange a system able to allow the user a real-time modeling and visualization,
appropriate hardware has to be set up and software has to be implemented. In the following
section a low cost solution, that has been implemented by the authors, will be described.
2.1. Hardware Setup
The implemented system is made of input, processing and output devices. Input devices have
to acquire a real world video stream and user actions. Output devices have to project an
augmented perception of the real world enriched with virtual objects. Processing units have to
manage inputs coming from different devices, store and arrange data flows and render the
augmented video stream.
The input video device of the implemented system is a Microsoft LifeCam VX6000 USB
2.0 camera, able to catch frames up to 30 Hz with a resolution of 1024x768 pixels. This
camera has been rigidly mounted on the Head Mounted Display (Figure 5). The position
tracking system has been developed starting from an electromagnetic 6 degrees of freedom
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 64
sensor named Flock of Birds by Ascension (http://www.ascension-tech.com) in combination
with a wood pen (named “I-Pen”). Flock of Birds is an electromagnetic tracker similar to
those described in the introduction. Its receiver (a small cube) is rigidly constrained to the I-
Pen and the emitter is placed inside the design space. Another input device is a standard pc
keyboard for user key command.
The processing unit is a personal computer with a Pentium IV quad core processor, 3 Gb
RAM, NVidia Quadro FX3700 graphic card and the operative system is Windows XP
professional and the development suite for programming is Microsoft Visual Studio 2005.
The output display is an Head Mounted Display equipped with OLed displays (Z800 3D
visor by Emagin - http://www.3dvisor.com/). It is able to support stereovision up to a
resolution of 800x600 each eye.
The emitter of the electromagnetic tracker has to be rigidly fixed inside the working area.
Since it has a transmission range of less than 1 m, it can be useful to place in the middle of the
area. The presence of metallic objects strongly affects the measurement, so it is important to
ensure a correct shielding or an appropriate distance of ferromagnetic parts from the working
zone. In the working area, in a place that can be always seen by the camera, a patterned
marker has to be located. Its presence allows the computation of the relative position between
the camera and the real world as explained in the following section. The user has to wear the
head mounted display and the camera and has to grab the I-Pen in a normal way (Figure 5).
2.2. Software Setup
Since the system is assembled from scratch, a reliable software has to be implemented in
order to manage the various data flows, to control the devices and to generate an augmented
real-time video stream. Since the driver of the electromagnetic tracker were available in C++,
all the procedures have been implemented in the same programming language.
Figure 5. Input and output devices of the implemented system.
In the system there are three main data flows. The first is the video stream coming from
the camera. This video is processed in order to find out the relative position between the
camera and the real world. Since the camera is fixed on the head mounted display, it is also
Virtual Engineering in Augmented Reality 65
the relative position between the user and real world. The computation is possible thanks to
the ArtToolkit libraries. Their source code (freely available together with documentation at
http://sourceforge.net/projects/artoolkit) is widely used to developed augmented reality
applications. The ArtToolkit routines can analyze the scene, recognize a patterned square
marker (as that in Figure 4), and find out the spatial transformation between the camera and
the real world. At the same time another data flow comes from the tracking system. So it is
possible to locate the tip P of the I-Pen in the world reference frame of the emitter (O-XYZ) as
(see Figure 6):
{ } { } [ ] { }
I-Pen
p p p p
p
O XYZ O x y z
O XYZ
P O T P
− −
−
= + (1)
where:
{ }
*
A
are the coordinate of the generic vector with respect to A system of coordinates;
[ ]
I-Pen
T is the spatial transformation between the reference frame of the I-Pen sensor and
that of the emitter
Both
{ }
p
O XYZ
O
−
and
[ ]
I-Pen
T can be computed with the embedded driver of the tracker.
Figure 6. World, emitter and I-Pen reference frames.
In order to collimate the three reference frame of the camera, the world and the I-Pen,
another transformation has to be computed. It relates the emitter with the world patterned
marker. Since these two reference frames are fixed in space, this transformation
(
{ }
W
O XYZ
O
−
and
[ ]
marker
T ) can be computed off-line when setting up the working
environment. By doing this it is possible to locate the I-Pen tip P in the reference frame of the
marker as:
{ } { } { }
'
W
O XYZ O XYZ O XYZ
r P O
− − −
= −
(2)
{ } [ ] { } [ ] { }
1
marker marker
' '
W W W W
T
O x y z O XYZ O XYZ
P T r T r
−
− − −
= = (3)
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 66
2.3. Examples
In this section some examples of modeling will be discussed. With reference to Figure 7 one
can notice how to sketch planar geometry such as lines and polygons. The user can draw
using the I-Pen in the same way he can do with a pencil and a paper sheet. The graphic
modeler, based on OpenGL, render the scene, augmented with sketched objects. The
augmented scene also includes virtual tripods in order to underline where the world systems
of coordinate is located (on the patterned marker and where the I-Pen is pointing). Figure 8
shows examples of three dimensional geometry. Starting from sketches, the user can extrude
1D and 2D entities, building surfaces and solids. It can be done simply dragging the I-Pen
throughout the path of extrusion in a natural way. The results can be visualized in preview
during the modeling, so the user can adjust and correct his operation in real time. Starting
from the picking of points in space, the user can also define free form surfaces (Figure 9).
These entities are very suitable for aesthetic design purposes, reverse engineering and
advanced modeling. The user can define and modify control points of surface, simply moving
the I-Pen in the working area and selecting locations.
Figure 7. Modelling of objects in plane (lines, on the left and a triangle, on the right).
Virtual Engineering in Augmented Reality 67
3. Simulating and Animating in AR
The capabilities of augmented reality can be also used to support engineering simulation. In
particular, this section focuses on the main three types of simulation often performed by
engineers: the movement of a linkage (kinematics and dynamics), the deformation of an
object subjected to loads, and the dynamics of fluids inside or outside objects. When a
designer prepares a virtual model for one of these simulations, he has to define geometries,
constraints and boundary conditions. All these tasks can be supported by the implemented AR
system which can serve as pre- and post-processor. Moreover, the great advantage that the
augmented reality can give to simulation is in the visualization of results, i.e. in post-
processing. In this way the user can integrate in the real world the visual results of several
numerical analyses in a more realistic way.
Figure 8. Modelling of 3D extruded objects (a surface based on a spline, on the left and two cylinders,
on the right).
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 68
Figure 9. Modelling of 3D complex objects (a free form surface based on control points, on the left and
a patched surface to be integrated in an existing models of a car, on the right).
3.1. Multibody Animation
With multibody simulations engineers analyze the kinematics and dynamics of moving parts
and linkages. Starting from the knowledge of mass and inertial properties of each component,
topological constraints (joints and fixtures) and external actions (forces, torques, motors), it is
possible to simulate the movement of a system in a reliable way. Readers interested in
techniques for deducing the governing equations and solution strategies can refer to
referenced books [30-31]. The AR can support multibody simulations in two different ways.
The first is about the possibility to project on the real world the results coming from a pre-
computed simulation. It concern the rendering on the scene of all the objects involved in the
simulation, whose position is updated according to the results of the computation. This
implementation is similar to that of the common post processing software for visualizing
graphics results. The only difference lays in the merging of the simulated system in the real
Virtual Engineering in Augmented Reality 69
world. The advantage is to perceive the interaction with the real world and check working
spaces, possible interferences, etc. Although useful, this approach does not use all the
potential of AR. Let us consider a practical application in order to illustrate this first possible
integration between motion simulation and augmented reality. The purpose is to simulate a
robot that has to be mounted on flange by means of a revolute joints. The flange is present in
the real world as a physical component (Figure 10). The robot and its movement have to be
added as virtual shapes. The first step is to built the virtual parts. This task can be done using
the modeling techniques illustrated in the previous section or also using a computer-aided
design program. In this case it is useful to export geometries using .vrml (Virtual Reality
Mark-up Language) file. This format is quite common and can be exported by almost all solid
modelers. Dealing with .vrml is useful for rendering geometries using OpenGL libraries. The
second step is to perform the numerical computation using a specific solver. In order to
prepare the input file for simulation, it is useful to consider how to relate the real world to the
virtual world. The main issue is to collimate the two main reference frames. This operation
can be made using communication reference frames in the same way of sub-structuring large
assemblies. In other worlds, we can consider the virtual world (and the objects in it) as a
subsystem of the real world. The relation between the main system and a subsystem is
controlled by several constraint equations acting on the communication reference frames. For
the specific example of the robot, it can be convenient to locate the real world communication
frame on the flange when the revolute joint has to be implemented. Similarly, we can define
the virtual communication frame coincident to the inertial reference frame of the virtual
system. When building the equations for simulation, we have to create a fictitious revolute
joint between the virtual inertial reference frame and a reference frame on the first link of the
robot. This approach can be resumed in the following five steps:
1. Before the simulation starts, the geometries and topological properties (joints and
connections) have to be defined, built and stored in files;
2. The real scene has to contain information for collimating the real world to the virtual
objects (communication frames);
3. The equations of motion of the investigated mechanisms have to be built and
externally solved by means of a specific solver;
4. After the numerical solution, simulation results have to be accessible to AR
executable.
5. Each frame acquisition, virtual objects have to be rendered on the scene in the correct
position and attitude according to the simulation results, considering the position of
the communication frames and using OpenGl ModelView transformations.
The definition of the real world reference frame can be done in two different ways. In the
first, the patterned marker can be placed on the flange where the communication frame is
desired (Figure 10). The second way is about the picking of points defining the
communication frame by the use of I-Pen and recording the coordinates of the origin and
main axes.
As a result, the augmented scene includes the real world with the virtual robot. Since the
movement of the manipulator has been simulated solving the equations of motion, the
augmented scene will show an animation of the robot. By this way the user can visualize from
different points of view, the motion of the manipulator directly in the real world looking at its
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 70
performance, verifying working space, checking possible interference with real objects and
the aesthetic impact on the real environment.
The augmented video can be enriched with other visual information on kinematic and
dynamic parameters as velocity, acceleration, force, torque, joint reaction, etc. This can be
performed rendering on the scene both vectors (for the direction) and numerical values (for
the amplitude) as static overlays.
A smarter way to enhance the multibody simulation is to introduce interactivity. It means
that the user does not only watch the augmented scene, but interact with it. Let us imagine the
kinematic simulation of a robot arm whose end-effector can be grabbed by the user and
moved. The purpose of the simulation is to compute attitude and position of all the links in
order to obtain the required position and orientation of the end-effector. This simulation can
be supported by augmented reality by introducing in the real scene the virtual robot that can
be interactively manipulated by the user using virtual sensors. Of course, the simulation of the
mechanism has to be computed real time in order to update the scene with quick information.
This idea can be implemented following five steps (Figure 11):
1. Before the simulation starts, the geometries and topological properties (joints and
connections) have to be defined, built and stored in files;
2. The real scene has to contain information for collimating the real world to the virtual
objects and the virtual sensor(s) for the interactive action of the user;
3. Each frame, the position and attitude of all markers in the scene have to be acquired
and mathematical transformations between camera and markers have to be
computed.
4. Each frame, starting from the previous recognition and calculation, dynamic or
kinematics equations have to be solved in order to compute the correct position and
attitude of all the virtual bodies in the scene;
Each frame acquisition, virtual objects have to be rendered on the scene in the correct
position and attitude.
Figure 10. Continued on next page.
Virtual Engineering in Augmented Reality 71
Figure 10. Simulation of the movement of a manipulator in augmented reality. Starting from a real
environment and adding collimated virtual objects (on the top), an augmented animation can be built (at
the bottom).
Figure 11. Activities for implementing interactive simulation in augmented reality.
The input and output procedures (data acquisition, markers recognition, rendering of the
virtual objects) can be implemented using the ArtToolkit open source library.
Between the input procedures (data acquisition and markers recognition) and the output
procedures (rendering of the virtual objects on the input scene) a specific kinematic or
dynamic solution has to be implemented. This portion of the algorithm depends on the
specific mechanism to be simulated. Since it has to be performed between the acquisition of
two consecutive frames, all the equations need a real-time solution [32-34]. For this purpose it
is useful to optimize the solution strategy and kinematics simulations are more suitable
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 72
because involve the solution of a system of non linear equations, instead of a system of
differential-algebraic equations.
In order to explain in details all the steps to implement a multibody simulation in an
Augmented Reality environment we develop an example about a 3D manipulator.
Figure 12. The 3D manipulator of the example, and its closure loop.
With reference to Figure 12, let us consider a robot with 4 bodies connected with 3
revolute joints and 1 spherical joint. According to Grubler’s count the mechanism has 6
degrees of freedom:
( )
link
1
6 6 6
joints
i
i
dof n f
=
= ⋅ − − =
∑
Virtual Engineering in Augmented Reality 73
where
link
n is the number of moving parts, joints is the number of kinematic pairs and
i
f are
the degrees of constraint of the i-th pair.
It means that, in order to define in a unique way the position in space of the manipulator,
we have to prescribe 6 independent parameters (i.e. position and attitude of the end-effector).
For an interactive simulation it means that the user can freely choose the position and attitude
of the end effector and the Augmented Reality scene has to be able to include such a 6 d.o.f.
sensor. This sensor can be represented by a patterned marker.
The first step in building the model is the construction of geometries of each link that can
be stored in .vrml (Virtual Reality Mark-up Language) file. This format is quite common and
can be exported by almost all solid modelers. Dealing with .vrml is useful for rendering
geometries using OpenGL. The second step is about the preparation of the scene. We need a
marker (marker 0) to define the position and orientation of the manipulator world coordinate
system (i.e. its position inside the scene) and another marker (marker 1) to define the position
and the orientation of the end-effector that will work as a sensor (Figure 12, on the right).
The third step is about the implementation of the system of constraint equations which
can be built considering the closed loop of vectors (Figure 12, on the right). Several
approaches can be used for building the system of equations. It is useful, for the subsequent
graphical operations, to use the 4x4 homogeneous transformation matrix
[ ]
1 2 l l
T
−
to express
the relative position and attitude between two generic links 2 and 1. The first 3x3 portion of
this matrix is used to define the relative orientation between the two reference frames attached
to the two links. The last column is used to describe the relative position between the origins
of the coordinate frames. The last row of the matrix is [0 0 0 1]:
[ ]
[ ] [ ]
3x3 3x1
1 2
Orientation Position
0 0 0 1
l l
T
−
⎡ ⎤
=
⎢ ⎥
⎣ ⎦
Looking at the tip point P on the link 2 (center of the spherical joint) we can deduce its
position with respect to the marker 0 as:
{ } [ ] [ ] [ ] { }
marker0 link2 0 0 0 1 1 2 l l l l l
P T T T P
− − −
= ⋅ ⋅ ⋅ (1)
where:
{ }
marker0
P is the position vector of point P in marker 0 (world) coordinate system;
{ }
link2
P is the position vector of point P in link 2 (local) coordinate system;
[ ]
0 0 l
T
−
is the homogeneous transformation matrix between link 0 and marker 0. It is a
function of the parameter which describes the relative rotation at their relative
revolute joint;
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 74
[ ]
0 1 l l
T
−
is the homogeneous transformation matrix between link 1 and link 0. It is a
function of the parameter which describes the relative rotation at their relative
revolute joint;
[ ]
1 2 l l
T
−
is the homogeneous transformation matrix between link 2 and link 1. It is a
function of the parameter which describes the relative rotation at their relative
revolute joint;
Looking at the same point P, but on the slider and considering that the slider is attached
to the marker 1, we can deduce its position with respect to the marker 0 as:
{ } [ ] { }
marker0 slider 0 1
P T P
−
= ⋅
(2)
where:
{ }
slider
P is the position vector of point P in slider or marker 1 (local) coordinate system;
[ ]
0 1
T
−
is the homogeneous transformation matrix between marker 1 and marker 0. It is a
function of the 6 independent parameters which describes the relative position and
the relative rotation between the two markers. These parameters can be considered as
the input of the kinematic analysis because can be freely chosen by the user to move
the manipulator in space.
Since at point P, link 1 is connected to slider by means of a spherical joint, we can obtain
the closure loop equation of the mechanism as:
[ ] [ ] [ ] { } [ ] { }
link2 slider 0 0 0 1 1 2 0 1
0
0
0
1
l l l l l
T T T P T P
− − − −
⎧ ⎫
⎪ ⎪
⎪ ⎪
⋅ ⋅ ⋅ − =
⎨ ⎬
⎪ ⎪
⎪ ⎪
⎩ ⎭
(3)
The system of equations in (3) can be solved for the unknown kinematic parameters
starting from the knowledge of the position and attitude of the marker 1 (and the end-
effector slider) with respect to the marker 0. In order to compute this information we have to
know the relative transformation between marker 1 and marker 0.
ARToolkit deals with matrices which are similar to the homogeneous ones. They are 3x4
transformation matrices, containing information about the relative position and attitude as the
homogeneous ones but without the last dummy row. Quaternions and position vector can be
extracted from these matrices by using arUtilMat2QuatPos Artoolkit procedure.
The relative position of the camera with respect of marker 0 (
[ ]
0 c
T
−
) and marker 1
(
[ ]
1 c
T
−
) can be computed using an Artoolkit procedure arGetTransMat. It computes the
camera position and attitude in function of detected markers. Their inverse matrices
Virtual Engineering in Augmented Reality 75
(
[ ]
0 c
T
−
and
[ ]
1 c
T
−
) represent the relative transformations between the markers and the
camera. The relative transformation between the marker 1 and marker 0 can be computed as:
[ ] [ ] [ ]
0 1 0 1 c c
T T T
− − −
= ⋅
(4)
This relation is used to compute the coordinates of point P on slider (fixed to the marker
1) with respect to the marker 0. By doing this, the equations (3) can be solved for the
unknown angles. At this point the virtual position and attitude of each link of the mechanism
are known. The next step is to compute the projection of geometries in the augmented scene.
Since the renderer is OpenGl engine [35], we have to deal with the ModelView projection
matrix which maps the 3d points coordinates into 2d (scene) coordinates. The first step
concerns with the loading of the projection matrix computed from the perspective of the
marker 0. This task can be performed by using arglCameraViewRH whose output is a vector
of 16 elements containing the transformation and scale information to project the object from
the 3D coordinate system of the marker 0 into the camera view plane. This transformation can
be used to relate the position and attitude of each objects from the virtual world coordinate
system and the marker 0. The next steps concern the computation of the transformation in
order to draw the manipulator link in the scene, using the OpenGl operators gl Rot at ed
and gl Tr ansl at ed. The first performs a rotation of an angle about a specified axis, the
second performs a translation of a specified amplitude along a specified direction.
For the base it is sufficient to perform a rotation about the z axis of the marker 0 of an
angle:
gl Rot at ed( al pha, 0. 0, 0. 0, 1. 0) ;
For the first link three transformations need: a translation a1 along z axis of marker 0, a
rotation about the same z axis of an angle and a rotation of a angle about the axis of the
first revolute joint:
gl Tr ansl at ed( 0. 0, 0. 0, a1) ;
gl Rot at ed( al pha, 0. 0, 0. 0, 1. 0) ;
gl Rot at ed( bet a, 0. 0, - 1. 0, 0. 0) ;
For the second link, three transformations need: a rotation of an angle about the -z
axis of the marker 0, a translation to the center of the revolute joint between the 1
st
and the 2
nd
links and a rotation of an angle about the axis of the revolute joint between the 1
st
and the
2
nd
links:
gl Rot at ed( al pha, 0. 0, 0. 0, 1. 0) ;
gl Tr ansl at ed( l 1*cos( bet a) , 0. 0, l 1*si n( bet a) +a1) ;
gl Rot at ed( gamma, 0. 0, - 1. 0, 0. 0) ;
The end-effector, since it is fixed to the marker 1, can be projected using marker 1
projection matrix without applying any further transformation.
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 76
Screenshots of animation are reported in Figure 13.

Figure 13. Snapshots of the interactive animation.
3.2. FEM Pre and Postprocessing
By means of finite element analyses engineers analyze the mechanical stresses inside a
structure and its deformation to applied loads. Starting from the knowledge of elastic properties
of each part, topological constraints between parts and presence of external actions (forces,
torques, enforced displacements) it is possible to simulate the deformation of a system and
Virtual Engineering in Augmented Reality 77
assess the level of mechanical stresses inside it. Readers interested in techniques for deducing
the governing equations and solution strategies can refer to referenced book [36].
Let us consider a practical application in order to illustrate how the augmented reality can
support this kind of engineering simulation. The purpose is to simulate a steel bar that is
pinned on two mountings and it is loaded with a force. The two mountings and the bar are
physically present in the real world (Figure 14). The force, the resultant deformation and the
stress field have to be added as virtual contents. The first step is to built the virtual parts. This
task can be done using the modeling techniques illustrated in the previous section, using the I-
Pen to acquire the geometry of the bar. The second step is to define the location and the
amplitude of the force. This can be also done with the I-Pen, by pointing on the force location
on the real world and defining load amplitude using the keyboard. Then, we have to perform
the numerical computation using an external FEM solver. The last step is the projection of the
numerical results in the real scene. Generally, the main results of such simulations concern
with a deformed shape of the structure (that reproduces the deformation in an amplified way)
that is coloured with a palette indicating the stress level. The information has to be added to
the real world. Again, we have to collimate the two main reference frames of the real world
and the virtual one.
Figure 14. Simulation of the deformation and stresses of a pinned bar in augmented reality. Starting
from a real environment and adding collimated virtual objects (on the top), an augmented animation can
be built (at the bottom) .
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 78
This operation can be made using communication reference frames in the same way of
the previous discussion. For the specific example of the bar, it can be convenient to put a
visible patterned marker locate near the physical bar (Figure 14), defining the real world
communication frame. Similarly, we can define the virtual communication frame at the
corresponding location in the virtual world. By this way it is possible to locate the virtual
deformed and colored bar in the real scene with a simple coordinate transformation
(translation and rotation) using OpenGl operators gl Rot at ed and gl Tr ansl at ed.
As a result, the augmented scene includes the real world with the virtual bar (Figure 14,
at the bottom). The scene can be also enriched including an animation of the deformed shape
instead of static geometry. It is useful especially for vibration analyses where the visualization
of modal shape is very important. By this way, the user can visualize from different points of
view, the deformation and the stress level of the bar directly in the real world looking at its
performance, verifying excessive stresses or deformation.
The implementation can be summarizes in the following six steps:
1. The real scene has to contain information for collimating the real world to the virtual
objects. A patterned marker or a coordinate system defined by picking points with I-
Pen can be used;
2. The geometry of component(s) under investigation has to be acquired;
3. Boundary conditions (constraints and loads) have to be defined too;
4. The computation of deformed shapes and stress level can be performed using an
external FEM solver;
5. The results coming from step 4 has to be converted into coloured .wrml entities (i.e.
deformed shapes with stress contour plot);
6. The .wrml entities have to be collimated with the real world and rendered on the
augmented scene.
The augmented video can be enriched with other visual information on the exact
numerical values of stresses (i.e. adding a virtual legend), constraints and restraints forces,
etc. This can be performed rendering on the scene both vectors (for the direction) and
numerical values (for the amplitude).
3.3. CFD Postprocessing
With computational fluid-dynamics engineers analyze liquid and gasses flows inside or
outside parts. Starting from the knowledge of fluid properties, boundary conditions (flow rate,
pressure, temperature, etc.) and wall properties, it is possible to simulate the kinematic and
thermodynamic behavior of fluids and their interaction with solids. Readers interested in
techniques for deducing the governing equations and solution strategies can refer to
referenced book [37].
Let us consider a practical application in order to illustrate how the augmented reality can
support this kind of engineering simulation. The purpose is to simulate an external air flow
around a cylinder on a table. Both table and cylinder are physically present in the real world
(Figure 15). The stream line and the pressure field on the surfaces have to be added as virtual
contents. The first step is to built the virtual parts. This task can be done using the modeling
Virtual Engineering in Augmented Reality 79
techniques illustrated in the previous section, using the I-Pen to acquire the geometry of the
cylinder and of the table. The second step is to define all the bounding conditions for the
simulation. Because there are many values to be defined, it is convenient to complete the
model outside the augmented reality environment. Then we have to perform the numerical
computation using an external CFD solver. The last step is the projection of the numerical
results in the real scene. Generally, the main results of such simulations concern with several
stream lines describing the fluid trajectories which are coloured with a palette indicating the
velocity, pressure or temperature value. The information has to be added to the real world.
Once again, we have to collimate the two main reference frames of the real world and the
virtual one. This operation can be made using communication reference frames in the same
way of the previous discussions. For the specific example of the cylinder, it can be convenient
to put a visible patterned marker locate near the physical cylinder (Figure 15), defining the
real world communication frame. Similarly, we can define the virtual communication frame
at the corresponding location in the virtual world. By this way it is possible to locate the
virtual stream lines and surface plots in the real scene with a simple coordinate transformation
(translation and rotation) using OpenGl operators gl Rot at ed and gl Tr ansl at ed.
Figure 15. Simulation of the air flow around a cylinder on a table in augmented reality. Starting from a
real environment and adding collimated virtual objects (on the top), an augmented animation can be
built (at the bottom) .
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 80
As a result, the augmented scene includes the real world with the virtual surfaces touched
by the fluid (cylinder and table) (Figure 15, at the bottom). The scene can be also enriched
including an animation of the stream lines. By this way the user can visualize from different
points of view, the fluid stream around the cylinder and the pressure acting on the boundary
surfaces directly in the real world.
The implementation can be summarizes in the following six steps:
1. The real scene has to contain information for collimating the real world to the virtual
objects. A patterned marker or a coordinate system defined by picking points with I-
Pen can be used;
2. The geometry of component(s) under investigation has to be acquired;
3. Boundary conditions (fluid flows, pressure openings, wall properties, etc.) have to be
defined too;
4. The computation of fluid field and pressure on surfaces can be performed using an
external FEM solver;
5. The results coming from step 4 has to be converted into coloured .wrml entities (i.e.
stream lines, surfaces coloured according to pressure values);
6. The .wrml entities have to be collimated with the real world and rendered on the
augmented scene.
The augmented video can be also enriched with other visual information on the exact
numerical values of fluid or surface parameters adding a virtual legend (i.e for describing
pressure, velocities, temperature, etc.).
4. Conclusion
The integration between computer aided engineering tools and augmented reality has revealed
to be a valid instrument in supporting designers and users in modeling, testing and reviewing
their products. With the integration of specific hardware devices as trackers and sensors, the
user can interact with the scene in an immersive way. For the modeling of shapes, a magnetic
sensor can be useful to acquire a precise position in space of a virtual pen in order to allow
the user to sketch virtual objects directly on the real world. Moreover the comprehension of
the results of engineering simulations as motion analyses, structural investigations, fluid
dynamics computations can be improved by enhanced visualization and interaction with real
and virtual objects.
The discussed instrument and methodologies can be also useful for the collaborative
design. Very often, scientists or engineers teams work on the same project at different
locations. In this case, all the designers can wear an AR sub-system and all these sub-systems
can communicate among them. Imagine that a group of designers is working on the model of
a complex device for their customers. The designers and customers want to perform a joined
design review even though they are physically separated. If each of them is equipped with an
augmented reality display this could be accomplished. The physical prototype that the
designers have mocked up is imaged and displayed in the client’s AR system in 3D. The
client may look at different aspects of it, testing engineering performances and checking its
integration to the real world.
Virtual Engineering in Augmented Reality 81
The future of this combination between AR and Virtual Engineering is very promising.
The discussed examples are only a small part of the capabilities of such integration. The main
future challenges are about the manipulation of objects, the capabilities for virtual assembling
and the fully integration of numerical methodology without requiring external solvers.
References
[1] Bernard A. (2005). Virtual engineering: methods and tools. Proceedings of the
Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture Vol.
219 (5), pp. 413-422.
[2] Azuma R.T. (1997). A survey of augmented reality. Teleoperators and Virtual
Environments, 6(4), pp. 355–385.
[3] Bimber O., Raskar R. (2005). Spatial Augmented Reality: Merging Real and Virtual
Worlds, A K Peters, Ltd.
[4] Vallino J. (1998). Interactive augmented reality. PhD thesis, Department of Computer
Science, University of Rochester, USA.
[5] Azuma R., Baillot Y. et al. (2001). Recent advances in augmented reality. IEEE
Computer Graphics 21(6), pp. 34–47.
[6] Haniff D., Baber C., Edmondson W. (2000). Categorizing augmented reality systems.
Journal of Three Dimensional Images 14(4), pp. 105– 109.
[7] Klinker G., Ahlers K.H. et al. (1997). Confluence of computer vision and interactive
graphics for augmented reality, PRESENCE: teleoperations and virtual environments.
Special issue on Augmented Reality, 6(4), pp. 433–451.
[8] Fotis Liarokapis (2007). An augmented reality interface for visualizing and interacting
with virtual content, Virtual Reality 11, pp. 23–43.
[9] Muller A., Conrad S., Kruijff E. (2003). Multifaceted interaction with a virtual
engineering environment using a scenegraph-oriented approach. Proceedings of the
11th Int. Conf. in Central Europe on Computer Graphics, Visualization and Computer
Vision, Czech Republic.
[10] Samset E., Talsma A., Elle O., Aurdal L., Hirschberg H., Fosse E. (2002). A virtual
environment for surgical image guidance in intraoperative MRI, Computer Aided
Surgery 7(4), pp.187– 196.
[11] Stilman M., Michel P., Chestnutt J., Nishiwaki K., Kagami S., Kuffner J.J. (2005).
Augmented Reality for Robot Development and Experimentation. Tech. Report CMU-
RI-TR-05-55, Robotics Institute, Carnegie Mellon University.
[12] Ong S.K., Pang Y., Nee, A.Y.C. (2007). Augmented Reality Aided Assembly Design
and Planning. Annals of the CIRP Vol. 56/1 pp.49-52.
[13] Pang Y., Nee A.Y.C., Ong S.K., Yuan M.L. Youcef-Toumi K. (2006). Assembly
Feature Design in an Augmented Reality Environment, Assembly Automation, 26/1, pp.
34-43.
[14] Sharma R. and Molineros J. (1995). Computer vision-based augmented reality for
guiding manual assembly. PRESENCE: Teleoperators and Virtual Environments, n. 3,
pp. 292-317.
Pier Paolo Valentini, Eugenio Pezzuti and Davide Gattamelata 82
[15] Webster A., Feiner S., MacIntyre B., Massie W., Krueger T. (1996). Augmented reality
in architectural construction, inspection, and renovation. Int. Proc. Of Third Congress
on Computing in Civil Engineering ASCE 3, Anaheim, CA, pp. 913-919.
[16] Liarokapis, Petridis P., Lister P.F., White M. (2002). Multimedia augmented reality
interface for E-learning (MARIE). World TransEng Technol Educ 1(2), pp.173–176.
[17] Pan Z., Cheok A.D., Yang H., Zhu J.,_, Shi J. (2006). Virtual reality and mixed reality
for virtual learning environments. Computers & Graphics 30, pp. 20–28.
[18] Kaufmann H., Schmalstieg D., Wagner, M. (2000). Construct3D: A Virtual Reality
Application for Mathematics and Geometry Education. Education and Information
Technologies 5(4), pp. 263-276.
[19] Dangelmaier W., Fischer M., Gausemeier J., Grafe M., Matysczok C., Mueck B.
(2005). Virtual and augmented reality support for discrete manufacturing system
simulation. Computers in Industry 56, pp. 371–383.
[20] Ong S.K., Nee A.Y.C. (2004). Virtual and Augmented Reality Applications in
Manufacturing. Springer, London, UK.
[21] Reif R., Walch D. (2008). Augmented & Virtual Reality applications in the field of
logistics. Visual Computing 24, pp. 987–994.
[22] Friedrich W. (2004). ARVIKA—Augmented Reality for Development, Production and
Service. Publicis Corporate Publishing, Erlangen.
[23] Liarokapis F., Sylaiou S., et al. (2004). An interactive visualization interface for virtual
museum. Proceedings of the 5th international symposium on Virtual Reality,
Archaeology- Cultural Heritage, Brussels and Oudenaarde, pp 47–56.
[24] Narzt W., Pomberger G., Ferscha A., Kolb D., Muller R, Wieghardt J., Hortner H.,
Lindinger C. (2006). Augmented reality navigation systems. Univ Access Inf Soc 4, pp.
177–187.
[25] Klinker G., Stricker D., Reiners D. (1999). Optically based direct manipulation for
augmented reality, Computers & Graphics 23, pp. 827-830.
[26] Bowman D.A. (1996). Conceptual Design Space: Beyond Walk-through to Immersive
Design. In Bertol D., Designing Digital Space, John Wiley & Sons, New York.
[27] Bowman D.A. (1999). Interaction Techniques for Common Tasks in Immersive Virtual
Environments: Design, Evaluation, and Application. Ph.D. thesis, Virginia Polytechnic
& State University.
[28] Liang J. and Green M. (1994). JDCAD: a highly interactive 3D modeling system.
Computers & Graphics 18(4), pp. 499-506.
[29] Raskar R., Welch G., Cutts M., Lake A., Stesin L., Fuchs H. (1998). The office of the
future: a unified approach to image-Based modeling and spatially immersive displays.
Proceedings of SIGGRAPH 98, Orlando, FL, July 19–24.
[30] Garçıa de Jálon J., Bayo E. (2004). Kinematic and Dynamic Simulation of Multibody
Systems – the Real-Time Challenge. Springer-Verlag, New York.
[31] Haug E.J. (1989). Computer-Aided Kinematics and Dynamics of Mechanical Systems.
Allyn and Bacon, Boston, MA.
[32] Korkealaakso P.M., Rouvinen A.J., Moisio S.M., Peusaari J.K. (2007). Development of
a real-time simulation environment. Multibody System Dynamics, 17, pp. 177–194.
[33] Naya M.A., Dopico D., Perez J.A., Cuadrado J. (2007). Real-time multi-body
formulation for virtual-reality-based design and evaluation of automobile controllers.
Virtual Engineering in Augmented Reality 83
Proceedings of the Institution of Mechanical Engineers, Part K: Journal of Multi-body
Dynamics, Vol. 221 (2), pp. 261-276.
[34] Isnard F., Dodds G., Vallée C., Fortuné D. (2000). Real-time dynamics simulation of a
closed-chain robot within a virtual reality environment. Proceedings of the Institution
of Mechanical Engineers, Part K: Journal of Multi-body Dynamics Vol. 214(4), pp.
219-232.
[35] Wright R.S., Lipchak B. (2004). OpenGl Super Bible, third edition, Sams Publishing,
USA.
[36] Zienkiewicz O.C. (1977). The Finite Element Method in Engineering Science.
McGraw-Hill Publ., London.
[37] Anderson J.D. (1995). Computational Fluid Dynamics, Mc Grow Hill Inc., USA.
In: Computer Animation
Editors: J.S. Wright and L.M. Hughes, pp. 85-111
ISBN 978-1-60741-559-6
c 2010 Nova Science Publishers, Inc.
Chapter 3
A SURVEY OF POPULAR 3D SOFT-BODY
ANIMATION COMPRESSION APPROACHES
S. Ramanathan and A.A. Kassim
National University of Singapore,
Dept. of Electrical and Computer Engineering, Singapore
1. Introduction
The world of Computer Graphics has seen rapid developments over the past few decades.
Since the appearance of 3D computer animation in the science ﬁction movie, Futureworld
(1976), animations have been successfully employed in Toy Story (1995), Shrek (2001) and
Happy Feet (2006).
Contemporary animations are synthesized by deforming 3D objects, typically modeled
using triangular meshes. A triangular mesh is deﬁned by its geometry (location of ver-
tices), connectivity (how the triangles are connected), surface color, normal and texture
properties. An isolated triangle requires more than 100 bytes for its description, since the
geometry, normal and texture components are represented using ﬂoating point numbers.
For accurate representation of real world objects, 3D mesh models typically require many
thousands of vertices and triangles. Therefore, 3D meshes demand extensive storage mem-
ory and transmission bandwidth, which makes 3D animation compression
1
a very relevant
problem that has generated much interest in recent years. Besides entertainment and video
games, 3D meshes and animations are used in biometrics, computer-aided design (CAD)
and medicine.
3D soft-body animations are synthesized by applying non-linear deformations on a
static mesh to generate realistic, life-like motion. Complex, realistic and human- like mo-
tion of these characters is achieved using physically-based [41, 9] or synthetic [30] anima-
tion techniques, by moving the 3D mesh vertices in separate trajectories, as compared to
rigid-body transformations where the whole mesh deforms homogeneously. As shown in
Fig. 1, mesh deformations can be achieved by either
1
In the 1980s, the need for efﬁcient multimedia storage and transmission led to the development of the
popular JPEG (Joint Photographic Experts Group) and MPEG (Moving Picture Experts Group) techniques.
86 S. Ramanathan and A.A. Kassim
i. changing the mesh geometry while keeping the connectivity constant- where mesh is
deformed by changing the position of the mesh vertices. These animations are termed
dynamic geometry sequences.
ii. altering mesh geometry as well as connectivity - where deformations are accompa-
nied by movement of the mesh vertices as well as addition/deletion of new/existing
vertices which modiﬁes the existing connectivity.
While it is intuitive to treat coding of 3D animations (essentially 3D video) similar to 2D
video compression [24], the two problems are inherently different. The key to efﬁcient
3D animation coding is the compact representation of the inter-mesh motion using a few
parameters analogous to motion vectors in video compression. A number of animation cod-
ing algorithms [25, 1, 46] have attempted to address the compact 3D motion representation
problem using the video coding approach involving the following steps -
(i) Segmentation - a necessary pre-processing step to efﬁciently represent the motion
between consecutive video frames. The current video instance (frame) is segmented
into smaller blocks (of size 8 ×8 or 16 ×16 pixels) during this process.
(ii) Motion prediction - which allows for compact inter-frame motion representation in
terms of motion parameters. The motion parameters are computed by ﬁnding the
best-match block in the temporal reference for each block in the current frame.
(iii) Parameter encoding - where the amount of information required to represent the
motion parameters is further reduced using data compression techniques such as
Huffman/Arithmetic coding [45].
However, the arbitrary topology (i.e., arrangement of vertices in space) of 3D meshes
makes it difﬁcult to efﬁciently segment 3D meshes and thereby translate well known video
compression techniques to 3D. The difﬁculty involved in segmenting non-planar 3D meshes
as compared to trivial planar image segmentation is illustrated in Fig. 2. Therefore, a
number of newmethods have been proposed for efﬁcient 3Ddynamic mesh coding. MPEG-
4 Part 25 [19] and www.3dcompression.com are resources dedicated to research on 3D
animation compression.
In this chapter, we review 3D dynamic mesh compression algorithms and investigate
how vertex clustering, which chieﬂy contributes to animation coding complexity, affects
compression performance. We ﬁnally conclude this chapter with observations that need to
be effectively addressed by future 3D animation coding algorithms.
2. An Overview of Mesh Coding Algorithms
2.1. Static Mesh Compression
Over the past decade, much research has been focused on compression of 3D static meshes
and a majority of these [7, 8, 10, 11, 26, 34, 40, 42] address efﬁcient encoding of mesh
connectivity. Using spiralling tree-based encoding schemes, popular algorithms like Cut-
border[11] and Edgebreaker [34] achieve lossless connectivity compression with only up
A Survey of Popular 3D Soft-Body Animation Compression Approaches 87
(a)
(b)
Figure 1. Examples of animations synthesized by (a) changing mesh geometry as seen
from frames 95 and 105 of the Chicken animation or (b) changing both mesh geometry and
connectivity (around the mouth) for realistic facial expression generation.
(a) (b)
Figure 2. (a) Image segmentation into equal-sized blocks is a simple pre-processing step
to motion prediction in video coding. A segmented frame from the Foreman sequence.
(b) Segmentation of the Dino 3D mesh (2039 vertices, 3999 triangles) into 29 pieces using
spectral mesh decomposition [28]. Since 3D meshes are non-planar, segmenting 3D meshes
into coherent pieces is a non-trivial and compute-intensive task.
88 S. Ramanathan and A.A. Kassim
to 1.5-2 bits per mesh triangle. A few multiresolution geometry-cum-connectivity repre-
sentation techniques [17, 32] have been proposed to facilitate progressive data transmission
and achieve a compression efﬁciency of around 4-10 bits per vertex (bpv). Compression of
mesh geometry is only a supplementary to the connectivity coding scheme in these algo-
rithms.
Geometry compression, which involves coding of the ﬂoating point (x,y,z) vertex coor-
dinates, is inherently lossy and has been attempted using predictive coding as well as signal
processing-based techniques. Predictive coding [42] exploits correlation in the mesh data
by predicting a vertex position using the positions of its neighboring vertices. Prediction
errors are quantized and entropy coded for compact representation. A typical compression
efﬁciency of 7-12 bpv is obtained using this scheme. Spectral compression [48] is a popular
signal processing-based mesh compression method, where the mesh geometry is projected
onto an orthonormal basis and reconstructed using a small number of components in the
basis. This method achieves a compression efﬁciency of around 14 bpv for perceptually
lossless encoding. A wavelet-based geometry compression technique [2] achieves a com-
pression efﬁciency of 8 bpv. Since the number of components for geometry reconstruction
can be adaptively varied, [48] and [2] can also be used for multiresolution mesh represen-
tation. Other techniques that tackle the problem of mesh representation with various levels
of detail are [32, 8].
2.2. Dynamic Mesh Coding Algorithms
As noted above, there are two types of 3D animation sequences - (i) dynamic geometry
sequences where mesh motion is achieved by moving the mesh vertices with time, and
(ii) dynamic geometry-cum-connectivity sequences where mesh motion is accompanied by
changes in mesh geometry as well as connectivity. Dynamic geometry compression algo-
rithms can be grouped into three major classes based on their implementation- Registration-
based, Prediction-based and PCA-based multiresolution representation. Examples of these
algorithms are discussed below.
2.2.1. Registration-Based Compression
In Lengyel’s pioneering work on registration-based dynamic geometry compression [25], he
proposes the segmentation of the mesh into smaller sub-meshes and represents the motion
of each of these sub-meshes using rigid-body afﬁne transforms. His compression mecha-
nism yields an efﬁciency of 3.45 bpvf (bpv per frame) for the Chicken animation with 16
and 4 bits used for afﬁne and vertex quantization respectively.Ibarria et al. report a com-
pression efﬁciency ranging from 1.37 to 2.91 bpvf for their Dynapack algorithm [18] when
the quantization ranges from 7 to 13 bits for test animations. Their algorithm exploits space-
time coherence in dynamic geometry by predicting the position of each vertex v in frame
f from three of its neighbors in f and the positions of v and its neighbors in the previous
frame.
A video coding-like method which segments the 3D mesh into blocks and computes
motion vectors and error residuals for each mesh block is proposed by Ahn et al. [1]. A
compression efﬁciency of 9.6 bpvf is obtained using the encoding scheme that consists of I
A Survey of Popular 3D Soft-Body Animation Compression Approaches 89
(Intra), P (Predicted) and B (Bi-directionally predicted) meshes for the Chicken animation.
In Gupta et al.’s dynamic geometry compression scheme [13], the mesh is partitioned into
segments, and the displacement of vertices in each segment is computed using Iterative
Closest Point (ICP)-based registration. The encoding scheme describes mesh motion using
a few afﬁne parameters and residual errors to achieve a compression efﬁciency of 2.5 bpvf
for the Chicken animation.
2.2.2. Prediction-Based Compression
Another interesting work on dynamic geometry compression is that of Yang et al. [46],
based on vertex-wise motion vector (MV) prediction. Each vertex is given a motion vec-
tor obtained from the neighborhood of the vertex, deﬁned as the set of all vertices within
a threshold distance around the vertex. This their coding procedure requires a third of the
bitrate compared to [25] for the same quality of animation reconstruction measured in terms
of Signal-to-Noise ratio (SNR). Stefanoski et al. propose a connectivity-based prediction
technique in [38], where prediction is performed in a frame-to-frame fashion using the pre-
vious frame and the partly decoded current frame. Mesh connectivity is used to determine
the order of vertex compression and the spatial-cum-temporal dependency between vertex
locations is exploited using a non-linear spatio-temporal predictor with angle preserving
properties. They report a 25% improvement in compression performance over competing
prediction schemes like [18], especially for high quality animation reconstruction. Muller
et al. [31] propose another prediction-based compression algorithm using Differential Pulse
Code Modulation (DPCM) where errors in prediction fromthe previously decoded mesh are
clustered in an octree. Only a representative from each cluster is used for further processing
which results in a signiﬁcant reduction in bit-rate.
2.2.3. Multiresolution Representation
Recently, multi-resolution mesh representation for bandwidth limited streaming applica-
tions has generated much interest. A notable work on multiresolution representation of
dynamic geometry is that of Alexa et al. [3] who propose a Principal Component Anal-
ysis (PCA)-based compact animation representation scheme where each mesh in the an-
imation sequence is projected on a basis of n PCA eigenvectors. The animation may be
reconstructed using k eigenvectors where k << n. Higher the k, greater the level of de-
tail. Another example of wavelet-based multiresolution encoding is that of Guskov et al.
[14], which exploits parametric coherence in mesh sequences using an anisotropic wavelet
transform and progressively encodes wavelet details. Payan et al. [33] propose another
wavelet-based multiresolution representation scheme based on a temporal lifting scheme
that exploits the temporal redundancy in dynamic geometry.
In [21], Karni et al. propose a compression scheme that employs a combination of Prin-
cipal Component Analysis (PCA) and Linear Predictive Coding (LPC). Recently, localized
PCA-based dynamic geometry coding techniques have yielded good compression perfor-
mance. Sattler et al. [35] propose animation compression using Clustered PCA (CPCA)
where the mesh is ﬁrst segmented into meaningful components based on vertex motion anal-
ysis and PCA is applied on each of these components. This compression scheme outper-
forms both pure PCA-based and PCA+LPC approaches while achieving better animation
90 S. Ramanathan and A.A. Kassim
reconstruction. Another Localized PCA Analysis (LPCA)-based compression scheme is
proposed by Amjoun et al. [4]. Upon clustering the mesh using local similarity properties,
a local coordinate system is deﬁned for each cluster with respect to which the cluster motion
is encoded using PCA. LPCA coding achieves better compression performance compared
to CPCA-based compression for similar quality of reconstructed animation.
2.2.4. Other Coding Algorithms
Varakliotis et al. propose animation encoding with RTP packetization in [43], and recom-
mend insertion of I frames in the encoded/transmitted mesh sequence to maintain animation
smoothness. A Differential Pulse Code Modulation (DPCM)-based encoder is used to com-
press the animation, whose compression efﬁciency is low. Main contributions of this work
include (i) Analysis of the trade-off between compression performance and reconstructed
animation quality and (ii) Introduction of the Peak Mean Square Error (PMSE)-based dis-
tortion metric to tackle degradation of animation smoothness under noise.
MPEG-4 Part 25 [19] presents generic tools for dynamic 3D mesh compression using
Bone-Based Animation (BBA), which involves decomposition of geometric motions in the
animation to elementary transformations, and Frame-based Animation Mesh Compression
(FAMC), where the animation is divided into segments that can be decoded independently.
A spatially and temporally scalable compression scheme for 3D animations using FAMC,
where the original animation is reconstructed at multiple layers corresponding to different
spatial resolutions is proposed in [39]. Boulfani et al. [5] propose a 3D dynamic mesh
compression scheme where geometry compensation is performed upon clustering the mesh
using motion characteristics followed by application of the scan-based wavelet transform.
2.2.5. Encoding 3D Dynamic Meshes with Changing Connectivity
Very few works deal with compression of animations with changing connectivity. Shamir
et al. [36] suggest a multi-resolution representation scheme for animations with changing
connectivity using the T-DAG data structure, which can be incrementally constructed even
as the input mesh is processed. Gupta et al. [12] propose an Iterative Closest Point (ICP)
registration-based geometry-cum-connectivity coding scheme for dynamic 3D MMs. As
in [13], the current and previous frames are partitioned to generate sub-meshes and inter-
mesh correspondences are computed using ICP to identify the added/deleted vertices over
time. Subsequently, the errors in geometry and connectivity prediction are encoded and
transmitted.
3. Vertex Clustering for Dynamic Geometry Coding
In this section, we investigate how vertex clustering, which involves grouping of mesh
vertices for motion prediction, affects dynamic geometry compression. We ﬁrst discuss
the impact of vertex clustering on registration-based dynamic mesh coding with ICP-
based compression [13] as an example, and see how PCA coding approaches developed on
similar lines [35, 4] have yielded superior compression performance for dynamic geometry
animations in general.
A Survey of Popular 3D Soft-Body Animation Compression Approaches 91
Vertex clustering techniques group mesh vertices into a number of sets, where the
grouping may done on the basis of (a) mesh topology (b) mesh geometry or (c) seman-
tic mesh segmentation.
3.1. Overview of Vertex Clustering Techniques
3.1.1. Topology-Based Clustering
Topology-based clustering techniques partition the mesh based on vertex adjacency. They
are more popularly known as ”Graph Partitioning Techniques” and many algorithms have
been developed to solve the graph partitioning problem [16, 23]. The graph partitioning
problem was formulated to solve a compute intensive problem using a number of parallel
processors where the objective was to share the workload equally among the processors
while keeping number of inter-processor communications small. In the case of the 3D tri-
angular mesh, the aim is to divide it into equal-sized pieces with minimum number of edges
between the pieces. To this end, the multilevel k-way graph partitioning algorithm divides
the mesh into k roughly equal partitions such that the number of cut edges connecting ver-
tices in different partitions is minimized.
In MPEG video compression [24], the image is divided into smaller, equal-sized blocks
for efﬁcient motion prediction. Likewise, decomposition of the mesh into equal-sized seg-
ments can be achieved using the topology-based multilevel k-way graph partitioning algo-
rithm [16]. Given the mesh geometry V , the function of the graph partitioning algorithm is
to divide the mesh, consisting of n vertices, into k subsets, V
1
, V
2
, . . . , V
k
such that
V
i
∩ V
j
= φ for i = j (1)
|V
i
| = n/k (2)

i=1..k
V
i
= V (3)
where |V
i
| denotes the cardinality of the i
th
cluster and

V
i
denotes the union of the k
clusters. In this mode of clustering, vertices are clustered with their connected neighbors as
given by the mesh connectivity and no knowledge of mesh geometry is required.
For a graph G(V, E) containing vertex set V and edge set E, the graph partitioning
algorithm ﬁrst coarsens the original graph G
0
= G(V
0
, E
0
) into a series of coarse graphs
G
i
= G(V
i
, E
i
), such that the number of vertices at G
i
is approximately half the number
of vertices at G
(i−1)
i.e., |V
i
| ≈
1
2
|V
i−1
|. The graph is coarsened by performing a series
of edge contractions. A maximal set of edges, no two of which are incident on the same
vertex, are ﬁrst determined and these edges are contracted. This coarsening procedure maps
each vertex in the ﬁne graph G
i−1
to a unique vertex in the coarse graph G
i
and therefore,
graph topology is preserved. The coarsening terminates when the original graph has been
coarsened to G
m
= G(V
m
, E
m
) where |V
m
| is typically a small number.
The coarsest graph G
m
is now segmented using a spectral partitioner [15] which
uses eigenvectors of the graph Laplacian for partitioning. The partitions obtained for the
coarsest graph are propagated back to the ﬁner graphs by projecting the k partitions onto
G
m−1
, G
m−2
, . . . , G
0
. The projected partitioning onto G
i−1
is occasionally reﬁned using
local reﬁnement heuristics based on the Kerninghan-Lin (KL) algorithm [23]. Vertices are
92 S. Ramanathan and A.A. Kassim
incrementally swapped among the partitions to reduce the number of cut edges connecting
vertices in different partitions. The mesh is ﬁnally divided through recursive bisection into
k sets each containing about |V
0
|/k vertices. The clusters generated by the k-way graph
partitioning algorithm for various meshes are shown in Fig 3.
Overall, multilevel k-way partitioning performs better than competing inertial or spec-
tral bisection approaches [37] in terms of execution time and the partition quality (based
on the number of cut edges). However, the vertex clusters obtained by minimization of the
number of cut-edges are ineffective for determining the mesh motion. This is evident from
Fig 2 where vertices belonging to distinct mesh regions are clustered together (parts of the
nose, forehead and cheeks in the face; pelvis and thigh regions for blade) while vertices
corresponding to the same region fall in different clusters (mouth region of the face and the
claws for the chicken). Clearly, the clustered vertices do not undergo homogeneous motion.
Also, when the same clusters are used for the entire sequence, detecting the coherent mo-
tion regions becomes difﬁcult and consequently, the compression performance is affected
as we will see in latter sections where we analyze related experimental results.
(a) (b) (c) (d)
Figure 3. Clusters generated by topology-based k-way partitioning algorithm for (a)
Chicken - 32 partitions (b) Face - 8 partitions (c) Dinopet - 32 partitions and (d) Blade
- 8 partitions.
3.1.2. Geometry-Based Clustering
Geometry-based clustering involves grouping of vertices based on their positional closeness
and is independent of the mesh connectivity. The Lloyd’s k-means algorithm [29] is a
popularly used geometry-based clustering technique. Given a set of n data points {x
i
} in
d-dimensional space and the required number of clusters k, the problem is to determine a
set of k centers {c
j
} such that the mean squared distance of each point to its nearest center,
termed the average distortion D, is minimum.
The algorithm works as follows. The initial k cluster centers are chosen at random and
the data points {x
i
} are partitioned into k clusters by assigning each point to the cluster
containing the closest c
i
. The set of data points to which c
i
is the nearest center is known
A Survey of Popular 3D Soft-Body Animation Compression Approaches 93
as the neighborhood of c
i
and is denoted by V (c
i
). Once the initial centers and their neigh-
borhoods have been determined, the algorithm proceeds by moving the c
i
’s to the centroid
of their clusters and recomputing V (c
i
) for each of the c
i
’s. This process iterates until con-
vergence is achieved or the mean distortion D achieves a local minimum. A summary of
the Lloyd’s algorithm is presented below.
• Step 1: Initialize {c
j
} by selecting the c
j
’s at random.
• Step 2: Determine the neighborhood V (c
j
) for each of the c
j
’s by assigning the x
i
’s
to their closest center.
V (c
j
) = {x
i
: d(x
i
, c
j
) ≤ d(x
i
, c
k
), for all k = j} (4)
• Step 3: Move each of the c
j
’s to the centroid of V (c
j
).
c
j
=
1
|V (c
j
)|

i
(x
i
), x
i
∈ V (c
j
) (5)
• Step 4: Repeat Step 2 and Step 3 until mean distortion D is minimum i.e.,
D =
1
k
k

j=1
1
|V (c
j
)|

x∈V (c
j
)
(x
i
−c
j
)
2
= D
min
(6)
(a) (b) (c)
Figure 4. Illustration of Lloyd’s clustering for k=3. (a) The initial cluster centers in red,
blue and green and their computed neighborhoods (b) Centers are moved to the centroid of
the cluster and the data points are re-assigned to the nearest centers. (c) Final clusters.
Fig 4 illustrates the working of the Lloyd’s algorithm. In the context of mesh parti-
tioning, the clusters themselves are more important than the cluster centers. For k-means
clustering, it can be proved that the local minimum distortion measure would correspond
to a ”centroidal Voronoi” conﬁguration [20], where each data point is closer to its cluster
center than any other cluster center. The partitions move closer to this conﬁguration at
every step until convergence, and the ﬁnal clusters would correspond to the local energy
minima, even when the initial centers are badly chosen. However, slightly different initial
94 S. Ramanathan and A.A. Kassim
partitionings do not produce the same set of clusters. Also, while the ﬁnal partitioning is
deﬁnitely better than the initial partitioning, it need not correspond to the global minimum.
Nevertheless, this is not a signiﬁcant problem since data repartitioning may be performed
later, as explained in the next section. Since vertex clustering is independent of the mesh
connectivity, vertex neighbors have to be computed explicitly. For 3D meshes, computing
nearest neighbors is not a trivial problem. The Lloyd’s implementation in [20] computes
nearest neighbors using a kd-tree (k dimensional tree) data structure. Also, since the data
points do not change throughout the cluster computation process, the kd-tree needs to be
computed only once. The clusters at every step are determined by computing the nearest
center for each of the tree nodes.
Since clusters are determined based on the vertex positions, the cluster conﬁgurations
will vary for different meshes in the animation sequence (Fig 5). Clustering based on ver-
tex proximity produces better quality partitions whose vertices are more likely to undergo
homogeneous motion.The cluster sizes are variable and the mesh can be segmented into
an arbitrary number of clusters. A noticeable improvement in compression performance
is observed when geometry-based clustering is employed, especially for high-motion se-
quences, as noted in later sections. However, since the vertex clusters do not correspond to
the distinctive mesh components, the general performance of geometry-based clustering is
inferior to that of semantic mesh decomposition for dynamic mesh coding.
(a) (b) (c) (d)
Figure 5. Segmentation using Lloyd’s clustering for frames (a) 70 and (b) 120 of the
Chicken animation (maximum cluster size = 100); frames (c) 0 and (d) 505 of the Face
animation (maximum cluster size = 75).
3.1.3. Spectral-Based 3D Mesh Segmentation
A third set of techniques perform Semantic Mesh Decomposition to segment the mesh into
meaningful components. These techniques exploit both topology and geometry features to
generate components that represent distinctive features of the 3D polygonal mesh. Unlike
in geometry or topology-based clustering, where the number of clusters is user speciﬁed,
most of the semantic mesh decomposition algorithms [28, 22, 27] automatically determine
the number of vertex clusters based on homogeneity of the mesh regions. The mesh com-
ponents represent the distinctive regions of the object consistent with human perception,
which deﬁnes boundaries along concavities of the surface. They can be used to establish
A Survey of Popular 3D Soft-Body Animation Compression Approaches 95
shape correspondence, and in most cases, also correspond to regions capable of undergoing
independent motion (Fig 6).
Liu and Zhang proposed a spectral clustering approach to mesh decomposition in [28],
where, in order to segment a 3D mesh with n faces along the edges, the n×n afﬁnity matrix
W is initially constructed for the dual of the mesh graph to group faces closer to each other.
Each vertex in the dual graph corresponds to a mesh face and two vertices are connected if
and only if the corresponding mesh faces are adjacent to each other. For grouping of faces,
the pairwise face distance measure used in [22] is used to deﬁne the afﬁnity matrix. The
distance measure between mesh faces f
i
and f
j
is deﬁned as the shortest path between their
dual vertices given by
Dist(i, j) = weight(dual(f
i
), dual(f
j
)) = δ
Geod(f
i
, f
j
)
avg(Geod)
+(1−δ)
Ang Dist(α
ij
)
avg(Ang Dist)
(7)
Here, Geod(f
i
, f
j
) is the geodesic distance between f
i
and f
j
while the angular distance is
deﬁned as
Ang Dist(α
ij
) = η(1 −cosα
ij
) (8)
where α
ij
is the angle between the normals for adjacent faces f
i
and f
j
. Since the angular
distance plays a more important role for visually meaningful segmentation, δ is set to a
value close to zero. Also, a smaller value of η favors concavities and therefore it is set in
the range 0.1 ≤ η ≤ 0.2. On obtaining the pairwise face distances, the afﬁnity matrix is
deﬁned by the Gaussian kernel
W(i, j) = e
−Dist(i,j)
2σ
2
(9)
It can be easily seen that 0 < W(i, j) < 1 and takes larger values for faces closer
to each other. A suitable value for the width of the Gaussian, σ, is empirically set to
1
n
2

1≤i,j≤n
Dist(i, j).
W(i, j) encodes the likelihood of faces i and j belonging to the same patch. The nor-
malization of the afﬁnity matrix is performed as N = D
−
1
2
WD
−
1
2
where D is the diag-
onal matrix whose i
th
diagonal element is the sum of the i
th
row of W, the vertex degree
at node i. N possesses desirable properties in the context of spectral clustering [44] and
N
ij
=
W
ij
√
D
ii
D
jj
. Let V be n×k matrix formed using the k leading eigenvectors of N. Then,
the n ×n matrix Q = V V
T
represents the most energy preserving projection of N to rank
k. Normalizing the rows of V to unit length gives
ˆ
V , whose rows ˆ v
1
. . . ˆ v
n
(of dimension k)
represent the embedding of the W
ij
s onto the k-dimensional unit sphere centered at origin.
ˆ
Q =
ˆ
V
ˆ
V
T
is known as the association matrix whose elements
ˆ
Q
ij
= ˆ v
i
ˆ v
j
T
= cos θ
ij
are the cosine of the angle between unit vectors ˆ v
i
and ˆ v
j
. As N is projected to succes-
sively lower rank k, the sum of squared angle cosines

i,j
(cosθ
ij
)
2
is strictly increasing
[6]. Point pairs likely to be clustered together will move towards each other as k decreases,
while other pairs will move further apart. Therefore, clustering points in k-dimensional
space is easier than clustering the original data and is accomplished by performing k-means
clustering on the rows of
ˆ
V .
96 S. Ramanathan and A.A. Kassim
The spectral clustering algorithm performs a semantic decomposition of the mesh with
the generated components corresponding to the salient object features, capable of under-
going independent motion, as shown in Fig 6. The algorithm tends to segment the mesh
in a hierarchical fashion on varying the number of eigenvectors chosen for V . The com-
putation of the shortest distance face pairs and the afﬁnity matrix W are of complexity
O(n
2
log(n)) and O(n
2
) respectively but this computation time is greatly reduced in the
implementation described in [47]. Also, a recursive 2-way spectral cut procedure used in
[47] overcomes the problem of choosing the optimal k for clustering and produces better
quality partitions. Since the components generated upon mesh decomposition correspond
to the salient features of the object, the same set of vertex clusters can be used for perform-
ing motion estimation for the entire animation sequence. As the components can describe
the piecewise afﬁne mesh motion effectively, it is possible to encode the animation more
efﬁciently. Experimental results conﬁrm the superior compression performance obtained
through semantic mesh decomposition compared to k-way partitioning or Lloyd’s cluster-
ing for dynamic geometry compression.
(a) (b) (c) (d)
Figure 6. Segmentation of (a) Chicken (59 components), (b) Face (6 components), (c) Dol-
phin (7 components) and (d) Dinopet (29 components) meshes through spectral clustering.
A number of mesh segments e.g., ﬁns of the dolphin, limbs of the dinosaur can undergo
independent motion.
3.1.4. Analysis of Registration-Based Coding Algorithms
A majority of the registration-based dynamic geometry compression algorithms use
topology-based clustering for segmenting meshes. Lengyel’s [25] algorithm uses a greedy
vertex clustering approach based on the triangulation of the original mesh. Prediction-based
geometry compression algorithms [46, 38] deﬁne vertex neighborhoods for prediction based
on mesh connectivity . Ahn et al. [1] segment the mesh by converting the triangular mesh
structure into a linear triangle strip form. The triangle strip is divided into blocks such that
each block has same number of vertices. Gupta et al. [13] use the multilevel k-way graph
partitioning technique [16] that generates clusters of approximately equal sizes. Since the
A Survey of Popular 3D Soft-Body Animation Compression Approaches 97
mesh connectivity remains constant for dynamic geometry, topology-based clustering needs
to be performed only for the ﬁrst mesh in the sequence (I mesh) and the clusters remain
ﬁxed thereafter for the entire sequence.
We ﬁnd that mesh segmentation based on the ﬁxed mesh topology is unsuitable for
compressing mesh sequences with changing mesh geometry. Efﬁcient detection of mesh
pieces that have moved over time is possible only when the components generated upon
clustering roughly represent these pieces. When the mesh undergoes arbitrary deforma-
tion, the vertex clusters undergoing coherent motion will be different at different times.
Clearly, it is impossible for a given set of clusters generated using topology-based parti-
tioning to represent the coherent motion regions at all times. Clustering the mesh on the
basis of positional proximity instead of graph adjacency is more suited for encoding dy-
namic geometry sequences. Alternatively, a ﬁxed set of clusters will effectively describe
the piecewise afﬁne mesh motion in animations only if they correspond to the distinctive
mesh components that can undergo independent motion. Our experimental results (Section
3.7) conﬁrm that geometry-based clustering and semantic mesh decomposition techniques
produce better compression performance than topology-based clustering.
The next section brieﬂy discusses ICP-based dynamic mesh coding [13] and related
performance metrics while the following sections demonstrate how potent vertex clustering
improves compression performance of registration-based and PCA-based dynamic mesh
coding algorithms.Finally, we conclude with observations that will require critical consid-
eration during the development of an efﬁcient 3D animation coding standard.
3.2. ICP-based 3D Dynamic Geometry Compression
The vertex clustering algorithms described in the previous section simplify the problem
of determining the regions of coherent motion by segmenting the mesh into pieces that
can possibly undergo afﬁne motion. In order to determine the actual regions that have
moved, motion prediction needs to be performed. For 3D dynamic geometry, the inter-mesh
motion is typically small. The mesh motion can be completely described using a few afﬁne
transformations and residual errors, and this compact representation leads to compression.
An efﬁcient dynamic geometry compression algorithm that performs systematic motion
estimation is described in [13].
The algorithm uses the multilevel k-way graph partitioning algorithm for initially seg-
menting into pieces, and for each piece in the current mesh, the corresponding piece in
the temporal reference is detected using the Iterative Closest Point (ICP) algorithm. Using
the results of ICP-based registration, the motion segmentation module segments the mesh
vertices into three distinct sets based on their motion characteristics.
• First set (Type 1) - consisting of clusters of vertices, such that the motion of each
of the clusters may be described accurately using the associated afﬁne transform A
i
.
The reconstruction error for the vertices falling in this set is less than a threshold, τ.
• Second set (Type 2) - consisting of clusters of vertices, such that each cluster has an
afﬁne transformation matrix A
i
associated with it. In addition, residual errors also
need to be encoded for accurately representing the vertex positions. The reconstruc-
98 S. Ramanathan and A.A. Kassim
tion error for the vertices falling in this set using the afﬁne transform alone is less
than 20 τ.
• Third set (Type 3) - consisting of vertices whose motion cannot be described effec-
tively using afﬁne transforms. These are encoded using DPCM-based techniques.
The ability of ICP-based motion segmentation to divide the vertices in the mesh into
distinct sets helps achieve efﬁcient compression. This is because the motion of Type 1 and
Type 2 vertices, which constitute over 70% of the total, can be described using a few afﬁne
transformations and residual errors. Also, the residual errors can be adaptively coded using
variable number of bits for different groups of vertices in order to maintain the animation
smoothness. ICP-based compression produces two types of meshes: I and P. I (Intra)
meshes are encoded using static mesh compression techniques and complete information
can be obtained by decoding the I mesh without any reference. P (Predicted) meshes con-
tain only the differences from the temporally previous I or P mesh and chieﬂy contribute to
compression. The difference needs to be added to the reference mesh data in order to obtain
the complete P mesh information. Although dynamic geometry compression is inherently
lossy, it is still acceptable if the reconstructed animation is ’perceptually lossless’, as there
is a trade-off between compression and quality. For P meshes, the following information
needs to be encoded:
• Afﬁne matrices and vertices associated with each afﬁne transform matrix.
• Error values associated with the vertex positions for Type 2 and Type 3 vertices.
To compactly encode the above information, the following procedure is used:
• Vertices whose motion can be described using afﬁne transforms are given the symbol
P. Every P vertex is associated with a patch index to denote the associated afﬁne
transform. When the patch index of the vertex hasn’t changed from the reference, a
symbol N is used to signify no change and the previous patch index is used.
• Type 2 vertices are represented using the symbol E to denote error information.
• Type 3 vertices encoded using DPCM techniques are assigned the symbol D.
• As afﬁne transforms associated with the vertex clusters exhibit a high spatio-temporal
correlation, the differences between the afﬁne matrices are quantized and encoded.
• The number of bits used to encode error data for Type 2 and Type 3 vertices is deter-
mined by the PSNR. More bits are added to encode error for vertices whose PSNR is
below a certain threshold.
• Vertex symbols, afﬁne matrices and residual errors are encoded using Arithmetic
coding.
The process of reconstructing mesh geometry fromP meshes is outlined in Fig 7. When
there is large inter-mesh motion, the same vertex may be associated with a number of afﬁne
transforms and many vertices may require error information to be encoded, which results
A Survey of Popular 3D Soft-Body Animation Compression Approaches 99
in considerably lower compression ratios. Therefore, such meshes are encoded as I meshes
for which only the spatial coherence in mesh geometry is exploited. The Edgebreaker algo-
rithm [34] is used for encoding I meshes. Insertion of I meshes helps maintain animation
smoothness with a marginal reduction in compression. Also, periodic transmission of I
meshes is necessary while transmitting data over noisy channels and for enabling random
access to the animation sequence.
Figure 7. Reconstruction of mesh geometry from P meshes.
3.2.1. Performance Metrics
The Signal to Noise Ratio (SNR) is extensively used for evaluating the performance of
dynamic geometry coding algorithms. The SNR and PSNR (Peak Signal to Noise Ratio)
are considered objective measures for comparing the quality of the compressed data against
the original data. The following deﬁnition for Peak Signal to Noise Ratio (PSNR) [43] is
for evaluating the reconstruction quality of the encoding scheme.
PSNR = −10 log
10
PMSE (10)
where PMSE is the Peak Mean Square Error per vertex given by
PMSE =
1
N
n

n
j=1
1
3

i=x,y,z
(v
ji
(t) − ¯ v
ji
(t))
2
R
2
(11)
where R is the maximum inter-mesh displacement for the entire animation sequence, N
n
and n denote the number of vertices that have moved between two consecutive meshes and
the total number of mesh vertices respectively. The PSNR provides a quantitative measure
of the animation smoothness.
Another performance metric used for comparing various compression algorithms is the
compression ratio which is deﬁned as the size of the original data to the encoded data. The
100 S. Ramanathan and A.A. Kassim
per-frame compression ratio (CR) is calculated as follows:
CR =
(Bits for raw data)
(Encoded vertex data bits + Afﬁne transform bits + Error bits)
(12)
A number of encoding schemes also use Distortion Factor d
a
(also termed KG
error
),
to evaluate reconstruction quality and is deﬁned as
d
a
= 100
B −
ˆ
B
B −C(B)
(13)
where B is a 3V × F matrix representing the geometry of the V vertices in the F frames
of the original animation,
ˆ
B represents the reconstructed animation geometry and C(B)
contains the average vertex positions for the animation. Likewise, the compression perfor-
mance is alternatively measured using encoded Bits per Vertex per Frame, (bpvf), which is
related to the compression ratio as given by the following equation.
bpvf =
Bits for encoding each vertex
F
=
Raw data bits (total)
(
Compressed bits (total)
FV
)
=
96

F
i=1
CR
i
F
(14)
3.3. Impact of Vertex Clustering on Compression Performance
In this section, we study the impact of vertex clustering on dynamic geometry compression
by comparing the aforementioned performance metrics for the clustering schemes described
previously.
3.3.1. Test Animations
Four animation sequences, namely, Chicken
2
, Face
3
, Cow and Dance were used for com-
paring the clustering schemes. The Chicken animation contains 400 frames with each mesh
in the sequence consisting of 3029 vertices and 5664 triangles. The animation is highly
non-linear with the motion becoming extremely rapid after frame 260. The Face sequence
contains a realistic animation of a talking human face in various poses and exhibiting vari-
ous facial expressions as well. There are 952 frames in the sequence with 757 vertices and
1468 triangles per mesh. The Cow (2904 vertices, 5804 triangles, 204 frames) animation
is also a high motion sequence while the Dance sequence (7061 vertices, 14118 triangles,
201 frames) depicts a person performing various dance movements. Due to the similar na-
ture of motion throughout the Dance animation, only the ﬁrst 100 meshes were used in the
experiments.
3.3.2. Experimental Results
ICP-based dynamic geometry compression [13] using k-way partitioning, with about 100
vertices per cluster, achieves an average compression of 45 for the Chicken animation. It can
2
The Chicken, created by Andrew Glassner, Tom McClure, Scott Benza, and Mark Van Langeveld for
Microsoft Corporation (1996).
3
The Face sequence is the property of Visage Technologies.
A Survey of Popular 3D Soft-Body Animation Compression Approaches 101
be observed that the Lloyd’s clustering (Fig 4) and spectral decomposition (Fig 6) schemes
are able to segment the chicken’s neck from its torso more effectively compared to k-way
topology partitioning (Fig 3). The motion in frames 40-60 and 200-230 is mainly localized
around the neck of the chicken as seen in Fig 8.
Figure 8. Frames 48, 56, 203 and 221 of the Chicken animation.
For P frames, the per-frame compression is directly proportional to the number of Type
1 and Type 2 vertices and inversely proportional to the number of Type 3 vertices. The
number of Type 1 vertices, in turn, depends on how well the afﬁne transforms computed
using ICP based registration can represent the piecewise motion of the mesh. There exists
a direct relationship between the number of Type 1 vertices registered using ICP and the
accuracy with which the initial clusters input to ICP can represent the independent mesh
regions. To illustrate this point, frame sequences 40-60 and 200-230 were encoded using
Lloyd’s clustering (100 vertices per cluster) and spectral mesh decomposition (59 mesh
components).
Table 1 shows that the number of Type 1 vertices registered using ICP are much higher
when Lloyd’s and spectral clustering are used instead to k-way graph partitioning for both
frame sequences. Better clustering produces more Type 1 vertices, and hence, better com-
pression performance. For frames 40-60, all mesh vertices are encoded as either Type 1
or Type 2. However, for frames 200-230, afﬁne transforms can effectively describe motion
of only few mesh vertices, and therefore, there are many Type 3 vertices. For this frame
sequence, while the number of Type 1 vertices registered using Lloyd’s and spectral cluster-
ing are about the same, more Type 2 vertices generated using spectral clustering produces
higher compression.
The latter part of the Chicken animation (frames 261-399) is characterized by extensive
motion. In these frames, reconstruction errors are associated with a large number of vertices
and the reconstructed animation is noisy. Additional bits need to be encoded to improve
animation quality at the expense of compression performance for these frames as seen from
Fig 9. PSNR calculations are used to measure and improve the animation smoothness for
this set of frames. For each frame in the animation, we measure the PSNR for Type 1, Type
2, Type 3 vertices and the entire reconstructed frame for analysis. For ensuring smooth
animation reconstruction, the overall minimum PSNR varies for different sequences and
depends on the nature of the mesh motion. A minimum PSNR of 35 db is required for the
Chicken sequence (R = 0.7662), while a PSNR threshold of 20 db is sufﬁcient for the low
102 S. Ramanathan and A.A. Kassim
motion Face sequence (R = 0.0673). The frame PSNR can be improved by (i) allocating
extra bits for encoding Type 2 and Type 3 vertices (ii) using a smaller error threshold τ to
register Type 1 vertices and (iii) transmission of I meshes. Examples of (i) and (ii) are
shown in Fig 9.
For frames 261-399, Lloyd’s clustering performs better than spectral clustering and
provides the best compression performance (Table 1). This is because the motion in these
frames is concentrated around the chicken’s wings which is not well segmented by spectral
clustering. The rapidness in motion also necessitates a number of meshes in the animation
to be coded as I meshes which correspond to the minima in the compression curve. The
performance of various dynamic geometry coding schemes for the Chicken animation is
presented in Table 2. Clearly, the use of Lloyd’s clustering over k-way partitioning improves
performance of ICP-based compression by 3.8% under similar distortion (d
a
).
Table 1. Impact of vertex clustering on compression performance for frame
sequences (a) 40-60, (b) 200-230 and (c) 261-399 of the Chicken animation. The table
contains the mean values of Type 1 count, Type 2 count, CR and PSNR for the frame
sequence under consideration. The number of Type 1 vertices and compression ratios
increase with improvement in quality of input clusters.
Frame nos. Clust. Algo. Type 1 count Type 2 count CR PSNR
40-60
k-way 733 2296 52 66.1
Lloyd’s 746 2283 52.2 66.5
Spectral 839.6 2189.4 52.8 67
200-230
k-way 470 1787 45.6 63.7
Lloyd’s 604 1632 47.2 63.9
Spectral 603 1925 48.3 64.6
261-399
k-way 264 1406 38.3 51.1
Lloyd’s 357 1299 39.9 50.8
Spectral 296 1407 39.1 51.1
For the Face sequence, the inter-frame motion is very low and very few vertices are
registered with error threshold τ = ν/4, where ν is average inter-frame motion, for many
frames. The compression results for the different clustering schemes for τ = ν/4, ν/3 and
ν/2 are shown in Table 3. Clearly, spectral clustering produces higher compression perfor-
mance than Lloyd’s or k-way partitioning. The ability of the spectral clustering algorithm
to accurately segment the various face regions (Fig 6) enables the ICP module to register a
maximum number of Type 1 vertices. This leads to a major improvement in the compres-
sion performance even when the encoded mesh is small in size. The compression obtained
using spectral clustering is 9.7%, 15% and 7.8% higher than k-way partitioning for τ equal
to ν/4, ν/3 and ν/2 respectively. However, the compression obtained using Lloyd’s and
k-way partitioning is very similar even though Lloyd’s clustering produces better quality
A Survey of Popular 3D Soft-Body Animation Compression Approaches 103
(i)
(ii) (i) (ii)
(a) (b)
Figure 9. (a) Reconstructed Chicken frame 286 with (i) PSNR = 25.7 db (CR =
46.6) and (ii) PSNR = 43.2 db using τ = ν/11 and 6 bits for error encoding (CR
= 46.3). (b) Reconstructed frame 320 with (i) PSNR = 31.8 db (CR = 49.5) and (ii)
PSNR = 39.7 db using τ = ν/7 and 5 bits for error encoding (CR = 49.3).
(a) (b) (c)
Figure 10. Frames of the Cowand Dance animations partitioned using (a) k-way (b) Lloyd’s
and (c) Spectral clustering.
104 S. Ramanathan and A.A. Kassim
Table 2. Performance of various dynamic geometry compression algorithms for the
Chicken animation (uncompressed ﬁle size = 13.9 MB). Table contains mean values of
CR and d
a
(wherever available).
Compression algorithm CR d
a
Motion compensated compression [1] 10
Motion vector prediction-based compression [46] 18.3
Time-dependent geometry compression [25] 27
PCA Representation [3] 39.8
Connectivity-guided connectivity compression [38] 33 0.13
Clustered PCA Analysis [35] 34.3 0.076
Partitioning based compression [13] 45.3 0.11
[13] with spectral 46.6 0.12
[13] with Lloyd’s 46.7 0.12
Local PCA Analysis [4] 64 0.057
clusters. This is possibly because a marginal improvement in cluster quality does not sig-
niﬁcantly improve the number of Type 1 vertices for the low-motion, small-sized Face mesh
sequence. A PSNR threshold of 20 db is sufﬁcient to smoothly reconstruct the animation.
Table 3. Compression performance of the different clustering schemes algorithms at
various values of τ for the Face animation. Table contains mean values of CR and
PSNR.
τ Clust. algo. Type 1 count CR PSNR
ν/4
k-way 182 41.9 44.3
Lloyd’s 165 41.2 44.5
Spectral 268 45.9 43.8
ν/3
k-way 372 53.3 39.9
Lloyd’s 367 52.9 40
Spectral 526.8 61.3 39.2
ν/2
k-way 670 69.5 34.4
Lloyd’s 675 69.2 33.9
Spectral 710 74.9 34.9
Some partitioned frames of the Cow and the Dance animations are shown in Fig 10.
A Survey of Popular 3D Soft-Body Animation Compression Approaches 105
A minimum PSNR of 35 db is required to smoothly reconstruct the animation for both
sequences. It is evident from the ﬁgure that k-way and Lloyd’s produce more mesh clus-
ters compared to spectral clustering. As vertex clustering is performed solely based on
proximity for k-way and Lloyd’s clustering, the cluster sizes affect the compression and
SNR performance as observed in [13]. While ICP works well on small-sized clusters, a
large number of mesh clusters are associated with increased processing time and reduced
compression performance (as more afﬁnes need to be encoded). Also, large-sized clus-
ters produce registration errors and consequently, a degradation in compression and SNR
performance. We observe that the best compression and SNR performance is achieved for
cluster sizes of 100 and 125 for k-way and Lloyd’s clustering respectively.
Table 4. Compression performance of the different clustering schemes algorithms for
the Cow and Dance animations. Mean values of CR and PSNR are listed in the table.
Anim. Clust. algo. Type 1 count Type 2 count CR (% incr.) PSNR
Cow
k-way 351 1386 40.3 - 43
Spectral 332.3 1597 41.7 3.5 44.7
Lloyd’s 368 1581 42 4.2 42.5
Dance
k-way 312 4038 41.4 - 42.2
Lloyd’s 414 3993 41.9 1.2 42.9
Spectral 740 4765 45.8 10.6 41.5
The Cow animation sequence is characterized by high motion and I meshes need to
be encoded frequently after frame 100. Lloyd’s clustering enables most efﬁcient encoding
of the mesh motion as shown in Table 4. The number of Type 1 vertices is minimum for
spectral clustering but outperforms k-way partitioning as there are more registered Type 2
vertices. The poor performance of spectral compression for the Cow and the latter part of
the Chicken animations underline the limitations of semantic mesh decomposition. This
is because the mesh decomposition is purely based on the intrinsic geometric structure of
the mesh. While it is difﬁcult to achieve accurate segmentation of the mesh into distinctive
components, efﬁcient segmentation of coherent motion regions can only be performed by
exploiting the motion cues available from the animation. Overall, about a 4% improvement
in compression performance is obtained when Lloyd’s clustering is used instead of graph
partitioning. On the other hand, spectral clustering performs exceedingly well for the Dance
animation. The increased number of ICP registered Type 1 vertices using spectral clustering
improves compression performance by over 10%for the Dance animation as shown in Table
4.
3.4. Comparison with PCA-Based Algorithms
As seen from the experimental results, the clustering scheme employed to segment the
mesh for motion prediction greatly affects compression performance. The k-way parti-
106 S. Ramanathan and A.A. Kassim
(a) (b)
Figure 11. (a) Clusters generated for the Chicken and Cow in [35]. (b) Vertex clusters for
the Chicken, Cow and Dance meshes in [4]. Figures adapted from [35, 4].
tioning, Lloyd’s clustering and Spectral mesh decomposition are static mesh segmentation
techniques that segment the mesh to be encoded into smaller pieces without any temporal
considerations. For encoding motion in dynamic mesh sequences, temporal cues can also
be used to group vertices likely to undergo similar motion. Motion-based segmentation can
facilitate identiﬁcation of those ”pieces” that cannot be easily detected using static mesh
decomposition. For example, for the human ﬁgure in the Dance sequence, segmentation of
the arms and limbs is achieved by spectral clustering (Fig 10). Motion cues can be used to
achieve further segmentation around articulated joints like the elbow and knee, and clearly,
the segmented parts will correspond to the coherent motion regions better. Next, we look
at two recent algorithms that segment the mesh based on motion characteristics to reinforce
the idea that meaningful mesh segmentation greatly impacts performance of 3D dynamic
mesh coding algorithms.
Two approaches that perform motion-based clustering to efﬁciently represent motion
have been found to achieve high compression performance. Sattler et al. [35] propose the
clustered PCA (CPCA) approach to dynamic geometry compression that can identify the
mesh parts undergoing coherent motion over time. The vertex trajectories are clustered us-
ing Lloyd’s clustering [29] in combination with PCA to segment the coherent mesh parts.
Each mesh part is then compressed using PCA on the complete animation as performed
in [3]. This method results in higher compression than standard PCA and PCA+LPC ap-
proaches while producing lesser distortion. Also, Amjoun et al. [4] propose local PCA-
based compression, where the mesh is segmented into clusters based on local motion char-
acteristics and a local coordinate system is deﬁned for each cluster, with respect to which
the cluster motion is encoded. Table 5 compares the performance of ICP-based compression
using spectral clustering with CPCA and LPCA-based compression for similar distortion.
From the table, it is evident that for the bpvf values for ICP-based compression using
spectral decomposition are much lower than those for Clustered PCAfor comparable values
of distortion. While LPCA-based compression performs better than CPCA or ICP-based
coding for the Chicken sequence, ICP coding with spectral clustering achieves maximum
compression for the Cow animation. This could be attributed to the inadequate segmenta-
tion achieved by the pure motion-based clustering schemes in [35, 4] as shown in Fig. 11.
While motion-based clustering can produce meaningful segmentation of the mesh into com-
ponents, e.g., wings and legs of the Chicken, more coherent mesh segments are obtained for
A Survey of Popular 3D Soft-Body Animation Compression Approaches 107
Table 5. Comparison of CPCA and LPCA-based compression with ICP
based-compression using Spectral clustering for the Chicken and Cow animations.
Animation CPCA LPCA ICP
bpvf d
a
bpvf d
a
bpvf d
a
Chicken
4.7 0.076 3.5 0.008 2.16 0.12
2.8 0.139 1.5 0.057 1.72 0.26
Cow
7.4 0.16 6.8 0.128 2.9 0.33
3.8 0.50 4.1 0.47 2.3 0.47
2.0 7.4 2.2 1.22 1.9 0.9
the Cow using spectral decomposition (Fig 10) compared to pure motion-based clustering.
Therefore, an ideal vertex clustering scheme for 3D dynamic mesh coding needs to exploit
both structural and motion-based cues for maximal compression performance.
4. Conclusion
3D soft-body animation compression is a non-trivial as non-planar 3D mesh deformations
cannot be described using well-known video motion prediction techniques. A number tech-
niques have been proposed for efﬁciently coding 3D animations, and in particular, 3D dy-
namic geometry. These 3D dynamic mesh coding algorithms can be divided into three ma-
jor classes- registration-based, prediction-based and PCA-based multiresolution represen-
tation. Efﬁcient vertex clustering for motion prediction enhances compression performance
achieved using registration and pca-based coding.
Our experimental results clearly demonstrate that meaningful mesh segmentation en-
hances compression performance of dynamic mesh coding algorithms. Using spectral
mesh decomposition, which segments the mesh based on structural cues, performance of
registration-based dynamic geometry coding improves by as much as 10% (for the Dance
animation), while CPCA and LPCA-based compression algorithms, that use motion cues
for vertex clustering, also perform signiﬁcantly better compared to other PCA-based coding
schemes. Nevertheless, vertex clustering exclusively using structural or motion cues does
not produce the best clusters for motion prediction- Lloyd’s clustering outperforms spectral
decomposition for the high-motion Chicken animation while cluaters generated using mo-
tion cues are inadequate for the Cow sequence. A hierarchical mesh segmentation scheme
that initially segments the mesh based on structural cues followed by generation of ﬁner
vertex clusters through motion analysis appears to be best suited for dynamic geometry
coding. Detecting motion-coherent vertex clusters could be the key in solving the anima-
tion compression problem. Recently proposed Bone-based animation [19], which involves
detection of the elementary transformations that constitute mesh motion, offers an exciting
prospect in this regard.
108 S. Ramanathan and A.A. Kassim
References
[1] J. H. Ahn, C. S. Kim, C. C. Kuo, and Y. S. Ho. Motion compensated compression of
3d animation models. Electronic Letters, 37(24):1445–1446, 2001.
[2] A.Khodakovsky, P.Schroder, and W.Sweldens. Progressive geometry compression.
In Proceedings of the 27th annual conference on Computer graphics and interactive
techniques, pages 271–278, 2000.
[3] M. Alexa and W. Muller. Representing animations by principal components. In EU-
ROGRAPHICS, volume 19(3), pages 411–418, 2000.
[4] Rachida Amjoun and Wolfgang Straβer. Efﬁcient compression of 3d dynamic mesh
sequences. Journal of the WSCG, 2007.
[5] Y. Boulfani, F. Payan, and M. Antonini. Temporal wavelet-based compression of
3d animated meshes using motion-based clustering. In Proceedings of the Workshop
TAIMA’07. Tunisia, May 2007.
[6] M. Brand and K. Huang. A unifying theorem for spectral embedding and cluster-
ing. In Proceedings of the Ninth International Workshop on Artiﬁcial Intelligence and
Statistics, 2003.
[7] M. Chow. Geometry compression for real-time graphics. In Proceedings of Visualiza-
tion’97, 1997.
[8] D. Cohen-Or, O. Remez, and D. Levin. Progressive compression of arbitrary triangu-
lar meshes. In Proceedings of Visualization ’99, pages 67–72, 1999.
[9] G. Debunne, M. Desbrun, M.P. Cani, and A. H. Barr. Dynamic real-time deformations
using space and time adaptive sampling. In SIGGRAPH 2001, Computer Graphics
Proceedings, pages 31–36, 2001.
[10] M. Deering. Geometry compression. In Proceedings of SIGGRAPH ’95, pages 13–20,
1995.
[11] S. Gumhold and W. Strasser. Real time compression of triangle mesh connectivity. In
Proceedings of SIGGRAPH ’98, pages 133–140, 1998.
[12] S Gupta, K Sengupta, and A.A. Kassim. Registration, partitioning based compression
of 3d dynamic data. IEEE Transactions on Circuits and Systems for Video Technology,
13(11):1144–1155, 2003.
[13] Sumit Gupta, Kuntal Sengupta, and A.A. Kassim. Compression of 3d dynamic geom-
etry data using iterative closest point algorithm. Computer Vision and Image Under-
standing, 87:116–130, 2002.
[14] Igor Guskov and Andrei Khodakovsky. Wavelet compression of parametrically
coherent mesh sequences. In SCA ’04: Proceedings of the 2004 ACM SIG-
GRAPH/Eurographics symposium on Computer animation, pages 183–192, 2004.
A Survey of Popular 3D Soft-Body Animation Compression Approaches 109
[15] Bruce Hendrickson and Robert W. Leland. An improved spectral graph partitioning
algorithm for mapping parallel computations. SIAM Journal on Scientiﬁc Computing,
16(2):452–469, 1995.
[16] Bruce Hendrickson and Robert W. Leland. A multi-level algorithm for partitioning
graphs. In Supercomputing, 1995.
[17] Hugues Hoppe. Progressive meshes. In Proceedings of the 23rd annual conference
on Computer graphics and interactive techniques, pages 99–108, 1996.
[18] L. Ibarria and J. Rossignac. Dynapack: Space-time compression of the 3d animation
of triangle meshes with ﬁxed connectivity. In Proceedings of the ACM SIGGRAPH
Symposium on Computer Animation, 1999.
[19] B. Jovanova, M. Preda, and F. Preteux. Mpeg-4 part 25: A generic model for 3d
graphics compression. In 3DTV08, pages 101–104, 2008.
[20] Tapas Kanungo, David M. Mount, Nathan S. Netanyahu, Christine D. Piatko, Ruth
Silverman, and Angela Y. Wu. The analysis of a simple k-means clustering algorithm.
In Proc of the 16th Annual Symposium on Computational Geometry, pages 100–109,
1991.
[21] Z. Karni and C. Gotsman. Compression of soft body animation sequences. Computers
and Graphics, 28:25–34, 2004.
[22] Sagi Katz and Ayellat Tal. Hierarchical mesh decomposition using fuzzy clustering
and cuts. ACM Transactions on Graphics, 22(3):954–961, 2003.
[23] B. W. Kernighan and S. Lin. An efﬁcient heuristic procedure for partitioning graphs.
The Bell System Technical Journal, 49(2):291–307, 1970.
[24] Rob Koenen. Overview of the MPEG-4 standard. Moving Picture Experts Group,
2000.
[25] J. Lengyel. Compression of time dependent geometry. In Symposium on Interactive
3D Graphics, pages 89–95, 1999.
[26] J. Li and C.C. Kuo. A dual graph approach to 3d triangular mesh compression. In
Proceedings of the IEEE International Conference on Image Processing, pages 891–
894, 1998.
[27] X. Li, T. Toon, T. Tan, and Z. Huang. Decomposing polygon meshes for interactive
applications. In Proceedings of the Symposium on Interactive 3D Graphics, pages
35–42, 2001.
[28] R. Liu and H. Zhang. Segmentation of 3d meshes through spectral clustering. In
Proceedings of Paciﬁc Graphics, pages 298–305, 2004.
[29] Stuart P. Lloyd. Least squares quantization in pcm. IEEE Transactions on Information
Theory, 18(2):129–137, 1982.
110 S. Ramanathan and A.A. Kassim
[30] S. Moradoff and D. Lischinski. Synthesis of textural motion with hard constraints.
In Proceedings of the 4th IsraelKored 6i-national conference on geometric modelling
and computer graphics., pages 123–128, 2003.
[31] K. Muller, A. Smolic, M. Kautzner, P. Eisert, and T. Wiegand. Predictive compression
of dynamic 3d meshes. In ICIP05, pages 589–592, 2005.
[32] R. Pajarola and J. Rossignac. Compressed progressive meshes. IEEE Transactions on
Visualization and Computer Graphics, 6(1):79–93, 2000.
[33] F. Payan and M. Antonini. Wavelet-based compression of 3d mesh sequences. In
Proceedings of IEEE ACIDCA-ICMI’2005, 2005.
[34] J. Rossignac. Edgebreaker: Connectivity compression for triangle meshes. IEEE
Transactions on Visualization and Computer Graphics, 5(1):47–61, 1999.
[35] Mirko Sattler, Ralf Sarlette, and Reinhard Klein. Simple and efﬁcient compres-
sion of animation sequences. In SCA ’05: Proceedings of the 2005 ACM SIG-
GRAPH/Eurographics symposium on Computer animation, pages 209–217, 2005.
[36] Ariel Shamir, Chandrajit Bajaj, and Valerio Pascucci. Multi-resolution dynamic
meshes with arbitrary deformations. In Proceedings of the conference on Visualization
’00, pages 423–430, 2000.
[37] H.D. Simon. Partitioning of unstructured problems for parallel processing. In Proc.
Conference on Parallel Methods on Large Scale Structural Analysis and Physics Ap-
plications, pages 135–148. Pergammon Press, 1991.
[38] N. Stefanoski and J. Ostermann. Connectivity-guided predictive compression of dy-
namic 3d meshes. In ICIP06, pages 2973–2976, 2006.
[39] N. Stefanoski and J. Ostermann. Spatially and temporally scalable compression of
animated 3d meshes with mpeg-4/famc. In ICIP08, pages 2696–2699, 2008.
[40] G. Taubin and J. Rossignac. Geometric compression through topological surgery.
ACM Transactions on Graphics, 17(2):84–115, 1998.
[41] D. Terzopoulos, J. Platt, A. Barr, and K. Fleischer. Elastically deformable models. In
SIGGRAPH ’87: Proceedings of the 14th annual conference on Computer graphics
and interactive techniques, pages 205–214, 1987.
[42] C. Touma and C. Gotsman. Triangle mesh compression. In Proceedings of Graphics
Interface, pages 26–34, 1998.
[43] S. Varakliotis, J. Ostermann, and V. Hardman. Coding of animated 3d wireframe mod-
els for internet streaming applications. In Proc. International Conference on Multi-
media and Expo., pages 353–356, 2001.
[44] Yair Weiss. Segmentation using eigenvectors: A unifying view. In International
Conference on Computer Vision, pages 975–982, 1999.
A Survey of Popular 3D Soft-Body Animation Compression Approaches 111
[45] Ian H. Witten, A. Moffat, and Timothy C. Bell. Managing Gigabytes: Compressing
and Indexing Documents and Images. Morgan Kaufmann Publishers, San Francisco,
CA, 1999.
[46] J.H. Yang, C.S. Kim, and S.U. Lee. Compression of 3-d triangle mesh sequences
based on vertex-wise motion vector prediction. IEEE Transactions on Circuits and
Systems for Video Technology, (12):1178–1184, December 2002.
[47] H. Zhang and R. Liu. Mesh segmentation via recursive and visually salient spectral
cuts. In Proceedings of Vision, Modeling and Visualization, pages 429–436, 2005.
[48] Z.Karni and C.Gotsman. Spectral compression of mesh geometry. In Proceedings of
the 27th annual conference on Computer graphics and interactive techniques, pages
279–286, 2000.
In: Computer Animation
Editors: J.S. Wright and L.M. Hughes, pp. 113-127
ISBN 978-1-60741-559-6
c 2010 Nova Science Publishers, Inc.
Chapter 4
VIRTUAL EMOTION TO EXPRESSION:
A COMPREHENSIVE DYNAMIC EMOTION
MODEL TO FACIAL EXPRESSION GENERATION
USING THE MPEG-4 STANDARD
Paula Rodrigues
1,∗
, Asla S´ a
2,†
and Luiz Velho
3,‡
1
Informatics Department, PUC-Rio, Brazil
2
TecGraf, PUC-Rio, Brazil
3
IMPA - Instituto de Matem´ atica Pura e Aplicada, Brazil
Abstract
In this paper we present a framework for generating dynamic facial expressions
synchronized with speech, rendered using a tridimensional realistic face. Dynamic fa-
cial expressions are those temporal-based facial expressions semantically related with
emotions, speech and affective inputs that can modify a facial animation behavior.
The framework is composed by an emotion model for speech virtual actors, named
VeeM (Virtual emotion-to-expression Model), which is based on a revision of the
emotional wheel of Plutchik model. The VeeM introduces the emotional hypercube
concept in the R
4
canonical space to combine pure emotions and create new derived
emotions.
The VeeM model implementation uses the MPEG-4 face standard through a in-
novative tool named DynaFeX (Dynamic Facial eXpression). The DynaFeX is an
authoring and player facial animation tool, where a speech processing is realized to
allow the phoneme and viseme synchronization. The tool allows both the deﬁnition
and reﬁnement of emotions for each frame, or group of frames, as the facial anima-
tion edition using a high-level approach based on animation scripts. The tool player
controls the animation presentation synchronizing the speech and emotional features
with the virtual character performance. Finally, DynaFeX is built over a tridimensional
polygonal mesh, compliant with MPEG-4 facial animation standard, what favors tool
interoperability with other facial animation systems.
Keywords: Facial Animation, Talking Heads, Expressive Virtual Characters.
∗
E-mail address: [email protected]
†
E-mail address: [email protected]
‡
E-mail address: [email protected]
114 Paula Rodrigues, Asla S´ a and Luiz Velho
1. Introduction
Character Animation is one of the key research areas in Computer Graphics and Multi-
media. It has applications in many ﬁelds, ranging from Entertainment, Games, Virtual
Presence and others.
Within the general area of character animation, the modeling and animation of faces is
perhaps the single most important and challenging topic. This is because the expressiveness
and personality of a character is communicated by facial expressions.
The research in face modeling and animation dates back to the seminal work of Frederic
Parke in the early 1970’s [9]. Since that time, the area experienced a very intense devel-
opment. Practically all problems related to generating the shape and motion of faces have
been deeply studied. This body of research includes a plethora of techniques for capturing
the geometry and appearance of human faces, learning facial expressions, modeling mus-
cles and the dynamics of deformations, together with realistic rendering methods for skin
and hair.
Despite of the amazing progress in the area of facial animation, there is one problem
which is still open, and poses a great challenge to researchers: it is how to incorporate emo-
tion on animated characters! This is the crucial step toward believable virtual characters.
While a talented artist with the help of powerful modeling and animation tools can
manually create a very expressive character, the same is not true for an automatic or even
semi-automatic animation system.
It is our intent in this paper to address the challenge of generating believable virtual
characters automatically by incorporating a computational emotion model. We propose a
comprehensive emotion model for facial animation that considers the various aspects of an
expressive character. The model is implemented using a system based on the guidelines of
the MPEG-4 standard for faces.
The rest of the paper is structured as follows: in next section the emotion models and
related work are discussed. In Section 3. we propose an emotion space named emotion
Hypercube. The emotion hypercube enables pure emotion combinations in order to generate
derived emotions in a natural way. We then describe a derived emotion classiﬁcation scheme
based on the proposed space. In Section 4. some affective phenomenon, like mood and
personalitity, are incorporated into the model. In Section 5. the VeeM (Virtual emotion-
to-expression Model) is formalized, it consists of a representation of a facial expression
from an emotion description with dynamic features. Section 6. presents an overview of the
MPEG-4 standard to facial animation. In Section 7. we explain how the proposed model is
implemented. Finally, conclusions and future work are discussed in the Section 8..
2. Emotion Models and Related Work
Several models have been proposed to explain what is an emotion and how it is repre-
sented [13]. Here we summarize the main approaches related to this topic.
Basic Emotion is probably the most well-known emotion approach. The reason for
this is its association with universal recognized emotions [5]. Nevertheless, there is not a
consensus for deﬁning which are the basic emotions yet.
Virtual Emotion to Expression 115
Figure 1. Six universal basic emotions deﬁned by Ekman (surrounded by a dashed line) and
additional Plutchik basic emotions (accepted and aware).
As discussed in [7], the basic emotion approach aims to build a psychologically ir-
reducible emotion set, which means that these emotions cannot be derived by any other
emotion and new emotions are derived from them. Note that these considerations matches
to the mathematical deﬁnition of a basis.
As mentioned above, the best known method used to study basic emotion is by observ-
ing facial expressions. Through this, Ekman [5] deﬁned six universal emotions: anger,
fear, disgust, surprise, joy and sadness, illustrated in Figure 1.
An extension of Ekman’s model to the basic emotions representation is the approach
proposed by Plutchik [11], where two additional basic emotions are deﬁned (emphasized in
Figure 1): anticipation (also referred as aware, curiosity or interest) and acceptance (also
referred as trust). Plutchik describe its basic emotion as pairs of opposite emotions.
Plutchik emotions are disposed in a wheel of opposed pairs, as illustrated in Figure 2.
Derived emotions are deﬁned as the combination of two neighbor basic emotion or as a basic
emotion intensity variation. In the emotion literature, the Plutchik wheel is considered to
be enough to span most of human emotion state.
Examples of computational systems that use the basic emotion approach to generate
their facial expressions are: SMILE [6], eFASE [3], EE-FAS [15], Cloning Expression [12],
the MPEG-4 Standard [8] and the CSLU Toolkit [2].
In addition to any model of emotion, the emotion perception becomes single for each
person due to factors such as mood and personality.
The approach proposed in [10] is to model mood as a simple and unique dimension:
good mood and bad mood. A more complete approach proposed by Thayer in [16] uses
emotion spaces to represent mood in two dimensions (calm/tense and energy/tired), result-
ing in four mood emotional states: Energetic-calm, Energetic-tense, Tired-calm and Tired-
tense. An example of computational system that incorporates mood to the character using
Thayer’s model to generate dynamic facial expressions is the DER [14] [15].
Personality is another important aspect to deﬁne the action and reaction of each per-
son as unique even when submitted to the same situation of another person. Until now,
there is not a formal consensus to deﬁne the personality trait of a person, however the Big
116 Paula Rodrigues, Asla S´ a and Luiz Velho
Figure 2. Plutchik wheel.
Five (or Big OCEAN) model is well-known. In this model, each ﬁrst letter of OCEAN word
deﬁnes a dimension in the personality trait: Openess to experience, Conscientiousness,
Extraversion, Agreeableness, Neuroticism
1
.
Emotions are not static. They are experienced by each individual differently because of
characteristics such as personality and mood, referred here as affective phenomenas. Addi-
tionally, affective phenomena also interferes in the reaction of each person when receives a
stimulus, deﬁning the emotion sustaining time.
Our aim is to propose a computational system which incorporates and implements a
robust model based on basic emotions. So, the Plutchik model [11] is revisited and gen-
eralized to allow the description of new emotions from the eight basic emotions as well as
the incorporation of emotion dynamics in a comprehensive manner in order to allow the
automatic generation of believable virtual characters.
3. The Emotion Hypercube
The emotion description space proposed in this paper is a reinterpretation of Plutchik’s
emotional wheel.
We consider a family of emotions as a set of emotions composed by different intensity
levels of a given basic emotion E
i
. Plutchik has considered a discrete set of three levels
of intensity of a given basic emotion, namely, an attenuation of the pure basic emotion, the
pure basic emotion itself and an extrapolation of the pure basic emotion, as described in
Table 1.
Assuming that the Plutchik’s set of basic emotions is psychologically irreducible, our
1
More information about BIG OCEAN model can be found at http://www.answers.com/topic/big-ﬁve-
personality-traits (accessed in 22-jan-2008).
Virtual Emotion to Expression 117
Table 1. Basic emotions
family attenuation basic emotion extrapolation
(axis) (|α
i
| < 1) (|α
i
| = 1) (|α
i
| > 1)
1 (x+) serenity joy ecstasy
2 (y+) annoyance anger rage
3 (z+) acceptance trust admiration
4 (w+) distraction surprise amazement
5 (x-) pensivenes sadness grief
6 (y-) apprehension fear terror
7 (z-) boredom disgust loathing
8 (w-) interest anticipation vigilance
goal is to deﬁne a basis that represents the space of derived emotions. In order to do so, we
deﬁne an emotion axis, denoted by e, as composed by a pair of opposed families as stated
in Plutchik emotional wheel. Thus, the eight basic emotions are arranged in 4 emotion axes
denoted by x, y, z e w.
The level of intensity of an emotion axis is modeled as a continuum parameter repre-
sented by the real value α
i
, where α
e
∈ [−γ, +γ], with |γ| ≥ 1. A basic emotion is mapped
to the intensity level 1 and its opposed basic emotion is mapped to -1. The neutral emo-
tion is mapped to level 0. Though the intensity level |α
i
| = 1 matches to a basic emotion,
|α
i
| < 1 is the emotion attenuation and |α
i
| > 1 is the emotion extrapolation (See Table 1).
The more obvious space to be adopted to represent the space of emotions is R
n
, where
n is the number of opposed pairs of emotions to be considered in the model. Since we adopt
4 emotion axes, limited to the interval [−γ, +γ] we obtain an emotion hypercube
H = [−γ, +γ] ×[−γ, +γ] ×[−γ, +γ] ×[−γ, +γ]
A given emotional stimulus can then be completely deﬁned by a vector of intensity
levels u = (α
x
, α
y
, α
z
, α
w
) that represents the character’s state of emotion.
The emotion hypercube His a comprehensive emotion space useful to facial expression
generation. Observe that the proposed space can be easily extended to more than four axes
if a new pair of opposed basic emotions is incorporated, observing that the new axis should
be independent from the previously deﬁned ones in order to preserve the property that the
set of emotion axes is a basis to H.
3.1. Derived Emotions
The combination of basic emotions is an adequate approach to use with virtual talking
heads. Plutchik [11] states that two basic emotions can be combined if they are not opposed
to each other.
The emotion hypercube H leads to a simple way to derive combined emotions. For
instance, binary combinations can be deﬁned by setting two intensity levels to zero. Thus,
given two non opposite basic emotions E
i
and E
j
, their combination is deﬁned by their non
zero intensity values α
i
and α
j
. Although the original Plutchik model restricts combinations
to adjacent basic emotions, we do not adopt this restriction in the proposed model.
118 Paula Rodrigues, Asla S´ a and Luiz Velho
We emphasize that by using the hypercube model, n-ary combinations, where n is
greater than two, are easily stated, although in the literature combinations of more than
two emotions are not considered, probably due to the combinatorial growth of the num-
ber of derived emotions to be considered and semantically interpreted. Observe also that
the restriction to not combine opposite emotions is intrinsic to the disposition of opposite
emotions in the same axis.
3.2. Binary Derived Emotions Taxonomy
In this section we propose a natural and complete taxonomy of the binary derived emotions
that extends Plutchik’s work [11] by deﬁning additional semantic interpretations.
Two emotion axes e
i
and e
j
deﬁne an emotion plane of derived emotions Π
ij
. We will
refer to each quadrant of a plane as a sector of derived emotion. The combination of the 4
axis, 2 by 2, results in 6 derived planes. In Table 2 each sector of each plane is named with
a semantic interpretation of the derived emotion.
We emphasize that the semantic interpretation of each sector corresponds to the combi-
nation of two basic emotions with level of intensity |α
i
| = 1. If the basic emotion intensity
is attenuated or extrapolated the semantic interpretation of the derived emotion may change.
4. Modeling Affective Phenomena in H
Personality, mood, physical environment and others factors interfere on the expression of
emotion. We refer to such factors collectively as affective phenomena and when applied to
an individual character as its affective pattern. Thanks to affective phenomena people under
the same emotion stimulus react and feel it in different ways and intensities according to its
affective pattern.
Affective phenomena are more enduring emotions and usually occur as an emotional
background of much lower intensity than emotional episodes. In order to model the inﬂu-
ence of an affective phenomena on an emotional episode we assume that the basic emotions
parameters are deﬁned considering a neutral character while an affective pattern is modeled
as a distortion (warping) of the original emotion space.
4.1. Affective Pattern Description
Suppose that in a given instant a happiness stimulus is augmented, then it is reasonable to
think that the level of intensity related to joy also augments even if the affective pattern of
the character is biased to sadness.
Thus it is reasonable to state that an affective pattern can be deﬁned by a set of mono-
tonic functions f
i
: [−γ, γ] → [−γ, γ]. The new intensity level on each emotion axis is
˜ α
i
= f
i
(α
i
), corresponding to the original stimulus i with i = x, y, z, w is one of the emo-
tion axes from H. Then, the vector

˜ u = ( ˜ α
x
, ˜ α
y
, ˜ α
z
, ˜ α
w
) is an instance of the affective
pattern in H.
This approach to modeling affective patterns is general and capable to describe non-
neutral characters behavior. The set of functions χ = {f
x
, f
y
, f
z
, f
w
} characterize an af-
fective pattern. The difﬁculty resides in determine such functions in a meaningful manner.
Virtual Emotion to Expression 119
Table 2. Taxonomy
Derived basic basic derived
Plane emotion i emotion j emotion
Π
xy
Joy Fear thrill
Joy Anger negative pride
Sadness Fear despair
Sadness Anger envy
Π
xz
Joy Trust love
Joy Disgust morbidness
Sadness Trust sentimentalism
Sadness Disgust remorse
Π
xw
Joy Anticipation optimism
Joy Surprise absent
Sadness Anticipation pessimism
Sadness Surprise disappointment
Π
yz
Fear Trust submission
Fear Disgust distress
Anger Trust dominance
Anger Disgust contempt
Π
yw
Fear Anticipation anxiety
Fear Surprise awe
Anger Anticipation aggression
Anger Surprise outrage
Π
zw
Trust Anticipation positive pride
Trust Surprise curiosity
Disgust Anticipation cynicism
Disgust Surprise defeat
Note that the physical environment can also be described by a set of such functions.
For example if a character is in a formal environment the emotion expression tends to be
attenuated, that is, ˜ α
i
< α
i
, thus the χ set should be a set of emotion attenuation functions.
4.2. The Dynamics of an Affective Pattern
It is important to notice that affective patterns are not static. For instance, personality traits
evolve in lifetime while mood can change during the day time or as a reaction to an emo-
tional episode.
The emotion dynamics can be modeled as a function of emotion and time f : (H, t) →
H.
Semantic aspects also need to be taken into account in the context of emotion dynamics.
For example: a character cannot mix opposite emotion families and cannot swap them
120 Paula Rodrigues, Asla S´ a and Luiz Velho
Figure 3. Emotion vector generation.
frequently, otherwise a conﬂict or emotional instability is established. The combination of
basic emotions with the farthest families generates destructive emotions, because they are
close to a conﬂict state.
5. VeeM: Virtual Emotion to Expression Model
Figure 4. VeeM architecture.
The H space is used to deﬁne a given character’s state of emotion in an instant of time.
In order to simulate a believable character animation, the affective patterns and its dynamic
characteristics as well as dynamic characteristics such as head movements and eyes blink
have to be combined together. The dynamics of the facial expression of a given emotion
will be treated by VeeM. In Figure 4 a schematic view of the proposed model is illustrated.
Virtual Emotion to Expression 121
There are other equally important dynamic characteristics to be considered when aim-
ing to generate a believable character, namely the speech dynamics, the eyes movements
dynamics and the head movement dynamics, refered in literature as non-verbal movements.
The difference between affective patterns and non-verbal movements is their domain of
action, instead of affect an emotional state, the non-verbal movements act directly on the
facial expression domain F. Thus, can be modeled as a function of emotion and time
g : (H, t) →F.
Speech interferes on mouth movements and a lot of work has been done in order to
deﬁne the movements that characterize it. The subject is complex and depends on the
spoken language and character’s culture. But the consensus is to deﬁne visemes to represent
the phonemes and combine them to produce speech visualization. A viseme is a visual
representation of a phoneme that describes the facial movements that occur alongside the
voicing of phonemes.
Head and eyes movements can be treated as random functions (random noise) consid-
ering the fact that it is uncommon to keep them ﬁxed. Other complementary approach is
to model them as directed reactions that simulate the attentional focus. In both cases the
function that model these behaviors should interfere in the head and eyes position as well
as the motion velocity.
Speech and non-verbal characteristics such as eyes and head movements should be com-
bined with the emotion description in order to produce the resultant facial expression [13].
VeeM also incorporates dynamics of reaction to an emotion episode (directly related to
stimulus or facts which elicits an emotion). The emotion expression reaction depends on
the affective pattern. The reaction curve r
j
(t), similarly to the description of emotional
stimulus given by Picard [10], has 3 stages: onset (usually very fast), sustain and decay
stages, as illustrated in Figure 5.
Figure 5. Emotion reaction curve.
Emotion transition is incorporated as a blending between two subsequent emotions.
6. The MPEG-4 Standard
Once the domain of emotion description H has already been deﬁned we now turn to the
description of the range space, that is, the space where the emotions are to be visualized:
122 Paula Rodrigues, Asla S´ a and Luiz Velho
the facial expression space F.
In general, different works in facial animation generation develop their own facial model
without worrying about compatibility. Often the only common approach among these works
is the emotion model, which usually are the six Ekman’s basic emotions.
Aiming to deﬁne a standard, the MPEG-4 [4] [8] agreed a set of control points to
deﬁne a facial model proposing a facial polygonal mesh that can be considered universal. It
is important to mention that the MPEG-4 facial animation standard is the ﬁrst effort in this
direction.
The MPEG-4 speciﬁes a face model in its neutral state, a number of feature points (FPs)
on this neutral face as reference points, and a set of facial animation parameters (FAPs),
each corresponding to a particular facial action that deforms the face model starting from
the neutral state. In this work we identify the facial expression space F to the space of
FAPs.
A neutral face in the MPEG-4 standard must consider the following properties:
• Gaze is in the direction of z-axes;
• All face muscles are relaxed;
• Eyelids are tangent to the iris;
• The pupil is one third the diameter of the iris;
• Lips are in contact; the line of the lips is horizontal and the same height at lip corners;
• The mouth is closed and the upper teeth touch the lower ones; and
• The tongue is ﬂat, horizontal with the tip of the tongue touching the boundary be-
tween upper and lower teeth.
In order to deﬁne FAPs for arbitrary face models, MPEG-4 deﬁnes facial animation
units (FAPUs) that servers to scale FAPs for any face model. FAPUs are deﬁned as fractions
of distances between key facial features. These features, such as eye separation are deﬁned
on a face model that is in the neutral state.
From the FAPUs deﬁnition, MPEG-4 speciﬁes 84 FPs on the neutral face. The main
purpose of these FPs is to provide spatial reference for deﬁning FAPs. FPs are arranged in
groups such as cheeks, eyes and mouth. The location of these FPs has to be known for any
MPEG-4 compliant face model.
The FAPs are based on the study of minimal perceptible actions and are closely related
to muscle actions [8]. The 68 parameters are categorized into 10 groups related to parts
of the face (Table 3). FAPs represent a complete set of basic facial actions including head
motion, eye and mouth control.
The FAP group 1 contains two high-level parameters: visemes and expressions. The
MPEG-4 standard deﬁnes 14 visemes to represent english phonemes [13] [8]. The expres-
sion parameter deﬁnes the six basic facial expressions. Facial expressions are animated by
a value deﬁning the excitation of the expression. A beneﬁt of using FAP Group 1 is that
each facial model preserves its personality in a sense that a speciﬁc face model preserves
its particular version of facial expression.
Virtual Emotion to Expression 123
Table 3. FAPs groups
Group Number of FAPs
1. visemes and expressions 2
2. jaw, chin, inner lowerlip 16
cornerlips, midlip
3. eyeballs, pupils, eyelids 12
4. eyebrow 8
5. cheeks 4
6. tongue 5
7. head rotation 3
8. outer-lip positions 10
9. nose 4
10. ears 4
FAP groups 2 to 10 are considered low-level parameters. They specify precisely how
much a FP of a face has to be moved for a given amplitude [8].
An MPEG-4 facial expression is then obtained by moving the Feature Points (FP) asso-
ciated to the FAPs. Each basic emotion has a set of FAPs deﬁned to produce its correspon-
dent facial expression. We call signal the facial expression related to a speciﬁc emotion and
we denote the set of FAP values that deﬁne the emotion j as v
j
.
The facial animation sequence is obtained by specifying FAP values at each time instant,
v
t
j
, according to an input timeline. We adopt the MPEG-4 Standard as our space of facial
expressions.
7. VeeM applied on an MPEG-4 Face Model
As already mentioned, the VeeM permits to generate different facial emotions taking into
account affective patterns and the emotion dynamics. A challenge turns into emotion vi-
sualization using an MPEG-4 face model. That is, to set FAPs parameter values from the
deﬁned emotion.
In this work we adopt the open source Xface [1] face model. The model had to be
extended since the original implementation deﬁnes only the six Ekman’s basic emotions
as conventioned in MPEG-4 standard. Figure 6 shows the eight basic emotions of VeeM
speciﬁed in the Xface MPEG-4 polygonal mesh.
Speech is a key element to generate a dynamic and natural facial animation combined
with the character’s emotional state. The implemented system uses an audio ﬁle with the
character’s speech as input. This ﬁle goes through a speech recognition stage generating,
as output, the speech phonemes [13]. The produced phonemes are mapped into the 14
MPEG-4 visemes, each one mapped into a set with 26 FAPs that deﬁne the mouth region.
The FAPs values generated at each stage of facial animation speciﬁcation (verbal ex-
pressions, non-verbal expressions and emotions) need to be blended to deﬁne the ﬁnal value
124 Paula Rodrigues, Asla S´ a and Luiz Velho
Figure 6. VeeM basic emotions viewed in a MPEG-4 facial animation.
that a FAP takes in each animation frame. For each animation frame, this blending is done
in two stages (Figure 7):
• Facial expression for the emotion (pure or derived); and
• Facial expression for the resulting emotion and viseme blending.
Figure 7. FAP blending to generate the ﬁnal facial expression for each animation frame.
The ﬁrst stage can receive as input a single vector of FAPs, two vectors of FAPs or
nothing. Whatever the input, the results are always two vectors with dimension 68: a vector
containing the mask of FAPs (signaling if a FAP participates or not in the current frame)
and another with the values for the participants FAPs.
Virtual Emotion to Expression 125
If a vector is not provided as input, the ﬁrst stage result is the FAPs vector for natural
emotion and a mask containing the value 0 for the 26 FAPs of the mouth region and for the
FAP 1 (viseme), and the value 1 for the other FAPs. This conﬁguration purpose is the face
remains in the natural state, and its expression inﬂuenced only by the viseme FAPs.
In the case where a single FAPs vector is provided, the ﬁrst stage output is the FAPs
vector itself and a mask containing a value of 1 for all FAPs, with the exception of FAP 1
(viseme).
When the ﬁrst stage receives two vectors of FAPs, a blending is necessary to deﬁne the
derived emotion. This blending is obtained by calculating the mean between the values of
each FAP, except for FAPs 1 and 2 which are the high-level FAPs. In FAP 1 the value 0
is set, once it represents the viseme. In FAP 2, which describes the emotion, the value 0
is assigned, because this FAP does not has inﬂuence in the animation, since only low-level
FAPs are considered by the module of synchronization.
If the two FAPs vectors are provided as input, the output mask passes through the same
process as in the case of only one input vector.
The second blending stage generates the resulting FAPs vector using as input the ﬁrst
stage output, the viseme FAPs vector and a viseme contribution parameter indicating a
factor blending, denoted by β, where 0 ≤ β ≤ 1.
Before applying the viseme-emotion blending rule, the second stage creates a mask for
the viseme FAPs vector.
If there is a viseme FAPs vector, the generated mask vector has value 1 for the 26 FAPs
related to mouth region and value 0 for the other FAPs. If there is not a viseme FAPs vector
speciﬁed as input of second stage, the mask is created with all elements of the vector having
value 0.
The blending rule is simple. Denoting FAP
emo
and MASK
emo
, respectively, the re-
sulting emotion FAPs vector and mask vector; FAP
vis
and MASK
vis
, respectively, the
viseme FAPs vector and mask, both created in second stage; and β as the viseme contri-
bution factor, it is possible to apply for each index i of resulting FAPs vector FAP
res
the
following algorithmic logic:
• if (MASK
vis
= 0) ⇒FAP
res
i
= MASK
emo
i
∗ FAP
emo
i
• if ((MASK
vis
= 0) ∧ (MASK
emo
= 0)) ⇒FAP
res
i
= MASK
vis
i
∗ FAP
vis
i
• if ((MASK
vis
= 0) ∧ (MASK
emo
= 0)) ⇒ FAP
res
i
= (1 −β) ∗ MASK
emo
i
∗
FAP
emo
i
+β ∗ MASK
vis
i
∗ FAP
vis
i
With this rule, it is possible to note that the value β = 1 is related to the blending
visemes and emotions strategy where visemes is overlapping emotions, ignoring the emo-
tion inﬂuence in the generation of facial expression in mouth region. The β value can be
set for a qualitative analysis of the visual results obtained. Good results have been achieved
with values ranging from 0.6 ≤ β ≤ 0.7.
Once FAP values for each animation frame are generated, the next step is the syn-
chronization between the audio speech ﬁle and each frame. In the beginning of animation
presentation, a thread for audio presentation is initiated in parallel with the thread used to
control the face. The thread responsible for the face control synchronizes FAPs with the
126 Paula Rodrigues, Asla S´ a and Luiz Velho
audio verifying the machine clock and using the frame frequency deﬁned in the animation.
At each iteration, the elapsed time from the start of the animation is calculated and the
corresponding frame is loaded. If the frame is different from the previous one, it is taken
in FAP structure received as a parameter. This method generates the FAPs mapping in the
mesh and the new expression is rendered.
8. Conclusion
This paper introduced a new emotion model for the generation of facial expressions in
virtual characters. The proposed Virtual emotion-to-expression Model (VeeM) is based on
a generalization of Plutchik‘s emotional wheel [11] to the emotional Hypercube H ⊂ R
4
.
This mathematical formulation allows the combination of the 8 pure emotions in a general
and coherent way. Furthermore, it sets the ground for a comprehensive framework which
integrates emotions and affective phenomena by time-varying functions f : (H, t) →H as
well as the non-verbal movements g : (H, t) → F, from the conﬁguration space H to the
space of facial expressions F. The dynamics of expressions is modeled by considering the
temporal properties of functions f
t
in the space H and g
t
in the space F.
Facial expressions are deﬁned in our framework using the MPEG-4 standard. Conse-
quently, another relevant contribution of this paper is a computational methodology that
incorporates VeeM expressions into the representation of a face under the MPEG-4 guide-
lines. Our system generates animations of believable virtual characters from emotional and
verbal elements. This is done by mapping emotion and speech to FAPs at each frame of the
animation with lip-sync and expression dynamics.
An important aspect that we plan to address in the future is the development of tests to
validate the combination between visemes and facial expressions in VeeM. Another avenue
for future work is the investigation of derived emotions resulting from the combination of
more than two axis of the emotional Hypercube.
Further research for the VeeM framework includes: a detailed analysis of warpings of
the emotional hypercube to model affective phenomena; an in-depth study of how to effec-
tively model the dynamics of expressions through time-dependent properties of functions
in the emotional space; and a validation of the best strategies to perform transition between
emotional states. Finally, our end goal is to fully exploit the potential of VeeM in actual
applications.
References
[1] Balci, K.: Xface: MPEG-4 based open source toolkit for 3D Facial Animation. Pro-
ceedings of the working conference on Advanced visual interfaces, 399–402 (2004).
[2] Cole, R.: Tools for research and education in speech science. Proceedings of the In-
ternational Conference of Phonetic Sciences (1999).
[3] Deng, Z., Neumann, U.: eFASE: Expressive Facial Animation Synthesis and Edit-
ing with Phoneme-Isomap Control. Proceedings of ACM SIGGRAPH/Eurographics
Symposium on Computer Animation (2006).
Virtual Emotion to Expression 127
[4] Ebragimi, T., Pereira, F.: The MPEG-4 Book. Prentice Hall PTR, 1st edition (2002).
[5] Ekman, P.: Universal and cultural differences in facial expressions of emotion. Ne-
braska Symposium on Motivation. Ed. Lincoln, 207–283 (1971).
[6] Karla, P. et al.: SMILE: A Multilayered Facial Animation System. Proceedings of
IFIP WG 5 (10), 189–198 (1991).
[7] Ortony, A., Turner, T.J.: What’s basic about basic emotion? American Psychological
Association, Inc. 97 (3), 315–321 (1990).
[8] Pandzic, I.S., Forchheimer, R.: MPEG-4 Facial Animation: The Standard, Implemen-
tation and Applications. John Wiley and Sons, Ltd (2002).
[9] Parke, F.: A parametric model for human faces. PhD Thesis, University of Utah
(1974).
[10] Picard, R. W.: Affective computing. Cambridge, Mass. : M.I.T. Press (1997).
[11] Plutchik, R.: A general psychoevolutionary theory of emotion. Emotion: Theory, re-
search, and experience. Theories of emotion Vol 1, 3–33 (1980).
[12] Pyun, H. and et al.: An Example-Based Approach for Facial Expression Cloning.
Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer
Animation, 167–176 (2003).
[13] Rodrigues, P.S.L.: A System for Generating Dynamic Facial Expressions in 3D Facial
Animation with Speech Processing. PhD Thesis, PUC-Rio, Brazil (2007).
[14] Tanguy, E. A. R.: Emotions: the Art of Communication Applied to Virtual Actors. PhD
Thesis, Department of Computer Science; University of Bath; CSBU-2006-06 (ISSN
1740-9497) (2006).
[15] Tanguy, E., Willis, P., Bryson, J.: A Dynamic Emotion Representation Model Within a
Facial Animation System. Techinical Report, CSBU-2005-14 (2005).
[16] Thayer, R.E.: The origin of everyday moods. Oxford University Press (1996).
In: Computer Animation
Editors: J.S. Wright and L.M. Hughes, pp. 129-144
ISBN 978-1-60741-559-6
c _2010 Nova Science Publishers, Inc.
Chapter 5
EXAMPLE-BASED PERFORMANCE-DRIVEN
ANIMATION OF AN ANATOMICAL FACE MODEL
Yu Zhang
Institute of High Performance Computing,
Singapore 117528
Abstract
Recent development of physics-based face modeling that emulates the anatomi-
cal structure including skin, muscles, and skull allows us to create detailed, realistic
animations. However, synthesis of facial expressions on such complex models of-
ten involves signiﬁcant manual work due to the difﬁculty in determining appropriate
values of the muscle actuation parameters. This paper presents an example-based
performance-driven method to automatically estimate facial muscle actuation param-
eters from markerless video footage. Our method is based on an efﬁcient face tracker
which uses a facial deformation subspace model. During the training phase of the
tracker a set of templates associated with the subspace basis is computed to alleviate
the online computation. At runtime, the tracking algorithm establishes temporal cor-
respondence of the face region in the video sequence by simultaneously determining
both motion and appearance parameters. Using a set of example pairs that consist
of the appearance and animation parameters corresponding to the key expressions, we
learn the relationship between facial appearances and animation parameters. It enables
the animation parameters to be computed in real-time from the appearance parameters
obtained by the tracker, allowing animation of the anatomical model at interactive
rates.
1. Introduction
Realistic modeling and animation of human faces has been one of the most interesting
problems in computer graphics. In particular, synthesizing facial expressions of virtual
characters has experienced increased attention for its important applications in entertain-
ment (e.g., movies and computer games), advanced man-machine interfaces, electronic
commence, tele-presence and shared virtual world systems, and facial expression recog-
nition. However, the task of modeling the expressive human face by computer remains a
130 Yu Zhang
major challenge. First, facial movement is a product of the underlying skeletal and muscu-
lar forms, as well as the mechanical properties of the skin and subcutaneous layers. This
is very complex, because there are numerous speciﬁc muscles and important interactions
between muscles and bone structure. Second, the human visual system is very sensitive to
the nuances of facial expressions that it can perceive, and the slightest deviation from real
facial appearance or movement can be immediately detected as wrong by the most casual
viewer.
Advances in facial animation systems show the potential of physics-based approaches,
where an anatomically accurate model of facial musculature, passive tissues and underlying
skeletal structure is simulated [15, 16, 18, 25, 30, 33]. This kind of techniques can be used
to create detailed, realistic animations. However, as the model becomes more complex, an-
imating detailed models of this sort becomes more difﬁcult, requiring complex coordinated
stimulation of the underlying musculature. Although the Facial Action Coding System
(FACS) [10] that codes facial movements in small units provides guidance to activate indi-
vidual muscles for synthesizing speciﬁc expressions, extensive manual intervention is still
required due to the difﬁculty in determining appropriate muscle parameter values. Once a
model exists, it is often desirable to automatically determine muscle contractions from real
facial motion data.
One solution to this problem comes from the performance-driven animation approach,
in which video footage recording the performance of a human actor is used to control the
animation of a synthetic model. The face can be tracked throughout the video by recovering
the position and expression at each frame. This information can then be used to estimate
animation parameter values. While various techniques have been used for performance-
driven animation, most existing ones use colored markers painted on the actor’s face to aid
the face tracker. Once the position of the markers has been determined, the position of the
face and the facial features can be derived easily. However, the use of markers on the face
is intrusive and limits the type of video that can be processed.
In this paper, we present an example-based performance-driven animation method by
automatically determining facial muscle activations that track markerless video footage of
a person’s face. Our method consists of several stages: facial deformation subspace con-
struction, facial motion tracking, and expression retargeting. Fig. 1 shows a block diagram
of the system architecture. In facial deformation subspace construction, we build a low-
dimensional linear subspace that models image variation due to non-rigid facial deforma-
tions. The subspace is trained ofﬂine by processing a video sequence of the person with
different expressions. In face tracking procedure, the deformation subspace model is incor-
porated into an efﬁcient tracking algorithm which establishes temporal correspondence of
the face region in the video sequence by simultaneously determining both motion and ap-
pearance parameters with no more computation that would be required. During the training
phase of the tracking algorithm a set of templates associated to the subspace basis is com-
puted to alleviate the online computation. In expression retargeting, a set of example face
images that show key expressions are selected. Their appearance parameters in the facial
deformation subspace together with the corresponding animation parameters of an anatom-
ical 3D face model are used to learn the relationship between the animation parameters and
the appearance parameters. At runtime, it allows us to efﬁciently estimate the animation
parameters from the appearance parameters provided by the tracker.
Hamiltonian Mechanics 131
Figure 1. Overview of the example-based performance-driven expression synthesis system.
The paper is organized as follows. Section 2. reviews the previous and related work.
Section 3. presents our anatomical face model and creation of key example expressions.
Construction of the deformation subspace is explained in Section 4.. Section 5. details the
facial motion tracking algorithm. Section 6. describes efﬁcient estimation of animation
parameters in the expression retargeting process. Experimental results are shown in Section
7.. Section 8. presents conclusions and proposes avenues for future work.
2. Previous and Related Work
Realistic facial animation remains as a fundamental challenge in computer graphics. Since
the pioneering work of Parke [24], a large body of literature on modeling and animating
faces has been published in the last four decades. A good overview can be found in the
textbook by Parke and Waters [23] and in the survey by Noh and Neumann [21]. In the
context of this paper, we focus on publications that address performance-driven facial ani-
mation and muscle-based face modeling.
Remarkably, one of the oldest publications in this context is the one that uses three-
dimensional sparse motion capture marker data to control facial movement of computer-
generated models [32]. The systemsynthesizes expressions by changing texture coordinates
calculated from the positions of the markers on the performer’s face. Eisert and Girod [9]
model a face with a B-spline surface, and analyze facial expressions into feature point
positions to estimate the facial animation parameters of the MPEG-4 standard. Guenter
[13] capture both 3D geometry and shading information of a human face, and reproduce
photorealistic expressions. In all of these methods, the locations of the markers are used
to drive the 3D model. Since the markers usually are quite sparse compared to the dense
surface mesh of the model, an interpolation function is typically used to deformed the mesh
so that vertices in between the markers are displaced properly.
Another category of performance-driven animation is to synthesize expressions by
blending pre-modeled key expressions. The animation is achieved by computing a set of
132 Yu Zhang
blending weights that minimize the Euclidean distance between the corresponding markers
on the actor’s face and the 3D model. Pighin et al. [27] reconstruct the geometry and tex-
ture of an individual face from several face images taken from different viewangles. They
also model basic expressions and generate novel expressions by blending them. Later, they
propose a method to ﬁnd the blending weights by minimizing an error function over the set
of pre-modeled expressions and face positions spanned by the model [26]. Kouadio et al.
[17] animate a synthetic character by extracting the interpolation weights from the feature
points traced by an optical capture system. Choe and Ko [5] develop an artist-in-the-loop
method for analyzing captured expressions. The expressions are synthesized by a linear
combination of the elements in a muscle actuation basis which consists of face shapes re-
sulting from the contraction of each single facial muscle. Typically, the basis elements need
to be resculpted a number of times to obtain satisfactory results. In the approach proposed
by Chuang and Bregler [6], the 2D key expressions are automatically found from the track-
ing data, and the corresponding 3D key shapes of a face model are created manually. Facial
animation is produced by applying the blending weights recovered from facial feature de-
composition to the 3D key shapes. However, a complete bank of 3D key shapes must be
built for any new subject, which is a tedious task.
Some approaches involve mapping the motion of a facial expression from the source
model to the target model directly [20]. Since the target model may have different shape,
the source motion vectors need to be transformed to follow the curvature of the new face
shape. In order to generate delicate skin deformations, dense mesh motion is required as
input. But this may not be available from some motion capture systems. Moreover, dense
correspondences between the source and target models should be established for motion
retargeting, which is difﬁcult if the source and target shapes are very different.
In facial animation, desire for improved realism has driven researchers to extend ge-
ometric models with physical models of facial anatomy which attempt to emulate the in-
ﬂuence of muscle contraction onto the skin surface by approximating the biomechanical
properties of skin [15, 16, 18, 25, 30, 33]. Animating a detailed muscle-based model can be
rather difﬁcult since facial muscles contract in a complex coordinated manner to generate
expressions. A solution to this problem is to automatically determine muscle activations
from the facial motion data. Terzopoulos and Waters [31] extract muscle contraction pa-
rameters based on the position of facial features tracked by snakes. Morishima et al. [19]
use 2D marker positions as input for a neural network which estimates muscle actuation
parameters. Both of these approaches require heavy makeup of the actor’s face. Some
techniques compute an optical ﬂow from the video sequence and decompose the ﬂow into
muscle activations. Essa et al. [12] use a physical face model and develop a system to
estimate muscle contractions that match optical ﬂow input based on feedback control the-
ory. Decarlo and Metaxas [8] employed a similar model that incorporates variations in head
shape using anthropometric measurements. In [1, 4], a 2D quasi-static ﬁnite element model
is used to simulate movements of the lips [1] or the facial skin surface [4]. Given marker
data, the authors use a steepest descent iterative solver to calculate the lip model parameters
or the facial muscle activations that best track the motion data. More recently, Sifakis et al.
[29] employ an optimization framework to determine muscle activations that track a sparse
set of surface landmarks, and used it for speech animation [28]. However, the computa-
tional complexity of the nonlinear optimization process makes this method unsuitable to
Hamiltonian Mechanics 133
retarget facial expressions in real-time.
3. Creating Key Expressions on An Anatomy-based Face Model
We have developed an anatomy-based face model for physically-based facial animation
[33]. The model encapsulates three structural layers: skin, muscles, and skull (see Fig. 2).
The skin surface is represented as a triangular mesh, consisting of 4,517 triangles. The
edges and vertices of the skin mesh are converted to nonlinear springs and point masses
to simulate dynamic deformation of the soft tissue. A layer of 23 muscles is attached
to the skull and inserted into the skin to control facial movement. Our muscle models
simulate the distribution of muscle force exerted on the facial skin. When muscles contract,
the surrounding skin tissue is dynamically deformed under a ﬁeld of muscle forces. An
animation of facial expressions is carried out by a deformation of the skin mesh resulting
from the combined contractions of a set of muscles based on the FACS [10]. The skull is
also represented as a triangular mesh. During the runtime of an animation, articulation of
the jaw causes motion of the mouth, and shape of the skull constrains skin deformation,
preventing skull penetration. The reader is referred to [33] for detailed description of this
model.
Frontalis Major
Frontalis Outer
Nasalis
Zygomaticus Major
Mentalis
Orbicularis Oris
Frontalis Inner
Corrugator Supercilliary
Zygomaticus Minor
Risorius
Depressor Anguli
Orbicularis Oculi
Figure 2. The anatomy-based face model. Left: face geometry. Right: multi-layer anatom-
ical structure of the skin, muscles and skull.
The anatomy-based face model is animated to generate a set of key expressions, which
is done once ofﬂine. Ekman [10] illustrated that facial expressions result from the actuation
of a single or multiple facial muscles. Using FACS, expressions are coded as combinations
of the Action Units and levels of muscle activation, which serves as an excellent guide to
the key expression modeling job. With a windows GUI, different expressions can be readily
synthesized on the face model using the muscle and jaw parameters. For each muscle,
the degree of the muscle contraction is controlled by the parameter muscle contraction
rate which is deﬁned between 0 (passive) and 1 (maximally active). The motion of the
jaw is realized by a 3D coordinate transformation which is controlled by six parameters:
three rotation angles and three translation parameters. Rigid movements of two eyes are
controlled by four transformation parameters. The total 33 facial animation parameters are
grouped into a vector p = [p
1
, p
2
, . . . , p
m
]
T
, where m = 33. Following the categorization
of emotions in psychological study [10], we create a set of 54 key expressions which are
believed to correspond to situations eliciting different kinds of emotion. Fig. 3 shows some
134 Yu Zhang
examples. Each key expression is rendered into an image, and its corresponding animation
parameter vector, p
i
, is recorded.
Figure 3. Examples of key expressions generated on the anatomy-based face model.
4. Face Deformation Subspace Model
The tracking model is trained from an annotated training set which contains a number of
images from the training video sequence. In acquisition of the training data, the subjects
were asked to make all kinds of expressions including the key expressions. To obtain the
training set, a handful of M frames from the video sequence that exhibit pronounced dif-
ferences are manually labeled with 65 feature points which are located on the eyebrows,
eyes, nose, mouth, and outline of the face. Feature points are distributed evenly along each
contour (see an example in Fig. 4).
(a) (b)
Figure 4. A training face image (a) and the feature points (b).
Assume that the feature point set of the training frames be given as ¦P
i
¦
i=1,...,M
where
P
i
= ((x
i
1
, y
i
1
), . . . , (x
i
K
, y
i
K
)) ∈ R
2K
is a sequence of K (K=65) points in the image
plane. Let
¯
P be the mean shape of the feature point set.
¯
P is calculated after P
i
are aligned
to remove the afﬁne motion of the face. Each training image is then warped correspondingly
from its original feature point set P
i
to the mean shape
¯
P by using a thin plate spline
approach [3]. After this normalization procedure, we deﬁne a target region ¹ ∈ R
N
which
is the patch of N image pixels enclosed by the bounding box of
¯
P. Let the set ¦¹
i
¦
i=1,...,M
be a set of M target regions in the warped training images. For each image in the set ¦¹
i
¦
we construct a 1D vector by scanning it in the standard lexicographic order. We assume that
the number of training images, M, is less than the number of pixels, N. The average over
Hamiltonian Mechanics 135
M formed vectors is given by Φ
0
= (1/M)

M
i=1
¹
i
. Each formed vector differs from
the average by the vector d¹
i
= ¹
i
− Φ
0
. We arrange the deviation vectors into a matrix
D = [d¹
1
, d¹
2
, . . . , d¹
M
]. Principal component analysis (PCA) of the matrix D yields
a set of M principal orthogonal modes of variation in ¦¹
i
¦, Φ
j
, and their corresponding
eigenvalues λ
j
. Φ
j
are sorted according to the decreasing order of their eigenvalues. The
PCA model is obtained as:
¹ = Φ
0
+
M

j=1
α
j
Φ
j
= Φα (1)
where Φ = [Φ
0
, Φ
1
, . . . , Φ
M
] is the matrix consisting of the average and M principal
modes of variation in the training set, and α = [1, α
1
, . . . , α
M
]
T
is the vector of appearance
parameters. The projection from ¹to α is
α = Φ
T
¹ (2)
By truncating the expansion of Eq. 1 at j = k we introduce an error whose magnitude
decreases when k is increased. We choose the k such that
k

j=1
λ
j
≥ τ
M

j=1
λ
j
(3)
where τ deﬁnes the proportion of the total variation exhibited in the training set (98% in
our case). By this, a k(<< N) dimensional deformation subspace is deﬁned by the k
basis vectors, and each training image, ¹
i
, is represented as a point in the subspace in
R
k
. The linear subspace basis, Φ, models non-rigid deformation of the face in generating
expressions.
5. Tracking the Face
Let I(x, t) be the pixel value at the location x = (x, y)
T
in the image acquired at time t.
Over time, the relative motion between the subject and the camera causes the image of the
face to shift. We use a warping function f(x, β) to model the rigid motion of the face, where
β = [β
1
, β
2
, . . . , β
l
]
T
is the motion parameter vector, with f(x, 0) = x. f is assumed to be
differentiable in both x and β. Tracking a face amounts to recovering the motion parameter
vector for each image in the tracking sequence. We assume that the only changes in images
of the face are completely described by f, i.e., there are no changes in the illumination of
the face. Our tracking model is represented by the image constancy assumption
I(f(x, β
t
), t) = [Φα
t
](x), ∀x ∈ ¹ (4)
where I(f(x, β
t
), t) is the image acquired at time t rectiﬁed with motion model f(x, β
t
)
and motion parameters β
t
. By [Φα](x) we denote the value of Φα for the pixel with posi-
tion x in the image. Intuitively, Eq. (4) states that the rigidly rectiﬁed image I(f(x, β
t
), t)
can be expressed as a linear combination of the subspace basis vectors Φ.
136 Yu Zhang
Tracking a face consists of estimating for each frame in the sequence the values of the
motion parameter vector β and appearance parameter vector α to minimize the following
least squares objective function
O(α, β) =

x∈R
(I(f(x, β
t
), t) −[Φα
t
](x))
2
=| I(β
t
, t) −Φα
t
|
2
(5)
where I(β
t
, t) is the image of the target region, under the change of coordinates with pa-
rameters β, in vector form in an N-dimensional space:
I(β
t
, t) =





I(f(x
1
, β
t
), t)
I(f(x
2
, β
t
), t)
.
.
.
I(f(x
N
, β
t
), t)





(6)
Minimizing Eq. (5) can be a difﬁcult task as it deﬁnes a nonconvex cost function. In the
absence of a good starting point, some costly global optimization approaches are required
to solve this problem. In our case, by taking advantage of the continuity of face motion
in the tracking sequence, we recast the tracking problem as one of determining a vector of
offsets, ∆β, such that β
t+∆t
= β
t
+ ∆β from a frame acquired at t + ∆t. Incorporating
this modiﬁcation into Eq. (5) and using a ﬁrst order Taylor series expansion, we reduce the
problem to a linearized version
O(α, ∆β) ≈| I(β
t
, t + ∆t) + J∆β −Φα
t+∆t
|
2
(7)
where J ∈ R
N×l
is the Jacobian matrix of I with respect to the components of β, i.e.,
J =
∂I(β,t)
∂β
[
β
t
. Such a linearization enables us to apply continuous optimization procedures
to the tracking problem.
The optimization scheme we use ﬁrst assumes α constant and uses the most recent
appearance parameter estimate α
t
to rectify the target region. Then the solution for ∆β
can be obtained by solving the set of equations ∇O = 0:
∆β = −(J
T
J)
−1
J
T
[I(β
t
, t + ∆t) −Φα
t
] (8)
We deﬁne the error vector as the difference between the rectiﬁed image and the linear
combination of the subspace basis vectors.
e(t + ∆t) = I(β
t
, t + ∆t) −Φα
t
(9)
Thus, the solution of Eq. (5) at time t + ∆t given a solution at time t is
β
t+∆t
= β
t
−(J
T
J)
−1
J
T
e(t + ∆t) (10)
FromEq. (10), we see that the obstacle to efﬁciently tracking the face region through the
image sequence is the computational cost of estimating J for each frame which involves the
calculation of the image gradient vector. However, it is possible to reduce this computation
by factoring J. Each element of J can be written as
s
ij
= I
β
j
(f(x
i
, β), t) = ∇
f
I(f(x
i
, β), t)
T
f
β
j
(x
i
, β) (11)
Hamiltonian Mechanics 137
By differentiating both sides of Eq. (4) with respect to x, we obtain
∇
f
I(f(x, β), t)
T
f
x
(x, β) = ∇
x
Φα (12)
From Eq. (11) and (12), we get
J(α, β) =





∇
x
Φ(x
1
)αf
x
(x
1
, β)
−1
f
β
(x
1
, β)
∇
x
Φ(x
2
)αf
x
(x
2
, β)
−1
f
β
(x
2
, β)
.
.
.
∇
x
Φ(x
N
)αf
x
(x
N
, β)
−1
f
β
(x
N
, β)





(13)
Therefore, J can be expressed in terms of the gradient of the subspace basis vectors, ∇
x
Φ,
which are constant, and the appearance and motion parameters, (α,β), which are time-
varying. If we choose a motion model f such that
αf
x
(x, β)
−1
f
β
(x, β) = Λ(x)Γ(α, β), (14)
where Λ and Γ are the matrices depending only on image coordinates and parameters (α,β),
respectively. J can then be factored into
J(α, β) =





∇
x
Φ(x
1
)Λ(x
1
)
∇
x
Φ(x
2
)Λ(x
2
)
.
.
.
∇
x
Φ(x
N
)Λ(x
N
)





Γ(α, β) = J
0
Γ(α, β) (15)
where ∇
x
Φ(x
i
) is the Jacobian of Φ with respect to image coordinates, J
0
is a constant
matrix, and Γ is a time-varying matrix. The columns of J
0
can be regarded as a set of ﬁxed
template images and can be pre-computed ofﬂine.
By exploiting this factorization, from (8), an efﬁcient solution for ∆β can be obtained:
∆β = −(Γ
T
ΩΓ)
−1
Γ
T
J
T
0
e (16)
where Ω = J
T
0
J
0
. J
0
can be precomputed and stored, and only e and Γ need to be evaluated
online at (α
t
, β
t
). Our optimization scheme then assumes β constant and computes the
minimum of O(α, ∆β) with respect to α to obtain the solution for α
t+∆t
:
α
t+∆t
= Φ
T
[I(β
t
, t + ∆t) + J∆β] (17)
The term J∆β represents the pixel value variation in I due to a motion of magnitude ∆β.
Intuitively Eq. (17) states that the appearance parameters are computed by projecting into
the subspace Φ the rectiﬁed image corrected to take into account the incremental motion
∆β.
In our experiments, we use a projective motion model, f(x, β) = Hx
h
, where H is a
3 3 projective transformation matrix containing l = 8 motion parameters and homoge-
neous image coordinates x
h
are related to x by: x
h
= (r, s, λ)
T
→ x = (r/λ, s/λ)
T
=
(x, y)
T
; λ ,= 0. The above optimization is performed using the Gauss-Newton approach.
With obtained α
t+∆t
, we iteratively estimate ∆β using (16). Normally in two to three
iterations the convergence is reached.
138 Yu Zhang
6. Facial Expression Retargeting
The face tracker described in Section 5. provides the vector of subspace coefﬁcients, α
t
,
which encapsulates deformation appearance of the face at time t. α
t
is used to estimate
animation parameters of the anatomy-based face model.
We select a set example face images ¦¹
j
¦
j=1,...,n
(n = 54) corresponding to the key
facial expressions from the normalized training data set ¦¹
i
¦
i=1,...,M
, and project each
example image into the constructed deformation subspace to obtain its appearance parame-
ters.
α
j
= Φ
T
¹
j
(18)
Let A∈ R
(k+1)×n
and P∈ R
m×n
be the matrices obtained by storing column-wise the
computed appearance parameters of the example images and pre-stored facial animation
parameters corresponding to the key facial expressions, respectively. We construct a matrix
G∈ R
(k+m+1)×n
:
G =

A
WP

=

α
1
α
n
W(p
1
p
n
)

(19)
where W is a diagonal matrix of weights for the facial animation parameters to compensate
the difference in scale between the animation parameters and appearance parametersz. We
set W = diag(w) where w
2
is the ratio of the total variation of the appearance parameters
to the total variation of the animation parameters.
Applying PCA on G, we obtain a matrix Ψ ∈ R
(k+m+1)×(q+1)
which consists of the
mean vector of examples and q (q < n) eigenvectors corresponding to the q largest eigen-
values of the covariance matrix GG
T
:
Ψ =

Ψ
α
Ψ
p

(20)
Each example pair (α
j
,p
j
) can be approximated as:

α
j
wp
j

= Ψγ
j
(21)
where γ
j
= [1, γ
j,1
, . . . , γ
j,q
]
T
is the vector of coefﬁcients. By this, a concatenated param-
eter vector, which is originally in R
(k+m+1)
, can be represented as a point in the low di-
mensional parameter subspace in R
(q+1)
. Ψparameterizes the example parameter vectors
and represents the relationship between the appearance parameters in A and the animation
parameters in P.
For each frame in the tracking sequence, once the vector of appearance parameters is
obtained by the tracker, it can be represented by a linear combination of the parameter
subspace basis vectors: α
t
= Ψ
α
γ
t
, where γ
t
is the unknown. γ
t
is solved by
γ
t
= pinv(Ψ
α
)α
t
(22)
where pinv() is the matrix pseudo-inverse operator using SVD. The vector of facial ani-
mation parameters corresponding to the tracked face at time t is then computed as:
p
t
=
1
w
Ψ
p
γ
t
=
1
w
Ψ
p
pinv(Ψ
α
)α
t
= Cα
t
(23)
where the constant matrix C∈ R
m×(k+1)
is precomputed ofﬂine. Table. 1 shows the imple-
mentation steps of our efﬁcient performance-driven facial animation algorithm.
Hamiltonian Mechanics 139
Table 1. The steps of our facial animation algorithm.
Process
Ofﬂine
1. Compute the gradient of the subspace basis ∇
x
Φ(x) and matrix Λ(x).
2. Compute and store J
0
and Ω.
3. Compute the parameter subspace basis Ψ.
4. Compute and store C.
Runtime
1. Reconstruct the image vector Φα
t
.
2. Use the motion parameter β
t
to compute I(β
t
, t +∆t).
3. Compute e(t +∆t) according to Eq. (10).
4. Compute Γ(α
t
, β
t
).
5. Compute ∆β according to Eq. (16).
6. Compute β
t+∆t
= β
t
+∆β.
7. Compute α
t+∆t
according to Eq. (17).
8. Go to step 5 to compute ∆β using α
t+∆t
until convergency.
9. Compute the facial animation parameters p
t+∆t
according to Eq. (23).
7. Results
Our facial animation system is programmed with C++/OpenGL and runs on a 3.2GHz PC
with 1GB memory. In order to evaluate the accuracy of estimation of facial animation
parameters, we use a synthetic sequence as the input. The anatomy-based face model is
animated to generate two synthetic image sequences: a training sequence (944 frames) and
a tracking sequence (1865 frames). The facial expressions in the training sequence include
the key expressions and are different from those in the tracking sequence. We select 134
normalized images from the training sequence to train the tracker. The target face region
contains N = 147 220 pixels. These selected images allow us to form the deformation
subspace, Φ, for tracking. The set of 54 normalized images of the key expressions and the
pre-stored animation parameter vectors is used to compute the parameter subspace, Ψ, for
facial expression retargeting.
Figure 5. Some tracked images from the synthetic sequence.
140 Yu Zhang
Figure 6. Errors of estimated animation parameters.
(a)
(b)
Figure 7. Example frames from two live tracking sequences and expressions synthesized
on the 3D face model.
Fig. 5 shows some tracking frames from the synthetic sequence. We assess the rean-
imation by measuring the maximum, mean, and root mean square (RMS) errors from the
Hamiltonian Mechanics 141
Table 2. Parameter values of two experiments and performance of our system.
Notation: tracking time (T
t
), retargeting time (T
r
) and expression simulation time
(T
s
).
estimated animation parameters to the ground truth. Fig. 6 plots the normalized errors of
all parameters (the ratio between measured errors and the maximum parameter range). The
result shows that the overall parameter estimation is accurate. Some relatively large errors
(e.g., the maximum errors of the parameters 3-8, 18 and 19 which control the peripheral
muscles) occur when the model in the tracking sequence has a large out-of-plane rotation
(> 20 degrees).
We also test our system using the live video sequences of different subjects. Fig. 7
shows some tracking frames and animation snapshots. Note that we do not estimate 3D
head motion, only the in-plane 2D rotation and translation extracted during the tracking
stage are used to produce the global head motion. The ﬁrst tracking video (male) con-
sists of 5,100 frames, acquired by a Sony VL500 ﬁrewire camera at 30 fps. The second
video (female) consists of 5,400 frames. Also, long sequences are used for training the
tracker and the expression retargeting module. In particular, a set of images corresponding
to the key expressions are manually selected from the training sequence. Together with the
stored animation parameter vectors, the appearance parameters of these images are used to
form the example parameter pairs for building the parameter subspace. Table 2 shows the
parameter values used in two experiments and the average time of each process for resyn-
thesizing facial animation. With the proposed algorithm we can achieve standard video rate
performance (30Hz) for face tracking and estimation of animation parameters. The bulk
of computation is consumed by the physically-based expression simulation [33]. Never-
theless, we can still achieve an average frame rate of about 15 fps animation speed on the
current experimental platform.
8. Conclusion
We have presented an efﬁcient example-based method to synthesize facial expressions on
an anatomical model by retargeting captured performance. Our method automatically de-
termines facial muscle activations from markerless video footage. In ofﬂine processing,
we build a linear subspace model to compactly describe appearance variation due to facial
deformations. Based upon this model, an efﬁcient face tracker is used to track the target
face region by simultaneously solving for both the motion and appearance parameters. Ef-
ﬁciency is gained by precomputing a set of motion template resulting from factorization
of the image Jacobian used in minimizing the tracking error function. Using a set of ex-
ample pairs that consist of the appearance and animation parameters corresponding to the
142 Yu Zhang
key expressions, we learn an expression retargeting matrix. Given the appearance parame-
ters provided by the tracker, facial animation parameters are estimated in real-time, and the
anatomical model is animated at an interactive rate.
One improvement of the existing system consists in splitting the face into the upper
region (including eyes and forehead) and the lower region (including nose, mouth, and
chin). This allows a more compact and accurate model of the regions of interest. It enables
us to use a subspace of much lower dimension (i.e., number of eigenvectors) to model the
appearance of a target region with lower dimensionality (i.e., length of an eigenvector),
which would speed up the tracking process. Moreover, we would like to automatically
select key expressions of the actor from the training sequence. One possibility is to build a
personalized model by conforming the anatomical 3D face model to the face shape in video
footage, and generate key expressions on this model. The corresponding video image can
then be found by minimizing an objective function that measures the similarity between the
expression in the video and 2D projection of the synthesized expression.
References
[1] S. Basu, N. Oliver, and A. Pentland. “3D modeling of human lip motion.” Proc.
ICCV’98, pp. 337-343, 1998.
[2] M. J. Black and A. D. Jepson. “Eigentracking: Robust matching and tracking of artic-
ulated objects using a view-based representation.” International Journal of Computer
Vision, 26(1): 63-84, 1998.
[3] F. L. Bookstein. “Principle warps: Thin plate splines and the decomposition of defor-
mations.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2(6):567-
585, 1989.
[4] B. Choe, H. Lee, and H.-S. Ko. “Performance-driven muscle-based facial animation.”
Journal of Visualization and Computer Animation, 12: 67-79, 2001.
[5] B. Choe and H. S. Ko. “Analysis and synthesis of facial expressions with hand-
generated muscle actuation basis.” Proc. Computer Animation’01, pp. 12-19, 2001.
[6] E. Chuang and C. Bregler. “Performance driven facial animation using blendshape in-
terpolation.” Standford University Computer Science Technical Report, CSTR-2002-
02, April 2002.
[7] T. F. Cootes, G. J. Edwards, and C. J. Taylor. “Active appearance models.” Proc.
ECCV’98, vol. 2, pp. 484-498, 1998.
[8] D. DeCarlo and D. Metaxas. “Deformable model-based shape and motion analysis
from images using motion residual error.” Proc. ICCV’98, pp. 113-119, 1998.
[9] P. Eisert and B. Girod. “Analyzing facial expression for virtual conferencing.” IEEE
Computer Graphics and Application, 18(5):70-78, 1998.
[10] P. Ekman and W. V. Friesen, Facial Action Coding System, Consulting Psychologists
Press Inc., Palo Alto, California 94306, 1978.
Hamiltonian Mechanics 143
[11] R. Enciso, J. Li, D. Fidaleo, T-Y. Kim, J-Y. Noh, and U. Neumann. “Synthesis of 3D
faces.” International Workshop on Digital and Computational Video, December 1999.
[12] I. Essa and A. Pentland. “Coding, analysis, interpretation, and recognition of facial
expressions.” IEEE Tran. Pattern Analysis and Machine Intelligence, 19(7):757-763,
July 1997.
[13] B. Guenter, C. Grimm, D. Wood, H, Malvar, F. Pighin. “Making faces.” Proc. SIG-
GRAPH’98, pp. 55-66, July 1998.
[14] G. Hager and P. Belhumeur. “Efﬁcient region tracking with parametric models of ge-
ometry and illuminations.” IEEE Tran. Pattern Analysis and Machine Intelligence,
20(10):1025-1039, 1998.
[15] K. K¨ ahler, J. Haber, H. Yamauchi, and H.-P. Seidel. “Head shop: Generating animated
head models with anatomical structure.” Proc. ACM SIGGRAPH Symp. on Comput.
Anim., pp. 55-64, 2002.
[16] R. Koch, M. Gross, and A. Bosshard. “Emotion editing using ﬁnite elements.”Proc.
EUROGRAPHICS’98, pp. 295-302, 1998.
[17] C. Kouadio, P. Poulin, and P. Lachapelle. “Real-time facial animation based upon a
bank of 3D facial expression.” Proc. Computer Animation’98, pp. 128-136, 1998.
[18] Y. Lee, D. Terzopoulos, and K. Waters. “Realistic modeling for facial animation.”
Proc. SIGGRAPH’95, pp. 55-62, August 1995.
[19] S. Morishima, T. Ishikawa, and D. Terzopoulos. “Facial muscle parameter decision
from 2D frontal image.” Proc. the Int. Conf. on Pattern Recognition, vol. 1, 160-162,
1998.
[20] J. Y. Noh and U. Neumann. “Expression cloning.” Proc. SIGGRAPH’01, pp. 277-288,
August 2001.
[21] J. Y. Noh and U. Neumann. A survey of facial modeling and animation techniques.
USC Technical Report 99-705, USC, Los Angeles, CA, 1999.
[22] J. Ohya, Y. Kitamura, H. Takemura, H. Ishi, F. Kishino, and N. Terashima. “Virtual
space teleconferencing: Real-time reproduction of 3d human images.” Journal of Vi-
sual Communications and Image Representation, 6(1): 1-25, 1995.
[23] F. I. Parke and K. Waters. Computer Facial Animation. AK Peters, Wellesley, MA,
1996.
[24] F. I. Parke. Computer generated animation of faces. Master’s thesis, University of
Utah, Salt Lake City, 1972.
[25] S. Platt and N. Badler. “Animating facial expressions.” Proc. SIGGRAPH’81, pp. 245-
252, 1981.
144 Yu Zhang
[26] F. Pighin, R. Szeliski, and D. H. Salesin. “Resynthesizing facial animation through 3d
model-based tracking.” Proc. ICCV’99, pp. 143-150, 1999.
[27] F. Pighin, J. Hecker, D. Lischinski, R. Szeliski, and D. H. Salesin. “Synthesizing re-
alistic facial expressions from photographs.” Proc. SIGGRAPH’98, pp. 75-84, July
1998.
[28] E. Sifakis, A. Selle, A. Robinson-Mosher, and R. Fedkiw. “Simulating speech with a
physics-based facial muscle model.” Proc. ACM SIGGRAPH/Eurographics Symp. on
Comput. Anim.’06, pp. 261-270, 2006.
[29] E. Sifakis, I. Neverov, and R. Fedkiw. “Automatic determination of facial muscle acti-
vations from sparse motion capture marker data.” Proc. SIGGRAPH’05, pp. 417-425,
2005.
[30] D. Terzopoulos and K. Waters. “Physically-based facial modeling, analysis and ani-
mation.” Journal of Visualization and Computer Animation, vol.1, pp. 73-80, 1990.
[31] D. Terzopoulos and K. Waters. “Analysis and synthesis of facial image sequences
using physical and anatomical models.” IEEE Tran. Pattern Analysis and Machine
Intelligence, 15(6):569-579, June 1993.
[32] L. Williams. “Performance-driven facial animation.” Proc. SIGGRAPH’90, pp. 235-
242, August 1990.
[33] Y. Zhang, E. C. Prakash, and E. Sung. “Efﬁcient modeling of an anatomy-based face
and fast 3D facial expression synthesis.” Computer Graphics Forum, 22(2): 159-169,
June 2003.
In: Computer Animation
Editors: J.S. Wright and L.M. Hughes, pp. 145-156
ISBN 978-1-60741-559-6
c 2010 Nova Science Publishers, Inc.
Chapter 6
DYNAMICS FOR MANAGING OCCLUSION
OF BUILDINGS IN PANORAMIC MAPS
Neeharika Adabala
∗
Microsoft Research India,
“Scientia”, 196/36, 2nd Main, Sadashivnagar,
Bangalore 560080
Abstract
Panoramic maps depict urban areas in oblique view. This form of cartography was
prevalent from the late sixteenth century to the early nineteenth century, when there
were not many skyscrapers in urban areas. But oblique view maps in the current urban
scenarios suffer from loss of details due to occlusion among closely located multi-
story buildings. In this work we leverage the time dimension to overcome the clutter
in space dimension by introducing functional dynamics. We deﬁne a parameter called
occlusion index for an urban scene at a given viewpoint. Solving the problem of occlu-
sion involves devising methods for visualizing the urban scene that reduce/minimize
the occlusion index. We explore occlusion reduction techniques that involve selecting
optimal viewpoints, displacing buildings, making buildings transparent and changing
building heights. We demonstrate these approaches by presenting screen shots of the
solution applied to a prototype city block, and discuss the advantages and disadvan-
tages of these solutions. This work is pioneering in its approach to applying animation
in cartography, which has previously used animations only to depict time-dependent
phenomena or ﬂy-throughs.
1. Introduction
Panoramic maps were a vibrant form of representing urban locales from the late sixteenth
century to the early nineteenth century [8]. They depicted cities in oblique view and in-
cluded trees, people, horse carts on roads, etc., composed together aesthetically. The
beauty of panoramic maps makes them popular wall hangings to this day. These maps
in oblique view are also appealing because they represent landmarks in three-dimensions,
making them easier to recognize for map users while ﬁnding their way in a city.
∗
E-mail address: [email protected]
146 Neeharika Adabala
Making of panoramic maps disappeared with time as it involved extensive manual ef-
fort by skilled artists. Also current day urban areas contain several buildings that occlude
each other, making oblique view maps unattractive and lacking in information. Recent
advances in computer graphics and computer vision have enabled creation of large-scale
urban models [15, 13, 14] that can be used to generate panoramic maps. These techniques
signiﬁcantly reduce the manual effort of panoramic map artists. However, the problem of
extensive building occlusion severely limits the usefulness of the maps. Therefore there
is a need to explore techniques to overcome this limitation, and increase the visibility of
buildings.
A simple way to address the problem of occlusion is to modify the viewpoint of the
map. This approach is applicable for either static maps printed on a paper or dynamics
maps displayed by online mapping services. However, it is not always possible to ﬁnd an
acceptable solution as one or more buildings may be occluded in every possible viewpoint.
Therefore a solution cannot be guaranteed by this approach.
When it is not possible to ﬁnd an ideal viewpoint, then it is clear that there is no pos-
sible solution in the three dimensions of space. We overcome this problem by resorting
to the time dimension - we do this by introducing the concept of functional dynamics into
the rendering and solve the occlusion problem. The solutions can take various forms, the
speciﬁc techniques that we describe in this work include: displacing buildings to improve
visibility, viewpoint-dependent building transparency and altering building heights. These
solutions are especially relevant in the context of the recent trend towards providing online
mapping services on digital displays. These online maps no longer have to be static. The
modiﬁcations based on the proposed approaches are applied to entities (buildings) present
in the map resulting in a time varying appearance of the map.
The above time variation in maps during user interaction is different from the typical
dynamics that are often incorporated in contemporary maps to depict time-dependent phe-
nomena (example: the evolution of a storm, or the expansion of an empire with time). We
note the distinction of the dynamics introduced by us to solve the problem of occlusion by
emphasizing that this dynamic is unrelated to the physical concept of time lapsed. There-
fore the animation is said to be functional and has the sole function of reducing occlusion
between buildings. This differentiation is important when developing GI systems that need
to visualize large volumes of information by employing both functional dynamics and an-
imations of time-dependent phenomena. Functional dynamics impacts the user interaction
with the geographic information system; an effective implementation of functional dynam-
ics results in conveying more information to the user with lesser interaction.
The rest of the paper is organized as follows, the following section gives an overview of
related work. In section 3 we deﬁne a parameter called occlusion index, which characterizes
the extent to which the buildings are occluded for a given viewpoint. The visibility of
the buildings in a map can be optimized by selecting a conﬁguration that minimizes the
occlusion index. Section 4 to 7 describes techniques to manipulate the oblique view maps
to minimize the value of occlusion index when rendering of an urban region model. We
present example results on a hypothetical synthesized city block to illustrate the working of
the proposed technique in section 8 and give conclusions in section 9.
Dynamics for Managing Occlusion of Buildings in Panoramic Maps 147
2. Related Work
Several approaches to large-scale urban modeling have been developed and are described
in [10]. Other interesting results on non-photorealistic rendering of city models inspired
by panoramic maps are described in Buchholz et al. [2] and techniques for rendering them
stylistically are presented in the work of Adabala [1] and D¨ ollner and Walther [4]. A tech-
nique for procedurally creating city models is presented in [16]. The rendering of the mod-
els [16] generated in birds-eye view results in impressive panoramic view of urban areas,
however the resulting images cannot be used as maps as several of the buildings occlude
each other. The problem of being able to use the rendered results as maps has not been ad-
dressed. Landmarks are best represented by three-dimensional models in oblique view, they
have a key role in user-friendliness of a map and are often seen in tourist maps. Therefore
techniques to make oblique view maps from models of cities is attractive.
There has been signiﬁcant amount of work done in the area of determining optimal
viewpoints for rendering of graphics models. These include the work of Pere-Pau V´ azquez
et al. [17] where they develop a method of determining the optimal viewpoint for viewing a
geometric model by deﬁning viewpoint entropy. The entropy attempts to capture the degree
of visibility of features of the geometric model, higher entropy of a viewpoint indicates
greater visible information.
A large body of work exists in computer vision literature on ﬁnding optimal viewpoints
at which to place cameras and lights such that the resulting image is most suitable to apply
computer vision algorithms. Capture of edges and features of objects is crucial for identi-
fying objects. Therefore these algorithms have mainly focused on single objects rather than
scenes where elements occlude each other [19]. All research in the context of aspect graphs
in computer vision is relevant to this work, the concept of aspect graph is very general and
has to be adapted signiﬁcantly to be applied in a particular context. Another group of per-
tinent algorithms in computer vision are developed in the context are robot path planning;
indoor environments form the main subject of these studies [12].
Rendering algorithms like radiosity and ray-tracing often solve the visibility problem
in the context of whether light from a one surface reaches another [7]. Similarly shadow
casting algorithms also solved the visibility problem. The work by Wonka and Schmalstieg
[18] solve a problem of visibility in the context of walk-throughs in urban environments.
An survey of walk-through related visibility results can be found in [3]. A summary of sev-
eral studies on visibility can be found in the excellent multidisciplinary survey of visibility
studies by Durand [5, 6].
The other interesting group of problems that require viewpoint analysis are the prob-
lems that can be mapped onto the art gallery problem. Several analytical solutions have
been proposed for the problem that are based on polygon triangulation [11]. The solutions
limit themselves to two-dimensions by considering the art gallery as a polygon. Interesting
solutions have been developed for this problem, however they cannot be easily applied in
the context of our problem.
All these visibility related solutions treat the graphics models as a collection of triangles
and deﬁne optimization techniques applicable to the triangles. The number of triangles can
be large making the optimization algorithms slow. Also, the triangles cannot be associated
with the semantics of the model they represent. In this work we introduce a simple concept
148 Neeharika Adabala
of occlusion index that is speciﬁc to urban models. The occlusion index provides a tool that
enables us to retain the semantic signiﬁcance of each building as an entity while optimizing
visibility.
The work on automatically generating tourist maps [9] explores the problem of reducing
occlusion in tourist maps, and addresses the problem by adopting the traditional technique
used by artists that involves expanding the roads. This technique considers the static maps
and emphasizes on the visibility of roads. The idea of using functional dynamics to over-
come occlusion and improve visualization has not been explored prior to this work.
3. The Occlusion Index
In our approach buildings within the region of interest are represented by their bounding
boxes for the purpose of visibility computations. This is quite accurate for rectangular
buildings blocks, and for more irregular shaped buildings also it is acceptable. A general
rule of thumb is that such a bounding box is suitable for all buildings that can be mod-
eled completely by extruding a footprint till the ﬁnal height with no modiﬁcations in cross
section.
This approximation can lead to faulty results when the building tapers towards the top.
In this case if the full height of the building is considered in the bounding box, then the
solutions can be erroneous for the visibility of the tapering portion. A viewpoint in which
a small portion of the tapered tip of the building is visible may be selected as an optimal
viewpoint. We over come this problem by limiting the height of the bounding box to the
height of the building that is not tapered when it is being evaluated for visibility. We use
the bounding box for the full height of the building in computations where it occludes other
buildings so that occlusion by the tapering part is not ignored in the computation. Using
a variable bounding box, illustrated in ﬁgure 1, is a signiﬁcant approximation to visibility
computations on actual geometry, but we err on the side of being more conservative in our
estimates in this approach. This leads to reasonable solutions while keeping the computa-
tions simple.
Once the bounding boxes that represent the buildings have been identiﬁed, we determine
the extent of occlusion of buildings for a given view point using these bounding boxes. For
this purpose we deﬁne a parameter called occlusion index that measures that is an aggregate
of the farction of buildings occluded for a given viewpoint of the urban region. The building
occlusion bo
i
of a building i for a viewpoint is deﬁned by bo
i
=

j (j=i)
BlkFrac
ij
where
BlkFrac
ij
is the fraction of visible area of building i that is blocked by building j. We
apply the models transformation matrix and the projection matrix to compute the screen
coordinates of each of the models, and the value of BlkFrac
ij
is computed in the screen
coordinates space.
In the case of buildings with tapering roofs or buildings that require the use of the
variable bounding box approach, we use the smaller height value shown in ﬁgure 1 while
computing the value of BlkFrac
ij
of the building i, and the larger height value when
it occurs as the building j while occluding the building i behind it. Thus we are able
to separate the height property of a building during visibility computation, and occlusion
computation.
Dynamics for Managing Occlusion of Buildings in Panoramic Maps 149
Figure 1. Bounding boxes of buildings when building cannot be thought of as an extrusion
of its footprint.
When we want to ﬁnd the solution for a subset of total number of buildings in the scene
then we limit the computation of bo
i
to the buildings of interest. However the computation
of values BlkFrac
ij
is performed over all buildings j in the scene.
The occlusion index for a given viewpoint and position of buildings is given by O =

i
bo
i
where the summing is over all buildings i in the scene for which bo
i
is computed.
In the following four sections we describes ways in which this occlusion index can be
exploited to reduce occlusion among buildings in panoramics maps.
4. Modiﬁcation of Viewpoint
An obvious way to handle occlusion is to change the viewpoint [6], for this the occlusion
index is computed repeatedly for rotating viewpoints. The viewing direction that has the
minimum value of occlusion index is selected as the optimal viewpoint. The algorithm for
determining the optimal viewpoint is outlined below:
1. Select region or buildings for which optimal viewpoint is to be determined. Region
can be selected by truncating an urban scene to an extended bounded region around
the current visible portion of environment on the screen. If buildings are to be selected
this can be done by querying a database that contains additional information about
the functionality of the buildings.
2. Select the angle of elevation and zoom level at which to display the panoramic map
of region.
3. Determine bounding boxes for buildings.
4. Rotate 360 degrees about the center of the selected region, and evaluate occlusion
index for each viewpoint:
(a) For each buildings i in the scene compute the building occlusion term bo
i
by
considering all the buildings that can block it for given viewpoint.
150 Neeharika Adabala
(b) Sum up the values of bo
i
to ﬁnd the occlusion index for the scene.
5. Fix viewpoint to the direction at which the occlusion index assumed minimum value.
When no efﬁcient data structure is used in storing the building information, this is an O(n
2
)
algorithm when there are n buildings in the scene and we are optimizing for all of them. If
m buildings are selected as buildings of interest then the algorithm complexity is O(mn).
This approach may not always result in a solution in which at least some part of ev-
ery building is seen, especially in over crowded downtown areas. Also, modifying the
viewpoint is not applicable in all cases, often the cartographer or user may decide which
direction - north, south, east, or west - of the urban area should be displayed at the top of the
map. Therefore there is a need to explore techniques to optimize the visibility of buildings
after the orientation of the maps has been selected. The following three sections describe
techniques that use funcational animation to optimize the visibility of buildings.
5. Displacing Buildings
In this technique for handling occlusion we shift builds about their original position by small
amounts when they occlude each other. This movement is dependent on the viewpoint, and
it can make completely occluded buildings to become partially visible in the map.
Constraints are applied to the extent to which the building can moved from its original
position to maintain the usefulness of the map. Buildings cannot change its relative location
with respect to roads or with respect to other buildings. The occlusion index is minimized
to obtain the most suitable building locations for a given viewpoint. This approach cannot
guarantee a solution in all cases, especially in over crowded downtown areas where the
buildings have a limit region in which they can be displaced.
6. Making Buildings Transparent
An alternative dynamic approach is to make the buildings that are blocking other buildings
transparent. This is again a viewpoint dependent approach. When a building becomes
transparent the contribution of the building to the occlusion index O is considered to be
zero.
This approach guarantees that all the buildings are visible from any given viewpoint.
However it can happen that for certain viewpoints, several buildings become transparent
for enabling the visibility of all the buildings. Alternative approaches of assigning a depth
dependent translucency to the buildings can be adopted in this case. Here the contribution
of a building to the occlusion index is weighted by the translucency associated with the
buildings. Therefore the occlusion index is given by O =

i
α
i
bo
i
where α
i
is the trans-
parency associated with the building i, and the values of bo
i
are computed with the equation
bo
i
=

j (j=i)
α
j
BlkFrac
ij
where α
j
is the transparency associated with building j.
Dynamics for Managing Occlusion of Buildings in Panoramic Maps 151
Figure 2. Optimal view point for a simple urban scene.
7. Altering Heights of Buildings
An alternative approach is to scale down the height of obstructing buildings. This approach
is adopted in cases where the accuracy of the skyline and scale is not important, and in
stylized maps where the information and aesthetics are more emphasized. This kind of
distortion is often seen in tourist maps where some of the monuments of a city are draw to
the scale of their fame rather than their physical size.
Different ways of changing height of buildings can be explore. For example one can
we exchange the height of buildings when a taller building obstructs a shorter one. This
approach to change heights prevents an overly uniform appearance of buildings. Also it
prevents excessive changing of building heights with changes in viewpoint. Alternatively
one can assign ascending order of heights to buildings for a given viewpoint. This approach
can result in a regular monotonous representation of the urban areas. In the approaches
described here where we shorten the buildings in the foreground relative to the ones at
the back, the occlusion index reduces as the values of BlkFrac
ij
decrease when shorter
buildings are in front.
Note that the approaches of manipulating properties of buildings present in a map are
not completely novel. In fact these approaches have been used extensively in static visual-
izations of cities that are found in the Nuremberg Chronicle. Buildings were moved, and
heights changed in the town plans printed in this ﬁfteenth century incunabulum.
8. Results and Discussion
The model of an urban area was synthesized, such that there are several multistory buildings
in adjacent blocks, to demonstrate our results. In real world scenarios a subset of buildings
from the map may be displayed in a separate window, similar to the panoramic map viewing
tool provided at [8], and our techniques can work in real time in this window.
We have implemented our algorithm in C++ and OpenGL on a Windows system. The
results are illustrated in the ﬁgures 2 to 6.
Figure 2 shows the synthesized simple urban area at an optimal viewpoint.
In the case of results shown in ﬁgure 3 the model was created by considering an aerial
image of a region in Las Vegas shown in the left. We marked out the footprint of some
152 Neeharika Adabala
Figure 3. Optimal view point for urban scene for irregular shape of building footprints.
(Left) Aerial image of urban area - a region in Las Vegas. (Right above) Optimal viewpoint
for the urban area model created from aerial image. (Right below) The two viewpoints when
occlusion index reaches peak values.
complex shaped buildings fromthis image and created a model by extruding these footprints
to random heights. The variation of occlusion index is given for this model in ﬁgure 4.
Notice that the occlusion index peaks at two points. These are the two viewpoints in which
the buildings line up one behind the other in two rows, shown in the right bottom in ﬁgure
3. There are two points at which the occlusion index reaches a minimal value, one such
viewpoint is shown in right top of ﬁgure 3.
A limitation of this approach is that it cannot guarantee that all buildings are visible at
optimal viewpoints. At some elevations of viewing the urban region some of the buildings
are completely occluded even at optimal viewpoints.
The other three approaches make use of functional dynamics to enhance the information
visible to the user. Frames captured during interaction with the synthesized urban block are
presented in ﬁgures 5 to 8. Observe the occlusion of the pink building in each case. The
ﬁgure 5 shows the completely occluded building. In ﬁgure 6 the green building in the
front is displaced suitably, and the pink building becomes partially visible. In the ﬁgure 7
the buildings obstructing a large portion of buildings behind them become transparent. A
threshold value of the extant of obstruction is used to decide when a building should be
made transparent, this value is provided as a user control. A low value of threshold makes
the buildings become transparent even when they occlude a small fraction of the building
behind them. In the implementation shown in ﬁgure 7 we made the building faces fully
transparent and only retained the edges to indicate that a building is currently transparent in
this viewpoint to improve visibility.
In ﬁgure 8 shows the results when the buildings change height. Such a modiﬁcation of
the ground truth can be disorienting to a user. However this can be mitigated if the buildings
are labeled and the user is aware that the map is stylized and not to scale with respect to the
Dynamics for Managing Occlusion of Buildings in Panoramic Maps 153

0
5
10
15
20
25
O
c
c
l
u
s
i
o
n

I
n
d
e
x
Figure 4. Graph of variation of occlusion index with change in view point for scene given in
following ﬁgure. Notice the to peaks when the view point coincides with the two directions
in which the buildings are apparently lined up one behind the other as shown in Figure 3.
building heights. Our initial experimentation with users showed that they ﬁnd the approach
interesting, however more detailed user studies are required for wider use of this approach.
These approaches enable viewing of more information in a single frame, therefore a
better picture of the relative locations of buildings in the urban region is obtained by lesser
interaction. A more thorough manipulation of the region would have been required without
the support from functional dynamics.
9. Conclusions
In this work we addressed the issue of information loss in oblique view maps due to oc-
clusion among closely located multistory buildings. Maps are abstract representations of
Figure 5. Handling occlusion of buildings in panoramic maps observe the pink building in
all cases. The pink building is completely occluded for this viewpoint.
154 Neeharika Adabala
Figure 6. Moving position of buildings slightly.
Figure 7. Making buildings transparent.
Figure 8. Changing height of buildings - exchange of height between obstructing building
and building at the back.
Dynamics for Managing Occlusion of Buildings in Panoramic Maps 155
spatial information, therefore we manipulated the visualizations scheme to maximize the
information presented to the user. We deﬁned a parameter called occlusion index in the
context of urban areas with multistory buildings, and determined optimal viewpoints for
the region of interest. This solution did not always guarantee the visibility of all buildings
in the urban region of interest. We therefore proposed alternative solutions to visualize
overcrowded three-dimensional spaces by taking recourse to the fourth ‘time’ dimension;
we introduced the concept of functional dynamics. We applied this approach and suggested
techniques that displace buildings slightly from original positions, make buildings trans-
parent and change heights of buildings to improve the visibility of information in a region.
It should be noted that while our work demonstrated the results individually for each of
the approaches, they can be combined together to give more effective solutions. That is,
buildings can be displaces, made transparent, and scaled, simultaneously to create optimal
visualizations of the urban regions.
The concept of introducing dynamics to improve visualization is relatively new to
cartography, and it is important to conduct user studies to learn which approaches are more
attractive and effective for users. The initial reaction of users to the proposed techniques
has been encouraging.
Acknowledgment
The author would like to acknowledge the discussions with Dr. Kentaro Toyama that led to
the development of this work.
References
[1] N. Adabala. A technique for building representation in oblique-view maps of modern
urban areas. To appear in The Cartographic Journal, 2009.
[2] H. Buchholz, J. D¨ ollner, M. Nienhaus, and F. Kirsch. Real-time non-photorealistic
rendering of 3d city models. In Proceedings of the 1st International Workshop on
Next Generation 3D City Models, 2005.
[3] D. Cohen-Or, Y. Chrysanthou, C. Silva, and F. Durand. A survey of visibility for
walkthrough applications. IEEE Transactions on Visualization and Computer Graph-
ics, 9(3):412–431, July-Sept. 2003.
[4] J. D¨ ollner and M. Walther. Real-time expressive rendering of city models. In Sev-
enth International Conference on Information Visualization, Proceedings IEEE 2003
Information Visualization, pages 245–250, 2003.
[5] F. Durand. A multidisciplinary survey of visibility. in ACM Siggraph course notes
Visibility, Problems, Techniques, and Applications, July 2000.
[6] F. Durand. 3D Visibility: analytical study and applications. PhD thesis, Universit´ e
Joseph Fourier, Grenoble I, July 1999. http://www-imagis.imag.fr.
156 Neeharika Adabala
[7] F. Durand, G. Drettakis, and C. Puech. Fast and accurate hierarchical radiosity using
global visibility. ACM Transactions on Graphics, 18(2):128–170, 1999.
[8] Geography and MapsDivision. The library of congress: Panoramic maps collection.
http://memory.loc.gov/ammem/pmhtml/panhome.html.
[9] F. Grabler, M. Agrawala, R. W. Sumner, and M. Pauly. Automatic generation of tourist
maps. ACM Trans. Graph., 27(3):1–11, 2008.
[10] J. Hu, S. You, and U. Neumann. Approaches to large-scale urban modeling. IEEE
Comput. Graph. Appl., 23(6):62–69, 2003.
[11] D. T. Lee and A. K. Lin. Computational complexity of art gallery problems. IEEE
Trans. Inf. Theor., 32(2):276–282, 1986.
[12] J. Lengyel, M. Reichert, B. R. Donald, and D. P. Greenberg. Real-time robot motion
planning using rasterizing computer graphics hardware. In SIGGRAPH ’90: Proceed-
ings of the 17th annual conference on Computer graphics and interactive techniques,
pages 327–335, New York, NY, USA, 1990. ACM Press.
[13] P. M¨ uller, P. Wonka, S. Haegler, A. Ulmer, and L. V. Gool. Procedural modeling of
buildings. In SIGGRAPH ’06: ACM SIGGRAPH 2006 Papers, pages 614–623, New
York, NY, USA, 2006. ACM.
[14] P. M¨ uller, G. Zeng, P. Wonka, and L. V. Gool. Image-based procedural modeling of
facades. ACM Trans. Graph., 26(3):85, 2007.
[15] Y. I. H. Parish and P. M¨ uller. Procedural modeling of cities. In SIGGRAPH ’01: Pro-
ceedings of the 28th annual conference on Computer graphics and interactive tech-
niques, pages 301–308, New York, NY, USA, 2001. ACM.
[16] Y. I. H. Parish and P. M¨ uller. Procedural modeling of cities. In SIGGRAPH ’01: Pro-
ceedings of the 28th annual conference on Computer graphics and interactive tech-
niques, pages 301–308, New York, NY, USA, 2001. ACM Press.
[17] P.-P. V´ azquez, M. Feixas, M. Sbert, and W. Heidrich. Viewpoint selection using view-
point entropy. In VMV ’01: Proceedings of the Vision Modeling and Visualization
Conference 2001, pages 273–280, 2001.
[18] P. Wonka and D. Schmalstieg. Occluder shadows for fast walkthroughs of urban en-
vironments. In P. Brunet and R. Scopigno, editors, Computer Graphics Forum (Euro-
graphics ’99), volume 18(3), pages 51–60. The Eurographics Association and Black-
well Publishers, 1999.
[19] S. Yi, M. Haralick, and L. G. Shapiro. Automatic sensor and light source positioning
for machine vision. In In Proceedings of the 10th International Conference on Pattern
Recognition, pages 55–59, Piscataway, NJ, USA, 1990. IEEE Press.
In: Computer Animation ISBN: 978-1-60741-559-6
Editors: J.S. Wright and L.M. Hughes, pp. 157-175 © 2010 Nova Science Publishers, Inc.
Chapter 7
CONSTRAINT-BASED AND FEATURE-BASED
CAD SYSTEMS AND APPLICATIONS
I oannis Fudos
a
and Vasiliki Stamati
b
Department of Computer Science, University of Ioannina, Greece
Abstract
A new generation of Computer Aided Design systems has become available in which
geometric constraints can be defined to determine properties of large designs. The new design
concept, often called constraint-based design or design by features offers users the capability
of easily defining and modifying a design, but introduces the problem of solving complicated,
not always well defined, constraint problems. Traditional parametric models can also be
enhanced to partially support declarative constraint-based descriptions. We provide an
overview of representation schemes for CAD applications. Then we present a survey of
methods for geometric constraint solving appropriate for Computer Aided Design. We
demonstrate how these representations and constraint solving methods can be combined or
adapted to support a broad range of CAD applications by presenting two example cases of
successfully using a feature-based constraint-based representation scheme to support two
different CAD applications.
1. Introduction
Computer Aided Design has been the motivation for major breakthroughs in various fields of
computer science such as computer graphics, visualization, and computer architecture. On the
other hand, advancements in graph theory, geometric constraint solving, algorithms and data
structures have enabled the use of computers in various fields such as manufacturing, VLSI
design, reverse engineering, and restoration of artifacts. This new setting has established
Computer Aided Design as a major framework for designing and editing machine parts,
jewelry, archaeological findings, buildings, electronics and computers. We present a

a
E-mail addresses: [email protected].
b
E-mail addresses: [email protected].
Ioannis Fudos and Vasiliki Stamati 158
framework for representing and editing CAD models for various applications based on two
main concepts:
Local characteristics which are commonly called features. These characteristics may
determine structural properties such as connectivity and hierarchy of objects, geometric
properties such as dimensions, distances, angles, smoothness, inclusion, and other topological
relations and finally functionality properties that usually are application dependent and
determine the behavior of the system when performing its required tasks.
Local or global constraints imposed on the model to enforce complex geometric
structures and advanced functionality. Such constraints may be part of a feature or span a
number of different features. Constraints are clearly declarative, meaning that there is no
suggested enforcement procedure. The systems determine based of the application and the
user interaction how to enforce the system of geometric constraints.
The rest of this is structured as follows: Section 2 presents a comparative survey of
representation models for CAD applications. The hybrid feature-based and constraint-based
scheme is argued to be the most appropriate for a diverse collection of CAD applications.
Section 3 presents methods to tackle the bottleneck problem of geometric constraint solving.
Traditional methods are described and modern approaches that are not domain sensitive are
presented. Section 4 presents two example cases of applying the framework to two different
CAD applications. Section 5 offers conclusions.
2. Computer Aided Design Representation Schemes
There is a variety of geometric representations that can be used at different levels of CAD
applications. The suitable representation scheme for each application depends on the scope of
the application and its peculiarities. Some modeling types are simple and aim at providing
only an external representation of the object, whereas others aim at encapsulating and
providing additional knowledge and data, such as design intent, functionality, and editability.
In the following we study common modeling schemes used in CAD applications. An
object can be represented in the simple form of raw data, such as a point cloud corresponding
to points on the surface of the object. A widespread scheme in solid modeling is the Boundary
Representation (B-rep) model where the facets and edges that describe the boundary of a solid
are modeled using a connectivity graph and a collection of surface and edge patches. On the
other hand, Constructive Solid Geometry (CSG) and volume models handle objects as 3D
solids. There are also higher-level representation schemes that capture not only the shape of
the object but also provide information pertaining to design intent and functionality, which
can be used later on for re-parameterization and modification. We briefly describe each
scheme and evaluate its suitability for various CAD applications.
2.1. Raw Data
The most basic and simple way to represent a 3D object is as raw data. By raw data we mean
an unstructured collection of geometric primitives such as a point cloud or a range image.
Such data are usually produced directly from a 3D object scanning or 3D reconstruction
setting. The density of the data sets produced by these methods depends on the sampling rate
Constraint-Based and Feature-Based CAD Systems and Applications 159
used to acquire information from the object’s surface. Also, very often the point clouds
obtained contain noisy data due to physical characteristics of the object or limitations and
regulations of the acquisition method used. However, this problem has been dealt with and
processing methods have been suggested that overcome this problem. The characteristic of
this representation model is that it describes the object as discrete data, i.e. points, without
providing any information about the connectivity, the topological relation among geometric
primitives or the design intent. This type of representation is mainly used in point-based
modeling, i.e. [1], [2] and reverse engineering applications [3].
2.2. Boundary Representation (Brep)
A more common representation model in CAD applications is the boundary representation
model (Brep), which describes the edges and facets of the boundary of the object. This type of
model consists of a collection of surface patches. Surfaces can capture objects of complex and
freeform design. Thanks to advancements in computer graphics hardware we are able to
handle efficiently the CPU-intensive processing required by Brep. These factors have resulted
in the increased usage of this representation in a wide spectrum of applications. A Brep model
is often realized as a mesh of triangular or quadrilateral (and in general polygonal) planar or
higher degree surface facets.
Planar polygonal meshes (called polyhedral representations) are mostly suited for
rendering and virtual reality and not for CAD applications since they do not provide sufficient
detail. Often, other representation schemes are converted to polygonal representations for the
purpose of rendering. Polyhedral representations such as triangulations are also used in
reverse engineering applications, usually as intermediate representations during the re-
engineering process. A drawback of representing a 3D object with a polygonal mesh is that it
cannot capture design semantics, such as design intent, inter part relations and overall
behavior. Also model editing is only feasible in a local corrective sense. Smooth object
surfaces cannot efficiently and accurately be represented by a polygonal mesh, even when a
large number of polygons are used, since the polyhedral representation by definition cannot
accommodate for G1 continuity. For example, to render areas of high curvature quite
accurately we need to increase the number of polygons and decrease significantly the facet
size.
Overall, polyhedral representation is not suitable for describing objects with specific
design characteristics and functionality, such as mechanical and industrial parts. Also it is not
appropriate for describing complex and detailed objects since then the large number of
polygons needed to sufficiently approximate the initial object makes the method unaffordable
both time-wise and space-wise.
Applications such as aesthetic and industrial engineering, reverse engineering and
jewellery design use commonly non-planar surfaces to capture the boundaries of complex
objects [4]. A Brep model may be constructed using NURBS (Non-Uniform Rational B-
Splines) or other parametric surface patches. This type of representation is useful in
applications where free-form surfaces are part of the repertoire of primitive geometric
entities. Brep can capture almost any type of object, such as mechanical parts and objects of
aesthetic design. Surfaces can be described using appropriate parametric representations. Brep
models make editing of local features feasible by interactively placing control points,
Ioannis Fudos and Vasiliki Stamati 160
therefore modifying the shape or curvature of the object’s feature. However, Brep models on
their own do not capture higher design characteristics of the object such as functionality and
part relationships. The information provided through this type of model is limited and does
not provide tools for modifying parts of the model that affect the whole design. Therefore,
Brep models are used in combination with other techniques (e.g. features, constraints) to
obtain higher-level descriptions that correspond to more flexible and useful models that are
suitable for CAD applications. For instance, in [5], the authors present a beautification
process based on constraints which is performed on B-rep models constructed from reverse
engineering range data. B-rep models acquired by re-engineering can present various
inaccuracies and errors, therefore the authors suggest the beautification of the models by
describing topological regularities in terms of geometric constraints.
2.3. Volume Modeling
While surface raw data and Brep modeling schemes provide data concerning the boundary of
a model, constructive solid geometry (CSG) and volumetric models represent the objects as a
volume. This type of representation can be used for objects that B-rep cannot sufficiently
describe. For example, a Brep model cannot represent unambiguously a sphere containing a
hollow, whereas a volume model can easily capture such solids.
Constructive solid geometry (CSG) models are created by performing Boolean operations
on solid primitives e.g. spheres, cones, cylinders and cubes. We perceive that CSG models
represent objects that can be created from solid primitives. Unless we use a very large number
of primitives we cannot use CSG to model higher degree free-form objects. In general, the
CSG representation scheme is well suited for mechanical part design and for all applications
where the design history can be expressed as a tree of Boolean operations on geometric
primitives. Also editing and local shape modification is performed by intervening in the
appropriate operation (internal tree node). Converting CSG models to render-able ones is
extremely difficult and therefore CSG is commonly used in conjunction to Brep. In this case a
Brep model is always maintained and every modification is transformed to an incremental
Brep editing operation. Constraints may also be used in conjunction to CSG for performing
multiple internal node modifications at a single step.
Volume pixels (voxels) are used in a volumetric approach to 3D object representation. A
voxel is a geometric primitive and represents the smallest discrete volume used in this
representation scheme. Voxel-based representations are commonly used for visualizing
unstructured 3D volume date such as data from scientific computing, medical imaging etc.
Although used in early CAD/CAM settings, volumetric representations have been proven to
be very inefficient for computer aided editing, rendering and manufacturing. This
representation scheme may be used as redundant auxiliary information in CAD applications
[6] such as solid modeling, reverse engineering and feature-based and constraint-based
modeling for the purposes of physical modeling and simulation.
Constraint-Based and Feature-Based CAD Systems and Applications 161
2.4. Higher-Level Representations in CAD
A current promising trend in computer-aided design is to use higher-level structures for
model representation. These structures are based on one of the former representation types in
combination with additional structural, topological or other information. A feature-based
representation scheme describes the object as a combination of features, which are surfaces or
solid parts with specific characteristics. A constraint-based representation scheme uses
geometric constraints enforced on the model and its features to obtain a more accurate
representation that captures designer requirements. The skeleton of a model can also be
considered as a higher-level CAD representation that can be used for specific operations such
as feature detection and extraction.
More specifically, the feature-based model is a representation scheme that is growing
more and more popular. The model is described by defining collections of feature elements
and relationships among them. The features are collections of points, surfaces or other
features. For example a commonly used feature type is a cross-section of a solid. Constraints
are applied to the features to create more accurate and robust models, but also for enforcing
global criteria such as tolerance and beautification. This type of model representation has
been established initially for manufacturing mechanical parts, where a library of features is
created and then relationships among feature elements are enforced. The feature-based
scheme is well suited to industrial design in general since it provides for advanced editability.
This is due to the knowledge encapsulated by the model concerning tolerances, constraints,
relationships and connectivity. For this reason, feature-based methods are often characterized
as knowledge-based. Their main objective is to exploit any knowledge and information
pertaining to design intent, functionality and construction process. Besides, this representation
scheme supports collaborative CAD, reverse engineering and VLSI applications. This type of
model also provides the user-designer with the capability of editing, redesigning and
reconstructing the original design, depending on her preferences and needs by tailoring the
model features [7].
A powerful higher-level structure for representing objects is the constraint-based scheme,
which is often used in combination with features [8]. This representation scheme is
particularly preferred in CAD applications where the objects being modeled, modified and
manufactured are of geometric or freeform design and must conform to constraints
determined locally on specific components or globally on the whole model [9]. Constraints
defined on a model or its individual components can refer to almost any characteristic, i.e.
geometric attributes, such as size and shape, topological characteristics, such as placement
and connectivity, functionality and behavior. Constraint-based models are widely used in
architecture, mechanical engineering, electronic design, aesthetic and industrial design, for
design, modeling or re-engineering. The types of constraints defined depend on the nature of
the CAD application. For example, in VLSI CAD a geometric constraint scheme may be used
in conjunction to feature-based or other graph-based connectivity modeling. Constraints are
imposed on each design feature used in the VLSI circuit referring to the feature’s intra-
connectivity and its local characteristics (i.e. area, size, geometry). Contraints may also be
imposed to express inter-feature connectivity requirements. Finally, constraints are also
enforced globally on the circuit, and are targeted to optimize the overall placement and
routing of the features on the chip.
Ioannis Fudos and Vasiliki Stamati 162
An object can also be represented by its skeleton. By skeleton we mean the closure of all
points that have more than one closest point on the shape boundary (for example the medial
axis transform). This representation provides the topology and shapes that exist in the object
and also reflects the symmetries of an object. Depending on the type of application the
skeleton is used for, it may be a 2D or 3D representation. For instance, in 3D the medial axis
transform produces a medial surface. The exact computation of the 3D skeleton is a
computationally intensive problem that returns a skeleton as complex as the object itself.
Therefore we usually seek for an approximation. A skeleton representation scheme is used in
various CAD applications for object recognition and retrieval [10], animation [11] and other
solid modeling operations ([12], [13]). It is widely used in feature-based modeling, where it
can be employed to describe the shape of features, in feature detection and extraction
applications and shape deformation, for instance refer to [14] and [15].
3. Geometric Constraint Solving
Higher level representations are powerful, accurate and user friendly. However they have a
major bottleneck; all complicated functionality has been shifted to the solution of a large non-
linear system of geometric constraints with multiple valid solutions. In this section we present
an overview of approaches to geometric constraint solving. We outline the most
representative methods and evaluate their behavior in terms of the major concerns faced in
CAD/CAM systems: solution selection, interactive speed, editability, handling of over and
underconstrained configurations and scope [16].
3.1. Numerical Constraint Solvers
In numerical constraint solvers, the constraints are translated into a system of algebraic
equations and are solved using iterative methods. To handle the exponential number of
solutions and the large number of parameters, iterative methods require sharp initial guesses.
Also, most iterative methods have difficulties handling overconstrained or underconstrained
instances. The advantage of these methods is that they have the potential to solve large
nonlinear system that may not be solvable using any of the other methods. All existing
solvers more or less switch to iterative methods when the given configuration is not solvable
by the native method. This fact emphasizes the need for further research in the area of
numerical constraint solving.
Sketchpad [17] was the first system to use the method of relaxation as an alternative to
propagation. Relaxation is a slow but quite general method. The Newton-Raphson method has
been used in various systems [18] [19], and it proved to be faster than relaxation but it has the
problem that it may not converge or it may converge to an unwanted solution after a chaotic
behavior. For that reason, Juno [18] uses as initial state the sketch interactively drafted by the
user. However, Newton-Raphson is so sensitive to the initial guess [20], that the sketch
drafted must almost satisfy all constraints prior to constraint solving. A sophisticated use of
the Newton-Raphson method was developed in [21], where an improved way for finding the
inverse Jacobian matrix is presented. Furthermore, the idea of dividing the matrix of
constraints into submatrices as presented in the same work has the potential of providing the
Constraint-Based and Feature-Based CAD Systems and Applications 163
user with useful information regarding the constraint structure of the sketch. Though this
information is usually quantitative and nonspecific, it may help the user in basic modi-
fications. To check whether a constraint problem is well-constrained, Chyz [22] proposes a
preprocessing phase where the graph of constraints is analyzed to check whether a necessary
condition is satisfied. The method is however quite expensive in time and it cannot detect all
the cases of singularity. An alternative method to Newton-Raphson for geometric constraint
solving is homotopy or continuation [23], that is argued in [24] to be more satisfactory in
typical situations where Newton-Raphson fails. Homotopy, is global, exhaustive and thus
slow when compared to the local and fast Newton's method [25], however it may be more
appropriate for CAD/CAM systems when constructive methods fail, since it may return all
solutions if designed carefully.
3.2. Constructive Constraint Solvers
This class of constraint solvers is based on the fact that most configurations in an engineering
drawing are solvable by ruler, compass and protractor or using other less classical repertoires
of construction steps. In these methods the constraints are satisfied in a constructive fashion,
which makes the constraint solving process natural for the user and suitable for interactive
debugging. There are two main approaches in this direction.
Rule-Constructive Solvers
Rule-constructive solvers use rewrite rules for the discovery and execution of the construction
steps. In this approach, complex constraints can be easily handled, and extensions to the
scope of the method are straightforward to incorporate [26]. Although it is a good approach
for prototyping and experimentation, the extensive computations involved in the exhaustive
searching and matching make it inappropriate for real world applications.
A method that guarantees termination, ruler and compass completeness and uniqueness
using the Knuth-Bendix critical pair algorithm is presented in ([27], [28]). This method can be
proved to confirm theorems that are provable under a given system of axioms [29]. A system
based on this method was implemented in Prolog. Aldefeld in [30] uses a forward chaining
inference mechanism, where the notion of direction of lines is imposed by introducing
additional rules, and thus restricting the solution space. A similar method is presented in [31],
where handling of overconstrained and underconstrained problems is given special
consideration. Sunde in [32] uses a rule-constructive method but adopts different rules for
representing directed and nondirected distances, giving flexibility for dealing with the
solution selection problem. In [33], the problem of nonunique solutions is handled by
imposing a topological order on three geometric objects. An elaborate description of a
complete set of rules for 2D geometric constraint solving can be found in [34]. In their work,
the scope of the particular set of rules is characterized. [35] presents an extension of the set of
rules of [34], and provides a correctness proof based on the techniques of [36].
Ioannis Fudos and Vasiliki Stamati 164
Graph-Constructive Solvers
The graph-constructive approach has two phases. During the first phase the graph of
constraints is analyzed and a sequence of construction steps is derived. During the second
phase these construction steps are followed to place the geometric elements. These
approaches are fast and more methodical. In addition, conclusions characterizing the scope of
the method can be easily derived. A major drawback is that as the repertoire of constraints
increases the graph-analysis algorithm needs to be modified.
Fitzgerald [37] follows the method of dimensioned trees introduced by Requicha [38].
This method allows only horizontal and vertical distances and it is useful for simple
engineering drawings. Todd in [39] first generalized the dimension trees of Requicha. Owen
in [40] presents an extension of this principle that includes circularly dimensioned sketches.
DCM [41] is a system that uses some extension of Owen's method. [42] presents an
elaborative graph-constructive method, with fast analysis and construction algorithms, and
extensions for handling classes of nonsolvable, underconstrained and consistently
overconstrained configuration
3.3. Propagation Methods
Propagation methods follow the approach met in traditional constraint solving systems. In this
approach, the constraints are first translated into a system of equations involving variables
and constants. The equations are then represented by an undirected graph which has as nodes
the equations, the variables and the constants, and whose edges represent whether a variable
or a constant appears in an equation. Subsequently, we try to direct the graph so as to satisfy
all the equations starting from the constants. To accomplish this, various propagation
techniques have been used but none of them guarantees to derive a solution and at the same
time have a reasonable worst case running time. For a review of these methods see [28]. In a
sense, the constructive constraint solvers can be thought of as a sub case of the propagation
method (fixed geometric elements for constants and variable geometric elements for
variables). However, constructive constraint solvers utilize domain specific information to
derive more powerful and efficient algorithms.
3.4. Symbolic Constraint Solvers
In symbolic solvers, the constraints are transformed to a system of algebraic equations which
is solved using methods from algebraic manipulation, such as Grobner basis calculation [43]
or Wu's method [44]. Although, these methods are interesting from a theoretical viewpoint,
their practical significance is limited, since their time and space complexity is typically
exponential or even hyperexponential.
3.5. Hierarchical and Hybrid Approaches
A major result in analysis of constraint graphs by [45] in which an efficient method for
detecting dense constraint subgraphs is described has enabled the solution of large systems of
Constraint-Based and Feature-Based CAD Systems and Applications 165
geometric constraints in 2, 3 or more dimensions. By using this result we can build efficient
algorithm for solving arbitrary systems of geometric constraints. We first find a set of
minimal disjoint dense constraint subgraphs. Each subgraph is then reduced in a supernode of
high dimension and the method is applied recursively to the resulting graph. In this way we
build a hierarchy of constraint graphs that is treated bottom up or top down based on the
application. Interfeature 3D constraints result in systems of 3D constraints. Such systems are
very hard to solve with graph constructive methods since there is not even a necessary and
sufficient condition for well-constrainedness in 3D. By using the decomposition suggested by
this approach we may breakdown the large geometric constraint system in a multitude of
small systems with few variables. Such systems are usually easy to solve using global
optimization with topological constraints to narrow down the root selection process.
In this direction [46] has developed a novel method for placement and routing in VLSI by
constructing a circuit hierarchy by detecting dense connectivity graphs and then employing
global optimization algorithms for each sub-problem.
4. CAD Applications from a Feature-Based/Constraint-Based
Point of View
Since the models constructed by CAD applications are most often meant for manufacturing or
production in general, it is necessary that they are robust and accurate. Also, most
applications require that the model can be modified and re-engineered. The use of constraints
and features in such CAD applications is essential and is the only representation scheme that
sufficiently supports these requirements. In this section we will present two example
approaches that adopt this modeling scheme.
4.1. Parametric Feature Based Design in Manufacturing Systems
An example of CAD application where the feature- and constraint-based representation model
is most appropriate is parametric feature-based manufacturing [47].
Parametric modeling was commonly used for the construction of complex models on
which parameters were used to provide for subsequent customization. The parameters defined
during the design and modeling process are relative to the individual geometric characteristics
of the model or to the model as a whole. For example, the parameters can control
characteristics such as length, height, width and hole radius.
On the other hand, feature-based modeling is a representation scheme based on the
combination of individual feature components. In this context a feature is a unit that can be
defined as a connected set of geometric elements (i.e. a subpart) associated with attributes that
describe its shape and behavior, such as geometry, topology, functionality and connectivity
with other features. In traditional approaches each feature is linked to a set of local parameters
that control its attribute values. Here, the feature-based model is complemented through the
use of local and global constraints. The constraints are applied locally, in reference to the
parameter values or the geometric characteristics of the primitives of the features to impose
design or user-defined specifics such as hole size, pocket depth, and globally, in reference to
the connectivity and the inter-feature relations of the model.
Ioannis Fudos and Vasiliki Stamati 166
Much work has been performed on the definition of features in relation to various CAD
applications. Features are often perceived as 3D solid components that can be classified into
feature libraries depending on their shape or geometry. This point of view is described for
instance in [48], where the authors present a library of features fit for manufacturing
applications, and in [49], where design features for machining are examined. In [50] features
are defined as pierced voxels that are used to create traditional pierced jewellery. However,
features can also be defined from surfaces, which is especially common in freeform design
applications. For instance, [51] examines freeform surface features, whereas [52] presents a
taxonomy of freeform features. Other work uses the notion of feature points and feature lines
([53], [54]) for applications usually related to data segmentation for reverse engineering, or
shape deformation and manipulation. Since the definition of a feature is not strict, Hoffmann
and Joan-Arinyo in [55] suggest the use of user-defined features in feature based modeling.
Parametric and feature based modeling is an essential component of current CAD design
systems. In traditional CAD systems, CSG and Brep models are created by adding and
subtracting parts in the model and by applying transformations and various design operations.
Design intent was not a concern in these systems and therefore precise editing that involved
structural and arbitrary topological modifications of parts of the model was almost impossible
without rebuilding the model from scratch. Editing a part of the model is feasible if the design
steps are undone until the model returns to the previous state, when the part was created. This
of course is possible if the design history of the model is recorded and it is obvious that even
though editing theoretically concerns a part of the model, ultimately the whole design process
is affected. Feature based CAD systems overcome this limitation by capturing design intent.
Since the models are constructed using parameters and features, local editing is possible
without necessarily affecting the whole model. Changes are propagated through the model
based on the parameters and constraints defined in the system and based on the attributes and
connectivity of the features. Feature-based constraint-based modeling systems provide
libraries of feature components to be used in the design process and some support user-
defined features. Applications such as custom design are feasible since components of models
can be combined or re-designed to satisfy user defined preferences or requirements.
Many commercial CAD modeling systems support parametric and/or feature-based
modeling. Systems such as PRO/Engineer [56], AUTOCAD [57], IRONCAD [58], CATIA
[59], Solidworks [60], SolidEdge [61] and Alibre design [62], which have been developed
mainly for mechanical engineering, manufacturing and industrial design applications, have
been integrated with parametric and/or feature based modeling capabilities. Architectural
Desktop and AUTOCAD are CAD systems used in architectural applications that support
parametric modeling. 3D Studio Max [63] and Maya [64] are parametric feature-based
modeling systems used for artwork and animation. There are also systems that have been
developed for specific CAD applications, such as jewelry, clothing and textile design, marine
applications and furniture.
The above modeling systems are very efficient for manufacturing and production
applications. However more freeform applications, such as aesthetic and custom design, are
still challenging even with these systems. An interesting case is jewellery design. A large
number of CAD systems for jewellery design are parametric feature-based. They provide
graphical interfaces with excellent rendering capabilities. The majority of these systems
provide built-in libraries of settings and cut gems and stones and advanced feature-based
design tools. Some systems provide advanced functionality that provides the use of builders
Constraint-Based and Feature-Based CAD Systems and Applications 167
for recording design steps and for defining parameter values for parts to be used in the
process. Also, the majority of these systems have the capability of exporting models to rapid
prototyping machines. However, in most CAD systems for jewellery, designing is performed
manually using various tools and usually the design steps cannot be programmed to be
executed automatically and accurately. This means that each different piece of jewellery has
to be created basically from the beginning by hand, making custom design applications
difficult and time-consuming. Also these systems require that the user has designing skills or
knowledge of using CAD systems. In the following we will present an interesting example of
jewellery application that is difficult to carry out with existing CAD systems: the construction
of traditional pierced jewellery, .
Figure 1. Using a chisel to create carvings around a hole.

Figure 2. A structural element (feature).
In [50] ByzantineCAD, a feature-based CAD system suitable for the design of pierced
Byzantine jewellery is presented. The system is automated and parametric meaning that the
user-designer sets some parameter values and ByzantineCAD creates the jewellery model that
corresponds to the specified values. This provides the designer with the ability to rapidly
create custom-designed jewellery, based on the preferences of the customers such as
including their initials on a ring. ByzantineCAD introduces a feature-based and voxel-based
approach to designing jewellery, through the definition of elementary structural elements with
Ioannis Fudos and Vasiliki Stamati 168
specific attributes and properties that are used as building blocks to construct complex pierced
designs.
More specifically, pierced Byzantine jewellery are gold jewels with pierced designs that
were made along the coastlines of the eastern Mediterranean Sea during the period 3rd –7th
century A.D. Their originality is due to the particular processing technique that is used for
their creation resulting in a special aesthetic effect. Pierced jewellery was created from thin
sheets of gold. The designs were engraved on these sheets of gold with a thin sharp tool. After
the outlining of the designs, holes following their shape were created and these were
decorated with triangular carvings, using an iron chisel.

(a)
Figure 3. Pierced voxel elements such as (a) are used as features to create complex solid plaques
representing designs, i.e. letters or words, that are sized and modified appropriately to construct
custom-designed jewellery (i.e. ring).
In ByzantineCAD a feature library of carved, pierced voxel elements is defined in
accordance to the craftsmanship used in traditional Byzantine jewellery. The design of
pierced jewellery is made up of cylindrical holes that have carvings around them. Each hole
with the corresponding carvings around it is considered for the purposes of reconstruction as a
structural element (feature). Each feature is a solid made of a rectangular parallelepiped with
a cylindrical hole and the corresponding carvings around the hole (figures 1,2) . According to
the aesthetic rules that characterize traditional pierced jewellery, all structural elements have
the same size but differ in the position of the hole and the carvings around it. The hole can be
located either in the center of the parallelepiped or in the center of any of the four quarters.
Constraint-Based and Feature-Based CAD Systems and Applications 169
Note that, in terms of computer aided design and manufacturing, the cylindrical hole can be
positioned anywhere in the rectangular parallelepiped; the above restriction follows from
careful interpretation of the traditional artistic patterns used. Attributes of these feature
elements are characteristics such as the number of carvings around the cylindrical hole, the
position of the hole in the parallelepiped, the directions of the carvings and more. A large
number of different structural elements can be created by a hole and various carvings and,
since not all of these feasible feature elements are valid for use in creating pierced designs,
restrictions concerning the carving directions are defined based on aesthetic and artistic rules.
These feature elements are combined like 3D building blocks to create complex carved
plaques representing pierced designs (Figure 3). The structural elements are placed side by
side, either on top, bottom, right or left of each other, and unioned into a new object. The
rules determining how the different features can be combined are defined by the designs to be
recreated. The construction of these plaques is constrained by the parameter values defined by
the user-designer in reference to characteristics of the plaque such as length and width. The
plaques are then used to create jewellery such as rings and necklace pendants. By
parameterizing the process of creating pierced jewellery, it is very easy to modify
characteristics of the jewellery such as the size and the designs represented.
4.2. Feature-Based Modeling for Reverse Engineering
Reverse engineering aims to analyze a real object and to determine its characteristics and
mechanisms, with further aim to reconstruct and remanufacture it. The data concerning the
physical object can be obtained by various methods. A common method is using a 3D laser
scanner to obtain a point cloud corresponding to points on the surface of the scanned object.
Given this, a more specific definition of reverse engineering would be to define it as the
process of obtaining a geometric CAD model from measurements acquired by scanning an
existing physical model [65].
Reverse engineering is vital for various industries because the computer models acquired
help improve the quality and efficiency of designs and also speed up the manufacturing and
analysis process. In mechanical part engineering and manufacturing, reverse engineering aims
to replicate existing parts for which no CAD models exist. Also, it is possible to manufacture
objects for which the original CAD model no longer corresponds to the physical part that was
manufactured due to subsequent undocumented modifications made after the initial design
stage. Reverse engineering is applied in industrial design, such as automobile exterior parts
design. Stylists and artists very often create physical models of their concepts by using clay,
plaster or wood. These real-scale models are then used to create CAD models for
manufacturing the objects on an industrial scale. Also the CAD models provide the artists and
stylists with the ability to re-evaluate their designs, especially when they can easily re-design
or modify them as needed. Reverse engineering encourages conceptual design because the
designer creates an initial prototype, scans it and manipulates it as desired.
Re-engineering objects of freeform design is relatively more difficult and complex than
reconstructing mechanical parts. Mechanical parts usually have specific geometric
characteristics, such as symmetries and swept profiles that are fairly easy to detect and
parameterize. However, in the case of freeform objects, there are often features that are
difficult to extract. In the case of restricted mechanical parts, features libraries can be defined
Ioannis Fudos and Vasiliki Stamati 170
and used for detecting features in the point cloud, whereas for freeform objects this is not
feasible.
Since reverse engineering applications aim at reproduction and manufacturing, the re-
engineering process must create highly accurate and robust CAD models. A characteristic of
the objects re-engineered is that they are usually parts of larger objects and therefore have to
fit and connect exactly with other parts, like pieces in a puzzle. For this reason, the models
created through the reverse engineering process must be well defined and constrained [66].
Also, in some cases, the model created is to be edited in i.e. custom designing, therefore a
simple B-rep model is not appropriate. Given this, it is natural that feature-based and
constrained based models are used more and more in re-engineering applications.
A feature-based reverse engineering method was also used in [67] for reverse engineering
a mannequin for garment design. The basic concept in this method is to create a generic
mannequin model of a human torso, which is appropriately aligned with the 3D point cloud of
the desired human torso model, and the generic model is “fitted” to the point cloud by
matching up characteristic points of the models e.g. peaks. This method creates parameterized
models by exploiting the features of the object and by using them to constrain the fitting
process. It is an automated approach to reverse engineering human torsos that creates
parameterized models with good accuracy.
Constraint definition and application has been used in building reconstruction and reverse
engineering objects of aesthetic design. Specifically, [68] examines how a priori knowledge
can be used to derive constraints to create more accurate models in architectural applications.
Relevant work is also found in [69]. In [70] the authors suggest a constraint-based approach
in reverse engineering for model beautification.

Figure 4. A point cloud of a screwdriver [74] (left) and its concavity intensity map (right).
A paradigm of feature-based/constraint-based re-engineering is the REFAB ([71], [48])
project, which uses a feature-based and constraint-based method to reverse engineer
mechanical parts. REFAB is a human interactive system where the 3D point cloud is
presented to the user, and the user selects from a list a feature that exists in the cloud,
specifies with the mouse the approximate location of the feature in the point cloud, and the
system then fits the specified feature to the actual point cloud data using a least square means
method iteratively. The authors give emphasis on the fitting of pockets, where the user draws
a profile of the pocket on the point cloud and the system then fits the profile to the data and
the profile is then extruded to create the pocket. This feature-fitting process is made more
Constraint-Based and Feature-Based CAD Systems and Applications 171
accurate by using constraints that are detected by the system, verified by the user and then
exploited to achieve a better fitting of the features according to the data. The system supports
constraints [72] such as parallelism, concentricity, perpendicularity and symmetry. The
constraints defined and used in REFAB seek to reduce the degrees of freedom associated with
the object as much as possible, so as to achieve high precision models in less time.
An interesting feature-based approach to re-engineering objects of freeform design is
presented in [73]. Re-engineering objects of freeform design is essential for supporting
custom design in a CAD model reconstruction system. It provides user-designers with the
capability to modify re-engineered CAD models according to their preferences and to
incorporate in novel designs. For instance, in the case of jewellery re-engineering, the user-
designer might like to be able to modify the dimensions of a ring to produce one of larger
size, or be able to choose certain parts of the object to use them to create other pieces of
jewellery. To this end, one needs to exploit the features of the original model and the
relationships and constraints that hold among them. In [73] a generic and global feature
detection approach to reverse engineering point clouds of objects of freeform design is
presented. A method is presented for detecting and segmenting a point cloud into individual
subsets that correspond to features. This is achieved by using a point characteristic defined as
“concavity intensity” to decompose the point cloud into subsets (components) that correspond
to features of the physical object. The concavity intensity of a point corresponds to the
smallest distance from the point to its convex hull that does not pass through the point cloud.
This feature basically detects concave features in the object being reengineered. A concavity
intensity map is shown in figure 4. In the concavity intensity map the values are rendered
using greyscale, where black corresponds to points belonging to the convex hull, whereas
white corresponds to points that are farthest away from the convex hull. We can observe that
edges (rapid variations in concavity intensity), saddle points and extrema can be used to
partition the point cloud into components that can later be refined as the features of the object.
Figure 5. Feature extraction is performed by region growing.
The point concavity intensity values calculated are used to segment the point cloud into
feature components. A feature component is bounded by areas where abrupt changes in the
direction of the normal and/or rapid concavity intensity variations are observed. After
calculating the concavity intensities of the all the points that form the point cloud, a region
Ioannis Fudos and Vasiliki Stamati 172
growing method to divide the point cloud into its components is applied (figure 5). The
region growing method is based on two criteria:
i) the normal vectors of neighboring points belonging to the same region should form
an angle smaller than a threshold t and
ii) the approximate gradients of the concavity intensity function in directions x, y and z
for neighboring points of the same region should maintain the same sign value,
meaning that there are no zero crossings observed between them.
5. Conclusion
We presented a survey of representation schemes for CAD applications. Then we introduced
a framework for representing and performing complex design operations in CAD systems.
This framework has evolved from the feature-based design paradigm augmented with
arbitrary intra-feature and inter-feature constraints and a hierarchical constraint analysis
approach for placing geometric elements. To demonstrate the employment of this framework
we presented two example applications that we have developed using this new design
concept.
References
[1] Cripps R.J., Algorithms to support point-based cadcam, International Journal of
Machine Tools and Manufacture 2003, 43(4), 425-432.
[2] Kobbelt L. and Botsch M., A survey of point-based techniques in computer graphic,
Computers and Graphics 2004, 28(6), 801-814.
[3] Hoppe H., Derose T., Duchamp T., McDonald J. and Stuetzle W., Surface
reconstruction from unorganized points, Computer Graphics 1992, 26(2), 71--78.
[4] Benko P., Martin R.R. and Varady T., Algorithms for reverse engineering boundary
representation models, Computer-Aided Design 2001, 33(11), 839-851.
[5] Langbein F.C., Marshall A.D. and Martin R.R., Choosing consistent constraints for
beautification of reverse engineered geometric models, Computer-aided Design 2004,
36(3), 261-278.
[6] Jense G.J., Voxel-based methods for CAD, Computer Aided Design 1989, 21(10), 528-
533.
[7] Hoffmann C.M. and Joan-Arinyo R., Erep An editable high-level representation for
geometric design, Geometric Modeling for Product Realization, P. Wilson, M. Wozny ,
M. Pratt, eds., North Holland, 1993, 129-164.
[8] Benko P., Kos G., Varady T., Andor L. and Martin R.R., Constrained fitting in reverse
engineering, Computer-Aided Design 2002, 19(3), 173-205.
[9] Anderl R. and Mendgen R., Modelling with constraints: theoretical foundation and
application, Artificial Intelligence in Computer-Aided Design 1996, 28(3), 155-168.
[10] Cornea N.D., Demirci M.F., Silver D., Shokoufandeh A., Dickinson S.J. and Kantor
P.B., 3D Object Retrieval using Many-to-many Matching of Curve Skeletons, IEEE
Constraint-Based and Feature-Based CAD Systems and Applications 173
International Conference on Shape Modeling and Applications (SMI) 2005, Boston
USA, 368-373.
[11] Bloomenthal J., Medial-based Vertex Deformation, Proceedings of the 2002 ACM
SIGGRAPH/Eurographics Symposium on Computer Animation 2002, San Antonio, TX,
USA, .
[12] Sheehy D., Armstrong C. and Robinson D., Shape description by medial axis
construction, IEEE Transactions on Visualization and Computer Graphics 1996, 2(1),
62-72.
[13] Storti D.W., Turkiyyah G.M., Ganter M.A., Lim C.T. and Stal D.M., Skeleton-based
modeling operations on solids, Solid Modeling '97 1997, Atlanta GA USA.
[14] Lien J., Keyser J. and Amato N.M., Simultaneous Shape Decomposition and
Skeletonization, Proceedings of the ACM Syposium on Solid and Physical Modeling
2006. Cardiff, Wales, United Kingdom.
[15] Yoshizawa S., Belyaev A.G. and Seidel H-P., Free-form skeleton-driven mesh
deformation, Proceedings of the eighth ACM Symposium on Solid Modeling and
applications 2003, Seattle, Washington U.S.A.
[16] Fudos I., Constraint Solving for Computer Aided Design, Dept of Computer Sciences,
Purdue University, 1995, PhD Thesis.
[17] Sutherland I., Sketchpad, a man-machine graphical communication system. Proceedings
of the spring Joint Compo Conference, 1963.
[18] Nelson, G., Juno, a constraint-based graphics system, SIGGRAPH 1985, San Francisco
USA.
[19] Serrano D. and Gossard D., Combining mathematical models and geometric models in
CAE systems, Proc. ASME Computers in Eng. Conf 1986, Chicago USA.
[20] Beaty P.L., Fitzhorn P.A. and Herron G.J., Extensions in variational geometry that
generate and modify object edges composed of rational Bezier curves, Computer-Aided
Design 1994, 26(2), 98-108.
[21] Light R. and Gossard D., Modification of geometric models through variational
geometry, Computer Aided Design 1982, 14(4), 209-214.
[22] Chyz W., Constraint management for CSG, MIT, 1985, Master's Thesis.
[23] Allgower E.L. and Georg K., Continuation and path following, Acta Numerica 1993, 1-
64.
[24] Lamure H. and Michelucci D., Solving geometric constraints by homotopy, Proc. Third
Symposium on Solid Modeling and Applications 1995, Salt Lake City, USA.
[25] Morgan A., Solving polynomial systems using continuation for engineering and
scientific problems 1987, Prentice Hall Inc.
[26] Bruderlin B. and Roller D., Geometric constraint solving and applications 1998,
Springer Verlag.
[27] Bruderlin B., Constructing three-dimensional geometric objects defined by constraints,
In Workshop on Interactive 3D Graphics 1986.
[28] Sohrt W., Interaction with constraints in three-dimensional modeling, Dept of Computer
Science. The University of Utah, 1991, Master's Thesis.
[29] Bruderlin B., Using geometric rewrite rules for solving geometric problems
symbolically, Theoretical Computer Science 1993, 116, 291-303.
[30] Aldefeld, B., Variation of geometries based on a geometric-reasoning method,
Computer-Aided Design 1988, 20(3), 117-126.
Ioannis Fudos and Vasiliki Stamati 174
[31] Suzuki H., Ando H. and Kimura F., Variation of geometries based on a geometric-
reasoning method, Computers & Graphics 1990, 14(2), 211-224.
[32] Sunde G., Specification of shape by dimensions' and other geometric constraints, in
Geometric modeling for CAD applications 1988, M. J. Wozny, H. W. McLaughlin, and
J. L. Encarnacao, eds., North Holland, IFIP, 199-213.
[33] Yamaguchi Y. and Kimura F., A constraint modeling system for variational geometry,
in Geometric Modeling for Product Engineering, M. J. Wozny, J.U. Turner and K.
Preiss, eds., 1990, Elsevier Science Publishers B.V. (North Holland), 221-233.
[34] Verroust A., Schonek F. and Roller D., Rule-oriented method for parameterized
computer-aided design, Computer Aided Design 1992, 24(10), 531-540.
[35] Joan-Arinyo R. and Soto A., A rule-constructive geometric constraint solver, Technical
Report LSI-95-25-R, 1995, Universitat Politecnica de Catalunya.
[36] Fudos I. and Hoffmann C.M., Correctness proof of a geometric constraint solver,
International Journal of Computational Geometry & Applications, 1996, 405-420
[37] Fitzgerald W.J., Using axial dimensions to determine the proportions of line drawings in
computer graphics, Computer Aided Design 1981, 13(6), 377-382.
[38] Requicha, A., Dimensionining and tolerancing, Technical report PADL TM-19,
Production Automation Project 1977, University of Rochester.
[39] Todd P., A k-tree generalization that characterizes consistency of dimensioned
engineering drawings, SIAM J DISC MATH 1989, 2(2), 255-261.
[40] Owen J.C., Algebraic solution for geometry from dimensional constraints, ACM Symp.
Found. of Solid Modeling 1991, Austin, TX.
[41] D-Cubed, Ltd., 68 Castle Street, Cambridge, CB3 OAJ, England. The Dimensional
Constraint Manager, June 1994, Version 2.7.
[42] Fudos I. and Hoffmann C.M., A graph-constructive method to solving systems of
geometric constraints, ACM Transactions on Graphics 1997, 16(2), 179-216.
[43] Buchberger B., Grobner Bases: An algorithmic method in polynomial ideal theory, in
Multidimensional Systems Theory, N.K. Bose, Editor, 1985, D. Reidel Publishing
Company, 184-232.
[44] Wu W.T., Basic principles of mechanical theorem proving in elementary geometries,
Journal of Automated Reasoning 1986, 2, 221-252.
[45] Hoffmann C.M., Lomonosov A., and Sitharam M., Finding solvable subsets of
constraint graphs, L. G. Smolka, editor, 1997, Springer Verlag. 463-477.
[46] Fudos I. and Markouzis D., A hierarchical feature-based approach to computer aided
placement and routing for VLSI, Technical Report TR-2007-09, 2007, Computer
Science Department, University of Ioannina.
[47] Shah J.J. and Mantyla M., Parametric and Feature-based CAD/CAM 1995, John Wiley
& Sons Inc.
[48] Thompson W.B., Owen J.C., James de St Germain H., Stark S.R. and Henderson T.C.,
Feature-based reverse engineering of mechanical parts, IEEE Transactions on Robotics
and Automation 1999, 12(1), 57-66, .
[49] Lee J.Y. and Kim K., A feature-based approach to extracting machining features,
Computer-Aided Design 1998, 30(13), 1019-1035.
[50] Stamati V. and Fudos I., A parametric feature-based CAD system for reproducing
traditional pierced jewellery, Computer Aided Design 2005, 37(4), 431-449.
Constraint-Based and Feature-Based CAD Systems and Applications 175
[51] Nyirenda P. J., Mulbagal M. and Bronsvoort W. F., Definition of freeform surface
feature classes, Computer Aided Design & Applications 2006, 3(5), 665-674.
[52] Fontana M., Giannini F. and Meirana M., A free form feature taxonomy, In Proceedings
of EUROGRAPHICS '99, 1999.
[53] Dobson G.T., Waggenspack Jr W.N. and Lamousin H.J., Feature based models for
anatomical data fitting, Computer Aided Design 1995, 27(2), 139-146.
[54] Gumhold S., Wang X. and MacLeod R., Feature extraction from point clouds,
Proceedings of the 10th International Meshing Roundtable, 2001.
[55] Hoffmann C.M. and Joan-Arinyo R., On user-defined features, Computer Aided Design
1998, 30(5), 321-332.
[56] PTC, PRO/ENGINEER, http://www.ptc.com.
[57] Autodesk, AUTOCAD, http://www.autodesk.com/autocad.
[58] IRONCAD, IRONCAD, http://www.ironcad.com.
[59] Dassault Systems, CATIA, http://www.3ds.com/products-solutions/plm-solutions/catia/
overview.
[60] Dassault Systems, Solidworks, http://www.3ds.com/products-solutions/ solidworks.
[61] Siemens - UGS PLM Software, Solid Edge, http://www.solidedge.com.
[62] Alibre Inc., Alibre Design, http://www.alibre.com.
[63] Autodesk, 3D Studio Max, http://www.autodesk.com/3dsmax.
[64] Autodesk, Maya, http://www.autodesk.com/maya.
[65] Varady T., Martin R.R. and Cox J., Reverse engineering of geometric models - an
introduction, Computer Aided Design 1997, 29(4), 253-330.
[66] Werghi N., Fisher R., Robertson C. and Ashbrook A., Object reconstruction by
incorporating geometric constraints in reverse engineering, Computer Aided Design
1999, 31(6), 363-399.
[67] Au C.K. and Yuen M.M.F., Feature-based reverse engineering of mannequin for
garment design, Computer-Aided Design 1999, 31(12), 751-759.
[68] Fisher R.B., Applying knowledge to reverse engineering problems, Computer-Aided
Design 2004, 36, 501-510.
[69] Cantzler, H., Improving architectural 3D reconstruction by constrained modeling,
Institute of Perception, Action and Behaviour, School of Informatics, University of
Edinburgh, 2003, PhD Thesis.
[70] Gao C.H., Langbein F.C., Marshall A.D. and Martin R.R., Local topological
beautification of reverse engineered models, Computer-Aided Design 2004, 36(13),
1337-1355.
[71] Thompson W.B., James de St Germain H., Henderson T.C. and Owen J.C., Constructing
high-precision geometric models from sensed position data, Proceedings of the 1996
ARPA Image Understanding Workshop, 1996.
[72] James de St. Germain H., Stark S.R., Thompson W.B. and Henderson T.C., Constraint
optimization and feature-based model construction for reverse engineering, ARPA
Image Understanding Workshop, 1997.
[73] Stamati V. and Fudos I., A feature based approach to re-engineering objects of freeform
design by exploiting point cloud morphology. Proceedings of the 2007 ACM symposium
on Solid and physical modeling 2007, Beijing China.
[74] Cyberware, Cyberware Rapid 3D Scanners - Desktop 3D Scanner Samples, 1999,
http://www.cyberware.com/products/ scanners/desktopSamples.html
In: Computer Animation
Editors: J.S. Wright and L.M. Hughes, pp. 177-208
ISBN 978-1-60741-559-6
c 2010 Nova Science Publishers, Inc.
Chapter 8
COMPUTER AIDED GEOMETRIC DESIGN
WITH POWELL-SABIN SPLINES
Hendrik Speleers
1,2
, Paul Dierckx
1
and Stefan Vandewalle
1
1
Katholieke Universiteit Leuven, Department of Computer Science, Belgium
2
Research Assistant of the Fund for Scientiﬁc Research Flanders, Belgium
Abstract
Powell-Sabin splines are bivariate C
1
-continuous quadratic splines deﬁned on an
arbitrary triangulation. Their construction is based on a particular split of each trian-
gle in the triangulation into six smaller triangles. In this article we give an overview
of the properties of Powell-Sabin splines in the context of computer aided geometric
design. These splines can be represented in a compact normalized B-spline basis with
an intuitive geometric interpretation involving control triangles. Using these triangles
one can interactively change the shape of the splines in a predictable way. We describe
the simple subdivision rules for Powell-Sabin splines, and discuss some applications.
We consider a new efﬁcient spline visualization technique based on subdivision. We
also look at two useful generalizations of the Powell-Sabin splines, i.e., QHPS splines
and NURPS surfaces. The QHPS splines are a hierarchical variant of Powell-Sabin
splines. They have very similar properties as the Powell-Sabin splines, and their hier-
archical nature allows a local reﬁnement of the spline in a very straightforward way.
The NURPS surface is the rational extension of the Powell-Sabin spline. By means
of weights they give extra degrees of freedom to the designer for the modelling of
surfaces.
1. Introduction
The ability to represent complex surfaces on the computer is important in a broad range
of applications [10]. Tensor product B-splines are today’s most commonly used surface
splines in computer aided geometric design (CAGD) packages. They are very attractive
because of their compact representation, the ease of implementation and their efﬁciency.
The B-spline control net of such a surface enables local adaptations in a geometrically
intuitive way. A deﬁnite drawback, however, is that they are restricted to regular meshes
178 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
on rectangular domains. Therefore, they are not well suited to represent strongly irregular
objects deﬁned on arbitrary domains.
Instead of using the tensor product representation on rectangular domains, one can also
describe piecewise bivariate polynomial surfaces in terms of barycentric coordinates with
respect to a triangular domain. These triangular patches in Bernstein form are called B´ ezier
triangles [9]. In spite of the ﬂexibility of a triangular mesh, the representation of complex
smooth surfaces can require a large number of B´ ezier triangles. Imposing smoothness con-
ditions between these patches results in a large number of non-trivial continuity relations
between the coefﬁcients.
A number of authors studied the construction of smooth bivariate spline functions on
arbitrary triangulations. A major difﬁculty is to determine the dimension of such spline
spaces. In general, it is not possible to express the dimension in terms of the number of
vertices and triangles in the triangulation. There are some results for particular choices
of polynomial degree and smoothness [1, 3, 13], and for particular constrained triangula-
tions [11]. Yet, in general and especially for low degree polynomials the problem remains
open. One can overcome this problem by using so-called macro-elements, where each tri-
angle in the triangulation is split in a particular way. Well-known in the ﬁnite element
literature is the C
1
-continuous cubic Clough-Tocher spline space [5]. For C
1
-continuous
quadratic splines, Powell and Sabin [21] constructed an element by splitting all triangles
into six subtriangles.
Powell-Sabin splines can be compactly represented in a normalized basis. Dierckx [6]
discovered that the basis splines have an intuitive geometric interpretation by means of
control triangles. Windmolders [34] was the ﬁrst to investigate the use of Powell-Sabin
splines in CAGD applications. Many new interesting properties and techniques have been
developed since, e.g., a triadic subdivision scheme. We will overview them in the context
of computer aided design and modelling. Section 2. is devoted to the Powell-Sabin spline
space. It recalls some basic concepts of Powell-Sabin splines and the construction of a suit-
able normalized basis. In section 3. we describe a subdivision scheme for the splines, and
we discuss some applications. The next sections cover two generalizations of Powell-Sabin
splines. Section 4. discusses a hierarchical variant of the splines, called QHPS splines. They
retain similar properties as the Powell-Sabin splines, but they are deﬁned on a hierarchical
triangulation. These splines are very useful in an adaptive local reﬁnement strategy. In
section 5. we consider NURPS surfaces, a rational extension of Powell-Sabin splines. They
give the designer more degrees of freedom by means of extra weights.
2. Powell-Sabin Splines
In this section we detail the theory of Powell-Sabin splines. We ﬁrst recall some general
concepts of polynomials on triangles in their Bernstein-B´ ezier representation. Powell-Sabin
splines are deﬁned on arbitrary triangulations reﬁned with a particular split. We discuss a
normalized basis for this spline space, and we give an overview of their properties.
Computer Aided Geometric Design with Powell-Sabin Splines 179
2.1. Polynomials on Triangles
Barycentric coordinates provide an elegant tool for deﬁning points inside a triangle. Let
T (V
1
, V
2
, V
3
) be a non-degenerated triangle. An arbitrary point P in the plane of the trian-
gle can be uniquely expressed in terms of the barycentric coordinates τ = (τ
1
, τ
2
, τ
3
) with
respect to T , such that
P =
3

i=1
τ
i
V
i
, and τ
1
+τ
2
+τ
3
= 1. (2.1)
If the point P lies inside the triangle T , then its barycentric coordinates are all positive.
Consider two points in the plane of the triangle, i.e., P
1
and P
2
. The barycentric direction
δ = (δ
1
, δ
2
, δ
3
) of the vector P
2
− P
1
with respect to T is deﬁned as the difference of the
barycentric coordinates of both points. If the Euclidian distance P
2
− P
1
= 1, then δ is
called a unit barycentric direction.
Let Π
d
denote the linear space of bivariate polynomials of total degree less than or
equal to d. Any polynomial p
d
∈ Π
d
on triangle T has a unique Bernstein-B´ ezier represen-
tation [9],
p
d
(τ) =

i+j+k=d
b
ijk
B
d
ijk
(τ), (2.2)
with
B
d
ijk
(τ) =
d!
i!j!k!
τ
1
i
τ
2
j
τ
3
k
(2.3)
the Bernstein polynomials on the domain triangle T . The coefﬁcients b
ijk
are called B´ ezier
ordinates, and the B´ ezier domain points ξ
ijk
are deﬁned as the points with barycentric co-
ordinates
_
i
d
,
j
d
,
k
d
_
. By associating each B´ ezier ordinate b
ijk
with the B´ ezier domain points
ξ
ijk
, we can display the Bernstein-B´ ezier representation schematically as in Figure 1(a)
for the case d = 2. The piecewise linear interpolant of the B´ ezier control points, deﬁned
as (ξ
ijk
, b
ijk
), is called the B´ ezier control net. This control net is tangent to the polyno-
mial surface z = p
d
(τ) at the three vertices of the triangle, and it mimics the shape of the
Bernstein-B´ ezier surface. Figure 1(b) shows such a surface together with its control points
and control net.
Polynomials in their Bernstein-B´ ezier representation (2.2) can be evaluated using the
recursive de Casteljau algorithm [4], i.e.,
p
d
(τ) = b
d
0,0,0
(τ), (2.4a)
where
b
0
i,j,k
(τ) = b
ijk
, i +j +k = d, (2.4b)
b
r
i,j,k
(τ) = τ
1
b
r−1
i+1,j,k
(τ) +τ
2
b
r−1
i,j+1,k
(τ) +τ
3
b
r−1
i,j,k+1
(τ),
i +j +k = d −r, and r = 1, . . . , d. (2.4c)
This algorithm is numerically stable and has many interesting properties [9]. Besides for
evaluation, the algorithm can also be used for computing derivatives and for obtaining con-
tinuity conditions on neighbouring triangular patches.
180 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
V
1
V
2
V
3
b
200
b
101
b
002
b
110
b
011
b
020
(a) (b)
Figure 1. (a) Schematic representation of a quadratic bivariate polynomial by means of its
B´ ezier ordinates b
ijk
. (b) A quadratic Bernstein-B´ ezier polynomial with its control points
and control net.
Representing complex surfaces requires the use of a large number of B´ ezier triangles.
Preserving a certain degree of continuity between all patches results then in a large set of
non-trivial relations between their B´ ezier ordinates. Therefore, one has looked for piecewise
polynomials with inherent global continuity conditions.
2.2. The Powell-Sabin Spline Space
Consider a simply connected subset Ω ⊂ R
2
with polygonal boundary ∂Ω. Assume a
conforming triangulation ∆ of Ω is given, consisting of t triangles T
j
, with j = 1, . . . , t,
and having n vertices V
k
, with k = 1, . . . , n. A triangulation is conforming if no triangle
contains a vertex different from its own three vertices.
The Powell-Sabin (PS) reﬁnement ∆
∗
of ∆ partitions each triangle T
j
into six smaller
triangles in the following way:
1. Choose an interior point Z
j
in each triangle T
j
, so that if two triangles T
i
and T
j
have a common edge, then the line joining Z
i
and Z
j
intersects the common edge at
a point R
ij
between its vertices.
2. Join each point Z
j
to the vertices of T
j
.
3. For each edge of the triangle T
j
(a) which belongs to the boundary ∂Ω: join Z
j
to an arbitrary point on that edge;
(b) which is common to a triangle T
i
: join Z
j
to R
ij
.
Figure 2(a) displays a triangulation with 8 triangles, and a corresponding PS reﬁnement
containing 48 triangles.
The space of piecewise quadratic polynomials on ∆
∗
with global C
1
-continuity is called
the Powell-Sabin spline space:
S
1
2
(∆
∗
) :=
_
s ∈ C
1
(Ω) : s|
T
∗
j
∈ Π
2
, T
∗
j
∈ ∆
∗
_
. (2.5)
Computer Aided Geometric Design with Powell-Sabin Splines 181
(a) (b)
Figure 2. (a) A PS reﬁnement ∆
∗
(in dashed lines) of a given triangulation ∆ (in solid
lines). (b) The PS points (bullets) and a set of suitable PS triangles (shaded).
Each of the 6t triangles resulting from the PS reﬁnement is the domain triangle of a
quadratic Bernstein-B´ ezier polynomial. Powell and Sabin [21] proved that the following
interpolation problem
s(V
k
) = f
k
,
∂s
∂x
(V
k
) = f
x,k
,
∂s
∂y
(V
k
) = f
y,k
, k = 1, . . . , n, (2.6)
has a unique solution s(x, y) ∈ S
1
2
(∆
∗
) for any given set of n (f
k
, f
x,k
, f
y,k
)-triplets.
Hence, the dimension of the Powell-Sabin spline space S
1
2
(∆
∗
) equals 3n.
2.3. A B-spline Representation
Dierckx [6] presented a geometric method to construct a normalized basis for the spline
space S
1
2
(∆
∗
). Every Powell-Sabin spline can then be represented as
s(x, y) =
n

i=1
3

j=1
c
i,j
B
j
i
(x, y). (2.7)
To obtain the basis functions B
j
i
(x, y), we associate with each vertex V
i
three linearly
independent triplets (α
i,j
, β
i,j
, γ
i,j
), j = 1, 2, 3. These triplets are determined as follows:
1. For each vertex V
i
∈ ∆ with Cartesian coordinates (x
i
, y
i
), ﬁnd the corresponding
PS points. These points are the immediately surrounding B´ ezier domain points of V
i
in the PS reﬁnement ∆
∗
. The vertex V
i
itself is also a PS point. In Figure 2(b) the PS
points are indicated as bullets.
2. For each vertex V
i
, ﬁnd a triangle t
i
(Q
i,1
, Q
i,2
, Q
i,3
) that contains all the PS points
of V
i
. These triangles t
i
, i = 1, . . . , n, are called PS triangles, and we denote their
vertices as Q
i,j
= (X
i,j
, Y
i,j
). Figure 2(b) shows a possible set of PS triangles.
182 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
3. The three linearly independent triplets (α
i,j
, β
i,j
, γ
i,j
), j = 1, 2, 3, are derived from
the PS triangle t
i
of vertex V
i
as follows:
• α
i
= (α
i,1
, α
i,2
, α
i,3
) are the barycentric coordinates of V
i
with respect to t
i
,
• β
i
= (β
i,1
, β
i,2
, β
i,3
) and γ
i
= (γ
i,1
, γ
i,2
, γ
i,3
) are the unit barycentric direc-
tions with respect to t
i
, in the x- and y-direction respectively.
Practically, they can be computed as
α
i
=
_
_
1
E
¸
¸
¸
¸
¸
¸
x
i
y
i
1
X
i,2
Y
i,2
1
X
i,3
Y
i,3
1
¸
¸
¸
¸
¸
¸
,
1
E
¸
¸
¸
¸
¸
¸
X
i,1
Y
i,1
1
x
i
y
i
1
X
i,3
Y
i,3
1
¸
¸
¸
¸
¸
¸
,
1
E
¸
¸
¸
¸
¸
¸
X
i,1
Y
i,1
1
X
i,2
Y
i,2
1
x
i
y
i
1
¸
¸
¸
¸
¸
¸
_
_
,
β
i
=
_
Y
i,2
−Y
i,3
E
,
Y
i,3
−Y
i,1
E
,
Y
i,1
−Y
i,2
E
_
,
γ
i
=
_
X
i,3
−X
i,2
E
,
X
i,1
−X
i,3
E
,
X
i,2
−X
i,1
E
_
,
with
E =
¸
¸
¸
¸
¸
¸
X
i,1
Y
i,1
1
X
i,2
Y
i,2
1
X
i,3
Y
i,3
1
¸
¸
¸
¸
¸
¸
.
The Powell-Sabin B-spline B
j
i
(x, y) is deﬁned as the unique solution of the interpolation
problem (2.6) with all (f
k
, f
x,k
, f
y,k
) = (0, 0, 0) except for k = i, where (f
i
, f
x,i
, f
y,i
) =
(α
i,j
, β
i,j
, γ
i,j
) = (0, 0, 0). Figure 3 shows an example of three linearly independent
Powell-Sabin B-splines corresponding to the same vertex.
Choice of PS triangles. The set of PS triangles is not uniquely deﬁned for a given PS
reﬁnement. One possibility for their construction is to calculate triangles of minimal area,
the so-called optimal PS triangles [6]. Computationally, this problem leads to a quadratic
programming problem. An alternative (and easier to implement) solution is given in [31],
where the sides of the PS triangle are found by connecting two neighbouring PS points.
From a practical point of view, other choices may be more appropriate. A particular
choice of the PS triangles can, e.g., simplify the treatment of boundary conditions [28]. In
such a case it is better to construct PS triangles at the boundary vertices with one side tan-
gential and another normal to the boundary curve. For quasi-interpolation [20] the corners
of each PS triangle are preferred to be chosen on edges of the triangulation.
2.4. Properties of the B-spline Basis
The Powell-Sabin B-splines have some nice properties, which are very useful in CAGD
and approximation applications. In this section we review some of them. Later on, in
section 3., we will discuss in more detail subdivision rules for Powell-Sabin splines in the
representation (2.7).
Computer Aided Geometric Design with Powell-Sabin Splines 183
(a) (b)
(c) (d)
Figure 3. (a) A given triangulation with PS reﬁnement. (b)-(d) The three Powell-Sabin
B-splines B
j
i
(x, y) corresponding to the central vertex V
i
and its PS triangle. The contour
lines of the basis functions are depicted.
Local support. It is easy to see that each Powell-Sabin B-spline B
j
i
(x, y) has a local
support, because the basis function is zero outside the molecule M
i
of vertex V
i
. The
molecule (also called 1-ring) is deﬁned as the union of all triangles in the triangulation that
contain V
i
.
Convex partition of unity. Dierckx showed in [6] that the proposed basis forms a convex
partition of unity on the domain Ω, i.e.,
B
j
i
(x, y) ≥ 0, and
n

i=1
3

j=1
B
j
i
(x, y) = 1, for all (x, y) ∈ Ω. (2.8)
The fact that each PS triangle t
i
contains all PS points of vertex V
i
guarantees the positivity
of the basis functions.
Stability. Maes et al. [19] proved that the basis is L
∞
-stable. For the max-norms
c
∞
= max
i,j
|c
i,j
|, and s(x, y)
∞
= max
Ω
|s(x, y)|,
184 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
Figure 4. A Powell-Sabin spline and its PS control triangles.
they showed that for all choices of the coefﬁcient vector c, one has that
K
1
c
∞
≤ s(x, y)
∞
≤ K
2
c
∞
, (2.9)
where K
2
= 1, and K
1
depends only on the smallest angle θ
∆
in the triangulation ∆
and on the size of the PS triangles. Moreover, the smaller the PS triangles the better (the
larger) the stability constant. In [25] an adapted version of the proof is given, resulting in a
sharper stability bound. The ratio K
2
/K
1
yields an upper bound for the condition number
of the basis. It reﬂects the inﬂuence of a change in the coefﬁcients on the magnitude of the
corresponding spline with respect to the L
∞
-norm. In [26, 32] the stability of the Powell-
Sabin spline basis is proven in the more general L
p
-norm with 1 ≤ p ≤ ∞.
A stable local basis provides spline approximations of smooth functions with an optimal
order [16].
PS control triangles. Referring to the representation (2.7) for Powell-Sabin splines, we
deﬁne control points as
c
i,j
= (Q
i,j
, c
i,j
). (2.10)
These points lead to PS control triangles T
i
(c
i,1
, c
i,2
, c
i,3
), which are tangent to the spline
surface z = s(x, y) at the vertices V
i
. The projection of the control triangles T
i
in the
(x, y)-plane are simply the PS triangles t
i
. Using these control triangles a designer can
interactively change the shape of a given Powell-Sabin spline locally in a predictable way.
From property (2.8) it follows that the graph of the spline (2.7) lies inside the convex hull
of its control points c
i,j
.
Figure 4 shows a Powell-Sabin spline surface together with the corresponding control
triangles. The spline is taken from [7] and represents a smooth approximation of the func-
tion
_
exp((x −0.52)
2
+ (y −0.48)
2
) −0.95
_
−1
on the domain [−1, 1] ×[−1, 1].
Computer Aided Geometric Design with Powell-Sabin Splines 185
V
i
V
j
V
k
R
jk
R
ij
R
ki
Z
ijk
S
i
S
′
i
˜
S
i
Q
i,1
Q
i,2
Q
i,3
(a)
s
i
u
i
v
i
w
i
s
j
u
j
v
j
w
j
s
k
u
k
v
k
w
k
r
i θ
i
r
j
θ
j
r
k
θ
k
ω
(b)
Figure 5. (a) PS reﬁnement of a triangle T (V
i
, V
j
, V
k
), together with the PS triangle
t
i
(Q
i,1
, Q
i,2
, Q
i,3
) associated with vertex V
i
. (b) Schematic representation of the B´ ezier
ordinates of a Powell-Sabin spline.
Other bases for the Powell-Sabin spline space can be found in the literature [2, 17]. Their
construction is based on so-called minimal determining sets. They are stable, but they do
not form a convex partition of unity and they have no geometric interpretation via control
triangles.
2.5. A Bernstein-B´ ezier Representation
For further manipulation (e.g. evaluation and differentiation) of a Powell-Sabin spline in
the form (2.7), we can write the spline in a Bernstein-B´ ezier representation.
We consider a single domain triangle T (V
i
, V
j
, V
k
) ∈ ∆ with its PS reﬁnement ∈ ∆
∗
.
The other triangles in the triangulation can be treated in the same way. We assume that the
points indicated in Figure 5(a) have the following barycentric coordinates:
V
i
(1, 0, 0), V
j
(0, 1, 0), V
k
(0, 0, 1), Z
ijk
(z
i
, z
j
, z
k
),
R
ij
(λ
ij
, λ
ji
, 0), R
jk
(0, λ
jk
, λ
kj
), R
ki
(λ
ik
, 0, λ
ki
).
On each of the six triangles in ∆
∗
the Powell-Sabin spline is a quadratic polynomial, that
can be represented in its Bernstein-B´ ezier formulation, i.e., with d = 2 in equations (2.2)
and (2.3). The value of the corresponding B´ ezier ordinates is derived in [6]. The outcome
is schematically represented in Figure 5(b), with
s
i
= α
i,1
c
i,1
+α
i,2
c
i,2
+α
i,3
c
i,3
, (2.11a)
u
i
= L
i,1
c
i,1
+L
i,2
c
i,2
+L
i,3
c
i,3
, (2.11b)
v
i
= L
′
i,1
c
i,1
+L
′
i,2
c
i,2
+L
′
i,3
c
i,3
, (2.11c)
w
i
=
˜
L
i,1
c
i,1
+
˜
L
i,2
c
i,2
+
˜
L
i,3
c
i,3
. (2.11d)
The values of (α
i,1
, α
i,2
, α
i,3
), (L
i,1
, L
i,2
, L
i,3
), (L
′
i,1
, L
′
i,2
, L
′
i,3
) and (
˜
L
i,1
,
˜
L
i,2
,
˜
L
i,3
) are
found as the barycentric coordinates of the PS points V
i
, S
i
, S
′
i
and
˜
S
i
, respectively, with
186 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
respect to the PS triangle t
i
(Q
i,1
, Q
i,2
, Q
i,3
). These points are depicted in Figure 5(a).
Analogously, we can compute the values of (s
j
, u
j
, v
j
, w
j
) and (s
k
, u
k
, v
k
, w
k
). The other
B´ ezier ordinates are derived from the inherent continuity conditions of the Powell-Sabin
spline, e.g.,
r
k
= λ
ij
u
i
+λ
ji
v
j
, (2.11e)
θ
k
= λ
ij
w
i
+λ
ji
w
j
, (2.11f)
ω = z
i
w
i
+z
j
w
j
+z
k
w
k
. (2.11g)
In this Bernstein-B´ ezier representation the Powell-Sabin splines can be easily manipulated
using the de Casteljau algorithm (2.4).
2.6. Parametric Powell-Sabin Surfaces
A parametric Powell-Sabin surface is deﬁned as
_
¸
_
¸
_
x =

n
i=1

3
j=1
c
x
i,j
B
j
i
(u, v)
y =

n
i=1

3
j=1
c
y
i,j
B
j
i
(u, v)
z =

n
i=1

3
j=1
c
z
i,j
B
j
i
(u, v)
, (u, v) ∈ Ω, (2.12)
or, compactly,
s(u, v) =
n

i=1
3

j=1
c
i,j
B
j
i
(u, v), (2.13)
where the c
i,j
= (c
x
i,j
, c
y
i,j
, c
z
i,j
) are again called control points. Referring to (2.7), the graph
of a Powell-Sabin spline is a particular case of the parametric Powell-Sabin surface, notably
with x = u and y = v.
A parametric surface s(u, v) lies within the convex hull of its control points. We can
associate a control triangle T
i
(c
i,1
, c
i,2
, c
i,3
) with each vertex V
i
in the parameter domain.
This triangle is tangent to the surface at s(V
i
). Note that in the parametric setting the choice
of the control points c
i,j
is completely free, whereas for Powell-Sabin splines only the z-
component of the control points can be chosen. Figure 6 depicts a torus modelled using a
parametric Powell-Sabin surface with 16 control triangles.
Parametric Powell-Sabin surfaces can be easily manipulated by applying the algorithms
for Powell-Sabin splines separately on the three components in (2.12).
3. Spline Subdivision
A natural question that comes up in many applications is how to represent a spline function
on a reﬁnement ∆
1
of the given triangulation ∆
0
. A procedure to do that is called a sub-
division scheme. Windmolders and Dierckx [35] considered the subdivision problem for
uniform Powell-Sabin splines with a dyadic scheme. Recently, this was used to construct a
tangent subdivision scheme for parameter free surfaces [30].
However, the dyadic reﬁnement scheme is generally not applicable for non-uniform
Powell-Sabin splines. Vanraes et al. [33] presented a global triadic subdivision scheme for
Computer Aided Geometric Design with Powell-Sabin Splines 187
Figure 6. A torus modelled by a parametric Powell-Sabin surface. The control triangles and
the triangular mesh lines are shown.
general Powell-Sabin splines, which was extended to a local adaptive scheme in [23]. First,
we explain the reﬁnement strategy for a given triangulation, and then we determine the
corresponding PS control triangles of the new spline representation such that the original
spline surface is preserved.
3.1. Reﬁnement Rules of the Triangulation
General subdivision for Powell-Sabin splines is based on the so-called
√
3 reﬁnement
scheme [14, 15]. Such a reﬁnement proceeds as follows:
1. Split every triangle into three subtriangles by inserting a newvertex V
ijk
inside the old
triangle T (V
i
, V
j
, V
k
), and connect it to the surrounding old vertices. For example,
in the Powell-Sabin case the new vertex will be located at the interior point Z
ijk
.
2. Flip each edge adjacent to two reﬁned triangles of the original triangulation in order
to rebalance the newtriangulation. These edges connect nowtwo newvertices instead
of two original vertices.
These two steps are illustrated in Figure 7. From the construction it follows that the reﬁned
triangulation ∆
√
3
does not preserve the original edges except at the boundary. However,
if the new vertex V
ijk
is chosen as the interior point Z
ijk
of the original PS reﬁnement, the
new edges of ∆
√
3
still coincide with the edges of that PS reﬁnement. When we apply the
√
3 reﬁnement scheme twice, we obtain a triadic split, as shown in Figure 8. Every original
edge is trisected and each original triangle is split into nine subtriangles.
In order to make subdivision possible, the interior points of the PS reﬁnement ∆
√
3,∗
of the new triangulation ∆
√
3
must be chosen such that ∆
√
3,∗
contains the edges of the
original PS reﬁnement ∆
∗
. As can be seen in Figure 7(c), the new triangles in ∆
√
3
are
bisected by an edge of ∆
∗
. If their interior points are chosen on such an edge, we obtain a
valid PS reﬁnement ∆
√
3,∗
. Figure 9 depicts a
√
3 reﬁned triangle where the corresponding
PS reﬁnement is indicated with dashed lines.
188 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
(a) (b) (c)
Figure 7. Principle of
√
3 reﬁnement. (a) PS reﬁnement of two neighbouring triangles.
(b) Place a new vertex at the position of the interior points and connect with the triangle
corners. (c) Flip the edge adjacent to the two reﬁned triangles. The dashed lines are part of
the PS reﬁnement.
(a) (b) (c)
Figure 8. Applying the
√
3 reﬁnement scheme twice results in a triadic reﬁnement.
3.2. The Construction of Reﬁned Control Triangles
In this section we explain how to derive the B-spline coefﬁcients of the new Powell-Sabin
spline on the (locally) reﬁned triangulation, when the B-spline coefﬁcients were given on
the original triangulation.
For the vertices V
i
in the original triangulation one can reuse the old PS triangles deﬁned
by their corner points Q
i,m
, with m = 1, 2, 3. This is valid because any new PS point in the
reﬁned triangulation lies closer to the considered original vertex. However, it is better to
determine a smaller PS triangle for improving the stability of the new Powell-Sabin spline.
We will rescale the original PS triangle with an appropriate scalar ω
i
. The value of ω
i
can
be found by comparing the positions of the old and new PS points (see [33]). The corners
of the new PS triangle t
√
3
i
(see Figure 9) are then given by
Q
√
3
i,m
= ω
i
V
i
+ (1 −ω
i
) Q
i,m
, m = 1, 2, 3,
Computer Aided Geometric Design with Powell-Sabin Splines 189
V
i
V
j
V
k
V
ijk
Q
i,1
Q
i,2
Q
i,3
Q
√
3
i,1
Q
√
3
i,2
Q
√
3
i,3
Q
√
3
ijk,1
Q
√
3
ijk,2
Q
√
3
ijk,3
Figure 9. The PS triangles t
√
3
i
(Q
√
3
i,1
, Q
√
3
i,2
, Q
√
3
i,3
) and t
√
3
ijk
(Q
√
3
ijk,1
, Q
√
3
ijk,2
, Q
√
3
ijk,3
) associated
with the vertices V
i
and V
ijk
in a
√
3 reﬁned triangulation. The PS reﬁnement is indicated
with dashed lines.
and the corresponding coefﬁcients c
√
3
i,m
are calculated via the old coefﬁcients c
i,m
as
c
√
3
i,1
= (ω
i
α
i,1
+ 1 −ω
i
) c
i,1
+ω
i
α
i,2
c
i,2
+ω
i
α
i,3
c
i,3
, (3.14a)
c
√
3
i,2
= ω
i
α
i,1
c
i,1
+ (ω
i
α
i,2
+ 1 −ω
i
) c
i,2
+ω
i
α
i,3
c
i,3
, (3.14b)
c
√
3
i,3
= ω
i
α
i,1
c
i,1
+ω
i
α
i,2
c
i,2
+ (ω
i
α
i,3
+ 1 −ω
i
) c
i,3
. (3.14c)
The PS triangles t
√
3
ijk
associated with the new vertices V
ijk
in the
√
3 reﬁned triangula-
tion are deﬁned by
Q
√
3
ijk,1
= (V
ijk
+V
i
)/2, Q
√
3
ijk,2
= (V
ijk
+V
j
)/2, and, Q
√
3
ijk,3
= (V
ijk
+V
k
)/2,
as shown in Figure 9. The corresponding coefﬁcients are computed as
c
√
3
ijk,1
=
˜
L
i,1
c
i,1
+
˜
L
i,2
c
i,2
+
˜
L
i,3
c
i,3
, (3.15a)
c
√
3
ijk,2
=
˜
L
j,1
c
j,1
+
˜
L
j,2
c
j,2
+
˜
L
j,3
c
j,3
, (3.15b)
c
√
3
ijk,3
=
˜
L
k,1
c
k,1
+
˜
L
k,2
c
k,2
+
˜
L
k,3
c
k,3
. (3.15c)
Remark that Q
√
3
ijk,1
simply corresponds to the point
˜
S
i
in Figure 5(a), and equation (3.15a)
refers to equation (2.11d). Likewise, the triplets (
˜
L
j,1
,
˜
L
j,2
,
˜
L
j,3
) and (
˜
L
k,1
,
˜
L
k,2
,
˜
L
k,3
) can
be computed as the barycentric coordinates of Q
√
3
ijk,2
and Q
√
3
ijk,3
, respectively, with respect
to the PS triangles of the vertices V
j
and V
k
.
The formulas (3.14)-(3.15) use only convex combinations of the old coefﬁcients. As a
consequence, this subdivision scheme is numerically stable.
190 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
(a) (b)
Figure 10. (a) A Powell-Sabin spline with 7 control triangles. (b) The equivalent triadically
subdivided spline.
Applying those rules twice, one obtains the control triangles of the new vertices in a
triadically reﬁned triangulation. Figure 10 illustrates the triadic subdivision scheme for a
given Powell-Sabin spline.
3.3. Applications
In a broad range of application domains spline subdivision is of interest. In this section we
discuss some applications of the PS subdivision scheme.
Visualization. A common application of spline subdivision is visualization. After a few
subdivision steps, the PS control triangles mimic the shape of the Powell-Sabin spline quite
well. These control triangles can be used to construct a wireframe that approximates the
spline surface. The question is how to connect the PS control triangles efﬁciently into a
single wireframe. We consider a few approaches.
The wireframe can be constructed by connecting the spline interpolation points s(V
i
),
in the same way as the vertices V
i
are connected in the domain triangulation. These points
are the tangent points of the PS control triangles to the spline surface. This approach was
suggested in [35]. Figure 11(a) shows such triangular meshes for the splines in Figure 10.
Another strategy is to use the B´ ezier control net, see Figure 1(b), of the Powell-Sabin
spline in its Bernstein-B´ ezier representation [6]. This control net has the advantage that it
forms a convex hull for the spline surface. It also converges more rapidly to the surface than
the previous wireframe. However, many triangles are needed in this representation, as can
be seen in Figure 11(b).
A fair compromise is the wireframe in Figure 11(c), constructed by connecting pro-
jections of some PS points into the PS control triangles in a particular way, as described
below. We ﬁrst build the mesh in the parameter domain, and then we project the points in
this mesh into the corresponding PS control triangles. There are three types of patches in
such a mesh, as illustrated in Figure 12. The ﬁrst type is obtained by constructing for each
vertex V
i
the smallest envelope polygon that contains all PS points associated with V
i
. Note
Computer Aided Geometric Design with Powell-Sabin Splines 191
(a)
(b)
(c)
Figure 11. Different wireframes for visualizing the Powell-Sabin splines in Figure 10. The
wireframe can be obtained (a) by connecting the spline interpolation points s(V
i
), (b) by
using the B´ ezier control net, (c) by connecting the projections of certain PS points into the
corresponding PS control triangles, as explained in section 3.3..
that the corners of such a polygon are particular PS points. Then, these corner points are
connected into triangular and quadrilateral patches in the following way. For each triangle
in the domain triangulation ∆ we construct a triangular patch by connecting the three PS
points in the interior of the considered triangle. We connect the adjacent PS points along
each edge in ∆ to form a quadrilateral patch. The wireframe is then deﬁned by the projec-
tions of the corners of these patches into the corresponding PS control triangles. Note that
these projections are just particular B´ ezier ordinates in the Bernstein-B´ ezier representation
of the Powell-Sabin spline. For instance, the B´ ezier ordinate w
i
is the projection of the PS
point
˜
S
i
into PS control triangle T
i
, as shown in Figure 5.
192 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
(a) (b)
Figure 12. (a) A triangulation with PS reﬁnement and PS points. (b) The mesh used for
constructing a wireframe.
As can be seen in the Figures 11(b)-(c), the edges of this wireframe coincide with
particular edges in the B´ ezier control net. The number of used patches is about 1/8 of
the number needed for the B´ ezier control net. Let n, t and e be the number of vertices,
triangles and edges, respectively, in the triangulation ∆. Then, the B´ ezier control net needs
24t patches. The wireframe of Figure 11(c) only needs n+t +e patches, which amounts to
about 3t using Euler’s formulas. Because of the reduced number of patches, this wireframe
is more visually pleasant than the B´ ezier control net. Note that at most four patches come
together at each vertex in this wireframe. It could be disadvantageous that the patches in
this wireframe are constructed of polygons with a different number of edges. Nevertheless,
many graphical libraries, as OpenGL, can efﬁciently handle such set of polygons.
Modelling. The basis functions have a smaller support after subdivision. This gives the
designer more local control for manipulating the spline surface. By the local nature of the
√
3 subdivision scheme, complex surfaces can be efﬁciently represented with a reasonable
amount of memory [23].
Approximation. Subdivision is useful in data ﬁtting and ﬁnite element applications. The
original spline space is a subspace of the space obtained after subdivision. Hence, we
are guaranteed of a better spline approximation in the reﬁned space when a least squares
data ﬁtting strategy is applied [7], or a Ritz-Galerkin ﬁnite element method is used for the
numerical solution of partial differential equations [24]. When we use an iterative method
to solve the linear systems that arise in these methods, the subdivided version of the optimal
spline approximation in the coarser spline space provides a good initial guess for the optimal
solution in the ﬁner space.
Geometric multigrid methods can be used to solve partial differential equations in a
very efﬁcient way. By means of a hierarchy of meshes, one can accelerate the convergence
of a basic iterative method. Using the Powell-Sabin subdivision scheme, a nested sequence
of triangulations can be easily created with natural intergrid transfer operators [26].
Computer Aided Geometric Design with Powell-Sabin Splines 193
Multiresolution. Multiresolution techniques work with different levels of resolution.
Wavelets are functions that split data into different frequency components. Each compo-
nent is studied with a resolution matched to its scale. A common tool to develop wavelets
is the lifting scheme, where subdivision can be used as the prediction step. Such wavelets,
called Powell-Sabin spline wavelets, have been developed in [32, 38]. They are particularly
suitable for image/surface compression [18].
4. QHPS Splines
When an increased resolution is only required in a small part of the surface, the use of
global subdivision may lead to excessive computational and storage costs. In such a case, a
local (adaptive) subdivision scheme is recommended. Although the
√
3 reﬁnement scheme
described in section 3.1. can be applied locally, a naive use may introduce poorly shaped
triangles at the boundary of the locally reﬁned region. This problem can be dealt with
by using a reﬁnement propagation strategy [23]. When a triangle fails to satisfy a certain
quality requirement, extra neighbouring triangles are reﬁned. This results in an expansion of
the reﬁned region. By the edge ﬂipping step in the
√
3 reﬁnement method, a narrow triangle
(with small angles) will be replaced by two better shaped triangles. At the boundary of the
domain, artiﬁcial vertices outside the domain can be inserted into the triangulation. The
proposed strategy is driven by one parameter that manages the trade-off between the mesh
quality and the reﬁnement localization.
Such a specialized shape-improvement strategy could be avoided if non-conforming
triangulations were allowed. In this section we review the idea of [25], where Powell-
Sabin splines are adapted towards certain non-conforming triangulations. In particular, we
consider hierarchical triangulations which are obtained by partitioning an initial conforming
triangulation with a triadic split. A hierarchical triangulation gives rise to a set of nested
spline spaces. A hierarchical basis for such a space is constructed in a way that the basis
functions of the coarser spaces are retained in the basis of the ﬁner space. In a so-called
quasi-hierarchical basis some of the coarse-level basis functions are replaced by ﬁner-level
basis functions [12].
This section discusses QHPS splines, which are a hierarchical variant of Powell-Sabin
splines in a quasi-hierarchical basis representation [25].
4.1. The Hierarchical Powell-Sabin Spline Space
Consider a simply connected subset Ω ⊂ R
2
with polygonal boundary, and assume a con-
forming triangulation ∆
0
of Ω is given. We construct a hierarchical triangulation ∆
H
on Ω
by partitioning successively subsets of triangles with a triadic split, starting from the initial
triangulation ∆
0
. An example of such a triangulation is drawn in Figure 13(a) with solid
lines. Here, ∆
0
is the triangulation of Figure 2(a).
The hierarchical triangulation has a total of n vertices. Of these vertices, n
nc
are non-
conforming (or hanging) vertices. They are located on interiors of triangle edges. The
remaining ones, i.e. n
c
= n − n
nc
, are called conforming vertices. In Figure 13(a), ∆
0
consists of 8 triangles and 8 (conforming) vertices. The hierarchical triangulation in the
ﬁgure consists of 16 triangles and 15 vertices (n
c
= 9, n
nc
= 6).
194 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
(a) (b)
Figure 13. (a) A HPS reﬁnement ∆
∗
H
(in dashed lines) of a given hierarchical triangulation
∆
H
(in solid lines). (b) The QHPS points (bullets) and a set of suitable QHPS triangles
(shaded).
To each hierarchical triangulation ∆
H
we can associate a hierarchical mesh structure
∆
H
. It is the set of triangles T
k
that are generated during the construction of ∆
H
. The
superscript k of a triangle T
k
refers to the reﬁnement level of that triangle, i.e., the minimal
number of triadic reﬁnement steps needed to construct the triangle. The triangles in ∆
H
that are part of ∆
H
are called leaf triangles. We will denote ∆
l
H
as the subset of ∆
H
con-
taining all triangles whose level is l or lower, and let ∆
l
H
be its corresponding hierarchical
triangulation. Note that these mesh structures are nested
∆
0
⊂ ∆
1
H
⊂ ∆
2
H
⊂ . . . ⊂ ∆
H
. (4.16)
We will use Powell-Sabin triadic splits, as in Figure 8(c), in the construction of a hier-
archical triangulation ∆
H
. The PS reﬁnements needed in the splitting process generate a
particular reﬁnement of ∆
H
which partitions each triangle in ∆
H
into six subtriangles. This
reﬁnement is called the hierarchical Powell-Sabin (HPS) reﬁnement ∆
∗
H
of ∆
H
. Analogous
to (4.16), the HPS reﬁnement yields a nested structure of sets of triangles
∆
0,∗
⊂ ∆
1,∗
H
⊂ ∆
2,∗
H
⊂ . . . ⊂ ∆
∗
H
. (4.17)
In Figure 13(a) such a HPS reﬁnement is drawn in dashed lines.
The space of piecewise quadratic polynomials on ∆
∗
H
with global C
1
-continuity is
called the hierarchical Powell-Sabin spline space:
S
1
2,H
(∆
∗
H
) =
_
s
H
∈ C
1
(Ω) : s
H
|
T
∗
j
∈ Π
2
, T
∗
j
∈ ∆
∗
H
_
. (4.18)
For this hierarchical spline space we considered a similar interpolation problem as (2.6) for
Powell-Sabin splines [25] . Given a triplet (f
k
, f
x,k
, f
y,k
) at each conforming vertex V
k
in
the hierarchical triangulation ∆
H
, the interpolation problem
s
H
(V
k
) = f
k
,
∂s
H
∂x
(V
k
) = f
x,k
,
∂s
H
∂y
(V
k
) = f
y,k
, k = 1, . . . , n
c
, (4.19)
Computer Aided Geometric Design with Powell-Sabin Splines 195
has a unique solution s
H
(x, y) ∈ S
1
2,H
(∆
∗
H
). It follows that the dimension of the hierarchi-
cal Powell-Sabin spline space is equal to 3n
c
.
4.2. A Quasi-hierarchical Powell-Sabin Spline Basis
The construction of a normalized basis for the spline space S
1
2,H
(∆
∗
H
) is very similar to the
geometric approach for the Powell-Sabin basis, described in section 2.3.. A hierarchical
Powell-Sabin spline in its quasi-hierarchical representation is called a QHPS spline, and is
denoted as
s
QH
(x, y) =
n
c

i=1
3

j=1
c
i,j
B
j
i,QH
(x, y). (4.20)
We associate with each conforming vertex V
i
in the hierarchical triangulation three linearly
independent triplets (α
i,j
, β
i,j
, γ
i,j
), j = 1, 2, 3. The QHPS B-spline B
j
i,QH
(x, y) can then
be determined as the solution of the interpolation problem (4.19) with all (f
k
, f
x,k
, f
y,k
) =
(0, 0, 0) except for k = i, where (f
i
, f
x,i
, f
y,i
) = (α
i,j
, β
i,j
, γ
i,j
) = (0, 0, 0). The triplets
(α
i,j
, β
i,j
, γ
i,j
) are computed as follows:
1. For each conforming vertex V
i
in the hierarchical triangulation ∆
H
, identify the cor-
responding QHPS points. Let k be the smallest level of all triangles in ∆
H
that
contain vertex V
i
. Denote ∆
k
H
as the triangulation, consisting of triangles of at most
level k, that appears during the construction of ∆
H
. The QHPS points of V
i
are de-
ﬁned as the midpoints of all edges in the HPS reﬁnement ∆
k,∗
H
ending in V
i
. The
vertex V
i
is also a QHPS point. In Figure 13(b) the QHPS points are indicated as
bullets.
2. For each conforming vertex V
i
, construct a triangle t
i
(Q
i,1
, Q
i,2
, Q
i,3
) containing all
the QHPS points of V
i
. The triangles t
i
, i = 1, . . . , n
c
, are called QHPS triangles.
Figure 13(b) shows a possible set of QHPS triangles.
3. The three linearly independent triplets (α
i,j
, β
i,j
, γ
i,j
), j = 1, 2, 3, are derived from
the QHPS triangle t
i
of a vertex V
i
as follows:
• α
i
= (α
i,1
, α
i,2
, α
i,3
) are the barycentric coordinates of V
i
with respect to t
i
,
• β
i
= (β
i,1
, β
i,2
, β
i,3
) and γ
i
= (γ
i,1
, γ
i,2
, γ
i,3
) are the coordinates of the unit
barycentric directions, in x- and y-direction respectively, with respect to t
i
.
Figure 14 illustrates how a QHPS B-spline associated with the central vertex in a given
triangulation changes when some of the triangles are triadically reﬁned. The same QHPS
triangle is used in the three cases.
Note that if the hierarchical triangulation is obtained by global triadic splits, i.e., the
ﬁnal triangulation is conforming, then the QHPS B-splines are just the classical Powell-
Sabin B-splines.
4.3. Properties of the QHPS B-spline Basis
The quasi-hierarchical B-splines have similar attractive properties as the classical Powell-
Sabin B-splines. Unless stated otherwise, the proofs of the properties can be found in [25].
196 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
(a)
(b)
Figure 14. (a) Several triadically reﬁned hierarchical triangulations. The HPS reﬁnement
is indicated with dashed lines. (b) Contour plots of a QHPS basis function associated with
the central vertex for the corresponding meshes in (a). The ﬁrst B-spline is the same as in
Figure 3(b).
Local support. Each QHPS B-spline B
j
i,QH
(x, y) is zero outside the molecule M
k
i
of the
corresponding vertex V
i
in ∆
k
H
, with k the smallest level of any triangle in ∆
H
contain-
ing V
i
.
Convex partition of unity. The QHPS basis splines are positive, and they sum up to one.
Stability. The quasi-hierarchical basis is strongly L
∞
-stable, i.e., the stability constants
K
1
and K
2
in inequality (2.9) are both independent of the number of levels in the hierarchi-
cal triangulation. In [27] we investigated the L
p
-stability of the basis. It turns out that the
QHPS basis is, in general, weakly L
p
-stable, i.e., the constants K
1
−1
and K
2
have at most
a polynomial growth in the number of levels. However, this result can be improved for a
broad class of hierarchical triangulations ∆
H
. Suppose there exists an upper bound on the
difference between the level numbers of any two triangles in ∆
H
that lie inside the support
of the same QHPS B-spline. If this bound is independent of the number of reﬁnement levels
in ∆
H
, then the QHPS basis is strongly L
p
-stable. More details can be found in [27]. Note
that the classical Powell-Sabin B-splines are always strongly L
p
-stable on a globally reﬁned
hierarchical triangulation.
Computer Aided Geometric Design with Powell-Sabin Splines 197
QHPS control triangles. We can deﬁne control points as c
i,j
= (Q
i,j
, c
i,j
) and control
triangles as T
i
(c
i,1
, c
i,2
, c
i,3
). These triangles are tangent to the spline surface z = s(x, y)
at the vertices V
i
, and the graph of the QHPS spline lies inside the convex hull of these
control points.
Subdivision. The non-conformity of the hierarchical triangulation allows a local triadic
reﬁnement in a natural way. In addition, a QHPS spline can be locally subdivided on a given
set of triangles, using the same triadic PS subdivision rules as explained in section 3.. The
locality of the subdivision scheme ensures that a QHPS spline surface can be adaptively
manipulated with a reasonable increase of dimension of the space.
In Figure 15(a) we subdivided the torus, shown in Figure 6, locally on two triangles,
and we adapted the QHPS control triangles of the newly introduced vertices in Figure 15(b).
Note that the support of each QHPS B-spline associated with one of these new vertices
stays within its original triangle. This is a very interesting property for surface editing. The
designer selects a triangle where the surface must be locally subdivided. Then, he/she can
freely morph the surface while only the part within the selected triangle will be affected.
The new QHPS surface in Figure 15(b) only consists of 18 control triangles, whereas a
globally subdivided Powell-Sabin spline would need 144 control triangles to represent the
same surface.
4.4. A Practical Implementation
In this section we show how the efﬁcient algorithms for classical Powell-Sabin splines can
be used for working with quasi-hierarchical Powell-Sabin splines.
A QHPS spline can be represented on each leaf triangle by a particular Powell-Sabin
spline. To construct this Powell-Sabin spline, we have to determine the PS control trian-
gles corresponding to the three corner vertices of the considered triangle. The PS control
triangles associated with conforming vertices can be taken identical to the QHPS control
triangles of the QHPS spline. We can use subdivision to compute the control triangles for
the non-conforming vertices. The correct PS control triangles can be recursively obtained
while moving through the hierarchical mesh structure. When the leaf triangle is reached,
the corresponding Powell-Sabin spline on the considered triangle will then be fully deﬁned.
In this way, an algorithm for QHPS splines can be straightforwardly reduced to the equiv-
alent algorithm for Powell-Sabin splines. We now present a generic QHPS algorithm [25].
It only needs an equivalent algorithm for classical Powell-Sabin splines.
Generic QHPS algorithm. Let ∆
H
be a hierarchical mesh structure, and ∆ be a con-
forming triangulation. The generic algorithm qhps algorithm(∆
H
) for QHPS splines is
based on the equivalent algorithm ps algorithm(∆) for classical Powell-Sabin splines. We
suppose that each triangle in ∆
H
and ∆ has access to the control triangles associated with
its three corner vertices. For conforming vertices, their control triangles are already known
in advance, i.e., the QHPS control triangles. For non-conforming vertices, their control
triangles are computed during the algorithm.
function qhps algorithm(hierarchical structure ∆
H
)
198 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
for all triangles T in the initial triangulation ∆
0
of ∆
H
:
qhps local algorithm(T , ∆
H
)
endfor
end
function qhps local algorithm(triangle T , hierarchical structure ∆
H
)
if T is not a leaf triangle:
1. Let l be the level of T in the hierarchical structure.
2. for all non-conforming vertices V
i
of level (l +1) situated at the interior of
an edge of T :
Calculate the control triangle T
i
by triadic PS subdivision, using the
control triangles of two corner vertices of T .
endfor
3. for all 9 subtriangles T
i
of level (l + 1) in T :
qhps local algorithm(T
i
, ∆
H
)
endfor
else
ps algorithm(T )
endif
end
The algorithm requires that all triangles in the hierarchical mesh structure ∆
H
are stored
in memory. It is recommended to manage these triangles in the following way. Use an
array that contains the triangles in the initial triangulation ∆
0
. A triangle that is reﬁned
keeps references to each of its nine subtriangles. That ensures that all triangles in ∆
H
are
reachable. In order to navigate easily through the mesh structure it is also advisable that the
triangles keep the references to their (at most three) neighbouring triangles.
Using a different Powell-Sabin spline on each leaf triangle in the hierarchical triangula-
tion is usually not more time-consuming than considering a single Powell-Sabin spline on
a larger set of triangles. Indeed, most of the algorithms for Powell-Sabin splines run over
all triangles in the triangulation separately.
The QHPS splines can be evaluated and manipulated in a stable way. Only convex
combinations are needed to convert the QHPS spline on each triangle to a Powell-Sabin
spline, to represent these Powell-Sabin splines with Bernstein-B´ ezier polynomials, and to
evaluate these polynomials via the de Casteljau algorithm.
5. NURPS Surfaces
Rational surfaces, such as rational B´ ezier and NURBS surfaces, are commonly used tools
in commercial computer aided design and computer graphic packages. Rational surface
representations give a designer extra degrees of freedom compared to their non-rational
counterparts through the weights that are associated with the control points. These rational
Computer Aided Geometric Design with Powell-Sabin Splines 199
(a)
(b)
Figure 15. (a) Local QHPS subdivision applied on two triangles of the surface in Figure 6.
The control triangles and the triangular mesh lines are shown. (b) Effect of moving the two
new control triangles of the QHPS surface.
surfaces are able to exactly represent patches on quadric surfaces, e.g., patches on the cone
and the sphere. In this section we discuss the rational extension of Powell-Sabin splines,
called NURPS surfaces. For more details we refer to the papers [29, 36, 37].
5.1. Rational Powell-Sabin Surfaces
Consider a conforming triangulation ∆ on a given domain Ω ⊂ R
2
, with PS reﬁnement ∆
∗
.
A non-uniform rational Powell-Sabin (NURPS) surface s(u, v) is deﬁned as
s(u, v) =
n

i=1
3

j=1
c
i,j
φ
j
i
(u, v), (u, v) ∈ Ω, (5.21)
200 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
with c
i,j
= (c
x
i,j
, c
y
i,j
, c
z
i,j
) the NURPS control points, and the blending functions
φ
j
i
(u, v) =
w
i,j
B
j
i
(u, v)

n
i=1

3
j=1
w
i,j
B
j
i
(u, v)
. (5.22)
Here, B
j
i
(u, v) are the normalized Powell-Sabin B-splines on ∆
∗
, and w
i,j
are positive
weights. When all weights are constant, then (5.21) reduces to (2.13). The blending func-
tions φ
j
i
(u, v) in (5.22) have a local support and they form a convex partition of unity.
A NURPS surface in representation (5.21) can be seen as the 3D projection onto the
Euclidian space of a 4D Powell-Sabin spline in the homogeneous space, i.e.,
s(u, v) =
n

i=1
3

j=1
c
h
i,j
B
j
i
(u, v), (5.23)
where the 4D homogeneous control points are given by
c
h
i,j
= (c
hx
i,j
, c
hy
i,j
, c
hz
i,j
, c
hw
i,j
) = (w
i,j
c
x
i,j
, w
i,j
c
y
i,j
, w
i,j
c
z
i,j
, w
i,j
). (5.24)
The next sections are devoted to some particular features of the NURPS surfaces. First,
we discuss the use of the control points and the weights in shape modelling with NURPS
surfaces. Then, we describe a subdivision scheme. We end with the NURPS representation
of some quadric surfaces.
5.2. Modelling with NURPS Surfaces
A designer has two types of freedom in the construction of a NURPS surface: the coefﬁ-
cients c
i,j
and the weights w
i,j
. They can be determined in a geometrically intuitive way
by means of control triangles and weight points.
Control triangles. Similar to the Powell-Sabin splines, the inﬂuence of the coefﬁ-
cients in (5.21) on the NURPS surface can be intuitively interpreted via control trian-
gles [36]. With each vertex V
i
in the triangulation ∆, one can deﬁne the control triangle
T
i
(c
i,1
, c
i,2
, c
i,3
). This triangle is tangent to the NURPS surface at the point
s(V
i
) = ˆ α
i,1
c
i,1
+ ˆ α
i,2
c
i,2
+ ˆ α
i,3
c
i,3
, (5.25)
where
ˆ α
i,j
=
w
i,j
α
i,j

3
k=1
w
i,k
α
i,k
. (5.26)
Here, (α
i,1
, α
i,2
, α
i,3
) are the barycentric coordinates of vertex V
i
with respect to the PS
triangle t
i
.
Computer Aided Geometric Design with Powell-Sabin Splines 201
Weight points. In [29] it is described how one can use so-called weight points as a design
tool for handling the weights. Such a point is characterized by its position p
i
and by a
scaling factor K
i
. The position is chosen as the tangent point of the control triangle T
i
to
the NURPS surface, i.e., p
i
= s(V
i
).
The three weights w
i,j
, with j = 1, 2, 3, are then uniquely deﬁned by means of the
barycentric coordinates (ˆ α
i,1
, ˆ α
i,2
, ˆ α
i,3
) of p
i
with respect to T
i
, up to a positive factor K
i
,
via the formula
w
i,j
= K
i
ˆ α
i,j
α
i,j
. (5.27)
A designer can freely move the weight point within the control triangle T
i
. Since its
position is the tangent point of T
i
to the NURPS surface, the effect of the movement will
be intuitive to the designer. This is illustrated in the ﬁrst two pictures of Figure 16. One
can use the scaling factor K
i
to determine the relative importance of the considered three
weights with respect to the other weights. Indeed, K
i
can be interpreted as a weighted mean
of these weights, i.e.,
K
i
= α
i,1
w
i,1
+α
i,2
w
i,2
+α
i,3
w
i,3
. (5.28)
The larger the value of K
i
, the more the NURPS surface will be attracted to the control
triangle. A cusp can be simulated by reducing the scaling factor. The effect of changing the
scaling factor is illustrated in the bottom two pictures of Figure 16. Similar effects can be
obtained by rescaling (enlarging/reducing) the control triangles [37]. However, changing
the factor K
i
has the advantage that the designer can continue to work with the same control
triangles.
Let r
i,j
be the intersection point of the line p
i
−c
i,j
and the edge of control triangle T
i
opposite to c
i,j
, then
r
i,j
=
ˆ α
i,j
′ c
i,j
′ + ˆ α
i,j
′′ c
i,j
′′
ˆ α
i,j
′ + ˆ α
i,j
′′
, (5.29)
with j
′
= 1 + (j mod 3) and j
′′
= 1 + (j
′
mod 3). Using these points, the ratio of two
weights can be geometrically interpreted as the ratio of two lengths, i.e.,
w
i,j
′′
w
i,j
′
=
α
i,j
′
α
i,j
′′
r
i,j
−c
i,j
′
r
i,j
−c
i,j
′′
, (5.30)
and, using formula (5.27), as a ratio of two triangular areas, i.e.,
w
i,j
′′
w
i,j
′
=
α
i,j
′
α
i,j
′′
A(p
i
, c
i,j
, c
i,j
′ )
A(p
i
, c
i,j
′′ , c
i,j
)
. (5.31)
5.3. NURPS Subdivision
We now adapt the subdivision scheme for Powell-Sabin splines towards NURPS surfaces.
The construction of the reﬁned triangulation ∆
√
3
and the choice of the PS triangles remain
identical as described in section 3..
We can compute the control points of the subdivided NURPS surface via its homo-
geneous representation (5.23). For instance, the homogeneous control points c
h,
√
3
ijk,m
, with
202 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
(a) (b)
(c) (d)
Figure 16. (a) Original NURPS surface. (b) The effect of moving the weight point p
i
inside
the control triangle. (c)-(d) Enlarging and reducing the scaling factor K
i
. The considered
weight point and control points are indicated with bullets.
m = 1, 2, 3, corresponding to the PS triangle t
√
3
ijk
, are then calculated as
c
h,
√
3
ijk,1
=
˜
L
i,1
c
h
i,1
+
˜
L
i,2
c
h
i,2
+
˜
L
i,3
c
h
i,3
, (5.32a)
c
h,
√
3
ijk,2
=
˜
L
j,1
c
h
j,1
+
˜
L
j,2
c
h
j,2
+
˜
L
j,3
c
h
j,3
, (5.32b)
c
h,
√
3
ijk,3
=
˜
L
k,1
c
h
k,1
+
˜
L
k,2
c
h
k,2
+
˜
L
k,3
c
h
k,3
. (5.32c)
The convex combinations are identical to the ones in formulas (3.15). Projecting the new
control points (5.32) back to the Euclidian space yields
c
√
3
ijk,m
=
_
_
c
hx,
√
3
ijk,m
c
hw,
√
3
ijk,m
,
c
hy,
√
3
ijk,m
c
hw,
√
3
ijk,m
,
c
hz,
√
3
ijk,m
c
hw,
√
3
ijk,m
_
_
. (5.33)
It is well known that working in the homogeneous space can lead to numerical instabilities.
When the weights vary greatly in magnitude, the coordinates c
hr,
√
3
ijk,m
, with r = x, y, z, can
become extremely large. Then, the calculations do not operate in the convex hull of the
rational control points anymore, and numerical stability is endangered.
Inspired by the idea behind the rational variant of the de Casteljau algorithm from
Farin [8], we can improve the numerical stability by rearranging the calculations in or-
Computer Aided Geometric Design with Powell-Sabin Splines 203
Figure 17. The domain triangle of a NURPS patch, together with its PS reﬁnement and PS
triangles.
der to avoid working in the homogeneous space. For instance, c
√
3
ijk,1
and its weight w
√
3
ijk,1
are computed in a stable way as follows:
w
√
3
ijk,1
= c
hw,
√
3
ijk,1
=
˜
L
i,1
w
i,1
+
˜
L
i,2
w
i,2
+
˜
L
i,3
w
i,3
, (5.34a)
ˆ
L
i,m
=
w
i,m
˜
L
i,m
w
√
3
ijk,1
, m = 1, 2, 3, (5.34b)
c
√
3
ijk,1
=
ˆ
L
i,1
c
i,1
+
ˆ
L
i,2
c
i,2
+
ˆ
L
i,3
c
i,3
. (5.34c)
Analogously, one can calculate the other control points in a numerically stable way.
5.4. Quadrics as NURPS Surfaces
In [37] closed formulas are derived for the control points of NURPS patches on a cylinder, a
cone and a sphere. In this section we give another representation of such patches, resulting
in a simpler choice of the control points. Further on, each triangular NURPS patch is deﬁned
on the equilateral domain triangle T (V
1
, V
2
, V
3
) depicted in Figure 17, with the given PS
reﬁnement and PS triangles, deﬁned by the points
Q
1,1
= V
1
, Q
1,2
= (V
1
+V
2
)/2, Q
1,3
= (V
1
+V
3
)/2,
Q
2,1
= V
2
, Q
2,2
= (V
2
+V
3
)/2, Q
2,3
= (V
2
+V
1
)/2,
Q
3,1
= V
3
, Q
3,2
= (V
3
+V
1
)/2, Q
3,3
= (V
3
+V
2
)/2.
The derivation of the corresponding control points is analogous to the one described in [37].
Cylinder. The cylinder
x
2
+y
2
= r
2
, 0 ≤ z ≤ h, (5.35)
can be split into eight isometrical triangular segments. The control points and weights of
the NURPS representation of such a patch is given in the following table.
204 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
i c
i,1
w
i,1
c
i,2
w
i,2
c
i,3
w
i,3
1 (r, 0, 0) 1 (r, r, 0)
√
2
2
(r, 0,
h
2
) 1
2 (0, r, 0) 1 (r, r,
h
2
)
√
2
2
(r, r, 0)
√
2
2
3 (r, 0, h) 1 (r, 0,
h
2
) 1 (r, r,
h
2
)
√
2
2
The entire cylinder, constructed with eight NURPS patches, is shown in Figure 18(a).
Cone. The cone
x
2
+y
2
=
_
h −z
h
r
_
2
, 0 ≤ z ≤ h, (5.36)
can be split into four isometrical patches. The NURPS representation of such a patch is
given in the following table.
i c
i,1
w
i,1
c
i,2
w
i,2
c
i,3
w
i,3
1 (r, 0, 0) 1 (r, r, 0)
√
2
2
(0, 0, h) 1
2 (0, r, 0) 1 (0, 0, h) 1 (r, r, 0)
√
2
2
3 (0, 0, h) 1 (0, 0, h) 1 (0, 0, h) 1
The complete cone is depicted in Figure 18(b).
Sphere. Using only NURPS patches, it is not possible to describe the entire sphere,
x
2
+y
2
+z
2
= 1. (5.37)
Nevertheless, with 2n NURPS patches we can represent the sphere up to some small gaps.
The maximal height of these gaps is equal to
h
g
=
2 sin
2
θ
cos
2
θ + 1
, (5.38)
where θ = π/n. Small gaps can be achieved by a small angle θ, but then a large number of
patches has to be used. The control points of such a patch are shown in the table below.
i c
i,1
w
i,1
c
i,2
w
i,2
c
i,3
w
i,3
1 (cos θ, −sin θ, 0) 1
_
1
cos θ
, 0, tan
2
θ
_
cos
2
θ (cos θ, −sin θ, 1)
√
2
2
2 (cos θ, sin θ, 0) 1 (cos θ, sin θ, 1)
√
2
2
_
1
cos θ
, 0, tan
2
θ
_
cos
2
θ
3 (0, 0, 1) 1 (cos θ, −sin θ, 1)
√
2
2
(cos θ, sin θ, 1)
√
2
2
An extension to a sphere with radius r is straightforward, i.e., by multiplying the coordi-
nates of each control point with r. An incomplete NURPS representation of the sphere with
twelve patches is given in Figure 18(c). The hole ﬁlling strategy in [34] can be used to close
the gaps in the sphere approximately.
Computer Aided Geometric Design with Powell-Sabin Splines 205
(a) (b) (c)
Figure 18. A NURPS representation of (a) a cylinder, (b) a cone, and (c) a sphere. The
patches are deﬁned on the domain triangle in Figure 17. Neighbouring patches are shaded
in different colours.
6. Conclusion
Powell-Sabin splines are C
1
-continuous piecewise quadratic polynomials deﬁned on an
arbitrary conforming triangulation. These splines can be compactly represented in a stable
normalized basis. The basis functions can be chosen in a ﬂexible way by means of PS
triangles. This leads to a natural deﬁnition of PS control triangles, that allow an interactive
change of the shape of Powell-Sabin splines in a predictable way.
Powell-Sabin splines can be reﬁned using the
√
3 subdivision scheme. The control
points of the subdivided spline can be easily calculated in a stable way. Applying the
scheme twice results in a triadic reﬁnement. Subdivision has many applications. The sub-
division scheme can be applied for an efﬁcient visualization of the Powell-Sabin splines.
The increased resolution can also be used for obtaining a more detailed approximation or
for a local manipulation of the spline shape.
QHPS splines are a hierarchical variant of Powell-Sabin splines in a quasi-hierarchical
basis representation. They are deﬁned on a hierarchical triangulation. Such a mesh is
obtained, starting from an initial conforming triangulation, by partitioning successively a
subset of triangles with a triadic split. The QHPS basis retains all advantages of the Powell-
Sabin B-splines. In addition, a local reﬁnement of the QHPS spline can be performed in a
very natural way.
The rational extension of a Powell-Sabin spline surface is called a NURPS surface. In
the rational representation a weight is associated with each control point. These weights
can be used as extra degrees of freedom in the modelling of shapes by means of weight
points. The position of the tangent points within the control triangles leads to an intuitive
and graphical interpretation of the weights. NURPS surfaces are able to exactly represent
patches of quadric surfaces, as the cylinder, the cone and the sphere. Note that the quasi-
hierarchical setting is also applicable to the NURPS surfaces.
A generalization of Powell-Sabin splines to higher degrees and higher dimensions is
not a trivial task. Some higher degree spline extensions on the Powell-Sabin split are con-
206 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
sidered in [2, 17]. Unfortunately, the proposed B-splines have no immediate geometric
interpretation anymore, similar to PS triangles, and they are more difﬁcult to implement. A
generalization of the Powell-Sabin split in more dimensions is discussed in [22, 39]. Each
simplex in a tessellation has to be split into a lot of smaller simplices. They must satisfy
a particular set of geometric constraints in order to achieve C
1
-continuity. At the moment
it is not clear whether these constraints can always be satisﬁed for an arbitrary tessellation.
Nevertheless, we can conclude that Powell-Sabin splines exhibit many nice properties that
make them suitable for many CAGD applications.
References
[1] P. Alfeld and L.L. Schumaker. The dimension of bivariate spline spaces of smoothness
r for degree d ≥ 4r + 1. Constr. Approx., 3:189–197, 1987.
[2] P. Alfeld and L.L. Schumaker. Smooth macro-elements based on Powell-Sabin trian-
gle splits. Adv. Comp. Math., 16:29–46, 2002.
[3] L.J. Billera. Homology of smooth splines: generic triangulations and a conjecture of
Strang. Trans. Amer. Math. Soc., 310:325–340, 1988.
[4] W. Boehm and A. M¨ uller. On de Casteljau’s algorithm. Comput. Aided Geom. Design,
16:587–605, 1999.
[5] R.W. Clough and J.L. Tocher. Finite element stiffness matrices for analysis of plates in
bending. In Proc. 1st Conference on Matrix Methods in Structural Mechanics, pages
515–545, Wright Patterson Air Force Base, Ohio, 1965.
[6] P. Dierckx. On calculating normalized Powell-Sabin B-splines. Comput. Aided Geom.
Design, 15(1):61–78, 1997.
[7] P. Dierckx, S. Van Leemput, and T. Vermeire. Algorithms for surface ﬁtting using
Powell-Sabin splines. IMA J. Numer. Anal., 12:271–299, 1992.
[8] G. Farin. Algorithms for rational B´ ezier curves. Comput. Aided Design, 15:73–77,
1983.
[9] G. Farin. Triangular Bernstein-B´ ezier patches. Comput. Aided Geom. Design, 3:83–
127, 1986.
[10] G. Farin. Curves and Surfaces for CAGD: A Practical Guide. Morgan Kaufmann
Publishers, San Francisco, ﬁfth edition, 2002.
[11] G. Farin. Dimensions of spline spaces over unconstricted triangulations. J. Comput.
Appl. Math., 192:320–327, 2006.
[12] E. Grinspun, P. Krysl, and P. Schr¨ oder. CHARMS: a simple framework for adaptive
simulation. ACM Trans. Graphics, 21(3):281–290, 2002.
Computer Aided Geometric Design with Powell-Sabin Splines 207
[13] D. Hong. Spaces of bivariate spline functions over triangulation. Approx. Theory
Appl., 7:56–75, 1991.
[14] L. Kobbelt.
√
3-Subdivision. In Computer Graphics Proceedings, Annual Conference
Series, pages 103–112. ACM SIGGRAPH, 2000.
[15] U. Labsik and G. Greiner. Interpolatory
√
3-subdivision. In Proc. 21th European
Conference on Computer Graphics, volume 19 of Computer Graphics Forum, pages
131–138, Cambridge, 2000.
[16] M.J. Lai and L.L. Schumaker. On the approximation power of bivariate splines. Adv.
Comp. Math., 9:251–279, 1998.
[17] M.J. Lai and L.L. Schumaker. Macro-elements and stable local bases for spaces of
splines on Powell-Sabin triangulations. Math. Comp., 72:335–354, 2003.
[18] J. Maes and A. Bultheel. Stable multiresolution analysis on triangles for surface com-
pression. Electr. Trans. Numer. Anal., 25:224–258, 2006.
[19] J. Maes, E. Vanraes, P. Dierckx, and A. Bultheel. On the Stability of normalized
Powell-Sabin B-splines. J. Comput. Appl. Math., 170(1):181–196, 2004.
[20] C. Manni and P. Sablonni` ere. Quadratic spline quasi-interpolants on Powell-Sabin
partitions. Adv. Comput. Math., 26:283–304, 2007.
[21] M.J.D. Powell and M.A. Sabin. Piecewise quadratic approximations on triangles.
ACM Trans. Math. Softw., 3:316–325, 1977.
[22] T. Sorokina and A.J. Worsey. A multivariate Powell-Sabin interpolant. Adv. Comput.
Math., in press, 2007.
[23] H. Speleers, P. Dierckx, and S. Vandewalle. Local subdivision of Powell-Sabin
splines. Comput. Aided Geom. Design, 23(5):446–462, 2006.
[24] H. Speleers, P. Dierckx, and S. Vandewalle. Numerical solution of partial differen-
tial equations with Powell-Sabin splines. J. Comput. Appl. Math., 189(1-2):643–659,
2006.
[25] H. Speleers, P. Dierckx, and S. Vandewalle. Quasi-hierarchical Powell-Sabin B-
splines. Technical Report 472, Dept. Computer Science, K.U. Leuven, 2006.
[26] H. Speleers, P. Dierckx, and S. Vandewalle. Multigrid methods with Powell-Sabin
splines. Technical Report 488, Dept. Computer Science, K.U. Leuven, 2007.
[27] H. Speleers, P. Dierckx, and S. Vandewalle. On the L
p
-stability of quasi-hierarchical
Powell-Sabin splines. Technical Report 492, Dept. Computer Science, K.U. Leuven,
2007.
[28] H. Speleers, P. Dierckx, and S. Vandewalle. Powell-Sabin splines with boundary con-
ditions for polygonal and non-polygonal domains. J. Comput. Appl. Math., in press,
2007.
208 Hendrik Speleers, Paul Dierckx and Stefan Vandewalle
[29] H. Speleers, P. Dierckx, and S. Vandewalle. Weight control for modelling with
NURPS surfaces. Comput. Aided Geom. Design, 24(3):179–186, 2007.
[30] E. Vanraes and A. Bultheel. A tangent subdivision scheme. ACM Trans. Graphics,
25:340–355, 2006.
[31] E. Vanraes, P. Dierckx, and A. Bultheel. On the choice of the PS-triangles. Technical
Report 353, Dept. Computer Science, K.U. Leuven, 2003.
[32] E. Vanraes, J. Maes, and A. Bultheel. Powell-Sabin spline wavelets. Int. J. Wav.
Multires. Inf. Proc., 2(1):23–42, 2004.
[33] E. Vanraes, J. Windmolders, A. Bultheel, and P. Dierckx. Automatic construction of
control triangles for subdivided Powel-Sabin splines. Comput. Aided Geom. Design,
21(7):671–682, 2004.
[34] J. Windmolders. Powell-Sabin splines for computer aided geometric design. PhD
thesis, Dept. Computer Science, K.U. Leuven, 2003.
[35] J. Windmolders and P. Dierckx. Subdivision of uniform Powell-Sabin splines. Com-
put. Aided Geom. Design, 16:301–315, 1999.
[36] J. Windmolders and P. Dierckx. From PS-splines to NURPS. In A. Cohen, C. Rabut,
and L.L. Schumaker, editors, Proc. of Curve and Surface Fitting, Saint-Malo 1999,
pages 45–54. Vanderbilt University Press, 2000.
[37] J. Windmolders and P. Dierckx. NURPS for special effects and quadrics. In T. Lyche
and L.L. Schumaker, editors, Proc. of Mathematical Methods for Curves and Surfaces,
Oslo 2000, pages 527 – 534. Vanderbilt University Press, 2001.
[38] J. Windmolders, E. Vanraes, P. Dierckx, and A. Bultheel. Uniform Powell-Sabin
spline wavelets. J. Comput. Appl. Math., 154(1):125–142, 2003.
[39] A.J. Worsey and B. Piper. Atrivariate Powell-Sabin interpolant. Comput. Aided Geom.
Design, 5:177–186, 1988.
In: Computer Animation ISBN: 978-1-60741-559-6
Editors: J.S. Wright and L.M. Hughes, pp. 209-234 © 2010 Nova Science Publishers, Inc.
Chapter 9
AN ONTOLOGY OF COMPUTER-AIDED
DESIGN
Udo Kannengiesser
NICTA, Australia
J ohn S. Gero
Krasnow Institute for Advanced Study and Volgenau School of
Information Technology and Engineering, George Mason University, USA,
and University of Technology, Sydney, Australia
Abstract
This chapter develops an ontology of computer-aided design, based on the function-
behaviour-structure (FBS) ontology. It proposes two complementary views of the process of
design. The object-centred view applies the FBS ontology to the artefact being designed.
Integrating an ontology of three “design worlds”, this view establishes a framework of
designing as a set of transformations between the function, behaviour and structure of the
design object, driven by interactions between the three design worlds. Building on this
framework, the process-centred view applies the FBS ontology to the activities defined by the
object-centred view. This increases the level of detail and provides a more well-defined set of
representations of these activities. Our ontological framework can be used to provide a better
understanding of the functionalities required of existing and future computer-aided design
support.
1. Introduction
The notion of computer-aided design can be understood as an umbrella term for approaches to
using computational tools to support human design activities. Its principal innovations to date
include tools for computer-aided drafting (CAD), engineering (CAE) and manufacturing
(CAM), which have been recognised as a significant technological achievement of the past
century (Weisberg 2000). These tools are now indispensable for practitioners in many design
domains.
Udo Kannengiesser and John S. Gero 210
Some research has focused on expanding computer support to activities carried out in the
early, conceptual stages of design. However, its impact on design practices and tool
development in industry has generally been rather low. We believe that one of the reasons is
that many of these approaches are based on an insufficient understanding of design. This
concerns activities that are carried out by human designers, which include producing and re-
interpreting drawings or sketches, and reflecting on current and previous design tasks. It has
been shown that these activities are important drivers of designing (Schön and Wiggins 1992;
Suwa et al. 1999; Suwa and Tversky 2002). Most traditional models of design are inadequate
because they do not explicitly account for these findings.
Progress on a more comprehensive understanding of designing has been made only
recently. Our situated function-behaviour-structure (FBS) framework (Gero and
Kannengiesser 2004) represents designing as a situated act that is driven by the interactions
between the designer and their environment. It uses a perspective that is oriented to the object
being designed, so that designing can be shown as a set of transformations between the
function, behaviour and structure of the artefact. The situated FBS framework has shown its
potential to enhance human understanding of designing. This chapter develops extensions to
this framework that provide a more detailed ontological basis on which computer-aided
design support can be built.
Section 2 presents the situated FBS framework and shows how it derives from an object-
centred view of designing driven by the interactions between three “design worlds”. Section
3, adopting a process-centred view of designing, applies the FBS ontology to the design
activities defined in Section 2. This adds a significant amount of detail and rigour to the
representation of each activity. Section 4 uses this view to derive a framework that specifies
the functions required of computational tools to support designing. Section 5 concludes the
chapter.
2. An Object-Centred Ontology of Design
2.1. An Ontology of Design Objects
Most design models and design ontologies focus on the artefact or object of design. The FBS
ontology distinguishes between three aspects of a design object (Gero 1990; Gero and
Kannengiesser 2004): function (F), behaviour (B) and structure (S).
2.1.1. Object Function
Function (F) of an object is defined as its teleology (“what the object is for”). For example,
some of the functions of a window include “to provide view”, “to provide daylight” and “to
provide rain protection”. Function represents the usefulness of the object for another system.
2.1.2. Object Behaviour
Behaviour (B) of an object is defined as the attributes that can be derived from its structure
(“what the object does”). Using the window example, behaviours include “thermal
An Ontology of Computer-Aided Design 211
conduction”, “light transmission” and “direct solar gain”. Behaviour provides operational,
measurable performance criteria for comparing different objects.
2.1.3. Object Structure
Structure (S) of an object is defined as its components and their relationships (“what the
object consists of”). The structure of physical objects includes their form (i.e., geometry and
topology) and material. More generally, form can be viewed as a description of an object’s
macro-structure, and material can be viewed as a shorthand description of the micro-structure.
In the window example, macro-structure (form) includes “glazing length” and “glazing
height”, and micro-structure (material) includes “type of glass”.
2.1.4. Relationships between Object Function, Behaviour and Structure
Humans construct relationships between function, behaviour and structure through experience
and through the development of causal models based on interactions with the object.
Specifically, function is ascribed to behaviour by establishing a teleological connection
between the human’s goals and observable or measurable effects of the object. There is no
direct relationship between function and structure. Behaviour is causally related to structure,
i.e. it can be derived from structure using physical laws or heuristics. This may require
knowledge about external effects (exogenous variables) and their interaction with the
artefact’s structure. In the window example, deriving the behaviour “light transmission”
requires considering external light sources.
2.2. An Ontology of Design Worlds
An aspect that has been ignored in most models of design relates to the interactions of the
designer and their environment. Designers perform actions in order to change their
environment. By observing and interpreting the results of their actions, they then decide on
new actions to be executed on the environment. The designers’ concepts may change
according to what they are “seeing”, which itself is a function of what they have done. One
may speak of an “interaction of making and seeing” (Schön and Wiggins 1992). This
interaction between the designer and the environment strongly determines the course of
designing. This idea is called situatedness, whose foundational concepts go back to the work
of Dewey (1896) and Bartlett (1932).
Gero and Kannengiesser (2004) have modelled situatedness by specifying three
interacting worlds: the external world, interpreted world and expected world, Figure 1(a).
2.2.1. The External World
The external world is the world that is composed of representations outside the designer or
design agent. The notion of “external” is meant in a conceptual sense rather than a physical
one. It denotes an environment that contains design artefacts made available for
interpretation.
Udo Kannengiesser and John S. Gero 212
Figure 1. Situatedness as the interaction of three worlds: (a) general model, (b) specialised model for
design representations
2.2.2. The Interpreted World
The interpreted world is the world that is built up inside the design agent in terms of sensory
experiences, percepts and concepts. It is the internal representation of that part of the external
world that the design agent interacts with. The interpreted world provides an environment for
analytic activities and discovery during designing.
2.2.3. The Expected World
The expected world is the world imagined actions of the design agent will produce. It is
the environment in which the effects of actions are predicted according to current goals and
interpretations of the state of the world.
2.2.4. Relationships between the Three Worlds
These three worlds are related through three classes of interaction. Interpretation
transforms variables that are sensed in the external world into sensory experiences, percepts
and concepts that compose the interpreted world. Focussing takes some aspects of the
interpreted world and uses them as goals for the expected world. Action is an effect which
brings about a change in the external world according to the goals in the expected world.
An Ontology of Computer-Aided Design 213
2.2.5. A More Detailed Framework of Design Interactions
Figure 1(b) presents a specialised view of the ontology of design worlds, with the design
agent (described by the interpreted and expected world) located within the external world, and
with general classes of design representations placed into this nested model. The set of
expected design representations (Xe
i
) corresponds to the notion of a design state space, i.e.
the state space of all possible designs that satisfy the set of requirements. This state space can
be modified during the process of designing by transferring new interpreted design
representations (X
i
) into the expected world and/or transferring some of the expected design
representations (Xe
i
) out of the expected world. This leads to changes in external design
representations (X
e
), which may then be used as a basis for re-interpretation changing the
interpreted world. Novel interpreted design representations (X
i
) may also be the result of
memory (here called constructive memory), which can be viewed as a process of interaction
among design representations within the interpreted world rather than across the interpreted
and the external world.
Both interpretation and constructive memory are viewed as “push-pull” processes, i.e. the
results of these processes are driven both by the original experience (“push”) and by some of
the agent’s current interpretations and expectations (“pull”) (Gero and Fujii 2000). This
notion captures two ideas. First, interpretation and constructive memory have a subjective
nature, using first-person knowledge grounded in the designer’s interactions with their
environment (Bickhard and Campbell 1996; Clancey 1997; Ziemke 1999; Smith and Gero
2005). This is in contrast to static approaches that attempt to encode all relevant design
knowledge prior to its use. Anecdotal evidence in support of first-person knowledge is
provided by the common observation that different designers perceive the same set of
requirements differently (and thus produce different designs). And the same designer is likely
to produce different designs at later times for the same requirements. This is a result of the
designer acquiring new knowledge while interacting with their environment between the two
times.
Second, the interplay between “push” and “pull” has the potential to produce emergent
effects, leading to novel and often surprising interpretations of the same internal or external
representation. This idea extends the notion of biases that simply reproduce the agent’s
current expectations. Examples have been provided from experimental studies of designers
interacting with their sketches of the design object. Schön and Wiggins (1992) found that
designers use their sketches not only as an external memory, but also as a means to reinterpret
what they have drawn, thus leading the design in a surprising, new direction. Suwa et al.
(1999) noted, in studying designers, a correlation of unexpected discoveries in sketches with
the invention of new issues or requirements during the design process. They concluded that
“sketches serve as a physical setting in which design thoughts are constructed on the fly in a
situated way”. Guindon’s (1990) protocol analyses of software engineers, designing control
software for a lift, revealed that designing is characterised by frequent discoveries of new
requirements interleaved with the development of new partial design solutions. As Guindon
puts it, “designers try to make the most effective use of newly inferred requirements, or the
sudden discovery of partial solutions, and modify their goals and plans accordingly”.
Udo Kannengiesser and John S. Gero 214
2.3. The Situated Function-Behaviour-Structure Framework
Gero and Kannengiesser (2004) have combined the ontology of design artefacts (Section 2.1)
with the ontology of design worlds (Section 2.2), by specialising the model of situatedness
shown in Figure 1(b). In particular, the variable X, which stands for design representations in
general, is replaced with the more specific representations F, B and S. This provides the basis
of the situated FBS framework, Figure 2 (Gero and Kannengiesser 2004). In addition to using
external, interpreted and expected F, B and S, this framework uses explicit representations of
external requirements given to the designer by another agent (usually the customer).
Specifically, there may be external requirements on function (FR
e
), external requirements on
behaviour (BR
e
), and external requirements on structure (SR
e
). The situated FBS framework
also introduces the process of comparison between interpreted behaviour (B
i
) and expected
behaviour (Be
i
), and a number of processes that transform interpreted structure (S
i
) into
interpreted behaviour (B
i
), interpreted behaviour (B
i
) into interpreted function (F
i
), expected
function (Fe
i
) into expected behaviour (Be
i
), and expected behaviour (Be
i
) into expected
structure (Se
i
). Figure 2 uses the numerals 1 to 20 to label the resultant set of processes;
however, it should be noted that they do not represent any order of execution.
The 20 processes can be mapped onto eight fundamental design steps (Gero 1990; Gero
and Kannengiesser 2004).
1. Formulation: consists of processes 1 – 10. It includes interpretation of external
requirements, given to the designer by a customer, as function, behaviour and
structure, via processes 1, 2 and 3. Requirements are also constructed as implicit
requirements generated from within the designer, using constructive memory
(processes 4, 5 and 6). Focussing transfers a subset of the (explicitly and
implicitly) required function, behaviour and structure into the expected world
(processes 7, 8 and 9). In summary, processes 1 – 9 represent activities that
populate the interpreted and expected worlds with design concepts, providing the
basis for subsequent transformations of these concepts. Process 10 transforms
expected function into additional expected behaviour. The set of expected
function, behaviour and structure, resulting from the formulation step, represents
the design state space. It includes all the variables and their ranges of values that
are relevant for the design task.
2. Synthesis: consists of process 11 to generate an instance of structure that is
expected to meet the required behaviour, and the externalisation of that structure
via process 12. This design step can be viewed as part of a search process
through the (previously formulated) state space of all possible instances of
structure.
3. Analysis: consists of interpretation of externalised structure (process 13) and the
derivation of behaviour from that structure (process 14).
4. Evaluation: consists of a comparison of expected behaviour and behaviour
derived through analysis (process 15).
5. Documentation: produces an external representation of the final design solution
for purposes of communicating that solution in terms of structure (process 12),
and, optionally, behaviour (process 17) and function (process 18).
An Ontology of Computer-Aided Design 215
6. Reformulation type 1: consists of focussing on different structures than
previously expected (process 9). Precursors of this process are the interpretation
of external structure (process 13), constructive memory of structure (process 6)
or the interpretation of new requirements on structure (process 3).
7. Reformulation type 2: consists of focussing on different behaviours than
previously expected (process 8). Precursors of this process are the derivation of
behaviour from structure (process 14), the interpretation of external behaviour
(process 19), constructive memory of behaviour (process 5) or the interpretation
of new requirements on behaviour (process 2).
8. Reformulation type 3: consists of focussing on different functions than
previously expected (process 7). Precursors of this process are the ascription of
function to behaviour (process 16), the interpretation of external function
(process 20), constructive memory of function (process 4) or the interpretation of
new requirements on function (process 1).
The numbering of the eight design steps, similar to the 20 processes, does not prescribe
any order of execution. While it may be expected for some routine design tasks to follow a
sequential execution of only the first five steps, it has been found that all three types of
reformulation frequently occur throughout the process of designing (McNeill et al. 1998).
Figure 2. The situated FBS framework.
Udo Kannengiesser and John S. Gero 216
The situated FBS framework represents designing independently of the domain of the
design and the specific methods used, and of the subject carrying out the process of
designing. What we have referred to as the “design agent” in the definition of the three design
worlds can be embodied by a human designer (or team of human designers), a computational
tool, or a combination of both.
3. A Process-Centred Ontology of Design
The object-centred ontology of design presented in Section 2 has been helpful for establishing
a basic understanding of design. Its emphasis on artefacts provides an intuitive, tangible
perspective, representing the process of designing as a gradual evolution of the design object
across three levels. The three-world model of design interactions, in which this representation
is embedded, is sufficiently rich to account for the phenomena of situatedness.
However, the object-centred ontology lacks sufficient detail and rigour to be useful for
comparing or developing different methods and computer support for designers. The key
ideas and semantics conveyed by the situated FBS framework are only informally expressed
using textual, natural-language descriptions such as in Sections 2.2 and 2.3. The graphical
model in Figure 2 does not fully capture these semantics. The mapping of the 20 processes
onto Gero’s (1990) eight fundamental design steps has added some more meaning by locating
these processes within typical phases of a design project. However, this mapping does not
completely capture all the semantics and is too informal to be used as an ontological
framework for computer-aided design. What is needed is an ontology that is process-centred,
treating design processes as first-class entities rather than derivatives of object-centred
constructs. This Section will present such an ontology, extending our recent work on an FBS
ontology of processes (Gero and Kannengiesser 2007).
3.1. An Ontology of Processes
Processes are usually understood as entities that are less tangible than (physical) objects.
Nonetheless, they can be represented using the same set of ontological constructs as used for
describing objects: function, behaviour and structure. To clearly distinguish between the
notations of the process-centred and the object-centred FBS ontology, we will use the indices
“p” for “process” and “o” for “object”.
3.1.1. Process Function
Function (F
p
) of a process is ontologically no different to object function, as it is based on the
observer’s goals rather than on embodiment as an object or as a process. Instances of process
functions are largely domain-dependent. However, most processes that we design and execute
through actions have the general function of replacing an existing state of the world with a
desired one.
An Ontology of Computer-Aided Design 217
3.1.2. Process Behaviour
Behaviour (B
p
) of a process relates to attributes that allow comparison on a performance level
as a basis for process evaluation. Typical process behaviours are speed, cost, amount of space
required and accuracy. These behaviours can be specialised and/or quantified for instances of
processes in particular domains.
3.1.3. Process Structure
Through an analogy with the structure of physical objects, we can distinguish between a
macro- and a micro-structure (S
p
) of processes.
Figure 3. The macro-structure of a process (i = input; t = transformation; o = output).
The macro-structure of a process includes three components and two relationships,
Figure 3.
The components are
• an input (i),
• a transformation (t) and
• an output (o).
The relationships connect
• the input and the transformation (i – t) and
• the transformation and the output (t – o).
Input (i) and output (o) represent properties of entities being transformed in terms of their
variables and/or their values. For example, the process of transportation changes the values
for the location of a (physical) object (e.g. the values of its x-, y- and z-coordinates). The
process of electricity generation takes mechanical motion as input and produces electrical
energy as output.
A common way to describe the transformation (t) of a process is in terms of a plan, a set
of rules or other procedural descriptions. A typical example is a software procedure that is
expressed in source code or as an activity diagram in the Unified Modeling Language (UML).
Such descriptions are often used to specify sub-components of the transformation.
The relationships between the three components of a process are usually uni-directional
from the input to the transformation and from the transformation to the output. For iterative
processes the t – o relationship is bi-directional to represent the feedback loop between the
output and the transformation.
The micro-structure or “material” of a process differs from the macro-structure because
its components and relationships cannot be distinguished (or are not relevant) at the same
level of abstraction. For example, it is not common to specify the (business process)
Udo Kannengiesser and John S. Gero 218
transformation “pay the supplier” in terms of more fine-grained activities (sub-components)
such as “log in to online banking system”, “fill out funds transfer form” and “click the submit
button”. This set of activities is best viewed as a micro-structure specified only through a
shorthand qualifier such as “using internet banking”. Micro-structure can also be associated
with the input and output components of a process. For example, a set of measuring data that
is the input of a statistical analysis process may be “materialised” through either digital or
paper-based media.
While micro-structure is clearly needed to carry out (“materialise”) a process, the
components and relationships of that micro-structure are not explicitly represented. This fits
with one of Merriam-Webster’s definitions of material as “the formless substratum of all
things which exists only potentially and upon which form acts to produce realities”.
The “formless substratum” of a transformation may reference not only processes but also
objects. It then denotes the entity or agent executing the transformation. In the “pay the
supplier” example, it is possible to specify “finance officer” or “purchasing department” as a
general descriptor for the executing agent. Including such references to agents (as “actors” or
“roles”) has become well-established in process modelling (Curtis et al. 1992).
Since micro-structure does not specify components and relationships, it can be embodied
by either (micro-) objects or (micro-) processes. In some instances, the micro-structure of
objects can refer to (micro-) processes rather than (micro-) objects. For example, the chemical
bonds (macro-relationships) between the atoms (macro-components) of a molecule are
realised by physical processes, according to the laws of quantum electrodynamics. A view of
the world as being based on processes rather than objects has generally been suggested in
process philosophy (Rescher 2006).
3.1.4. Relationships between Process Function, Behaviour and Structure
Relationships among F
p
, B
p
and S
p
are constructed according to the same principles as
described for F
o
, B
o
and S
o
(see Section 2.1.4). Function is ascribed to behaviour based on
associations of process performance with human goals. Behaviour can be derived from
structure either directly or indirectly based on external effects. An example of directly derived
behaviour is the speed of a process, as this depends exclusively on the macro-structure (“what
kind of transformation is used on what input to produce what output?”) and the micro-
structure (“how and by whom is the transformation carried out using what input/output
media?”). An example of indirectly derived behaviour is accuracy, which needs an external
benchmark against which the output of the process is compared.
3.2. An Ontology of Design Processes
The FBS ontology of processes can be used to re-represent the object-centred description of
the 20 design processes (presented in Section 2.3) as a process-centred one. Most parts of the
object-centred model depicted in Figure 2 directly map onto the input and output components
of process macro-structure (S
p
). For example, S
p
of process 14 in Figure 2 includes
(interpreted) object structure (S
o
) as input (i) and (interpreted) object behaviour (B
o
) as output
An Ontology of Computer-Aided Design 219
(o).
1
No specific information is given about transformation components, as this is available
only at an instance level.
Most of the semantics of the situated FBS framework can be captured by process function
(F
p
). Table 1 gives an overview of the structure and functions of each of the 20 design
processes.
Table 1. Function (F
p
) and macro-structure (S
p
) of the 20 design processes
ID Process class (macro-) S
p
F
p
1 FR
e
→ F
i
2 BR
e
→ B
i
3
Interpretation
SR
e
→ S
i
1. transfer design concepts as intended
2. re-interpret design concepts
4 F
i
→ F
i
5 B
i
→ B
i
6
Constructive
memory
S
i
→ S
i
1. retrieve design concepts as stored
2. re-construct design concepts
7 F
i
→ Fe
i
construct function state space
8 B
i
→ Be
i
construct behaviour state space
9
Focussing
S
i
→ Se
i
construct structure state space
10 Fe
i
→ Be
i
construct behaviour state space
11
Transformation
Be
i
→ Se
i
generate values for design structure
12 Action Se
i
→ S
e
1. communicate the design to others
2. initiate reflective conversation
13 Interpretation S
e
→ S
i
1. transfer design concepts as intended
2. re-interpret design concepts
14 Transformation S
i
→ B
i
1. analyse for performance expectations
2. generate new design issues
15 Comparison {Be
i
, B
i
} → decision evaluate the design
16 Transformation B
i
→ F
i
generate new design issues
17 Be
i
→ B
e
18
Action
Fe
i
→ F
e
1. communicate the design to others
2. initiate reflective conversation
19 B
e
→ B
i
20
Interpretation
F
e
→ F
i
1. transfer design concepts as intended
2. re- interpret design concepts
Interpretation processes (1, 2, 3, 13, 19 and 20) can have two different functions. One
function is to transfer existing design concepts from one agent to another or the same agent
without a change of the initial meaning of these concepts. This involves bringing external
representations into a form that allows processing of these representations by the individual
design agent. The other function of interpretation is to re-interpret design concepts based on
existing ones. This generates design concepts and issues that are novel with respect to the
ones initially intended.
Constructive memory processes (4, 5 and 6) have a similar set of functions. One function
is to retrieve design concepts from some storage space in the same way as they were
experienced at the time of storage. While this may include some computation or
transformation, such as refinement or decomposition of design concepts, the results of this

1
Indices for “interpreted” have been omitted here to improve notational clarity.
Udo Kannengiesser and John S. Gero 220
process will all have a pre-defined relationship with the initial concepts. The other function of
the constructive memory processes is to re-construct and thereby modify existing design
concepts, which corresponds to the notion of reflection (Schön 1983).
Focussing processes (7, 8 and 9) have the function to construct the design state space.
This includes the construction of the initial design state space (maps onto the formulation
step) and subsequent modifications of that space (maps onto the reformulation steps).
Action processes (12, 17 and 18) can have two different functions. One function is to
communicate aspects of the design to other stakeholders (agents). Here, the notion of
communication is used in its traditional sense of sharing information, based on unambiguous
transfer of design concepts. The other function is to initiate reflective conversation, either
with other stakeholders (agents) or the initiator of the action process itself. In other words,
external representations are produced to be re-interpreted in new ways.
Processes 10, 11, 14 and 16 may be called “FBS
o
transformations” based on their role as
transformers between F
o
, B
o
and S
o
. Process 10 has the function to construct the behaviour
state space, and process 11 has the function to generate values within the (previously
constructed) structure state space. Process 14 has two functions. One function is to analyse
the design with respect to current performance expectations. The other function is to generate
new design concepts that can be included as new issues in the current design task. This is also
the function of process 16. The comparison process (15) has the function to evaluate the
design, based on decision making informed by comparison of expected and interpreted design
performance.
It can be seen that some of the functions (F
p
) – loosely speaking – relate to non-situated
and others to situated aspects of designing. Non-situated aspects are captured by those
functions that do not address the potential for change during designing. These are the
functions that involve “transfer” (in interpretation processes), “retrieval” (in constructive
memory processes) and “communication” (in action processes). Situated aspects of designing
describing the potential for change are captured by functions that involve “re-interpretation”
(in interpretation processes), “re-construction” (in constructive memory processes) and
“reflective conversation” (in action processes).
Table 1 does not include the behaviours (B
p
) of the 20 processes. This is because, at the
current level of abstraction, they are no different from the general process behaviours
described in Section 3.1.2. This is based on the independence of our ontology of specific
methods or design domains. No detailed information about structure (S
p
) and exogenous
effects is available to be able to specialise or quantify general process behaviours (B
p
) such as
speed, accuracy and cost. An example for such detailed information would be when process
structures (S
p
) were considered that contain iterations (e.g., when using genetic algorithms
(GAs) in design synthesis). In this case, the behaviour (B
p
) “rate of convergence” could be
derived that is a specialisation of the behaviour (B
p
) “speed”. However, as our aim here is to
provide a general rather than an instance-specific ontology, different classes of design
processes are distinguished only at the level of function (F
p
) and structure (S
p
).
4. An Ontological Framework for Computer-Aided Design Support
The ontological view presented in Sections 2 and 3 has provided a detailed description of 20
distinct processes in designing. This is useful for enhancing our understanding of designing as
An Ontology of Computer-Aided Design 221
a human activity. However, the ultimate aim of most research in design is to enhance the
performance of this activity, both in terms of higher effectiveness and efficiency. The key to
improving performance or behaviour (B
p
) of designing is in the structure (S
p
) it is derived
from. This requires more detailed representations of structure (S
p
) than presented in Table 1,
and mainly concerns micro-structure. Exploring the micro-level of process structure is a
general research theme that has been recognised in a number of other disciplines (Osterweil
2005).
Research in the micro-structure (S
p
) of designing can be characterised loosely as either
method- or tool-oriented. Method-oriented approaches focus on process-centred
representations of micro-structure. These representations can be viewed as composing a new
macro-structure to be “materialised” by humans or tools. Tool-oriented design research
focuses on object-centred representations of micro-structure in terms of new design tools.
Computer-aided design research and development is clearly located in this field. Both
method- and tool-oriented research streams are complementary, as each of them often uses
results from the other.
Computer-aided design tools can themselves be regarded as design objects. Applying the
FBS ontology to these tools provides a schema for the characteristics that the tools must
exhibit to be useful in the process of designing. We will use the index “t” for “tool” to
distinguish the FBS view of tools from the FBS view of design objects and design processes.
The most essential characteristics of a tool relate to function (F
t
) as they orient the
specification of a tool’s behaviour (B
t
) and structure (S
t
) towards the required goals and
context of use. Many of the functions of computer-aided design tools do not differ from any
other software product. They include such general characteristics as usability, reliability,
maintainability and others (ISO 2001). However, there are a number of functions that are
specific to computer-aided design tools. These functions relate to the tools’ role as the
“material” of design processes, and can generally be described as “to support design
processes of class X”. For example, a general function (F
t
) of a commercial CAD tool is “to
support design processes of class X = documentation” (one of the fundamental design steps
presented in Section 2.3). These functions can be further specialised using particular
combinations of the FBS
p
properties of the 20 design processes presented in Table 1. An
example of a more specific function (F
t
) of a CAD tool is “to support the process structure
(S
p
) Se
i
→ S
e
in a way to achieve the process function (F
p
) of communicating the design to
others”.
The set of functions (F
t
) derivable in this way can serve as high-level requirements for the
development of new design tools. This approach makes research and development in
computer-aided design look like a design process, generating computational models and
architectures as the structure (S
t
) of tools exhibiting certain behaviours (B
t
) to achieve the
required functions (F
t
). The remainder of this Section will cast existing work on computer-
aided design systems in this ontology, classifying that work based on tool functions (F
t
)
derived from combinations of F
p
and S
p
shown in Table 1. This aims to provide an overview
of the current range of both commercial software and academic proof-of-concept
demonstrators. For this purpose, detailed descriptions of their behaviour (B
t
) and structure (S
t
)
are not required. Readers may consult our references to the literature for more specific
information.
Udo Kannengiesser and John S. Gero 222
Figure 4. Action in the situated FBS framework.
4.1. Computer-aided Design Support for Action
The notion of a “tool” has traditionally been viewed as a mechanism for humans to perform
actions. Computer-aided design tools can serve two possible functions (F
t
) in their support of
action (see Table 1):
• to support communicating the design
• to support initiating reflective conversation
Figure 4 highlights processes 12, 17 and 18 in the situated FBS framework to represent
actions related to these two functions.
4.1.1. Support for Communicating the Design
• Se
i
→ S
e
(process 12): The ability to generate representations of external object
structure (S
o
) is provided by commercial CAD systems. These tools produce 2-D or
3-D models and offer functionalities such as scaling, rotating and rendering to
communicate different aspects of the object. The models generated by CAD systems
An Ontology of Computer-Aided Design 223
are primarily used for data exchange with other designers, manufacturers or other
stakeholders, or for providing input for tools that perform analyses of the designed
objects. Communication across different tools has been recognised as an area of
growing concern, as the tools generally use different languages (data formats) for
representing object structure. A number of approaches address this problem by
defining standardised product models, the best known of which are STEP and IFCs
(Eastman 1999). Many CAD tools now have translators (called pre-processors) that
map object structure onto a neutral format based on these standards. Some of our
previous work was concerned with developing an agent-based approach to
communicating product data in situations where no standard formats are available
(Kannengiesser and Gero 2006; Kannengiesser and Gero 2007).
• Be
i
→ B
e
(process 17): Virtual reality (VR) systems are increasingly used to
generate 3-D objects in a place-like context that usually include avatars representing
potential users or stakeholders of the design. These tools support modelling not only
the structure (S
o
) but also the behaviour (B
o
) of the designed object based on
simulated interactions with avatars or other objects. Digital mock-ups (DMUs) are
based on a similar concept, and are commonly used for the simulation of assembly
operations or kinematics. Other tools that focus mainly on the communication of
object behaviour (B
o
) are those specialised in performing particular engineering
analyses. Typical examples here include the representation of stresses and
temperatures.
• Fe
i
→ F
e
(process 18): There are currently no commercial tools specialised in
generating formal representations of object function (F
o
). This is mainly due to the
lack of a commonly agreed representation language. In most cases, function is
described informally using natural language expressions, usually based on verb-noun
pairs (Jacobsen et al. 1991) that are also used in this chapter. These descriptions can
be produced by general-purpose word processors and annotation mechanisms
provided by CAD systems. Future tool support may result from recent work on more
formal representations of function (Chandrasekaran and Josephson 2000; Stone and
Wood 2000; Szykman et al. 2001; Deng 2002).
4.1.2. Support for Initiating Reflective Conversation
• Se
i
→ S
e
(process 12): There are no commercial design tools that explicitly aim at
supporting reflective conversation. However, there are some method-oriented
approaches that may inform the development of such tools. For example, Jun and
Gero (1997) have demonstrated how shapes can emerge by representing the same
geometrical structure in different ways. Current CAD systems do not have this
ability, as their representations are fixed through the way they store a design’s
geometry in their database. An approach by Reymen et al. (2006) uses checklists and
forms for designers to stimulate the creation of textual descriptions of designs from
multiple perspectives, at regular intervals during the process of design.
• Be
i
→ B
e
(process 17): Reflective conversation at the behaviour level has not been
well understood. However, the models of generating multiple representations
described for the structure level can be applied when behaviour is represented using
Udo Kannengiesser and John S. Gero 224
shapes. The notion of space, for example, can be viewed as a behaviour (derived
from a walls-and-floor structure) that can be described geometrically.
• Fe
i
→ F
e
(process 18): Apart from cases in which functions represent references to
shapes, reflective conversation at the function level has not been well understood.
Tool support for generating multiple, textual representations of function may be
developed based on research in natural language semantics. For example, de Vries et
al. (2005) explore the use of the WordNet lexicon (Miller 1995) to generate a graph
of synonyms and other semantic relations from a given set of words.
4.2. Computer-aided Design Support for FBS
o
Transformations and
Evaluation
A number of research efforts have concentrated on tool support for performing those
transformations and evaluations that have been viewed as fundamental in most traditional
models of designing (e.g., Asimov (1962)). These include the transformations between the
function, behaviour and structure of the design object, and evaluation based on comparing
expected with “actual” behaviour. Figure 5 highlights processes 10, 11, 14, 15 and 16 to
represent these activities.
Figure 5. FBS
o
transformations and evaluation in the situated FBS framework.
An Ontology of Computer-Aided Design 225
• S
i
→ B
i
(process 14): There is a wide range of commercial tools that support the
derivation of object behaviour from design structure. These are commonly referred
to as analysis tools or simulation tools. Most of them are based on the physical
laws and principles established in the engineering sciences. Examples of design
analyses for which there is automated support include finite element analysis,
thermal analysis, energy analysis and kinematic analysis. Some tools, such as
design optimisation tools and parametric CAD systems, provide automated support
for the S
i
→ B
i
transformation as part of a collection of transformations that also
include evaluation (process 15) and the generation of object structure (process 11).
These tools will be presented in more detail under the bullet points for processes 11
and 15 (below). The function of generating new design issues (see Table 1) is
addressed by some CAD systems performing runtime analyses of the design, such
Design for X (DFX) analyses. Gero and Kazakov (1998) have developed a
computational model of behaviour analogy where new behaviour variables are
introduced into the target design based on structure similarity with the source
design.
• Be
i
→ Se
i
(process 11): Parametric CAD systems have shown to significantly
facilitate the creation of solid models (Shah and Mäntylä 1995), and many CAD
vendors now offer parametric modelling features. These systems can be viewed as
automating the process of computing an object structure once a set of parameters
have been formulated for both structure and behaviour. Parametric CAD systems also
allow for automated maintenance of parametric constraints (Sacks et al. 2004). This
requires additional automation for analysing and evaluating the design for constraint
violations, which can be mapped onto the transformation process S
i
→ B
i
(process
14) and the evaluation process {Be
i
, B
i
} → decision (process 15). Design
optimisation tools provide similar integrated functionalities supporting the same set
of processes. They provide an extensive range of mechanisms to evolve object
structure, including various deterministic and stochastic search methods
(Papalambros and Wilde 2000).
• {Be
i
, B
i
} → decision (process 15): Automated support for this process is provided
in a number of computer-aided design systems, as indicated above. Optimisation
tools, in particular, incorporate sophisticated strategies for controlling the
execution of alternative search paths, based on the performance of the current
design candidate. Research on agent-based design systems addresses evaluation
using conflict resolution mechanisms, which have been applied to instances of
multi-objective design optimisation (Grecu and Brown 1996; Campbell et al.
1999).
• Fe
i
→ Be
i
(process 10): Few systems have been developed that support the
generation of object behaviours based on object function (Maiden and Sutcliffe 1992;
Bhatta et al. 1994; Umeda et al. 1996). This is mainly due to the lack of a formal
language to represent function.
• B
i
→ F
i
(process 16): There has been no work to date on tool support for this
process.
Udo Kannengiesser and John S. Gero 226
Figure 6. Focussing in the situated FBS framework.
4.3. Computer-aided Design Support for Focussing
There has been some work on tools to support focussing, the processes involved in the
formulation of a design state space. These tools are mainly based on decision-making
mechanisms that use various kinds of information. Figure 6 highlights processes 7, 8 and 9 to
represent focussing.
• S
i
→ Se
i
(process 9): A number of computational approaches to focussing on object
structure have been developed in the area of design optimisation. Some of this work
uses information extracted from the current design. For example, Parmee’s (1996)
cluster-oriented genetic algorithms (COGAs) identify high-performance regions
within the current structure state space. These features are then used for focussing on
different structure variables and constraints, to concentrate the search for an optimum
design on particular areas within the original structure state space. Other work uses
information learnt from previous design tasks. A tool developed by Schwabacher et
al. (1998) extracts characteristics of previous optimisation results and uses them to
formulate new optimization problems. These characteristics include information such
An Ontology of Computer-Aided Design 227
as optimal structure, mappings between structure and behaviour, infeasible behaviour
and active constraints. This information is used to improve the problem formulation
by reducing the structure state space.
• B
i
→ Be
i
(process 8): Some work has been done on focussing at the level of object
behaviour, again mostly in the context of optimisation. Mackenzie and Gero (1987)
have induced rules to detect certain features of Pareto optimal sets relating to
curvature, sensitivity and other information. The rules use this information to
reformulate the problem by carrying out focussing in a way that reduces the
behaviour state space. Jozwiak’s (1987) approach uses learning to acquire knowledge
of inactive constraints, which is then used to predict whether or not the constraints of
the current optimisation task may be neglected.
• F
i
→ Fe
i
(process 7): There has been no work to date on tool support for this
process.
Figure 7. Interpretation in the situated FBS framework.
Udo Kannengiesser and John S. Gero 228
4.4. Computer-aided Design Support for Interpretation
Tools usually have some form of interface to receive and utilise input provided externally
either by humans or other tools. In computer-aided design, there are two possible functions
(F
t
) for interpretation by tools:
• to support transferring design concepts as intended
• to support re-interpreting design concepts
Figure 7 highlights processes 1, 2, 3, 13, 19 and 20 to represent interpretation.
4.4.1. Support for Transfer of Design Concepts as Intended
• SR
e
→ S
i
(process 3) and S
e
→ S
i
(process 13): There has been considerable
research in the computational interpretation of external object structure. The
standardisation approaches to product modelling, mentioned in Section 4.1.1, provide
the basis for the development of import mechanisms (called post-processors) that
translate the standard models into the tool’s native format. Post-processors for STEP
and IFC models are available in a number of commercial CAD/CAE/CAM systems.
Another area of research is concerned with the interpretation of human sketches and
freehand drawings by tools converting them into more exact graphical models or
performing early design analyses (Taggart 1975; Gross 1996; Leclercq 2001).
• BR
e
→ B
i
(process 2) and B
e
→ B
i
(process 19): Most design tools dealing with
object behaviour directly derive that behaviour from structure (process 14) rather
than interpreting it externally (e.g., from other tools). As a result, not much work
exists on tool support for the interpretation of object behaviour. However, recent
approaches to interoperability aiming to standardise the representation of function
and behaviour besides structure (Szykman et al. 2001) may lead to the development
of tool translators that automate this process.
• FR
e
→ F
i
(process 1) and F
e
→ F
i
(process 20): Most tool support for the
interpretation of object function is based on mechanisms of word recognition, given
that many representations of function are described using natural language
annotations. While there is a large number of general-purpose tools that provide
interfaces for textual input (such as Word processors or electronic whiteboards), only
few of them (e.g., the word generation system developed by de Vries et al. (2005))
offer more word-processing features than just editing. Future work on the
interpretation of function can be expected to be driven by advances in both
representing and reasoning about function, particularly in the area of design
interoperability (Szykman et al. 2001).
4.4.2. Support for Re-Interpretation of Design Concepts
• SR
e
→ S
i
(process 3) and S
e
→ S
i
(process 13): Most research in re-interpretation
has been done at the level of object structure. A system presented by Saund and
Moran (1994) supports the creation of multiple interpretations of line drawings, by
first decomposing and then reassembling elements of freehand drawings. The
An Ontology of Computer-Aided Design 229
emerging shapes are then presented to the user for selection. A design agent capable
of re-interpretation has been developed by Smith and Gero (2001) on the basis of
Gero and Fujii’s (2000) “push-pull” model of situated cognition. This system has
been able to learn new shapes over sequences of action and (re-)interpretation that
are themselves the result of the agent’s modified experience.
• BR
e
→ B
i
(process 2) and B
e
→ B
i
(process 19): There has been no work to date on
tool support for this process.
• FR
e
→ F
i
(process 1) and F
e
→ F
i
(process 20): There has been no work to date on
tool support for this process.
4.5. Computer-aided Design Support for Constructive Memory
Most work on computer-aided design tools includes support for memory in some way. There
are two possible functions (F
t
) related to this notion:
• to support retrieval of design concepts as stored
• to support re-construction of design concepts
Figure 8 highlights processes 4, 5 and 6 to represent constructive memory.
Figure 8. Constructive memory in the situated FBS framework.
Udo Kannengiesser and John S. Gero 230
4.5.1. Support for Retrieval of Design Concepts as Stored
• S
i
→ S
i
(process 6): Research in using memory of object structure includes work on
feature-based modelling. A number of CAD systems provide design databases,
repositories or libraries to store design features, such as pockets, holes and slots.
Their reuse can lead to significant gains of productivity in designing. Techniques of
feature extraction from geometrical CAD models can be viewed as another example
of retrieving design concepts, although they require some additional computation.
Here, features are implicitly stored in the pre-defined mappings underpinning
common extraction techniques such as graph matching, syntactic pattern recognition
and shape grammars (Shah 1991).
• B
i
→ B
i
(process 5): Some recent work on design repositories has concentrated on
including properties related to behaviour (B
o
) and function (F
o
) of the design object
(Szykman et al. 2001; Mocko et al. 2004). In addition, approaches to capturing and
reusing design rationale have focused on appropriate representations of previous
object behaviour to be accessible for guiding the generation of new object structure
(Chandrasekaran et al. 1993).
• F
i
→ F
i
(process 4): Simple retrieval of object function is best exemplified by work
on storing and reusing function (F
o
) hierarchies in design repositories (Szykman et al.
2001) or case bases (Navinchandra et al. 1991). Approaches to retrieving implicitly
stored functions include work, mentioned earlier, on inferring word relations based
on WordNet (de Vries et al. 2005). Other work focuses on the construction of sub-
functions using decomposition knowledge encoded in grammars (Sridharan and
Campbell 2005).
4.5.2. Support for Re-Construction of Design Concepts
• S
i
→ S
i
(process 6), B
i
→ B
i
(process 5) and F
i
→ F
i
(process 4): The idea of
generating design concepts by situated re-construction rather than static retrieval
from previous experience is quite new in design research. As a result, very little
work has been done towards the development of computational models and tools
that support this process. However, a number of research demonstrators have
shown both the feasibility and the potential benefits of future constructive
memory tools. Examples include neural network implementations used for the
design of mechanical assemblies (Liew and Gero 2004), design optimisation
(Peng and Gero 2006) and the exchange of product data between design tools
(Kannengiesser and Gero 2007). The majority of this work provides support for
re-construction of design concepts at all three levels, comprising function (F
o
),
behaviour (B
o
) and structure (S
o
).
5. Conclusion
Designing comprises a rich set of activities that is only beginning to be completely
understood. Capturing these activities and defining them in a detailed framework is necessary
to advance our understanding of design. The ontological framework presented in this chapter
An Ontology of Computer-Aided Design 231
is a contribution to this aim. It extends our previous, object-centred work on representing the
process of designing by adding a process-centred view. This view is based on the direct
application of the FBS ontology to design activities, treating them as first-class entities with
their own function, behaviour and structure, and no longer as mere derivatives of object-
centred constructs. This provides a more structured description at a higher level of detail,
which has the potential to make our framework of situated designing more amenable to other
researchers.
We have shown that the process-centred ontology of designing also allows specifying a
set of requirements for tool support. This is based on the connection we established between
the FBS view of design processes and the FBS view of design tools. Specifically, tools are
viewed as artefacts whose functions (F
t
) are specialised to supporting particular aspects of
design processes, which themselves consist of combinations of function (F
p
), behaviour (B
p
)
and structure (S
p
). We have demonstrated how some of the outcomes of existing computer-
aided design research and development can be mapped onto 20 classes of design processes
represented in this way. One result of our mappings is that a lack of tool support can be
identified for a number of design activities. At the level of granularity presented in this
chapter, this concerns activities of re-interpretation and re-construction of design concepts,
and reasoning and focussing on object function (F
o
).
Our ontology allows understanding the research field of computer-aided design as the
“materials science” of designing, concerned with creating and analysing tools to form
appropriate “materials” of design processes at different levels of granularity. This is possible
because the FBS ontology represents all design objects, tools and processes uniformly. Future
research may use this ontology to create more fine-grained specifications of design tools. For
example, different classes of feature extraction processes can be defined based on different
classes of inputs (e.g., cubic, cylindrical or free-form shapes) and on different classes of
transformations (e.g., graph matching, shape grammars, neural networks, etc.), and
consequently different functions (F
t
) of feature extraction tools can be derived. Information
about specific process behaviour (B
p
) and process function (F
p
), on an instance level, can be
added to derive more refined tool functions (F
t
). Researchers and developers in computer-
aided design can then identify specific gaps in the functions (F
t
) of existing tools and generate
the behaviour (B
t
) and ultimately the structure (S
t
) of new tools to close these gaps.
Acknowledgments
NICTA is funded by the Australian Government's Department of Communications,
Information Technology and the Arts, and the Australian Research Council through Backing
Australia's Ability and the ICT Research Centre of Excellence program.
References
Asimov, M: 1962, Introduction to Design, Prentice-Hall, Englewood Cliffs.
Bartlett, FC: 1932 reprinted in 1977, Remembering: A Study in Experimental and Social
Psychology, Cambridge University Press, Cambridge.
Udo Kannengiesser and John S. Gero 232
Bhatta, S, Goel, A and Prabhakar, S: 1994, Innovation in analogical design: A model-based
approach, in JS Gero and F Sudweeks (eds) Artificial Intelligence in Design ’94, Kluwer,
Dordrecht, pp. 57-74.
Bickhard, MH and Campbell, RL: 1996, Topologies of learning, New Ideas in Psychology
14(2): 111-156.
Campbell, MI, Cagan, J and Kotovsky, K: 1999, A-Design: An agent-based approach to
conceptual design in a dynamic environment, Research in Engineering Design 11(3):
172-192.
Chandrasekaran, B, Goel, AK and Iwasaki, Y: 1993, Functional representation as design
rationale, IEEE Computer 26(1): 48-56.
Chandrasekaran, B and Josephson, JR: 2000, Function in device representation, Engineering
with Computers 16(3-4): 162-177.
Clancey, WJ: 1997, Situated Cognition: On Human Knowledge and Computer
Representations, Cambridge University Press, Cambridge.
Curtis, B, Kellner, MI and Over, J: 1992, Process modeling, Communications of the ACM
35(9): 75-90.
Deng, YM: 2002, Function and behavior representation in conceptual mechanical design,
Artificial Intelligence for Engineering Design, Analysis and Manufacturing 16(5): 343-
362.
Dewey, J: 1896 reprinted in 1981, The reflex arc concept in psychology, Psychological
Review 3: 357-370.
Eastman, CM: 1999, Building Product Models: Computer Environments Supporting Design
and Construction, CRC Press, Boca Raton.
Gero, JS: 1990, Design prototypes: A knowledge representation schema for design, AI
Magazine 11(4): 26-36.
Gero, JS and Fujii, H: 2000, A computational framework for concept formation for a situated
design agent, Knowledge-Based Systems 13(6): 361-368.
Gero, JS and Kannengiesser, U: 2004, The situated function-behaviour-structure framework,
Design Studies 25(4): 373-391.
Gero, JS and Kannengiesser, U: 2007, A function-behavior-structure ontology of processes,
Artificial Intelligence for Engineering Design, Analysis and Manufacturing 21(4), in
press
Gero, JS and Kazakov, V: 1998, Using analogy to extend the behaviour state space in creative
design, in JS Gero and ML Maher (eds) Computational Models of Creative Design IV,
Key Centre of Design Computing and Cognition, University of Sydney, Australia, pp.
113-143.
Grecu, DL and Brown, DC: 1996, Learning by single function agents during spring design, in
JS Gero and F Sudweeks (eds) Artificial Intelligence in Design ’96, Kluwer, Dordrecht,
pp. 409-428.
Gross, MD: 1996, The electronic cocktail napkin – a computational environment for working
with design diagrams, Design Studies 17(1): 53-69.
Guindon, R: 1990, Designing the design process: Exploiting opportunistic thoughts, Human-
Computer Interaction 5: 305-344.
ISO: 2001, Software Engineering – Product Quality – Part 1: Quality Model, ISO/IEC 9126-
1, International Organization for Standardization, Geneva, www.iso.ch
An Ontology of Computer-Aided Design 233
Jacobsen, K, Sigurjonsson, J and Jacobsen, O: 1991, Formalized specification of functional
requirements, Design Studies 12(4): 221-224.
Jozwiak, SF: 1987, Improving structural optimization programs using artificial intelligence
concepts, Engineering Optimization 12: 155-162.
Jun, HJ and Gero, JS: 1997, Representation, re-representation and emergence in collaborative
computer-aided design, in ML Maher, JS Gero and F Sudweeks (eds) Preprints Formal
Aspects of Collaborative Computer-Aided Design, Key Centre of Design Computing and
Cognition, University of Sydney, Australia, pp. 303-320.
Kannengiesser, U and Gero, JS: 2006, Towards mass customized interoperability, Computer-
Aided Design 38(8): 920-936.
Kannengiesser, U and Gero, JS: 2007, Agent-based interoperability without product model
standards, Computer-Aided Civil and Infrastructure Engineering 22(2): 80-97.
Leclercq, P: 2001, Programming and assisted sketching, in B de Vries, JP van Leeuwen and
HH Achten (eds) CAAD Futures 2001, Kluwer Academic Publishers, Dordrecht, pp. 15-
32.
Liew, P and Gero, JS: 2004, Constructive memory for situated design agents, Artificial
Intelligence for Engineering Design, Analysis and Manufacturing 18(2): 163-198.
Mackenzie, CA and Gero, JS: 1987, Learning design rules from decisions and performances,
Artificial Intelligence in Engineering 2(1): 2-10.
Maiden, NA and Sutcliffe, AG: 1992, Exploiting reusable specifications through analogy,
Communications of the ACM 35(4): 55-63.
McNeill, T, Gero, JS and Warren, J: 1998, Understanding conceptual electronic design using
protocol analysis, Research in Engineering Design 10(3): 129-140.
Miller, GA: 1995, WordNet: A lexical database for English, Communications of the ACM
38(11): 39-41.
Mocko, G, Malak, R, Paredis, C and Peak, R: 2004, A knowledge repository for behavioral
models in engineering design, Computers and Information Science in Engineering
Conference ’04, Salt Lake City, UT.
Navinchandra, D, Sycara, KP and Narasimhan, S: 1991, Behavioral synthesis in CADET, a
case-based design tool, IEEE Conference on Artificial Intelligence Applications, Miami
Beach, FL, pp. 217-221.
Osterweil, LJ: 2005, Unifying microprocess and macroprocess research, in M Li, B Boehm
and LJ Osterweil (eds) Unifying the Software Process Spectrum, Springer-Verlag, Berlin,
pp. 68-74.
Papalambros, P and Wilde, DJ: 2000, Principles of Optimal Design: Modeling and
Computation, Cambridge University Press, Cambridge.
Parmee I.C. (1996) Towards an optimal engineering design process using appropriate
adaptive search strategies, Journal of Engineering Design 7(4): 341-362.
Peng, W and Gero, JS: 2006, Concept formation in a design optimization tool, in J van
Leeuwen and H Timmermans (eds) Innovations in Design Decision Support Systems in
Architecture and Urban Planning, Springer-Verlag, Berlin, pp. 293-308.
Rescher, N: 2006, Process Philosophical Deliberations, Ontos-Verlag, Frankfurt.
Reymen, IMMJ, Hammer, DK, Kroes, PA, van Aken, JE, Dorst, CH, Bax, MFT and Basten,
T: 2006, A domain-independent descriptive design model and its application to structured
reflection on design processes, Research in Engineering Design 16(4): 147-173.
Udo Kannengiesser and John S. Gero 234
Sacks, R, Eastman, CM and Lee, G: 2004, Parametric 3D modeling in building construction
with examples from precast concrete, Automation in Construction 13(3): 291-312.
Saund, E and Moran, TP: 1994, A perceptually-supported sketch editor, ACM Symposium on
User Interface Software and Technology, ACM Press, New York.
Schön, DA: 1983, The Reflective Practitioner: How Professionals Think in Action, Harper
Collins, New York.
Schön, DA and Wiggins, G: 1992, Kinds of seeing and their functions in designing, Design
Studies 13(2): 135-156.
Schwabacher, M, Ellman, T and Hirsh, H: 1998, Learning to set up numerical optimizations
of engineering designs, Artificial Intelligence for Engineering Design, Analysis and
Manufacturing 12(2): 173-192.
Shah, JJ: 1991, Assessment of features technology, Computer-Aided Design 23(5): 331-343.
Shah, JJ and Mäntylä, M: 1995, Parametric and Feature-Based CAD/CAM: Concepts,
Techniques, and Applications, John Wiley & Sons, New York.
Smith, GJ and Gero, JS: 2001, Interaction and experience: Situated agents and sketching, in
JS Gero and FMT Brazier (eds) Agents in Design 2002, Key Centre of Design Computing
and Cognition, University of Sydney, pp. 115-132.
Smith, GJ and Gero, JS: 2005, What does an artificial design agent mean by being ‘situated’?,
Design Studies 26(5): 535-561.
Sridharan, P and Campbell, MI: 2005, A study on the grammatical construction of function
structures, Artificial Intelligence for Engineering Design, Analysis and Manufacturing
19(3): 139-160.
Stone, RB and Wood, KL: 2000, Development of a functional basis for design, Journal of
Mechanical Design 122(4): 359-370.
Suwa, M, Gero, JS and Purcell, T: 1999, Unexpected discoveries and s-inventions of design
requirements: A key to creative designs, in JS Gero and ML Maher (eds) Computational
Models of Creative Design IV, Key Centre of Design Computing and Cognition,
University of Sydney, Sydney, Australia, pp. 297-320.
Suwa, M and Tversky, B: 2002, External representations contribute to the dynamic
construction of ideas, in M Hegarty, B Meyer and NH Narayanan (eds) Diagrams 2002,
Springer-Verlag, Berlin, pp. 341-343.
Szykman, S, Fenves, SJ, Keirouz, W and Shooter, SB: 2001, A foundation for interoperability
in next-generation product development systems, Computer-Aided Design 33(7): 545-
559.
Taggart, J: 1975, Sketching: An informal dialogue between designer and computer, in N
Negroponte (ed.) Reflections on Computer Aids to Design and Architecture, Petrocelli
Charter, New York, pp. 147-162.
Umeda, Y, Ishii, M, Yoshioka, M, Shimomura, Y and Tomiyama, T: 1996, Supporting
conceptual design based on the function-behavior-state modeler, Artificial Intelligence
for Engineering Design, Analysis and Manufacturing 10(4): 275-288.
de Vries, B, Jessurun, J, Segers, N and Achten, H: 2005, Word graphs in architectural design,
Artificial Intelligence for Engineering Design, Analysis and Manufacturing 19(4): 277-288.
Weisberg, DE: 2000, The electronic push, Mechanical Engineering 122(4): 52-59.
Ziemke, T: 1999, Rethinking grounding, in A Riegler, M Peschl and A von Stein (eds)
Understanding Representation in the Cognitive Sciences: Does Representation Need
Reality?, Plenum Press, New York, pp. 177-190.
INDEX
A
abstraction, 217, 220
accelerometers, 62
achievement, 209
actuation, viii, 129, 132, 133, 142
adaptation, 35
adaptations, 177
adjustment, 31
aesthetics, 151
age, 3
aggression, 119
Air Force, 206
algorithm, viii, 34, 71, 88, 89, 91, 92, 93, 96, 97,
99, 102, 104, 108, 109, 129, 130, 131, 138,
139, 141, 149, 150, 151, 163, 164, 165, 179,
186, 197, 198, 202, 206
amplitude, 70, 75, 77, 78, 123
anatomy, 132
anger, 115, 117
animations, vii, viii, ix, 55, 58, 85, 86, 87, 88, 90,
97, 103, 104, 105, 107, 108, 126, 129, 130,
145, 146
annotation, 223
anxiety, 119
articulation, 133
artificial intelligence, 233
assignment, 34
atoms, 218
Australia, 209, 231, 232, 233, 234
Austria, 60
automation, 225
B
background, 118
bandwidth, 85, 89
banking, 218
beams, 9, 11, 25, 29, 33
beautification, 160, 161, 170, 172, 175
beauty, 145
behavior, viii, 24, 29, 58, 78, 113, 118, 158, 159,
161, 162, 165, 232
Beijing, 175
Belgium, 177
bending, 206
Bible, 83
blocks, 86, 87, 88, 91, 96, 148, 151
bonds, 218
bone, 130
boredom, 117
Brazil, 113, 127
breakdown, 165
Britain, 55
building blocks, 168, 169
buttons, 27
C
calibration, 62
CAP, 4, 10
cast, 221
casting, 147
categorization, 133
Central Europe, 81
channels, 99
chicken, 92, 101, 102
China, 175
City, 143, 155, 173, 233
clarity, 41, 219
classes, 88, 164, 175, 212, 213, 220, 231
classification, 12, 114
cloning, 143
closure, 72, 74, 162
clustering, viii, 86, 90, 91, 92, 93, 94, 95, 96, 97,
100, 101, 102, 103, 104, 105, 106, 107, 108,
109
clusters, 91, 92, 93, 94, 96, 97, 98, 101, 102, 104,
105, 106, 107
codes, 4, 130
coding, viii, 86, 87, 88, 89, 90, 94, 97, 98, 99,
102, 106, 107
cognition, 229
coherence, 88, 89, 99
Index 236
communication, 59, 69, 78, 79, 173, 220, 223
compatibility, 122
compensation, 90
competition, 60
complement, 2
complexity, viii, 86, 96, 132, 150, 156, 164
components, 59, 85, 88, 89, 94, 96, 97, 101, 105,
106, 108, 136, 161, 165, 166, 171, 172, 186,
193, 211, 217, 218, 219
comprehension, 80
compression, viii, 41, 86, 88, 89, 90, 91, 92, 94,
96, 97, 98, 99, 100, 101, 102, 104, 105, 106,
107, 108, 109, 110, 111, 193, 207
computation, viii, 57, 59, 63, 64, 65, 68, 69, 75,
77, 78, 79, 80, 94, 96, 129, 130, 136, 141, 148,
149, 162, 219, 230
computer graphics, vii, viii, 57, 63, 110, 129,
131, 146, 156, 157, 159, 174
computing, 58, 94, 127, 131, 148, 179, 225
conception, vii, 58
concrete, 234
conditioning, 20
conduction, 211
configuration, 20, 41, 54, 93, 125, 126, 146, 162,
164
conflict, 120, 225
conflict resolution, 225
Congress, 82
conjecture, 206
connectivity, 85, 86, 87, 88, 89, 90, 91, 92, 94,
96, 97, 104, 108, 109, 158, 159, 161, 165, 166
consensus, 114, 115, 121
conservation, 3, 12, 54
constraint-based design, ix, 157
construction, vii, ix, 1, 16, 73, 82, 130, 161, 163,
164, 165, 167, 169, 173, 175, 177, 178, 182,
185, 187, 194, 195, 200, 201, 208, 220, 230,
234
consumption, 2
continuity, 136, 159, 178, 179, 180, 186
contour, 78, 134, 183
control, ix, 24, 27, 28, 29, 32, 33, 64, 66, 68, 122,
125, 130, 131, 132, 133, 141, 152, 159, 165,
177, 178, 179, 180, 184, 185, 186, 187, 190,
191, 192, 197, 198, 199, 200, 201, 202, 203,
204, 205, 208, 213
convergence, 93, 137, 192, 220
correlation, 88, 98, 213
costs, 193
creativity, 61, 63
cues, 105, 106, 107
cultural differences, 127
culture, 2, 3, 121
curiosity, 115, 119
customers, 80, 167
Czech Republic, 81
D
damping, 23
dance, 100
data set, 138, 158
data structure, 90, 94, 150
database, 149, 223, 233
decay, 121
decision making, 220
decisions, 233
decoding, 98
decomposition, 87, 90, 91, 94, 95, 96, 97, 101,
105, 106, 107, 109, 132, 142, 165, 219, 230
definition, viii, 58, 69, 99, 113, 115, 122, 159,
166, 167, 169, 170, 205, 216
deformation, viii, 67, 76, 77, 78, 97, 129, 130,
131, 133, 135, 138, 139, 162, 166, 173
degradation, 90, 105
density, 158
derivatives, 179, 216, 231
designers, vii, viii, 54, 57, 63, 80, 210, 211, 213,
216, 223
detection, 97, 107, 161, 162, 171
deviation, 130, 135
diet, 3
differentiation, 146, 185
diffusion, 12, 56
dimensionality, 142
disappointment, 119
discipline, vii, 1
discrete data, 159
displacement, 89, 99
disposition, 118
distress, 119
distribution, 2, 12, 133
divergence, 39
dominance, 119
drawing, 15, 163
DynaFeX, viii, 113
E
ears, 123
ecstasy, 117
editors, 156, 208
Education, 56, 82
educational objective, 3
elastic deformation, 22, 28, 33
e-learning, 61
electricity, 217
electromagnetic, 59, 62, 63, 64
emotion, viii, 113, 114, 115, 116, 117, 118, 119,
120, 121, 122, 123, 124, 125, 126, 127, 133
emotional state, 115, 121, 123, 126
emotions, viii, 113, 114, 115, 116, 117, 118, 120,
121, 122, 123, 124, 125, 126, 133
employment, 172
Index 237
encoding, 86, 88, 89, 90, 97, 99, 100, 102, 103,
105, 106
energy, 93, 95, 115, 217, 225
England, 174
entropy, 88, 147, 156
environment, viii, 54, 55, 57, 59, 63, 65, 70, 71,
72, 77, 79, 81, 82, 83, 119, 149, 210, 211, 212,
213, 232
estimating, 136
Europe, 55
European Union, 2
evolution, vii, 1, 2, 3, 146, 216
excitation, 122
execution, 92, 163, 214, 215, 225
expressiveness, 114
extraction, 161, 162, 171, 175, 230, 231
extrapolation, 116, 117
extrusion, 66, 149
F
Facial Action Coding, 130, 142
facial expression, viii, 87, 100, 113, 114, 115,
117, 120, 121, 122, 123, 124, 125, 126, 127,
129, 130, 131, 132, 133, 138, 139, 141, 142,
143, 144
facial muscles, 132, 133
family, 116, 117
FBS ontology, x, 209, 210, 216, 218, 221, 231
fear, 115, 117
feedback, 132, 217
FEM, 76, 77, 78, 80
finance, 218
finite element method, 192
flexibility, 163, 178
flight, 61, 62
floating, 85, 88
flour, vii, 1, 3, 4, 8, 9, 12, 22
fluid, viii, 57, 78, 79, 80
fluid dynamics, viii
food, 39
freedom, x, 63, 72, 171, 177, 178, 198, 200, 205
friction, 11
functional dynamics, ix, 145, 146, 148, 152, 153,
155
funds, 218
furniture, 61, 166
G
gene, ix, 177, 178
generalization, 126, 174, 205, 206
generation, viii, ix, 12, 57, 62, 63, 87, 107, 116,
117, 120, 122, 125, 126, 156, 157, 217, 225,
228, 230
goals, 211, 212, 213, 216, 218, 221
gold, 168
grains, 12, 22
graph, 91, 92, 95, 96, 97, 101, 105, 109, 147,
157, 158, 163, 164, 165, 184, 186, 197, 224,
230, 231
gravity, 22
Greece, 157
grief, 117
grounding, 234
grouping, 90, 91, 92, 95
groups, 17, 22, 98, 122, 123
growth, 118, 196
guidance, 81, 130
guidelines, 114, 126
H
Hamiltonian, 131, 133, 135, 137, 139, 141, 143
happiness, 118
height, 4, 41, 122, 148, 151, 152, 154, 165, 204,
211
homogeneity, 94
human activity, 221
humidity, 62
hybrid, 158
hypercube, viii, 113, 114, 117, 118, 126
I
ideal, 107, 146, 174
identification, 106
illumination, 13, 20, 34, 41, 135
illusion, 59
IMA, 206
image, viii, 3, 35, 36, 37, 38, 39, 41, 57, 60, 61,
81, 86, 91, 130, 134, 135, 136, 137, 138, 139,
141, 142, 143, 144, 147, 151, 152, 158, 193
image processing, viii, 57
imagery, 59
images, 3, 12, 15, 34, 37, 38, 41, 42, 54, 59, 60,
130, 132, 134, 135, 137, 138, 139, 141, 142,
143, 147
immersion, 59, 60
implementation, viii, 59, 63, 68, 73, 78, 80, 94,
96, 113, 123, 138, 146, 152, 177, 230
inclusion, 41, 158
independence, 220
India, 145
indices, 216
individual character, 118
industry, vii, 57, 210
inequality, 196
initial state, 20, 27, 162
insertion, 90
instability, 120
instruments, viii, 57, 63
integration, viii, 57, 63, 69, 80, 81
Index 238
interaction, viii, 57, 61, 62, 63, 69, 78, 80, 81,
146, 152, 153, 158, 211, 212, 213
interactions, x, 130, 209, 210, 211, 213, 216, 223
interface, 63, 81, 82, 228
interference, 62, 70
internet, 110, 218
interoperability, viii, 113, 228, 233, 234
interval, 117
intervention, 130
inventions, 3
iris, 122
iron, 10, 25, 168
Italy, 57
iteration, 126
J
jaw, 123, 133
joints, 23, 25, 26, 27, 28, 68, 69, 70, 72, 73, 106
L
language, 2, 64, 121, 223, 224, 225, 228
laws, 27, 211, 218, 225
layering, 16
learning, 54, 82, 114, 227, 232
learning environment, 82
Least squares, 109
legend, 78, 80
lens, 17, 19, 20, 39
lifetime, 119
light transmission, 211
likelihood, 95
limitation, 63, 146, 152, 166
line, 17, 39, 60, 78, 115, 122, 152, 174, 180, 201,
228
linear systems, 192
linkage, 62, 67
links, 24, 27, 28, 70, 73, 75
localization, 193
logistics, 61, 82
love, 119
M
machinery, 4, 16, 54, 55
magnetic field, 62
maintenance, 61, 225
management, 173
manipulation, 81, 82, 153, 164, 166, 185, 205
manufacturing, 61, 82, 157, 160, 161, 165, 166,
169, 170, 209
mapping, 33, 34, 109, 126, 132, 146, 216
materials science, 231
matrix, 73, 75, 95, 96, 98, 100, 135, 136, 137,
138, 139, 142, 148, 162
measurement, 13, 16, 64
measures, 18, 99, 142, 148
mechanical properties, 130
mechanical stress, 76, 77
media, 218
Mediterranean, 168
memory, 3, 54, 85, 139, 156, 192, 198, 213, 214,
215, 219, 220, 229, 230, 233
memory processes, 219, 220
Miami, 233
Microsoft, 63, 64, 100, 145
microstructure, 218
military, 61
mixing, 62
model, viii, ix, 4, 13, 15, 16, 17, 24, 29, 34, 37,
39, 58, 63, 67, 73, 79, 80, 109, 113, 114, 115,
116, 117, 118, 120, 121, 122, 123, 126, 127,
129, 130, 131, 132, 133, 134, 135, 137, 138,
139, 140, 141, 142, 144, 146, 147, 151, 152,
158, 159, 160, 161, 165, 166, 167, 169, 170,
171, 175, 212, 213, 214, 216, 218, 225, 229,
233
modeling, vii, viii, 12, 16, 57, 63, 66, 69, 77, 78,
80, 82, 114, 118, 129, 131, 133, 142, 143, 144,
147, 156, 158, 159, 160, 161, 162, 165, 166,
173, 174, 175, 232, 234
models, viii, ix, 3, 12, 58, 68, 85, 108, 110, 114,
122, 129, 130, 131, 132, 133, 135, 142, 143,
144, 146, 147, 148, 155, 157, 158, 159, 160,
161, 165, 166, 167, 169, 170, 171, 172, 173,
175, 210, 211, 221, 222, 223, 224, 225, 228,
230, 233
money, 58
mood, 114, 115, 116, 118, 119
morphology, 175
motion, viii, ix, 57, 63, 69, 80, 85, 86, 87, 88, 89,
90, 91, 92, 94, 95, 96, 97, 98, 100, 101, 102,
105, 106, 107, 110, 111, 114, 121, 122, 129,
130, 131, 132, 133, 134, 135, 136, 137, 139,
141, 142, 144, 156, 217
motivation, 157
movement, 10, 13, 18, 20, 22, 24, 25, 27, 28, 29,
33, 62, 67, 68, 69, 71, 86, 121, 130, 131, 133,
150, 201
MRI, 81
multimedia, vii, 85
multiple interpretations, 228
muscles, viii, 114, 122, 129, 130, 133, 141
N
navigation system, 82
neural network, 132, 230, 231
neural networks, 231
nodes, 94, 164
noise, 90, 121
nucleus, vii, 1
Index 239
O
observations, viii, 86, 97
obstruction, 152
occlusion, ix, 145, 146, 148, 149, 150, 151, 152,
153, 155
oil, 3
olive oil, 55
operator, 138
optimism, 119
optimization, 132, 136, 137, 147, 165, 175, 226,
233
orientation, 70, 73, 150
originality, 168
P
Pacific, 109
panoramic maps, 145, 146, 147, 153
parallel processing, 110
parallelism, 171
parameter, ix, 73, 74, 117, 122, 123, 125, 126,
130, 133, 134, 135, 136, 138, 139, 141, 143,
145, 146, 148, 155, 165, 167, 169, 186, 190,
193
parameter estimation, 141
parameter vectors, 138, 139, 141
parameters, viii, ix, 20, 34, 58, 70, 73, 74, 80, 86,
89, 118, 122, 123, 129, 130, 131, 132, 133,
135, 136, 137, 138, 139, 140, 141, 142, 162,
165, 166, 225
Pareto, 227
Pareto optimal, 227
partial differential equations, 192
particles, 12, 22
partition, 91, 92, 171, 183, 185, 196, 200
passive, 130, 133
path planning, 147
pattern recognition, 230
PCA, 89, 90, 104, 106, 135, 138
pelvis, 92
personality, 114, 115, 116, 119, 122
personality traits, 119
pessimism, 119
phonemes, 121, 122, 123
photographs, 4, 12, 15, 36, 39, 144
physical environment, 118, 119
physics, 27
planning, 156
plaque, 169
poor, 105
poor performance, 105
Powell-Sabin splines, ix, 177, 178, 182, 184, 186,
187, 191, 194, 197, 198, 199, 200, 201, 205,
206, 207, 208
power, 207
prediction, 86, 87, 89, 90, 91, 96, 97, 105, 107,
111, 193
pressure, 78, 79, 80
production, vii, 2, 3, 54, 55, 58, 165, 166
productivity, 230
program, 16, 18, 27, 54, 69, 231
programming, 64, 182
propagation, 162, 164, 193
protocol, 213, 233
prototype, ix, 63, 80, 145, 169
psychology, 232
pupil, 122
Q
quality standards, vii, 57
quantization, 88, 109
quantum electrodynamics, 218
R
radius, 165, 204
rain, 22, 210
range, ix, 24, 27, 29, 41, 54, 64, 95, 121, 141,
157, 158, 160, 177, 190, 221, 225
ray-tracing, 147
real time, 59, 62, 66, 70, 151
realism, viii, 15, 34, 39, 42, 57, 62, 63, 132
reality, vii, 1, 2, 54, 55, 59, 60, 61, 63, 65, 67, 69,
70, 71, 77, 78, 79, 80, 81, 82, 83, 159, 223
reason, vii, 1, 58, 114, 161, 162, 170
reasoning, 174, 228, 231
recall, 178
recognition, 70, 71, 123, 129, 143, 162
reconstruction, 2, 15, 16, 56, 88, 89, 90, 97, 99,
100, 101, 158, 168, 170, 171, 172, 175
recovery, vii, 1, 2, 12, 56
recreation, 13, 54
rectangular domains, 178
redundancy, 89
reference frame, 65, 69, 73, 77, 78, 79
reflection, 220, 233
region, viii, 92, 123, 125, 129, 130, 134, 136,
139, 141, 142, 143, 146, 148, 149, 150, 151,
152, 153, 155, 171, 172, 193
regulation, 54
regulations, 159
relationship, ix, 2, 101, 129, 130, 138, 211, 217,
220
relaxation, 162
reliability, 61, 221
relief, 9, 17
repair, 61
reproduction, 143, 170
Requirements, 214
residual error, 89, 97, 98, 142
residuals, 88
Index 240
resolution, 33, 41, 60, 62, 63, 64, 193, 205
resources, 59, 86
returns, 162, 166
rings, 25, 27, 169
robotics, 61
routines, 65
routing, 161, 165, 174
S
sadness, 115, 117, 118
sampling, 108, 158
scaling, 201, 202, 222
schema, 221, 232
scientific computing, 160
search, 2, 214, 225, 226, 233
searching, 163
selecting, ix, 20, 66, 93, 145, 146
semantics, 147, 159, 216, 219, 224
sensation, 37
sensitivity, 227
sensors, 70, 80
sensory experience, 212
separation, 24, 122
shape, vii, ix, 57, 77, 78, 95, 114, 132, 133, 134,
142, 152, 158, 160, 161, 162, 165, 166, 168,
174, 177, 179, 184, 190, 200, 205, 230, 231
sharing, 220
simulation, viii, 23, 27, 57, 58, 63, 67, 68, 69, 70,
71, 72, 73, 77, 78, 79, 82, 83, 141, 160, 206,
223, 225
Singapore, 85, 129
skeleton, 161, 162
skills, 167
skin, viii, 114, 129, 130, 132, 133
smoothness, 90, 98, 99, 101, 158, 178, 206
snakes, 132
social development, vii, 1
software, vii, 20, 34, 54, 57, 63, 64, 68, 213, 217,
221
space, viii, ix, 23, 62, 64, 65, 66, 70, 73, 74, 80,
86, 88, 92, 95, 108, 113, 114, 116, 117, 118,
120, 121, 122, 123, 126, 136, 143, 145, 146,
148, 163, 164, 178, 179, 180, 181, 185, 192,
193, 194, 195, 197, 200, 202, 203, 213, 214,
217, 219, 220, 224, 226, 227, 232
Spain, 1, 3, 4, 55, 56
spatial information, 155
specialisation, 220
spectrum, 159
speech, viii, 113, 121, 123, 125, 126, 132, 144
speed, 12, 13, 18, 141, 142, 162, 169, 217, 218,
220
sports, 59
stability, 184, 188, 196, 202
stakeholders, 220, 223
standards, 223, 233
static geometry, 78
steel, 77
stimulus, 116, 117, 118, 121
stock, 6
storage, 12, 85, 193, 219
strategies, 68, 77, 78, 126, 225, 233
stress, 77, 78
structural modifications, 61
subgroups, 22, 23
surface area, 4
symbols, 98
symmetry, 171
synchronization, viii, 113, 125
synthesis, viii, 3, 39, 129, 131, 142, 144, 220,
233
T
taxonomy, 118, 166, 175
teeth, 122
teleconferencing, 143
teleology, 210
television, 59
temperature, 62, 78, 79, 80
thermal analysis, 225
thoughts, 213, 232
three-dimensional model, 147, 173
three-dimensional space, 155
threshold, 89, 97, 98, 101, 102, 104, 152, 172
timing, 54
tissue, 133
tones, 41
topology, 86, 91, 94, 97, 101, 162, 165, 211
torus, 186, 187, 197
tracking, viii, 59, 61, 62, 63, 65, 129, 130, 131,
132, 134, 135, 136, 138, 139, 140, 141, 142,
143, 144
trade-off, 90, 98, 193
tradition, 55
training, viii, 54, 60, 129, 130, 134, 135, 138,
139, 141, 142
transformation, 27, 33, 65, 73, 74, 75, 78, 79, 97,
133, 137, 148, 217, 218, 219, 225
transformation matrix, 73, 74, 97, 137, 148
transformations, x, 20, 69, 70, 75, 85, 90, 97, 98,
107, 166, 209, 210, 214, 220, 224, 225, 231
transition, 20, 54, 121, 126
transitions, 54
translation, 75, 78, 79, 133, 141
transmission, 25, 64, 85, 88, 99, 102
transmits, 25
transparency, 146, 150
transportation, 217
trees, 145, 164
triangulation, ix, 96, 147, 177, 178, 180, 181,
182, 183, 184, 185, 186, 187, 188, 189, 190,
191, 192, 193, 194, 195, 196, 197, 198, 199,
200, 201, 205, 207
trust, 115, 117
Index 241
U
UNESCO, 2
uniform, 151, 186, 208
United Kingdom, 2, 82, 173
updating, 12
urban areas, ix, 145, 146, 147, 151, 155
V
validation, 126
variables, 164, 165, 211, 212, 214, 217, 225, 226
vector, 65, 73, 74, 75, 89, 104, 111, 117, 118,
120, 124, 125, 133, 134, 135, 136, 138, 139,
179, 184
velocity, 70, 79, 121
Venezuela, 56
vibration, 78
virtual actors, viii, 113
virtual reality, vii, 1, 54, 55, 59, 83, 159
vision, 17, 81, 146, 147, 156
visual system, 130
visualization, ix, 2, 35, 63, 67, 78, 80, 82, 121,
123, 148, 155, 157, 177, 190, 205
voicing, 121
W
Wales, 173
wavelet, 89, 90
wear, 64, 80
websites, 2
wheat, 12, 22
wind, 4, 5, 6, 7
windmill, vii, 1, 3, 4, 5, 6, 7, 10, 11, 13, 14, 15,
16, 17, 18, 20, 28, 34, 37, 39, 41, 54
windows, 6, 17, 20, 22, 133
wood, 34, 64, 169
word recognition, 228
workers, 59
workload, 91
Z
zinc, 4

Wright Jaron - Computer Animation

Comments

Content

Sponsor Documents

Recommended