Target Tracking

Published on January 2017 | Categories: Documents | Downloads: 68 | Comments: 0 | Views: 331
of 47
Download PDF   Embed   Report

Comments

Content

Detection and Tracking of Moving
Objects from a Moving Platform

Gérard Medioni
Institute of Robotics and Intelligent Systems
Computer Science Department
Viterbi School of Engineering
University of Southern California

Problem Definition
• Scenario: rigidly moving objects + moving camera

• Goal
• Motion segmentation: motion regions / background area
• Tracking of multiple objects: consistent track(s) over time
• Geo-registration and Geo-tracking: Geo-referenced mosaic and tracks

Scenario example 1 – moving cameras

Moving cameras

Image stabilization

Motion
segmentation

Tracking
Mosaic+Tracks
+Tracks
Mosaic

Scenario example 2 - moving cameras with a map

Moving camera
Map
Image stabilization
Geo registration

Global data
association

Motion
segmentation
Tracking

Geo-referenced
Geo-referenced
mosaic++tracks
tracks
mosaic

Challenges & Applications
• Information sources
• Pixel colors + 2D coordinates
• Object model information (optional)

• Difficulties
• Camera motion
• 3D Static structures (parallax)
• Multiple moving objects

• Applications
• Video surveillance
• Video compression and indexing
• …

Outline
 Introduction
 2D Motion segmentation
• Tracking of multiple moving objects
• Geo-registration and geo-tracking
• Summary and Discussion

Motion Segmentation – Overview
• Task: to segment motion region and background

• Assumptions
• General camera motion
• Distant scene
• Textured background

Feature Extraction & Matching
• Salient parts of the scene
• Extraction
• Harris corners
• Multi-scale
• Multi-orientation
• Sub-pixel accuracy

• Matching
• Small inter-frame motion
• Gray-scale windows
• Cross correlation

• Large viewpoint change
• Gradient histogram
• Vector angle

Multiple Image Registration
• Frame motion model
• Assumptions:
• Small inter-frame motion
• Distant planar scene

• 2D affine transform

• Robust estimation
• Random Sample Consensus
(RANSAC)
• Keep the model with the
largest number of inliers

• Non-linear refinement over
the inliers

Ap1 = p 2

 A11
A
 21
 0

A12
A22
0

A13  u1  u2 
A23   v1  =  v2 
1   1   1 

Motion Segmentation
• Two-frame pixel-level segmentation?
• Segmentation within a temporal window
• Accumulate the pixels warped from adjacent frames
• K-Means to find the most representative pixel
• Frame differencing and thresholding: |Ioriginal-Imodel|>ΔI

Frame t
Frame t-w

t: reference frame
w: half size of the window

Frame t+w
10/72

Experimental Results (1)

Original
Images

Motion
Prob.
Maps

Initial
Detection
Results

Tracking
Results

11/72

Experimental Results (2)

Original
images

Motion
Prob.
Maps

Initial
Detection
Results

Tracking
Results

Experimental Results (3)

A synthesized video without motion regions

Outline
 Introduction
 2D Motion segmentation
 Tracking of multiple moving objects
• Geo-registration and geo-tracking
• Summary and Discussion

Problem statement- multiple target tracking
• Input: foreground regions in each frame
• Output: trajectories with consistent track IDs
• Challenges:
• Noisy foreground regions
• Occlusions

Problematic underlying assumption
• One-to-one assumption
• One target can correspond to at most one observation
• One observation can be associated to at most one target
• Appropriate to punctual observations

• Underlying one-to-one assumption may not stand for visual tracking

Radar

UAV camera

Stationary camera

Related work

• MAP, multi-scan, uniform prior (no missing or false detection)

• (Cong et al., 04) Approximate association probabilities in JPDAF
• MMSE, MCMC outperforms JPDAF, one-scan/muliti-scan

• (Sastry, et.al 04) MCMC to compute joint DA with unknown number of
targets
• MAP, multi-scan, outperforms MHT, consider temporal association only

• (F.Dellaert et.al 03) MCMC to SfM without correspondence
• MMSE, Single scan, similar to JPDAF

• Our method: overcome the one-to-one assumption
• MAP, multi-scan, consider both spatial and temporal association

One-to-one assumption

• (Pasula et al., 99) Gibbs sampling to compute joint DA

Anatomy of the problem
• “Explain” foreground regions:

•It is hard at one frame without using any model information
•It is solvable if smoothness in motion and appearance is used

Explanation of foreground regions
• Two way of explain foreground regions
Precisely

Approximately
Labeling of foreground regions

• The label(s) of a pixel indicates the
track ID
• Each pixel can have multiple labels
to represent occlusions
• Accurate but expensive!

Cover of foreground regions

• A set of shapes (rectangles)
• Each rectangle can have overlap
with others to represent occlusions
• Approximate but Efficient!

Our formulation
• Given
• A set of noisy observations (foreground regions)

• Find
• A cover ω of foreground regions over time

τ

k

is a sequence of shapes (rectangles)

Solution space
• Solution space Ω is a collection of spatio-temporal covers of
observation Y.
• “Joint association event”

ω = {τ 1 ,τ 2 K,τ K }

• Two kinds of data association
• Spatial data association - change the cover at one instant
• Temporal data association - form consistent tracks

• Uncovered area belongs to false alarms

(a) Observations Y

(b) One possible cover of Y

Bayesian formulation
• MAP estimate

ω* = arg max( p(ω | Y ))
p (ω | Y ) ∝ p (Y | ω ) p (ω )
Prior model p(ω)
• Few number of long tracks
• One track should have little overlapping with other track unless necessary

p(ω ) = p ( L) p( K ) p(O)
• Likelihood p(Y | ω)
• Smoothness in both motion and appearance
• Areas of uncovered false alarms p(F)
K |τ k |−1

p (Y | ω ) = p ( F )∏ ∏ L(τ k (ti +1 ) | τ k (ti ))
k =1 i =1

Motion likelihood
Appearance likelihood

Motion and appearance likelihood

• Motion

xtk+1 = Ak xtk + w
y = H x +v
k
t

k

k
t

w ~ N (0, Q)
v ~ N (0, R)

τk (ti+1)

τk (ti+1)

• Appearance
LM (τ k (ti +1 ) | τ k (ti )) ≡ p(τ k (ti +1 ) | τ k +1 (ti ))

LA (τ k (ti +1 ) | τ k (ti )) = (1/ z3 ) exp ( −λ3 D(τ k (ti ),τ k (ti +1 ) )
D (τ k (ti ),τ k (ti +1 )
Kullback- Leibler (KL)
distance between two RGB
color histograms

MAP of full posterior p(ω |Y)
• MAP estimate of such a posterior is not a trivial task
• Even to determine the parameters in such a posterior is not an
easy task

p(ω | Y ) ∝ exp {C0 Slen − C1 K − C2 F − C3 Solp − C4 S app − Smot }
MAP is equivalent to minimize an energy function.

• Solution to MAP:
• Sampling based method to avoid enumerating all possible solutions
• Two types of proposal moves (temporal and spatial moves)
• Symmetric temporal information

Markov Chain Monte Carlo
• Basic idea: construct a Markov chain which will converge to
the target distribution
• State of the Markov chain is defined in Ω
• Transition of the Markov chain is guided by a proposal distribution

• Metropolis-Hasting algorithm
• Propose a new state ω’ from the previous state ω(i)

ω ' ~ q(ω ' | ω (i ) )
• Accept ω’ with probability

 p(ω ')q(ω ( i )

| ω ') 
min 1,
(i )
(i ) 
 p (ω )q (ω ' | ω ) 

• Properties
• Don’t have to compute the global p(ω), but the local ratio p(ω’)/ p(ω)
• For MAP, don’t have to keep the whole chain, but the current state and the
best one

Metropolis-Hasting algorithm
1. Initialize ω (0) .
2. For i = 0 to N -1

N is the length of Markov chain

- Sample u ∼ U [0,1]
- Propose ω ' ∼ q(ω ' | ω (i ) ).

q() is called the proposal distribution

(i )

p
(
ω
')
q
(
ω
| ω ') 
(i )
- Compute A(ω , ω ')= min 1,
(i )
(i ) 
ω
ω
ω
p
(
)
q
(
'
|
)


- If u < A(ω ( i ) , ω ')
else

ω ( i +1) = ω '
ω (i +1) = ω (i )

Endfor

The chain {ω (0) , K , ω ( N ) }N →∞ → p(ω )

Two types of q(ω’ | ω)
• Temporal moves and spatial
Birth/Death

• Data-driven proposal

q(ω ' | ω ) → q(ω ' | ω , D)
• Spatial moves are made only after

Temporal Moves

moves to drive the Markov chain

enough temporal information is

Extension/
Reduction

Split/Merge

Switch

• Symmetric temporal information


Forward and backward (e.g. extension)



Deal with occlusions at the very
beginning

Spatial Moves

collected
Segmentation
/Aggregation

Diffusion

MCMC Data Association
1. Initialize ω (0) .
2. For i = 0 to N -1

- Sample u ∼ U [0,1]
- Sample if i < ε ⋅ N , ω ' ∼ qTemporal (ω ' | ω ( i ) )
else

ω ' ∼ qAll (ω ' | ω (i ) ).

(i )

p
(
ω
')
q
(
ω
| ω ') 
(i )
- Compute A(ω , ω ')= min 1,
(i )
(i ) 
 p(ω )q(ω ' | ω ) 

- If u < A(ω ( i ) , ω ')
else
Endfor

ω ( i +1) = ω '
ω (i +1) = ω (i )

Determining Parameters
• Determine the parameters in the full posterior
• Casual setting makes ground truth p(ωgt|Y) even much lower than the
“solution”.
• Take advantage of the property of MCMC

p (ω | Y ) ∝ exp {C0 Slen − C1 K − C2 F − C3 Solp − C4 S app − S mot }

Degenerate the ωgt to ω’

p(ω gt )
p (ω ')

 A [C0 , C1 , C2 , C3 , C4 ] ≤ b

⇒ C0 , C1 , C2 , C3 , C4 ≥ 0
max(C + C + C + C + C )
0
1
2
3
4


≥1

Linear Programming to solve it
(GNU Linear Programming Kit)

Simulation experiments
• Settings





K (unknown number) moving discs in 200x200
Independent color appearance and motion
Static occlusion and inter-occlusion
False alarms

Original video

Tracking result

Simulation experiments
• Quantitative comparison





MHT (I. Cox94), JPDAF (J.Kang03), Temporal only
STDA score in VACE-II eval
Same motion and appearance likelihood
Average of multiple sequence and multiple runs

FA=0, W=50, 10K MCMC iterations

K=5, W=50, 10K MCMC iterations

Simulation experiments
• Online implementation
• Sliding window W
• Initialize ωt with ω*t-1

Online vs. offline comparison T=1000

Real Scenarios

Experiments

CLEAR 320x240

Vivid-II 320x240

Experiments
• Can handle occlusion at the beginning by using symmetric
temporal information

Outline
 Introduction
 2D Motion segmentation
 Tracking of multiple moving objects
 Geo-registration and geo-tracking
• Summary and Discussion

Geo-registration
• Use 2D homography to

compensate inter-frame (2-




−1

H i +1, M = ( H i ,i +1 ) H i , M H update

view) motion

Hi,i+1
Hi,M

Hi+1,M

Hupdate

• Refine the homography

between map and images

37/72

Geo-registration results

Geo-mosaicing 2000 frames on top of the reference frame.

Experimental results
• Results are shown on two UAV data sets
• Map is acquired from Google Earth®
• Geo-registration is performed every 50 frames
• Local data association (MCMCDA) window 50 frames

Geo-registration

Without geo-refinement

With geo-refinement

Experimental results

Experimental results

System implementation
• C++ implementation
• Xeon Dual Core P4 3.0GHz
• Preliminary time performance
Procedure

Time (seconds) on 320x240

Image registration

~ 0.25

Motion detection (moving cameras)

~ (2 / 0.1) (CPU / GPU)

Object detection after motion
segmentation

~0.25

Geo-registration

~ 6 every 50 frames

Tracking

~ 0.4

Total

~ 1 ( GPU)

43/72

Outline
 Introduction
 2D Motion segmentation
 Tracking of multiple moving objects
 Geo-registration and geo-tracking
 Summary and Discussion

Summary & Discussion
• Detection and tracking in dynamic scene





Moving camera + rigid moving objects
2D motion segmentation and geometric analysis of background
Spatial and temporal (2D+t) data association of moving objects
Tracking with Geo-registration

• Highlights
• Solution to practical problems in detection and tracking area

• Encouraging results and extensive applications

• Future directions
• Multi-view geometry + object recognition
• Automatically determination of applicable tasks

Reference


Qian Yu and Gérard Medioni, “A GPU-based implementation of Motion Detection from a
Moving Platform”, to appear in IEEE workshop on Computer Vision on GPU, in conjunction
with CVPR’08



Qian Yu and Gérard Medioni, “Integrated Detection and Tracking for Multiple Moving
Objects using Data-Driven MCMC Data Association,” IEEE Workshop on Motion and Video
Computing (WMVC'08), 2008



Qian Yu, Gérard Medioni, Isaac Cohen, "Multiple Target Tracking Using Spatio-Temporal
Monte Carlo Markov Chain Data Association" IEEE Conference on Computer Vision and
Pattern Recognition, 2007 (CVPR'07), pp.1-8



Qian Yu, Gérard Medioni, "Map-Enhanced Detection and Tracking from a Moving Platform
with Local and Global Data Association," IEEE Workshop on Motion and Video Computing
(WMVC'07), 2007



Yuping Lin, Qian Yu, Gerard Medioni "Map-Enhanced UAV Image Sequence Registration"
Workshop on Applications of Computer Vision (WACV'07), 2007



Qian Yu, Isaac Cohen, Gérard Medioni and Bo Wu "Boosted Markov Chain Monte Carlo
Data Association for Multiple Target Detection and Tracking," Proceedings of the 18th
international Conference on Pattern Recognition (ICPR'06), Vol. 2, pp. 675-678.

Q&A

Thank you!

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close