Optical Flow
原课程链接:https://www.cc.gatech.edu/~hays/compvision/
维基百科定义:https://en.wikipedia.org/wiki/Optical_flow#Methods_for_determination
百度百科定义:https://baike.baidu.com/item/%E5%85%89%E6%B5%81/7013666
Video and Motion
A video is a sequence of frames captured over time
Now our image data is a function of space (x, y) and time (t)
![](https://img.haomeiwen.com/i16468337/d4d77173ef31e64a.png)
Sometimes, motion is the only cue
Even “impoverished” motion data can evoke a strong percept
![](https://img.haomeiwen.com/i16468337/e70f7713af185e43.png)
Motion Estimation: Optical Flow
![](https://img.haomeiwen.com/i16468337/faa93d2dd851b172.png)
![](https://img.haomeiwen.com/i16468337/88f80d7f03056278.png)
Problem Define
How to estimate the motion of pixels from image I(x, y, t) to I(x, y, t+1)
![](https://img.haomeiwen.com/i16468337/b2aa947ec59ed637.png)
Key Assumption:
• color constancy:
– a point in I(x,y,t) looks the same in I(x,y,t+1)
– For grayscale images, this is brightness constancy
• small motion:
– Points do not move very far
Then we can obtain the optical flow constrains:
![](https://img.haomeiwen.com/i16468337/03f02c193199282a.png)
Brightness Constancy Constraint (equation):
![](https://img.haomeiwen.com/i16468337/f0d4ede07d03aac5.png)
Small Motion: (u and v are less than 1 pixel, or smooth):
Taylor series expansion of I:
![](https://img.haomeiwen.com/i16468337/bb5a7fbd3ac5ddbf.png)
Combining the two equations, we have:
![](https://img.haomeiwen.com/i16468337/d8c2b212a53cf215.png)
![](https://img.haomeiwen.com/i16468337/44ba1aeb384bd8cb.png)
In the limit as u and v go to zero, this becomes exact:
![](https://img.haomeiwen.com/i16468337/65e72d827d42c435.png)
Brightness constancy constraint equation
![](https://img.haomeiwen.com/i16468337/73b4c520f0055f2b.png)
How many equations and unknowns per pixel?
there is only one equation, but two unknows(u, v)
How to get more equations for a pixel?
Use spatial coherence constraint.
Spatial Coherence Constraint(equation):
Assum the pixel's neighbors have the same(u, v)
if we use a 5x5 window, then we can get 25 equations per pixel:
![](https://img.haomeiwen.com/i16468337/92f1ad3a5e1346ed.png)
Least squares solution for d given by:
![](https://img.haomeiwen.com/i16468337/171d033429ce136e.png)
![](https://img.haomeiwen.com/i16468337/230d70d3263a712e.png)
When is this solvable?
![](https://img.haomeiwen.com/i16468337/c57ac9a088883c6b.png)
if ATA is not invertible, will render Aperture Problem
![](https://img.haomeiwen.com/i16468337/dfbaa1d5561cae9d.jpg)
Criteria for Harris Corner Detector!!!
![](https://img.haomeiwen.com/i16468337/99c236da9551edfe.png)
![](https://img.haomeiwen.com/i16468337/754a1a08dbaddea1.png)
Errors in Lucas-Kanade
A point does not move like its neighbors
Motion segmentation
Brightness constancy does not hold
Do exhaustive neighborhood search with normalized correlation -tracking features – maybe SIFT – more later….
The motion is large (larger than a pixel)
Not-linear: Iterative refinement
Local minima: coarse-to-fine estimation
Revisiting the small motion assumption
Is this motion small enough?
Probably not—it’s much larger than one pixel
How might we solve this problem?
Coarse-to-fine optical estimation
![](https://img.haomeiwen.com/i16468337/c3b6514da6075095.png)
![](https://img.haomeiwen.com/i16468337/41c790820a070119.png)
![](https://img.haomeiwen.com/i16468337/1d4e38e659a0d25d.png)
![](https://img.haomeiwen.com/i16468337/b34d235e1eb45956.png)