Categories
code machine learning opencv programming python vision

Aligning faces with py opencv-dlib combo

This is my first trial at using Jupyter notebook to write a post, hope it makes sense.
I’ve recently taught a class on generative models: http://hi.cs.stonybrook.edu/teaching/cdt450
In class we’ve manipulated face images with neural networks.
One important thing I found that helped is to align the images so the facial features overlap.
It helps the nets learn the variance in faces better, rather than waste their “representation power” on the shift between faces.
The following is some code to align face images using the excellent Dlib (python bindings) http://dlib.net. First I’m just using a standard face detector, and then using the facial fatures extractor I’m using that information for a complete alignment of the face.
After the alignment – I’m just having fun with the aligned dataset 🙂

Categories
code graphics opencv python vision work

Revisiting graph-cut segmentation with SLIC and color histograms [w/Python]

As part of the computer vision class I’m teaching at SBU I asked students to implement a segmentation method based on SLIC superpixels. Here is my boilerplate implementation.
This follows the work I’ve done a very long time ago (2010) on the same subject.
For graph-cut I’ve used PyMaxflow: https://github.com/pmneila/PyMaxflow, which is very easily installed by just pip install PyMaxflow
The method is simple:

  • Calculate SLIC superpixels (the SKImage implementation)
  • Use markings to determine the foreground and background color histograms (from the superpixels under the markings)
  • Setup a graph with a straightforward energy model: Smoothness term = K-L-Div between superpix histogram and neighbor superpix histogram, and Match term = inf if marked as BG or FG, or K-L-Div between SuperPix histogram and FG and BG.
  • To find neighbors I’ve used Delaunay tessellation (from scipy.spatial), for simplicity. But a full neighbor finding could be implemented by looking at all the neighbors on the superpix’s boundary.
  • Color histograms are 2D over H-S (from the HSV)

import cv2
import numpy as np
import matplotlib.pyplot as plt
from skimage.segmentation import slic
from skimage.segmentation import mark_boundaries
from skimage.data import astronaut
from skimage.util import img_as_float
import maxflow
from scipy.spatial import Delaunay
# Calculate the SLIC superpixels, their histograms and neighbors
def superpixels_histograms_neighbors(img):
# SLIC
segments = slic(img, n_segments=500, compactness=20)
segments_ids = np.unique(segments)
# centers
centers = np.array([np.mean(np.nonzero(segments==i),axis=1) for i in segments_ids])
# H-S histograms for all superpixels
hsv = cv2.cvtColor(img.astype('float32'), cv2.COLOR_BGR2HSV)
bins = [20, 20] # H = S = 20
ranges = [0, 360, 0, 1] # H: [0, 360], S: [0, 1]
colors_hists = np.float32([cv2.calcHist([hsv],[0, 1], np.uint8(segments==i), bins, ranges).flatten() for i in segments_ids])
# neighbors via Delaunay tesselation
tri = Delaunay(centers)
return (centers,colors_hists,segments,tri.vertex_neighbor_vertices)
# Get superpixels IDs for FG and BG from marking
def find_superpixels_under_marking(marking, superpixels):
fg_segments = np.unique(superpixels[marking[:,:,0]!=255])
bg_segments = np.unique(superpixels[marking[:,:,2]!=255])
return (fg_segments, bg_segments)
# Sum up the histograms for a given selection of superpixel IDs, normalize
def cumulative_histogram_for_superpixels(ids, histograms):
h = np.sum(histograms[ids],axis=0)
return h / h.sum()
# Get a bool mask of the pixels for a given selection of superpixel IDs
def pixels_for_segment_selection(superpixels_labels, selection):
pixels_mask = np.where(np.isin(superpixels_labels, selection), True, False)
return pixels_mask
# Get a normalized version of the given histograms (divide by sum)
def normalize_histograms(histograms):
return np.float32([h / h.sum() for h in histograms])
# Perform graph cut using superpixels histograms
def do_graph_cut(fgbg_hists, fgbg_superpixels, norm_hists, neighbors):
num_nodes = norm_hists.shape[0]
# Create a graph of N nodes, and estimate of 5 edges per node
g = maxflow.Graph[float](num_nodes, num_nodes * 5)
# Add N nodes
nodes = g.add_nodes(num_nodes)
hist_comp_alg = cv2.HISTCMP_KL_DIV
# Smoothness term: cost between neighbors
indptr,indices = neighbors
for i in range(len(indptr)-1):
N = indices[indptr[i]:indptr[i+1]] # list of neighbor superpixels
hi = norm_hists[i] # histogram for center
for n in N:
if (n < 0) or (n > num_nodes):
continue
# Create two edges (forwards and backwards) with capacities based on
# histogram matching
hn = norm_hists[n] # histogram for neighbor
g.add_edge(nodes[i], nodes[n], 20-cv2.compareHist(hi, hn, hist_comp_alg),
20-cv2.compareHist(hn, hi, hist_comp_alg))
# Match term: cost to FG/BG
for i,h in enumerate(norm_hists):
if i in fgbg_superpixels[0]:
g.add_tedge(nodes[i], 0, 1000) # FG - set high cost to BG
elif i in fgbg_superpixels[1]:
g.add_tedge(nodes[i], 1000, 0) # BG - set high cost to FG
else:
g.add_tedge(nodes[i], cv2.compareHist(fgbg_hists[0], h, hist_comp_alg),
cv2.compareHist(fgbg_hists[1], h, hist_comp_alg))
g.maxflow()
return g.get_grid_segments(nodes)
if __name__ == '__main__':
img = img_as_float(astronaut()[::2, ::2])
img_marking = cv2.imread("astronaut_marking.png")
centers, colors_hists, segments, neighbors = superpixels_histograms_neighbors(img)
fg_segments, bg_segments = find_superpixels_under_marking(img_marking, segments)
# get cumulative BG/FG histograms, before normalization
fg_cumulative_hist = cumulative_histogram_for_superpixels(fg_segments, colors_hists)
bg_cumulative_hist = cumulative_histogram_for_superpixels(bg_segments, colors_hists)
norm_hists = normalize_histograms(colors_hists)
graph_cut = do_graph_cut((fg_cumulative_hist, bg_cumulative_hist),
(fg_segments, bg_segments),
norm_hists,
neighbors)
plt.subplot(1,2,2), plt.xticks([]), plt.yticks([])
plt.title('segmentation')
segmask = pixels_for_segment_selection(segments, np.nonzero(graph_cut))
cv2.imwrite("output_segmentation.png", np.uint8(segmask * 255))
plt.imshow(segmask)
plt.subplot(1,2,1), plt.xticks([]), plt.yticks([])
img = mark_boundaries(img, segments)
img[img_marking[:,:,0]!=255] = (1,0,0)
img[img_marking[:,:,2]!=255] = (0,0,1)
plt.imshow(img)
plt.title("SLIC + markings")
plt.savefig("segmentation.png",bbox_inches='tight',dpi=96)

Result

Categories
code graphics opencv vision

Laplacian Pyramid Blending with Masks in OpenCV-Python

lpb

A small example on how to do Laplacian pyramid blending with an arbitrary mask.
Enjoy
Roy

# adapted from http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_pyramids/py_pyramids.html
import cv2
import numpy as np
def Laplacian_Pyramid_Blending_with_mask(A, B, m, num_levels = 6):
# assume mask is float32 [0,1]
# generate Gaussian pyramid for A,B and mask
GA = A.copy()
GB = B.copy()
GM = m.copy()
gpA = [GA]
gpB = [GB]
gpM = [GM]
for i in xrange(num_levels):
GA = cv2.pyrDown(GA)
GB = cv2.pyrDown(GB)
GM = cv2.pyrDown(GM)
gpA.append(np.float32(GA))
gpB.append(np.float32(GB))
gpM.append(np.float32(GM))
# generate Laplacian Pyramids for A,B and masks
lpA = [gpA[num_levels-1]] # the bottom of the Lap-pyr holds the last (smallest) Gauss level
lpB = [gpB[num_levels-1]]
gpMr = [gpM[num_levels-1]]
for i in xrange(num_levels-1,0,-1):
# Laplacian: subtarct upscaled version of lower level from current level
# to get the high frequencies
LA = np.subtract(gpA[i-1], cv2.pyrUp(gpA[i]))
LB = np.subtract(gpB[i-1], cv2.pyrUp(gpB[i]))
lpA.append(LA)
lpB.append(LB)
gpMr.append(gpM[i-1]) # also reverse the masks
# Now blend images according to mask in each level
LS = []
for la,lb,gm in zip(lpA,lpB,gpMr):
ls = la * gm + lb * (1.0 - gm)
LS.append(ls)
# now reconstruct
ls_ = LS[0]
for i in xrange(1,num_levels):
ls_ = cv2.pyrUp(ls_)
ls_ = cv2.add(ls_, LS[i])
return ls_
if __name__ == '__main__':
A = cv2.imread("input1.png",0)
B = cv2.imread("input2.png",0)
m = np.zeros_like(A, dtype='float32')
m[:,A.shape[1]/2:] = 1 # make the mask half-and-half
lpb = Laplacian_Pyramid_Blending_with_mask(A, B, m, 5)
cv2.imwrite("lpb.png",lpb)
Categories
3d code opencv programming vision

Structure-from-Motion Toy Lib Upgrades to OpenCV 3

sfm toy lib
Hello again!
After a long hiatus I’m back with an update. Recently I’ve been upgrading the Structure-from-Motion Toy Library (https://github.com/royshil/SfM-Toy-Library/) to OpenCV 3.x from OpenCV 2.4.x.

Categories
code graphics opencv

Quickly: How to render a PDF to an image in C++?

Using Poppler, of course!
Poppler is a very useful tool for handling PDF, so I’ve discovered lately. Having tried both muPDF and ImageMagick’s Magick++ and failed, Poppler stepped up to the challenge and paid off.
So here’s a small example of how work the API (with OpenCV, naturally):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <iostream>
#include <fstream>
#include <sstream>
#include <opencv2/opencv.hpp>
#include <poppler-document.h>
#include <poppler-page.h>
#include <poppler-page-renderer.h>
#include <poppler-image.h>
using namespace cv;
using namespace std;
using namespace poppler;
Mat readPDFtoCV(const string& filename,int DPI) {
    document* mypdf = document::load_from_file(filename);
    if(mypdf == NULL) {
        cerr << "couldn't read pdf\n";
        return Mat();
    }
    cout << "pdf has " << mypdf->pages() << " pages\n";
    page* mypage = mypdf->create_page(0);
    page_renderer renderer;
    renderer.set_render_hint(page_renderer::text_antialiasing);
    image myimage = renderer.render_page(mypage,DPI,DPI);
    cout << "created image of  " << myimage.width() << "x"<< myimage.height() << "\n";
    Mat cvimg;
    if(myimage.format() == image::format_rgb24) {
        Mat(myimage.height(),myimage.width(),CV_8UC3,myimage.data()).copyTo(cvimg);
    } else if(myimage.format() == image::format_argb32) {
        Mat(myimage.height(),myimage.width(),CV_8UC4,myimage.data()).copyTo(cvimg);
    } else {
        cerr << "PDF format no good\n";
        return Mat();
    }
    return cvimg;
}

All you have to do is give it the DPI (say you want to render in 100 DPI) and a filename.
Keep in mind it only renders the first page, but getting the other pages is just as easy.
That’s it, enjoy!
Roy.

Categories
3d Augmented Reality code graphics Mapping opengl programming Tracking video vision

Bootstrapping planar AR and tracking without markers [w/code]

Years ago I wanted to implement PTAM. I was young and naïve 🙂
Well I got a few moments to spare on a recent sleepless night, and I set out to implement the basic bootstrapping step of initializing a map with a planar object – no known markers needed, and then tracking it for augmented reality purposes.

Categories
code Music opencv programming vision

Using Hidden Markov Models for staff line removal (in OMR) [w/code]

Screen Shot 2015-01-24 at 10.11.00 PM

So lately I’m into Optical Music Recognition (OMR), and a central part of that is doing staff line removal. That is when you get rid of the staff lines that obscure the musical symbols to make recognition much easier. There are a lot of ways to do it, but I’m going to share with you how I did it (fairly easily) with Hidden Markov Models (HMMs), which will also teach us a good lesson on this wonderfully useful approach.

OMR has been around for ages, and if you’re interested in learning about it [Fornes 2014] and [Rebelo 2012] are good summary articles.
The matter of Staff Line Removal has occupied dozens of researchers for as long as OMR exists; [Dalitz 2008] give a good overview. Basically the goal is to remove the staff lines that obscure the musical symbols, so they would be easier to recognize.

But, the staff lines are connected to the symbols, so simply removing them will cut up the symbols and make them hardly recognizable.
So let’s see how we could do this with HMMs.

Categories
code graphics opencv vision

Run length encoding in OpenCV [w/code]

RLE exampleSharing a simple code snippet for run-length encoding with OpenCV…

Categories
graphics opencv vision

Simplest Color Balance with OpenCV [w/code]

Color balanceI came across an extremely simple color balancing algorithm here. And I thought I’ll quickly transcode it to OpenCV.
Here’s the gist:

Categories
code ffmpeg graphics opencv video vision

Extending the hand tracker with snakes and optimizations [w/ code, OpenCV]

I wish to report of a number of tweaks and additions to the hand silhouette tracker I posted a while back. First is the ability for it to “snap” to the object using a simple Active Snake method, another is a more advanced resampling technique (the older tracker always resampled after every frame), and of a number of optimizations to increase the speed (tracker now runs at real-time on a single core).