This is my first trial at using Jupyter notebook to write a post, hope it makes sense.
I’ve recently taught a class on generative models: http://hi.cs.stonybrook.edu/teaching/cdt450
In class we’ve manipulated face images with neural networks.
One important thing I found that helped is to align the images so the facial features overlap.
It helps the nets learn the variance in faces better, rather than waste their “representation power” on the shift between faces.
The following is some code to align face images using the excellent Dlib (python bindings) http://dlib.net. First I’m just using a standard face detector, and then using the facial fatures extractor I’m using that information for a complete alignment of the face.
After the alignment – I’m just having fun with the aligned dataset 🙂
Tag: opencv
As part of the computer vision class I’m teaching at SBU I asked students to implement a segmentation method based on SLIC superpixels. Here is my boilerplate implementation.
This follows the work I’ve done a very long time ago (2010) on the same subject.
For graph-cut I’ve used PyMaxflow: https://github.com/pmneila/PyMaxflow, which is very easily installed by just pip install PyMaxflow
The method is simple:
- Calculate SLIC superpixels (the SKImage implementation)
- Use markings to determine the foreground and background color histograms (from the superpixels under the markings)
- Setup a graph with a straightforward energy model: Smoothness term = K-L-Div between superpix histogram and neighbor superpix histogram, and Match term = inf if marked as BG or FG, or K-L-Div between SuperPix histogram and FG and BG.
- To find neighbors I’ve used Delaunay tessellation (from scipy.spatial), for simplicity. But a full neighbor finding could be implemented by looking at all the neighbors on the superpix’s boundary.
- Color histograms are 2D over H-S (from the HSV)
Result
A small example on how to do Laplacian pyramid blending with an arbitrary mask.
Enjoy
Roy
Hello again!
After a long hiatus I’m back with an update. Recently I’ve been upgrading the Structure-from-Motion Toy Library (https://github.com/royshil/SfM-Toy-Library/) to OpenCV 3.x from OpenCV 2.4.x.
Using Poppler, of course!
Poppler is a very useful tool for handling PDF, so I’ve discovered lately. Having tried both muPDF and ImageMagick’s Magick++ and failed, Poppler stepped up to the challenge and paid off.
So here’s a small example of how work the API (with OpenCV, naturally):
#include <iostream> #include <fstream> #include <sstream> #include <opencv2/opencv.hpp> #include <poppler-document.h> #include <poppler-page.h> #include <poppler-page-renderer.h> #include <poppler-image.h> using namespace cv; using namespace std; using namespace poppler; Mat readPDFtoCV(const string& filename,int DPI) { document* mypdf = document::load_from_file(filename); if(mypdf == NULL) { cerr << "couldn't read pdf\n"; return Mat(); } cout << "pdf has " << mypdf->pages() << " pages\n"; page* mypage = mypdf->create_page(0); page_renderer renderer; renderer.set_render_hint(page_renderer::text_antialiasing); image myimage = renderer.render_page(mypage,DPI,DPI); cout << "created image of " << myimage.width() << "x"<< myimage.height() << "\n"; Mat cvimg; if(myimage.format() == image::format_rgb24) { Mat(myimage.height(),myimage.width(),CV_8UC3,myimage.data()).copyTo(cvimg); } else if(myimage.format() == image::format_argb32) { Mat(myimage.height(),myimage.width(),CV_8UC4,myimage.data()).copyTo(cvimg); } else { cerr << "PDF format no good\n"; return Mat(); } return cvimg; }
All you have to do is give it the DPI (say you want to render in 100 DPI) and a filename.
Keep in mind it only renders the first page, but getting the other pages is just as easy.
That’s it, enjoy!
Roy.
Years ago I wanted to implement PTAM. I was young and naïve 🙂
Well I got a few moments to spare on a recent sleepless night, and I set out to implement the basic bootstrapping step of initializing a map with a planar object – no known markers needed, and then tracking it for augmented reality purposes.
So lately I’m into Optical Music Recognition (OMR), and a central part of that is doing staff line removal. That is when you get rid of the staff lines that obscure the musical symbols to make recognition much easier. There are a lot of ways to do it, but I’m going to share with you how I did it (fairly easily) with Hidden Markov Models (HMMs), which will also teach us a good lesson on this wonderfully useful approach.
OMR has been around for ages, and if you’re interested in learning about it [Fornes 2014] and [Rebelo 2012] are good summary articles.
The matter of Staff Line Removal has occupied dozens of researchers for as long as OMR exists; [Dalitz 2008] give a good overview. Basically the goal is to remove the staff lines that obscure the musical symbols, so they would be easier to recognize.
But, the staff lines are connected to the symbols, so simply removing them will cut up the symbols and make them hardly recognizable.
So let’s see how we could do this with HMMs.
I came across an extremely simple color balancing algorithm here. And I thought I’ll quickly transcode it to OpenCV.
Here’s the gist:
I wish to report of a number of tweaks and additions to the hand silhouette tracker I posted a while back. First is the ability for it to “snap” to the object using a simple Active Snake method, another is a more advanced resampling technique (the older tracker always resampled after every frame), and of a number of optimizations to increase the speed (tracker now runs at real-time on a single core).