Update: check out my new post about this https://www.morethantechnical.com/2012/10/17/head-pose-estimation-with-opencv-opengl-revisited-w-code/
Hi
Just wanted to share a small thing I did with OpenCV – Head Pose Estimation (sometimes known as Gaze Direction Estimation). Many people try to achieve this and there are a ton of papers covering it, including a recent overview of almost all known methods.
I implemented a very quick & dirty solution based on OpenCV’s internal methods that produced surprising results (I expected it to fail), so I decided to share. It is based on 3D-2D point correspondence and then fitting of the points to the 3D model. OpenCV provides a magical method – solvePnP – that does this, given some calibration parameters that I completely disregarded.
Here’s how it’s done
Hi
Been working hard at a project for school the past month, implementing one of the more interesting works I’ve seen in the AR arena: Parallel Tracking and Mapping (PTAM) [PDF]. This is a work by George Klein [homepage] and David Murray from Oxford university, presented in ISMAR 2007.
When I first saw it on youtube [link] I immediately saw the immense potential – mobile markerless augmented reality. I thought I should get to know this work a bit more closely, so I chose to implement it as a part of advanced computer vision course, given by Dr. Lior Wolf [link] at TAU.
The work is very extensive, and clearly is a result of deep research in the field, so I set to achieve a few selected features: Stereo initialization, Tracking, and small map upkeeping. I chose not to implement relocalization and full map handling.
This post is kind of a tutorial for 3D reconstruction with OpenCV 2.0. I will show practical use of the functions in cvtriangulation.cpp, which are not documented and in fact incomplete. Furthermore I’ll show how to easily combine OpenCV and OpenGL for 3D augmentations, a thing which is only briefly described in the docs or online.
Here are the step I took and things I learned in the process of implementing the work.
Update: A nice patch by yazor fixes the video mismatching – thanks! and also a nice application by Zentium called “iKat” is doing some kick-ass mobile markerless augmented reality.
Hi All
It looks like it’s finally here – a way to grab the raw data of the camera frames on the iPhone OS 3.x.
Update: Apple officially supports this in iOS 4.x using AVFoundation, here’s sample code from Apple developer.
A gifted hacker named John DeWeese was nice enough to comment on a post from May 09′ with his method of hacking the APIs to get the frames. Though cumbersome, it looks like it should work, but I haven’t tried it yet. I promise to try it soon and share my results.
Way to go John!
Some code would be awesome…
Roy.
Hi
In the past few weeks I have been working hard at a few projects for end-of-term at Uni. One of the projects is what I called “SmartHome”, for Embedded computing [link] course, is a home monitoring [link] application. In the course the students were given an LPC2148 arm7-MCU (NXP) based education board, implemented by Embedded Artists [link]. My partner Gil and I decided to work with ZigBee extension modules [link] to enable remote communication.
Here are the steps we took to bring this project to life.
I want to suggest a trick that worked for me. My work place blocks most of the popular radio stations stream sites in my country.
I can understand why they’re doing that, but hey – if you want to save bandwidth I suggest you block YouTube (not that I complain…)
Well, I thought of a way to listen to my favorite radio station from work, by re-streaming it from my home. And it worked!
It can also work for you, in case your IT does not block by protocol, only by address.
So here’s how to do it:
Hi
I wanted to do the simplest recoloring/color-transfer I could find – and the internet is just a bust. Nothing free, good and usable available online… So I implemented the simplest color transfer algorithm in the wolrd – Histogram Matching.
Here’s the implementation with OpenCV
Links of the week
http://www.runnersworld.com/article/1,7124,s6-240-319–13001-0,00.html
Shoes tying hacks
http://www.engadget.com/2010/01/18/misa-digital-guitar-cuts-the-strings-brings-the-noise/
Very nice! A digital guitar…
http://www.newscientist.com/article/dn18036
An interesting concept – see-through walls w/ augmentd reality
http://gizmodo.com/5452140/one-third-of-us-11+year+olds-have-cellphones
The “Youth market”‘s little brother – the “Toddler market” – is booming
http://gizmodo.com/5451876/rumor-apple-iphone-os-40-features-detailed
Some goodies from iPhone OS 4 – where is video-pixel-bytes access already?!
http://lifehacker.com/5452786/memorize-now-helps-you-commit-long-passages-to-memory
I like! A helper webapp to memorize text
http://gizmodo.com/5452684/voice-band-iphone-app-converts-bah-ba-ba-bah-into—
This is awesome.
http://gizmodo.com/5453436/googles-html5-youtube-videos-dont-need-flash
YouTube without flash: I tried it on Chrome, the video was choppy, volume control didn’t work proerly and the progressing download & play made the position marker bounce around. But in the end, anything that replaces Flash, and Adobe’s reign over internet interactive animation, is good..
C ya’ll next week!
Roy.
Hi
Stuff I picked up on the web the last week:
http://www.billshrink.com/blog/nexus-one-vs-iphone-droid-palm-pre-total-cost-of-ownership/
Compare the leading smartphones on the market
http://www.techcrunch.com/2010/01/05/quantcast-mobile-web-apple-android/
Mobile web usage stats: iPhone 65%, Android 12%, RIM 9%
http://gizmodo.com/5442217/the-invisible-oled-laptop-to-end-all-laptops
A transparent screen – Cool? yes. Practical? Not so much.
http://www.techcrunch.com/2010/01/06/augmented-reality-vs-virtual-reality/
Augmented reality is officially more popular than virtual reality.
http://gizmodo.com/5439721/new-touchless-mobile-interface-could-eliminate-fingerprint-smudging-forever
You don’t need a mouse anymore (if you have a 154 frames-per-second camera, and very steady hands)
http://gizmodo.com/5442385/samsung-projector-phone-in-action
Samsung’s projector mobile phone in action in CES
http://weblogs.baltimoresun.com/news/technology/2010/01/apple_tablet_3d.html
Apple is putting proximity sensors on new device to allow for 3D desktop manipulation.
http://gizmodo.com/5441682/att-sdk-for-dumbphones-announced
AT&T goes app-store on dumbphones, releases SDK for BREW
C y’all next week!
Roy
Just links [Links of the week]
Hi
Stuff I picked up on the web the last week:
http://www.techcrunch.com/2009/12/21/world-map-social-networks/
Never tired of infographics: World map of social networks
http://gizmodo.com/5429631/implausible-digital-forensics-in-tv-and-film-a-medley
Awe-some and then some. Enhance.
http://lifehacker.com/5431998/ribbit-app-delivers-voicemail-transcripts-to-your-iphone
Give Ribbit credit for the bold stand in front of G-Voice.
http://gizmodo.com/5428610/rumor-google-working-on-chrome-os+branded-netbook-with-one-secret-manufacturer
A G netbook! That’s what we’ve been missing! Not.
http://gizmodo.com/5428642/apple-patent-sees-you-computing-hands+free-in-3d
3D interface by Apple, based on position of user.
http://gizmodo.com/5433074/open-apps-on-a-virtual-iphone-thanks-to-augmented-reality
Orange Israel promoting iPhones in a cute way: iPhone inside iPhone with AR.
http://www.techcrunch.com/2009/12/23/confirmed-jajah-sold-207-million
JaJah sold to Telefonica (O2) – for 145 million euros!
See ya’ll next week!
Roy.
Leenky wiks! [Links of the week]
Hello people of high measure,
Forth are listed the hyperlinks thy humble servant hath collected in past days:
http://gizmodo.com/5423006/i-cant-stop-smiling-over-google-chromes-new-ad
Nice ad by G for Chrome!
http://www.techcrunch.com/2009/12/09/geoapi-creation/
Very interesting: huge database of geo-tagged information with API for developers.
http://gizmodo.com/5424468/mits-bidirectional-display-lets-you-control-objects-with-a-wave-of-your-hand
I saw this contraption in the lab and the guy demoed it for me. It’s strange looking, but an interesting concept.
http://gizmodo.com/5425146/the-real-google-phone-everything-is-different-now
A G phone!
This is only a fraction of the online buzz about…
http://gizmodo.com/5425012/the-pen-de-touch-for-driving-light-cycles
Air-Pen. Nice implementation. But, is this is how we’re going to interface with computers in the future? Don’t think so.
http://www.techcrunch.com/2009/12/14/4g-mobile-network-sweden-teliasonera/
Välkommen till Sverige – LTE! (“Welcome to Sweden – LTE!” in Swedish)
TeliaSonera launching LTE
http://www.techcrunch.com/2009/12/14/the-unofficial-google-text-to-speech-api/
Free Text-To-Speech from G! Hurrah!
http://gizmodo.com/5425874/fuse-what-your-next-touch-phone-is-going-to-feel-like
TAT are as usual a good indicator of future UI (Social AR…). This time: 3D UI, haptic interface.
http://gizmodo.com/5426963/the-android-market-is-getting-ready-to-explode
Some stats on the Android market, looks good.
http://www.techcrunch.com/2009/12/16/google-browser-size/
G seem very committed to making the web better. First tools for webmasters to make their sites faster, and making sure the resolution fits
http://gizmodo.com/5428174/shooting-challenge-anthropomorphism
Anthropomorphism: Look at the video, (though it’s an AD) it will put a smile on your face!
and then go shoot some anthropomorphous objects.
http://gizmodo.com/5428233/microsoft-and-palm-treading-water-while-other-mobile-platforms-grow
Mobile OS stats (Feb-Oct): Apple and RIM skyrocket! WinMo, Symb, Palm, Android – stagnate.
Enjoy thy weekend!
Roy.