Clever math could enable a high-quality 3-D camera so simple, cheap and power-efficient that it could be incorporated into handheld devices.
When Microsoft’s Kinect — a device that lets Xbox users control games with physical gestures — hit the market, computer scientists immediately began hacking it. A black plastic bar about 11 inches wide with an infrared rangefinder and a camera built in, the Kinect produces a visual map of the scene before it, with information about the distance to individual objects. At MIT alone, researchers have used the Kinect to create a “Minority Report”-style computer interface, a navigation system for miniature robotic helicopters and a holographic-video transmitter, among other things.
Now imagine a device that provides more-accurate depth information than the Kinect, has a greater range and works under all lighting conditions — but is so small, cheap and power-efficient that it could be incorporated into a cellphone at very little extra cost. That’s the promise of recent work by Vivek Goyal, the Esther and Harold E. Edgerton Associate Professor of Electrical Engineering, and his group at MIT’s Research Lab of Electronics.
“3-D acquisition has become a really hot topic,” Goyal says. “In consumer electronics, people are very interested in 3-D for immersive communication, but then they’re also interested in 3-D for human-computer interaction.”
Andrea Colaco, a graduate student at MIT’s Media Lab and one of Goyal’s co-authors on a paper that will be presented at the IEEE’s International Conference on Acoustics, Speech, and Signal Processing in March, points out that gestural interfaces make it much easier for multiple people to interact with a computer at once — as in the dance games the Kinect has popularized.
“When you’re talking about a single person and a machine, we’ve sort of optimized the way we do it,” Colaco says. “But when it’s a group, there’s less flexibility.”
Ahmed Kirmani, a graduate student in the Department of Electrical Engineering and Computer Science and another of the paper’s authors, adds, “3-D displays are way ahead in terms of technology as compared to 3-D cameras. You have these very high-resolution 3-D displays that are available that run at real-time frame rates.
“Sensing is always hard,” he says, “and rendering it is easy.”