The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies (8 page)

Researchers thought a great deal about which sensors to give a robot (cameras? lasers? sonar?) and how to interpret the reams of data they provide, but progress was slow. As a 2008 review of the topic summarized, SLAM “is one of the fundamental challenges of robotics . . . [but it] seems that almost all the current approaches can not perform consistent maps for large areas, mainly due to the increase of the computational cost and due to the uncertainties that become prohibitive when the scenario becomes larger.”
16
In short, sensing a sizable area and immediately crunching all the resulting data were thorny problems preventing real progress with SLAM. Until, that is, a $150 video-game accessory came along just two years after the sentences above were published.

In November 2010 Microsoft first offered the Kinect sensing device as an addition to its Xbox gaming platform. The Kinect could keep track of two active players, monitoring as many as twenty joints on each. If one player moved in front of the other, the device made a best guess about the obscured person’s movements, then seamlessly picked up all joints once he or she came back into view. Kinect could also recognize players’ faces, voices, and gestures and do so across a wide range of lighting and noise conditions. It accomplished this with digital sensors including a microphone array (which pinpointed the source of sound better than a single microphone could), a standard video camera, and a depth perception system that both projected and detected infrared light. Several onboard processors and a great deal of proprietary software converted the output of these sensors into information that game designers could use.
17
At launch, all of this capability was packed into a four-inch-tall device less than a foot wide that retailed for $149.99.

The Kinect sold more than eight million units in the sixty days after its release (more than either the iPhone or iPad) and currently holds the Guinness World Record for the fastest-selling consumer electronics device of all time.
18
The initial family of Kinect-specific games let players play darts, exercise, brawl in the streets, and cast spells à la Harry Potter.
19
These, however, did not come close to exhausting the system’s possibilities. In August of 2011 at the SIGGRAPH (short for the Association of Computing Machinery’s Special Interest Group on Graphics and Interactive Techniques) conference in Vancouver, British Columbia, a team of Microsoft employees and academics used Kinect to “SLAM” the door shut on a long-standing challenge in robotics.

SIGGRAPH is the largest and most prestigious gathering devoted to research and practice on digital graphics, attended by researchers, game designers, journalists, entrepreneurs, and most others interested in the field. This made it an appropriate place for Microsoft to unveil what the Creators Project website called “The Self-Hack That Could Change Everything.”
*
20
This was the KinectFusion, a project that used the Kinect to tackle the SLAM problem.

In a video shown at SIGGRAPH 2011, a person picks up a Kinect and points it around a typical office containing chairs, a potted plant, and a desktop computer and monitor.
21
As he does, the video splits into multiple screens that show what the Kinect is able to sense. It immediately becomes clear that if the Kinect is not completely solving the SLAM problem for the room, it’s coming close. In real time, Kinect draws a three-dimensional map of the room and all the objects in it, including a coworker. It picks up the word DELL pressed into the plastic on the back of the computer monitor, even though the letters are not colored and only one millimeter deeper that the rest of the monitor’s surface. The device knows where it is in the room at all times, and even knows how virtual ping-pong balls would bounce around if they were dropped into the scene. As the technology blog
Engadget
put it in a post-SIGGRAPH entry, “The Kinect took 3D sensing to the mainstream, and moreover, allowed researchers to pick up a commodity product and go absolutely nuts.”
22

In June of 2011, shortly before SIGGRAPH, Microsoft had made available a Kinect software development kit (SDK) giving programmers everything they needed to start writing PC software that made use of the device. After the conference there was a great deal of interest in using the Kinect for SLAM, and many teams in robotics and AI research downloaded the SDK and went to work.

In less than a year, a team of Irish and American researchers led by our colleague John Leonard of MIT’s Computer Science and Artificial Intelligence Lab announced Kintinuous, a “spatially extended” version of KinectFusion. With Kintinuous, users could use a Kinect to scan large indoor volumes like apartment buildings and even outdoor environments (which the team scanned by holding a Kinect outside a car window during a nighttime drive). At the end of the paper describing their work, the Kintinuous researchers wrote, “In the future we will extend the system to implement a full SLAM approach.”
23
We don’t think it will be long until they announce success. When given to capable technologists, the exponential power of Moore’s Law eventually makes even the toughest problems tractable.

Cheap and powerful digital sensors are essential components of some of the science-fiction technologies discussed in the previous chapter. The Baxter robot has multiple digital cameras and an array of force and position detectors. All of these would have been unworkably expensive, clunky, and imprecise just a short time ago. A Google autonomous car incorporates several sensing technologies, but its most important ‘eye’ is a Cyclopean LIDAR (a combination of “LIght” and “raDAR”) assembly mounted on the roof. This rig, manufactured by Velodyne, contains sixty-four separate laser beams and an equal number of detectors, all mounted in a housing that rotates ten times a second. It generates about 1.3 million data points per second, which can be assembled by onboard computers into a real-time 3D picture extending one hundred meters in all directions. Some early commercial LIDAR systems available around the year 2000 cost up to $35 million, but in mid-2013 Velodyne’s assembly for self-navigating vehicles was priced at approximately $80,000, a figure that will fall much further in the future. David Hall, the company’s founder and CEO, estimates that mass production would allow his product’s price to “drop to the level of a camera, a few hundred dollars.”
24

All these examples illustrate the first element of our three-part explanation of why we’re now in the second machine age: steady exponential improvement has brought us into the second half of the chessboard—into a time when what’s come before is no longer a particularly reliable guide to what will happen next. The accumulated doubling of Moore’s Law, and the ample doubling still to come, gives us a world where supercomputer power becomes available to toys in just a few years, where ever-cheaper sensors enable inexpensive solutions to previously intractable problems, and where science fiction keeps becoming reality.

Sometimes a difference in degree (in other words, more of the same) becomes a difference in kind (in other words, different than anything else). The story of the second half of the chessboard alerts us that we should be aware that enough exponential progress can take us to astonishing places. Multiple recent examples convince us that we’re already there.

*
Since 2
9
= 512

*
Multiplying 62.34 by 24358.9274 is an example of a floating point operation. The decimal point in such operations is allowed to ‘float’ instead of being fixed in the same place for both numbers.

*
In this context, a “hack” is an effort to get inside the guts of a piece of digital gear and use it for an unorthodox purpose. A self-hack is one carried out by the company that made the gear in the first place.

“When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind.”

—Lord Kelvin

“H
EY
,
HAVE
YOU
HEARD
about . . . ?”

“You’ve got to check out . . . ”

Questions and recommendations like these are the stuff of everyday life. They’re how we learn about new things from our friends, family, and colleagues, and how we spread the word about exciting things we’ve come across. Traditionally, such cool hunting ended with the name of a band, restaurant, place to visit, TV show, book, or movie.

In the digital age, sentences like these frequently end with the name of a website or a gadget. And right now, they’re often about a smartphone application. Both of the major technology platforms in this market—Apple’s iOS and Google’s Android—have more than five hundred thousand applications available.
1
There are plenty of “Top 10” and “Best of” lists available to help users find the cream of the smartphone app crop, but traditional word of mouth has retained its power.

Not long ago Matt Beane, a doctoral student at the MIT Sloan School of Management and a member of our Digital Frontier team, gave us a tip. “You’ve got to check out Waze; it’s amazing.” But when we found out it was a GPS-based app that provided driving directions, we weren’t immediately impressed. Our cars have navigation systems and our iPhones can give driving directions through the Maps application. We could not see a need for yet another how-do-I-get-there technology.

As Matt patiently explained, using Waze is like bringing a Ducati to a drag race against an oxcart. Unlike traditional GPS navigation, Waze doesn’t tell you what route to your destination is best in general; it tells you what route is best
right now
. As the company website explains:

The idea for Waze originated years ago, when Ehud Shabtai . . . was given a PDA with an external GPS device pre-installed with navigation software. Ehud’s initial excitement quickly gave way to disappointment—the product didn’t reflect the dynamic changes that characterize real conditions on the road. . . .

Ehud took matters into his own hands. . . . His goal? To accurately reflect the road system, state of traffic and all the information relevant to drivers at any given moment.
2

Anyone who has used a traditional GPS system will recognize Shabtai’s frustration. Yes, they know your precise location thanks to a network of twenty-four geosynchronous GPS satellites built and maintained by the U.S. government. They also know about roads—which ones are highways, one-way streets, and so on—because they have access to a database with this information. But that’s about it. The things a driver really wants to know about—traffic jams, accidents, road closures, and other factors that affect travel time—escape a traditional system. When asked, for example, to calculate the best route from Andy’s house to Erik’s, it simply takes the starting point (Andy’s car’s current location) and the ending point (Erik’s house) and consults its road database to calculate the theoretically “quickest” route between the two. This route will include major roads and highways, since they have the highest speed limits.

If it’s rush hour, however, this theoretically quickest route will not actually be the quickest one; with thousands of cars squeezing onto the major roads and highways, traffic speed will not approach, let alone eclipse, the speed limit. Andy should instead seek out all the sneaky little back roads that longtime commuters know about. Andy’s GPS knows that these roads exist (if it’s up-to-date, it knows about
all
roads), but doesn’t know that they’re the best option at eight forty-five on a Tuesday morning. Even if he starts out on back roads, his well-meaning GPS will keep rerouting him onto the highway.

Shabtai recognized that a truly useful GPS system needed to know more than where the car was on the road. It also needed to know where
other
cars were and how fast they were moving. When the first smartphones appeared he saw an opportunity, founding Waze in 2008 along with Uri Levine and Amir Shinar. The software’s genius is to turn all the smartphones running it into sensors that upload constantly to the company’s servers their location and speed information. As more and more smartphones run the application, therefore, Waze gets a more and more complete sense of how traffic is flowing throughout a given area. Instead of just a static map of roads, it also has always current updates on traffic conditions. Its servers use the map, these updates, and a set of sophisticated algorithms to generate driving directions. If Andy wants to drive to Erik’s at 8:45 a.m. on a Tuesday, Waze is not going to put him on the highway. It’s going to keep him on surface streets where traffic is comparatively light at that hour.

That Waze gets more useful to all of its members as it gets more members is a classic example of what economists call a
network effect
—a situation where the value of a resource for each of its users increases with each additional user. And the number of Wazers, as they’re called, is increasing quickly. In July of 2012 the company reported that it had doubled its user base to twenty million people in the previous six months.
3
This community had collectively driven more than 3.2 billion miles and had typed in many thousands of updates about accidents, sudden traffic jams, police speed traps, road closings, new freeway exits and entrances, cheap gas, and other items of interest to their fellow drivers.

Other books

Miracle Monday by Elliot S. Maggin
To Die in Beverly Hills by Gerald Petievich
The Heart of a Girl (2) by Kaitlyn Oruska
ShiftingHeat by Lynne Connolly
Dyer Consequences by Maggie Sefton
Pipeline by Brenda Adcock