A heartbeat, a child’s first word, a photograph, a phone call, a Facebook update — all drops in a vast ocean of data. “Big” data, then, is something of a misnomer. It’s colossal. From the beginning of recorded time until 2003, we created 5 billion gigabytes of data. In 2011 the same amount was created every two days. By 2013 that time will shrink to 10 minutes. So what are we learning from it all? Truths about our measured world and our measured selves. What follows is a look at both. –The book from which this excerpt is drawn, The Human Face of Big Data, is created by Rick Smolan and Jennifer Erwitt and sponsored by EMC, Cisco, FedEx, Originate, Tableau, and VMware.
How words arrive Deb Roy, an associate professor at MIT’s Media Lab, is finding out how a child learns to speak in the most personal way — by observing his newborn son. Roy’s Human Speechome Project chronicled his son’s first three years at home, using 11 fisheye videocameras, 14 microphones, and thousands of feet of cable. This cutaway image is a composite taken by the cameras mounted in the ceiling of each room in Roy’s house. Though the system shut down each night while the baby slept, almost every other one of his sounds and movements were recorded, generating 200 gigabits of data each day. Every 40 days the files were hand-delivered to MIT — 250,000 hours’ worth, still being analyzed today.
The assassin: Tracked and caught In February 2011, Jaime Zapata, a special agent for U.S. Homeland Security, was gunned down on a highway in Mexico. U.S. law enforcement responded quickly. There were masses of data on the Mexican cartel suspected of the murder — tapped phone calls, e-mails, videos, and interrogation notes — but everything was in different formats, stored in different databases. Investigators turned to Palantir, a company founded by PayPal veterans. Palantir creates software tools that can rapidly integrate data from multiple sources into a single resource. A week after Zapata’s death, with a record from Palantir, investigators found the chief suspect in the shooting: Zeta cartel member Julian Zapata Espinoza, a.k.a. El Piolin or Tweety Bird (pictured, yellow shirt). Palantir’s tools have since aided the arrests of 700 suspects and the confiscation of 467 kilos of cocaine, 64 pounds of methamphetamines, and 282 weapons.
A cornfield can be a billion points of information Jeff Hodel farms 6,000 acres of corn and soybeans near Roanoke, Ill. To feed a growing population — an estimated 9 billion worldwide by mid-century — he uses genetically modified seeds, producing higher yields. The same companies that make the seeds, Monsanto and DuPont, are building hardware and software that help farmers plant and fertilize their crops with surgical precision. The goal is to have farmers double their per-acre yield by 2030, in part by monitoring their fields using off-the-shelf tools, such as an iPad. The Climate Corp., a company founded by former Google executives David Friedberg and Siraj Khaliq, provides crop insurance for farmers by analyzing 22 data sets for weather every six hours, calculating about 10,000 scenarios that could happen to a grower during the next two years. They run 34 trillion different simulations that can be used to price hyper-local insurance rates for atmospheric calamity. When erratic rain caused one farmer’s corn crop to fail, Climate Corp. automatically compensated him $45,000 for his losses. The farmer didn’t even need to file a claim.
When our heartbeat is not ours The cardiac defibrillator inside Hugo Campos transmits data to the manufacturer, which alerts his doctors if there is a problem with his heartbeat. He would like to see this data, too. But he cannot. Campos, a member of a growing number of data-access activists, or “e-patients,” is challenging the defibrillator’s manufacturer with this question: Who has the right to own, control, and use the information the device collects? Medical professionals are understandably concerned about patients’ incorrectly interpreting data, or trying to adjust a device without understanding the way it works. Campos is concerned that people he may not know — scientists, doctors — have more information about him than he does. He’s petitioned for access to his information; so far he has had no luck. If he and other data-access activists succeed, patients would receive access to their health data. For now, the ownership of information collected by the sensors in our bodies remains with the manufacturer.
Recognizing faces — not just human ones A team of German scientists at the Max Planck Institute for Evolutionary Anthropology and the Fraunhofer Institutes for Digital Media Technology and Integrated Circuits took human-facial-recognition technology and used it to track apes. By pairing it with audio recordings, the scientists could home in on specific animals. Scientists are now using the system to understand how the population of a chimpanzee colony shifts over time, adding to our knowledge of a chimp’s social networks.
Seeing the delivery system in real time The frenetic web of blue lines etched across the face of Midtown Manhattan is the path of Raju Hossain and his cohorts as they race their bikes down avenues and across streets to deliver pizzas on a Friday night. In the course of his eight-hour shift, Hossain will make at least 30 deliveries for Domino’s. The data visualization was created by attaching a GPS device to his and others’ bikes and using satellite-based telemetry to record their travels. Speed is essential to the entire pizza production and delivery process. Just getting the ingredients together to make the pizza requires a rapid supply system that reaches across the country and converges upon a central facility in Connecticut, then on to Domino’s franchises throughout the region. It’s a process duplicated not just in pizza shops but in thousands of other businesses across the world, all aimed at making the production and delivery of a product look inexpensive and effortless to the consumer.
Analytics isn’t just for adults Salim Sheikh, a 12-year-old from India, grew fascinated by the maps in his computer class. He found his native country and, tracing his eyes across the subcontinent, looked for Rishi Aurobindo Colony, his slum within Kolkata. It was not there. He checked Google. Again, it appeared as if Rishi Aurobindo simply did not exist. The colony’s absence on all maps prevented residents of the place from accessing vital government services, such as trash pickup, running water, and vaccinations. So Sheikh and his friends decided to change that. They began mapping their community. Sheikh charted where disease spread and where garbage piled up in the streets. He and his friends’ efforts raised polio vaccination rates from 40% to 80%. Sheikh and his team are now incorporating digital technology into their tracking and outreach. For now, the roads and details of their community cannot be found on any map other than the one they have created — though they are now working to integrate their work onto Google Earth.
Can life be created? J. Craig Venter, shown here in the greenhouse of his company, Synthetic Genomics, in La Jolla, Calif., is rewriting the building blocks of life. Venter plans to create bacteria, algae, and even plants. These customized products will carry out industrial tasks and displace fossil fuels. Exxon has invested in Venter’s efforts for $600 million. But writing an organism’s genetic blueprint from scratch is proving the most difficult task yet for the man who first sequenced a human genome and later transferred a genome from one organism to another.
Hacking civics Call it a Peace Corps of geeks. That’s the phrase Jennifer Pahlka (in red shirt on the couch, surrounded by her 2012 fellows) uses for Code for America, the San Francisco nonprofit she founded and runs. Her goal is to connect young web designers and developers with what may be the least tech-savvy sector of American life: state and local governments. Pahlka brings together the same young people who help build companies such as Google and Apple, gives them an 11-month fellowship, and places them in municipalities across the nation, with the charge of bringing those community governments into the 21st century. In Boston, fellows designed an application that helps parents find the right public school for their kids. The app is being adopted by school systems across the country.
Polio, fought by cellphone The battlefield is northern Nigeria, where thousands of children go unvaccinated. Half of the world’s new polio cases appear here. A task force that includes the Bill & Melinda Gates Foundation and the Environmental Systems Research Institute is using satellite imaging, smartphones, and crowdsourcing to help fight the disease. Today, inoculation workers armed with 10,000 GPS-enabled cellphones fan out across the countryside, mapping their progress in real time and targeting their efforts toward polio hotspots.
The beating heart of big data The amount of information we take in in a single day is more than someone living in the 16th century would view in his entire lifetime. Nowhere is this more apparent than in New York’s Times Square, aglow with advertisements. Look to the left and it’s dusk; to the right it’s midmorning. Photographer Stephen Wilkes created the image by blending more than 1,400 separate photos taken over the course of 15 hours — a meticulous process that took him nearly three months.
City Watch Hidden a few blocks from Rio de Janeiro’s Copacabana beach is one of the most sophisticated crime-fighting centers on the planet. With 100,000 feet of fiber-optic cables, 300 LCD screens in 100 offices, and the biggest surveillance screen in Latin America (860 square feet), the Rio Ops Center is the envy of the world’s urban police forces. It integrates 90 different city operations to an unrivaled degree. Rio will host the 2014 World Cup and the 2016 Summer Olympics, so the coordination of tremendous amounts of real-time data pouring in from myriad sources will be essential.
Re-imagining retirement Energetic members of the Aqua Suns synchronized swim team of Sun City, Ariz., celebrate their retirement community’s 50th anniversary. These women epitomize active aging, which means continuing participation in social, economic, cultural, spiritual, and civic affairs — not simply maintaining the ability to be physically active. Researchers at the Intel-GE Car Innovations Lab focus on “aging in place” devices, which will help seniors live at home as long as possible. The lab is prototyping a technology called the Magic Carpet — a carpet outfitted with sensors and accelerometers. For the first week the carpet simply learns a person’s typical routine (the “baseline”). After the first week, the system checks for abnormalities. Experts predict that in the not-too-distant future, inexpensive devices will be widely used to measure glucose levels, blood pressure, and other vital signs, serving as an early warning system that alerts their wearers to any troublesome change in their baseline. Such tools may play an important role in helping reduce the rising cost of health care around the globe. To find out more about this project, sign up here.