Scientists, mathematicians, and entrepreneurs share what they’ve learned from—and about—big data on the TED Talk stage. We selected 11 for your viewing pleasure.
TED Talks are fabulous short lectures on important stories of our times. One of the more important stories of the 21st century is big data, the science of collecting and analyzing copious amounts of information to derive insights that are beyond the reach of simple analysis.
These insights can save lives. Or not. Big data is also used for things like recommending television shows for your binge-watching pleasure, so you never have to leave your apartment again.
We crunched multiple TED (Technology, Education, Design) Talks on big data and spat out a few worthy of your attention. The most important insight comes from Sebastian Wernicke, who says, “No matter how powerful, data is just a tool.” And as we all know, tools can be used as weapons. Keep this in mind as you learn big data’s great accomplishments—and its downright dirty ones.
Spoiler alert: Like “Soylent Green,” it turns out that the secret ingredient to big data analysis is…people.
Here are 11 lessons data scientists and entrepreneurs learned from—and about—big data.
It is a truth universally acknowledged that big data results begin with big data collection.
Once upon the late 20th century, important surveys such as the vaccination rate of Zambian children were conducted on paper. But surveys cost time and money, which sometimes ran out—even before the papers could be transcribed (insert the sobs of academics here).
Joel Selanikio, a former employee of the Centers for Disease Control and Prevention and CEO of Magpi, describes the revolution that changed data collection: the PalmPilot. With this simple device, data gatherers could enter their facts directly into the simple handheld computer. Now that field researchers have access to cell phones, they can enter their data and quickly receive the results they need.
How quickly? What used to take two years, Selanikio says, now takes five minutes.
Big data tips its hat and rides off into the sunset.
An algorithm is a recipe for turning plain data into bite-size chunks of ready-to-ingest information. You may think that algorithms are true and scientific, but actually “algorithms are opinions embedded in code,” says mathematician and data scientist Cathy O’Neil. “We’re injecting biases into algorithms by choosing what data to collect.“
O’Neil explains how data has been used for evil, rather than good, something she refers to as “weapons of math destruction.”
For example, she says, one school system fired underperforming teachers, yet the company crunching the numbers refused to provide the metrics by which these teachers were judged. Worse, two men with a similar arrest record were given two different sentencing recommendations: Although the white man had a felony and the black man did not, the black man received a longer prison sentence. She refers to this baked-in bias as “data laundering.”
So how do you avoid doing terrible things with data? O’Neil presents a checklist of four steps, her very own “algorithmic audit.” Consider consulting it. We need to keep data and the people who wield it fair and ethical.
Big data is an abundant renewable resource, and we need to tap into it if we want to save lives, says Monika Blaumueller, managing director of Blue-Mill.
She’s not kidding. An AI algorithm spotted an Ebola outbreak nine days before the World Health Organization announced it—time the organization could have used to swoop in with a quarantine and a whole lot of face masks.
Big data can be used to make aid more effective. It can predict healthcare triggers, such as temperature spikes or drops. It can rank the effectiveness of each preventative measure healthcare workers employ.
Not just can but should. Tap into your data like it’s a maple tree, people.
To paraphrase Spider-Man, with big data comes big data responsibility.
Marie Wallace is a technical strategist at IBM who created a system to analyze her company’s in-house social network. She details the philosophy that guided her and her colleagues, including privacy and autonomy.
Transparency appears to be an Infinity Stone of her success. By having access to analytics and knowing exactly how their data was being used, IBM employees could trust their company. Wallace even got buy-in from IBMers before sharing data with an advocacy group. Therefore, IBM employees felt comfortable sharing more data.
This transparency is important, because, as Wallace says, the insights we generate can “exploit and manipulate us or enrich our lives.” She cites a more manipulative example: Social networks show you articles to make you feel insecure about your body, immediately followed by ads for diet products.
Wallace also calls on the world to embrace privacy. Spider-Man, who wants to keep his secret identity secret, would approve.
The datasphere has mathematicians and statisticians aplenty. What the world of big data really needs is humanities: sociology, rhetoric, philosophy, and ethics.
Susan Etlinger, a data analyst at Altimeter Group, learned this when her son was diagnosed with autism. His doctor told Etlinger that her son was developmentally delayed. Period. End of sentence. But, although his language skills were lacking, it didn’t mean her son was unintelligent.
When he was 4 years old, she found him typing on the computer, looking for images of “wimen.”
Not only had the boy taught himself to write but also was surfing the internet for women, an act more suited to a prepubescent child rather than a preschooler.
As Etlinger learned, when assessments overvalue verbal communication and ignore other forms of intelligence, the assessment is overlooking remarkable human accomplishments.
Note: A better title for this TED Talk would have been “Why we need critical thinking in big data.”
Tricia Wang has the coolest job title on the planet: technology ethnographer. Her job is cool, too. She advises companies on how people use technology. In 2009, she told Nokia what she had learned about cell phone usage, which was, people will pay any amount of money for the new hotness: a smartphone.
In the age of the $1,000-plus flagship phone, it’s obvious now, but back then, not so much. Nokia’s data analysis didn’t reflect this insight. But Wang had spent months in China, watching even the poorest peasants save up for the coveted item.
Nokia rejected her findings. Look who’s laughing now.
In this talk, Wang explains that while big data is important, the subsequent decisions made with it are better when paired with “thick data.” Thick data refers to “stories and interactions that cannot be quantified” and help you understand “the human narrative.”
Ignore her at your peril.
There is a wealth of medical data online, thanks to clinical trials, patents, and other publications. But as complexity specialist Gunjan Bhardwaj notes, you can’t acquire every bit of data, what with concerns over privacy as well as intellectual property rights.
With sick people who need all the facts they can get to make informed medical decisions, there needs to be a way to make this siloed data accessible.
Enter blockchain. Hey, don’t roll your eyes.
Bhardwaj says blockchain gives you a register of your transactions so the world knows that you created your data. With this proof that you had created this particular IP, more scientists and research institutions may be more willing to share their knowledge with the rest of the world.
By marrying big data with blockchain, many patients may live happily ever after.
Companies can help solve problems like world hunger by donating their data, their technology, and their in-house scientists in a private-public partnership known as “data philanthropy.”
It really works. Mallory Freeman, a former leader of a supply chain optimization project at the United Nations World Food Programme, describes how a satellite company donated data, and from there, researchers could see how droughts were impacting crops. “With that,” Freeman says, “you can trigger aid funding before a crisis can happen.”
(Note to self: Use big data to predict winning lottery tickets.)
Freeman says it’s good for business, too. When companies put data scientists to work on projects, they can glean insights for humanitarian aid (how to bring people out of poverty in India, for example) as well as for their own benefits (like gain insights about customers).
Keep this in mind, corporate philanthropists. Don’t just give dollars. Give data.
After receiving a disturbing text from a teenager, Nancy Lublin, the now-former CEO and founder of DoSomething.org, a nonprofit organization that provides youth services, built a crisis text line (CTL). Instead of making a telephone call others can potentially hear, a person who needs emotional support can simply text a counselor.
That’s the good news. The better news is that the CTL is leveraging its data to provide life-saving insights. “We know that if you text the words numbs and sleeve, there’s a 99 percent match for cutting,” says Lublin. “That algorithm in our hands means that an automatic pop-up says…’Try asking one of these questions,’ to prompt the counselor…. It makes us more accurate.”
Other facts gleaned from data: The state with the most suicidal ideation is Montana, and the time of day teens are most likely to abuse substances is 5 a.m.
The best news is that the CTL has performed “2.41 active rescues a day,” according to Lublin. The life of someone you know may have been saved with the help of caring counselors—and big data.
Data is great at taking apart problems but not as good at putting them back together. The best tool for that job is the one you have in your noggin, says Sebastian Wernicke. And in this TED Talk, he proves it.
Let’s say a wealthy media company with millions of data points wanted to create a political TV show. If this were a choose-your-own-adventure story, the company would have two options: It could create eight pilots based on your data, offer the pilots for free, and greenlight the show with the most viewers, or…it could create just one pilot.
But this isn’t actually a CYOA story. It really happened, with two different media companies making different choices. After creating eight pilots, Amazon ultimately greenlit “Alpha House,” a comedy about four politicians; Netflix developed the political drama “House of Cards.” And although both decisions were data-driven, Amazon used data to decide what kind of show to make, while Netflix learned that the original British series was popular with fans of actor Kevin Spacey and director David Fincher.
Let the less-than-stellar performance of “Alpha House” be a warning to those who rely solely on data, says Wernicke: It’s the human experts who take the risks that reap the rewards.
Quantitative analyst Ben Wellington explains that every. single. agency. in New York codes its addresses differently—a nightmare for data scientists to capture and clean. He believes the city, the country, and the world needs better standards for data. But that’s not the important part of his TED Talk. No, the important part—at least for New Yorkers—is about parking tickets. New York has released more than 2,000 datasets to the public, which includes records on fire hydrants. Wellington focused his data-scientific efforts there—and changed New York for the better.
Wellington created a heat map of all the parking tickets near fire hydrants, identifying the two biggest offenders, which had generated a total of $55,000 a year in tickets. That’s money you could be spending on Pabst Blue Ribbon and lottery tickets.
Wellington complained to the Department of Transportation, and then a New York City miracle happened: The DoT repainted those parking spots, and now drivers know not to park their cars in those particular danger zones. And it’s all down to a scientist who had data and a mouth and wasn’t afraid to use both. Thanks, big data.