Some powerful thoughts from Lucy Bernholz, one of the brightest minds in philanthropy, on how data will foundationally reshape the social ecosystem -- and perhaps society as we know it.
Some smart thinking from the folks at the California Civic Innovation Project:
"Another example of pioneering efforts to engage citizens: The Office of New Urban Mechanics and San Francisco’s Office of Civic Innovation both organize hackathons, where developers come together to make applications based on city data. The apps made at hackathons aren’t always completely useful—or completely functional—but the process engages developers by making them think of new ways to use information. Shannon Spanhake, deputy innovation officer for San Francisco, said these events make citizens realize their government has interesting problems to solve, and they prompt governments to realize that citizens can identify and address new dilemmas. When a citizen creates an app that could have a powerful impact on the city, the city’s innovation experts help him build partnerships and turn the idea into a business.
Hackathons are a good deal for governments and citizens alike. By releasing data and encouraging citizens to interact with it, cities harness the expertise of their people. The people working with the data get a chance to have an impact on their city, and the products they create can lead to new businesses and economic opportunities.
So when cities launch new open data initiatives, they need to realize that opening the data to the public is just the first step. The most important work is in making everyone realize just how powerful open data can be.
And a bit more:
"Having recently moved to a new city, I turn to Yelp whenever I am in need of a new restaurant, store, or even a refrigerator repairman. But as I learned the hard way, while Yelp can help me find the best calamari, it won’t tell me whether the restaurant has been spanked by the local health department.
Meanwhile, Yelp is filled with reviews that detail horrifying sanitation conditions at restaurants and bed-bug infestations at hotels. And yet these businesses’ doors remain open because the local health department isn’t using the more than 30 million user reviews on Yelp to target their inspections. A small sampling:
...Around the world, Yelp and local governments collect complementary data, intended to help would-be customers make decisions about where to spend their money and what to put in their stomachs. Yet that information isn’t available in a central location, and that is creating a knowledge gap for consumers. And customers aren’t the only ones who can benefit from better crunching of ratings and reviews information. Building partnerships with companies that generate user content is an easy way for our cities to get free feedback on their services. What if government took a user-centered design approach to service delivery? They could use feedback provided on thousands of sites to identify what citizens want from their cities—and where to begin fixing things.
But if prediction is the truest way to put our information to the test, we have not scored well. In November 2007, economists in the Survey of Professional Forecasters — examining some 45,000 economic-data series — foresaw less than a 1-in-500 chance of an economic meltdown as severe as the one that would begin one month later. Attempts to predict earthquakes have continued to envisage disasters that never happened and failed to prepare us for those, like the 2011 disaster in Japan, that did.
¶The one area in which our predictions are making extraordinary progress, however, is perhaps the most unlikely field. Jim Hoke, a director with 32 years experience at the National Weather Service, has heard all the jokes about weather forecasting, like Larry David’s jab on “Curb Your Enthusiasm” that weathermen merely forecast rain to keep everyone else off the golf course. And to be sure, these slick-haired and/or short-skirted local weather forecasters are sometimes wrong. A study of TV meteorologists in Kansas City found that when they said there was a 100 percent chance of rain, it failed to rain at all one-third of the time.
¶But watching the local news is not the best way to assess the growing accuracy of forecasting (more on this later). It’s better to take the long view. In 1972, the service’s high-temperature forecast missed by an average of six degrees when made three days in advance. Now it’s down to three degrees. More stunning, in 1940, the chance of an American being killed by lightning was about 1 in 400,000. Today it’s 1 in 11 million. This is partly because of changes in living patterns (more of our work is done indoors), but it’s also because better weather forecasts have helped us prepare.
But are still subject to the vagaries of human nature and biases of institutional incentive structures:
People don’t mind when a forecaster predicts rain and it turns out to be a nice day. But if it rains when it isn’t supposed to, they curse the weatherman for ruining their picnic. “If the forecast was objective, if it has zero bias in precipitation,” Bruce Rose, a former vice president for the Weather Channel, said, “we’d probably be in trouble.”
Because it is so important to understand these connections Asu Ozdaglar and I have recently created the MIT Center for Connection Science and Engineering, which spans all of the different MIT departments and schools. It's one of the very first MIT-wide Centers, because people from all sorts of specialties are coming to understand that it is the connections between people that is actually the core problem in making transportation systems work well, in making energy grids work efficiently, and in making financial systems stable. Markets are not just about rules or algorithms; they're about people and algorithms together.
Understanding these human-machine systems is what's going to make our future social systems stable and safe. We are getting beyond complexity, data science and web science, because we are including people as a key part of these systems. That's the promise of Big Data, to really understand the systems that make our technological society. As you begin to understand them, then you can build systems that are better. The promise is for financial systems that don't melt down, governments that don't get mired in inaction, health systems that actually work, and so on, and so forth.
The barriers to better societal systems are not about the size or speed of data. They're not about most of the things that people are focusing on when they talk about Big Data. Instead, the challenge is to figure out how to analyze the connections in this deluge of data and come to a new way of building systems based on understanding these connections.
With Big Data traditional methods of system building are of limited use. The data is so big that any question you ask about it will usually have a statistically significant answer. This means, strangely, that the scientific method as we normally use it no longer works, because almost everything is significant! As a consequence the normal laboratory-based question-and-answering process, the method that we have used to build systems for centuries, begins to fall apart.
Big data and the notion of Connection Science is outside of our normal way of managing things. We live in an era that builds on centuries of science, and our methods of building of systems, governments, organizations, and so on are pretty well defined. There are not a lot of things that are really novel. But with the coming of Big Data, we are going to be operating very much out of our old, familiar ballpark.
Algorithms and a Lack of Theory: It is not only algorithms that can go wrong when a theory proves incorrect or the assumptions underlying the algorithm change. There are places where no theory exists at any level of consensus to be meaningful. The impact of education (and the effectiveness of various approaches), how innovation works, or what triggers a fad are examples of behaviors for which little valid theory exists--it’s not that plenty of opinion about various approaches or models is lacking, but that a theory, in the scientific sense, is nonexistent. For Big Data that means a number of things, first and foremost, that if you don’t have a working theory, you probably don’t know what data you need to test any hypotheses you may posit. It also means that data scientists can’t create a model because no reliable underlying logic exists that can be encoded into a model.
The Big Data Dream
Dirk Helbing seeks a system that is akin to Asimov’s Psychohistory as imagined in theFoundation series. In broad swaths, it would anticipate the future by linking social, scientific, and economic data. This system could be used to help advise world governments on the most salient choices to make.
Reading the article in Scientific American reminded me of a science fiction story by Tribble-inventor David Gerrold—When Harlie Was One. In this book, Harlie, which stands for Human Analog Robot Life Input Equivalents, decides that he needs answers, and that he isn’t sophisticated enough to solve his own problem and therefore keep the corporate interests that built him interested enough to keep him plugged in. So he designs a new computer, the Graphic Omniscient Device, or GOD, as a proof of his value. GOD will answer all questions submitted to it. Unfortunately, as the human engineers building GOD eventually realize, the processing capacity is so vast, that GOD will not be able to provide an answer to any question during the lifetime of a human. Harlie, of course, knew this all along. He needed the humans for three reasons: to keep him running, for engineering labor to build GOD, and to ask the questions that GOD will answer.
Given the woes of Europe, spending €1-billion on such a project will likely prove to be wasted money. We, of course, don’t have a mechanical futurist to evaluate that position, but we do have history. Whenever there is an existential problem facing the world, charlatans appear to dazzle the masses with feats of magic and wonder. I don’t see this proposal being anything more than the latest version of apocalyptic sorcery. It’s not that a big science project can’t yield interesting outcomes, but if you look at the Microelectronics and Computer Technology Corporation (MCC), late of Austin, Texas, we find Cyc, a system conceived at the beginning of the computer era, to combat Japan’s Fifth Generation Project as it supposedly threatened to out-innovate America’s nascent lead in computer technology. Although Cyc has yielded some use, it has not yet become the artificial human mind it was intended to be, able to converse naturally with anyone about the events, concepts, and objects in the world. And artificial intelligence, as imagined in the 1980s, has yet to transform the human condition.
As Big Data becomes the next great savior of business and humanity, we need to remain skeptical of its promises as well as its applications and aspirations.
Bill Gates articulates the core logic of the excitement over's data potential to transform how we tackle public problems: measurement.
Harnessing steam power required many innovations, as William Rosen chronicles in the book "The Most Powerful Idea in the World." Among the most important were a new way to measure the energy output of engines and a micrometer dubbed the "Lord Chancellor" that could gauge tiny distances.
Such measuring tools, Mr. Rosen writes, allowed inventors to see if their incremental design changes led to the improvements—such as higher power and less coal consumption—needed to build better engines. There's a larger lesson here: Without feedback from precise measurement, Mr. Rosen writes, invention is "doomed to be rare and erratic." With it, invention becomes "commonplace."
In the past year, I have been struck by how important measurement is to improving the human condition. You can achieve incredible progress if you set a clear goal and find a measure that will drive progress toward that goal—in a feedback loop similar to the one Mr. Rosen describes.
This may seem basic, but it is amazing how often it is not done and how hard it is to get right. Historically, foreign aid has been measured in terms of the total amount of money invested—and during the Cold War, by whether a country stayed on our side—but not by how well it performed in actually helping people. Closer to home, despite innovation in measuring teacher performance world-wide, more than 90% of educators in the U.S. still get zero feedback on how to improve.
Big data has trouble with big problems. If you are trying to figure out which e-mail produces the most campaign contributions, you can do a randomized control experiment. But let’s say you are trying to stimulate an economy in a recession. You don’t have an alternate society to use as a control group. For example, we’ve had huge debates over the best economic stimulus, with mountains of data, and as far as I know not a single major player in this debate has been persuaded by data to switch sides.