By Eric Hornbeck
New York City collects vast amounts of data every day on the activities of its agencies and citizens. For the last several years it has posted reams of that data, from restaurant health inspections to 311 calls, on its open data portal. As the availability and use of that data has increased, it’s also become a source of profit for one particular group of New Yorkers — those on Wall Street. That for-profit use of government data has raised some concerns, but it’s largely been off the city’s radar. Instead, the city wants to make sure its data is used by even more users, including community organizations and nonprofits.
The city passed its open data law in 2012. Then-Mayor Michael Bloomberg administration’s touted the move as one that would spur innovation as technology-minded citizens and startups harnessed the raw data to bring unique services to citizens. Current Mayor Bill de Blasio’s administration has focused more on the portal’s good governance benefits, seeing it as a way give citizens access to the data their city collects and for city agencies to work better together.
The city’s open data portal is part of a broader open data movement. While the ethos of other open government initiatives, such as the federal Freedom of Information Act and New York state’s Freedom of Information Law, is built around citizens asking for specific information, open data envisions a culture of affirmative disclosures. If a government agency collects the data, it should then make it freely and easily available to the public. This is usually done by posting data online in a machine-readable format that’s easy for programmers to harness.
But there is another player in the city also hungry for this kind of big data: investors. As The Wall Street Journal reported in September, companies are harnessing big data to sell to hedge funds and others looking for a leg-up on their competitors in the market. The Journal mentions Guidepoint Global LLC, a “data hunter” that finds big data sources that it can sell to hedge funds and other investors. For example, data on credit card swipes from an electronic payments company could give insight into demand for a product. (Guidepoint did not respond to email messages seeking comment for this post.)
A similar phenomenon has played out among government data through FOIA requests. FOIA is often touted as a tool designed for journalists to keep government accountable. But the lion’s share of the requests might actually come from companies whose entire raison d’etre is to make bulk FOIA requests for companies.
But do these giveaways of government information for profit-making companies pervert the altruistic government transparency goals of open data initiatives?
Cathy O’Neil, a data researcher who has noted that open data can be used for nefarious ends, said that it’s hard to identify the benefit of releasing so much open data because it’s difficult to measure how open data projects help marginalized groups. She added that companies can marshal much greater resources to harness data than individuals ever could.
“[T]he private companies that make use of this stuff don’t talk about it,” she said in an email message.
The city, though, is less concerned with Wall Street freeloaders than with bringing the data to even more people. The open data law requires that any data that is first provided to a private company must then be posted on the portal as well. The city’s data license agreement includes a warranty disavowing the data’s completeness and allows anyone to use the data.
“[U]sers are free to use the data however they please – including making a (paid) application from out of the information,” Craig Campbell, a research fellow in the Mayor’s Office of Data Analytics, said in an email message.
Instead, the city’s focus is on increasing the amount of data posted on the portal and expanding the reach of open data beyond the financially or technologically motivated.
This year it added more data sets to the portal, including information on taxi trips and a searchable city budget. The city has also undertaken initiatives to find how it can tailor the data to meet more citizens’ needs, especially small community-based organizations throughout the five boroughs that could make use of the city’s data.
“This is rooted in the open data portal as a product: we should make sure the product works for the users we already know about (internal city analysts, the civic tech community, companies), but should ALSO be expanding the people who are usually not included (K-12, underserved communities, community boards and non-profit groups, etc),” Campbell said.