BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

From Homelessness To Human Trafficking, How A NYC Nonprofit Is Using Big Data To Make A Difference

Following
This article is more than 9 years old.

In a world where big data is the buzzword of the day, companies are figuring out how to use and harness it to get the most ROI. One company is using it for something a tad more philanthropic.

SumAll.org is the non-profit side of SumAll.com, the data firm that is already helping the likes of Starbucks and National Geographic with their social media analytics. SumAll.org has undertaken a number of projects, such as analyzing the darkness of human trafficking throughout the world as well as partnered with the city of New York to help combat homelessness and the the city’s prostitution problem.

"We engage in a wide range of issues where we believe data can add significant value and deliver impact," SumAll Foundation CEO Stefan Heeke says. "Our sweet-spot are data-rich, high impact projects related to an issue of interest where we partner with an organization with access to operational data. Initially we engage in workshops to better understand the opportunities and learn about the issue. Scoping the project together with our partners and have a common understanding is key for a successful outcome."

Partners include the likes of the Clinton Foundation, with which they are working on the staggering statistics on prescription drug abuse. "We explore concepts of using "big data" and real-time data to help with prescription drug abuse risk detection," Heeke says. "The fascinating opportunity here is to leverage untapped "big data" sources for drug related prevention and alerts."

The organization is also partnering with HumanitarianTracker.org to analyze data coming out of the Syrian crisis going on right this second. With their analysis, the teams have been able to pull different insights like types of killings that are going on and how they are changing over time, the casualties in Aleppo picking up, overtaking the amount in Damascus, the amount of female vs. male slayings. They even created a tool to document the crisis thus far.

Other issues tackled by the foundation include school tracking - with a school in Haiti (EcoledeChoix.org) they are tracking individual student performance to help their partners get funding by proving impact - and literacy, exploring the impact of writing poetry on literacy outcomes. With the city of new York, Dept of Homeless Services (DHS) SumAll is trying to improve homelessness prevention by targeting families which are likely to become homeless within 3-4 month. "We work - Heeke tells me - with eviction data and shelter history, enhanced by open data to create a model for predicting at-risk families and improving the outreach".

Big Data (Photo credit: Kevin Krejci)

Sounds great, but: does it work? It's no secret that there's a lot of hype, about big data, so much so that a recent article by Tim Harford on the FT was asking "Big Data: are we making a big mistake?". In short, the author's point was that too often researchers consider as causation something that is "just" correlation. Whilst you can still make some valid assumptions based only on correlation, without knowing the cause of a certain phenomenon, the risk of giving faulty explanations is high. The fact that a variable is associated to another, in itself doesn't mean anything. You might think that Google searches about "flu" are correlated to the geographic spread of the disease (people who are ill and search for information) but, as Harford points out, it could just mean that the news full of scary stories about flu in December 2012 provoked internet searches by people who were healthy. Or that Google’s own search algorithm moved the goalposts when it began automatically suggesting diagnoses when people entered medical symptoms. So, I asked Heeke: what are the limits of the Big Data approach and how can we make sure there are no misunderstandings?

"Big Data is a bit of a hype now," he says, "and may have created unrealistic expectations, but it's certainly not a mistake to leverage it responsibly. I have seen many cases where data has been transformational and well applied. The emphasis is on "well applied", There are risks of mishandling data and creating unrealistic expectations with careless handling. Data is an amazing tool but needs to be understood in the context of the issue. It's like flying: with properly licensed and trained pilots, aviation is safe".

As for making mistakes, "The word 'mistake,' the researcher adds, "suggests that we have a choice of using big data or not, I don't think we have that choice. We are enjoying the benefits of big data on a daily basis, often without realizing it. Predictive analytics is making many parts of our life more efficient and I wish there was more of it in the social sector. Big Data is a bit of a misnomer and suggests that quantity by itself is an advantage. Unfortunately, many big data sources are very noisy, e.g. Twitter feeds. Quality of data is more important than quantity".

What scientists (and journalists) should do is seriously question any dataset before investing time into analysis: is the data plausible? How was it collected? Is the data really useful and relevant to solve the problem? With that in mind and with a good understanding of the real-life context, you can put data to good use, and increase the chances of achieving your goals.