From the course: The Data Science of Nonprofit Service Organizations, with Barton Poulson

Overcoming the data-free environment

- [Instructor] Data scientists have done some amazing work both with and for nonprofits. For example, DataKind collaborated with the United Nations High Commissioner for Refugees, that's also the UN refugee agency, to scan satellite images of a South Sudanese refugee camp, and they used a machine learning algorithm to count the number of tents automatically. This allowed the refugee agency to better forecast and anticipate demand for tent requirements, food needs, water supply, and to reduce crowding and overcrowding in the camps. Then the business intelligence company, Qlik, that's Q-L-I-K, partnered with nonprofits to create data visualization tools that could monitor the spread of Ebola in Africa. They were able to create alert systems that helped direct the allocation of really what are limited resources, and that in turn helped to ultimately control the spread of Ebola. And finally the Akshaya Patra Foundation in India uses data science to optimize routes for delivering food to government schools across the country. It's the traveling salesmen problem in real life and this makes it possible for them to read to more people in less time at a lower cost, and provide greater good. All of this is because of the intelligent use of data science and the service of nonprofits. There is however a bit of a catch, and it has to do with the scale of nonprofits. Specifically here are a few important data points. It's true that nonprofits provide major services to their communities, but most nonprofits are in fact small organizations. A recent report by the Nonprofit Finance Fund, which is itself a nonprofit, drew on data from the Urban Institute Nation Center For Charitable Statistics to show that while there are over 1.5 million registered nonprofits in the United States, 94% have annual revenue of less than one million. In fact, over 60%, three out of five operate on lees than a 10th of that or about $100,000 per year. So while there are big nonprofits like the Red Cross, or the United Way, the vast majority operate on a scale that doesn't even reach the rounding error for revenue in a corporate world. But the work that nonprofits do is tremendously important. Whether it be in healthcare, housing, justice, education, the arts, or really any of the million other topics, and while data science tends to be associated with massive tech companies like Amazon or Google, it turns out that nonprofits can benefit just as much from a data driven or data informed approach, and from the insights of data science as much as their corporate cousins can. Now in my own data science work I collaborate extensively with nonprofit organizations, and I found that even a little data can go a very long way in helping them work more effectively, more efficiently, and even more creatively. It's also very gratifying work to do as a data scientist, 'cause you have the opportunity to genuinely make the world a better place. But data science often takes a slightly different form in nonprofits. Especially small nonprofits. I wanna talk about some of the differences and how you can tailor your approach to meet the needs of this very large percentage of small nonprofits that are working actively to improve their communities. Now the first thing to note is that all nonprofits gather data, and the reason they do that is because they work off of grants and major foundation gifts, and all of those have accountability and reporting requirements. So every nonprofit is going to gather the data that they need to make the report to the funding agencies. So for instance, a study by Every Action, a research organization, found that 90% of nonprofits collect data. That makes sense because they need it for their reports, but they also found that nearly 50% of them say they don't really know how the data can or might impact their organizations work aside from the required reporting. So they're not using it. It's there, they're gathering it, but they don't benefit from it, and what that creates is functionally a data-free environment, or a situation where there's not a deliberate attempt to get the full value out of the data that's available to them. They may know how many people they serve, how many people are on their email list, and the amount they received in donations the previous year, but even that information may exist in separate datasets used by different people. It might not be combined, or perhaps it's not mined for potential insights, and this is where data science can be enormously helpful. There's a lot that can be done before you even get to, really data science per se. Based on my own experience working with nonprofits, you can get amazing return on investment for some of the simplest practices in data driven or data informed decision making. Especially important for (mumbles) the funders. So the first one is really getting them up and running on spreadsheets. Getting their data into spreadsheets, making those spreadsheets available is the first step. So perhaps they combine information on their interactions with clients, interactions with donors. Maybe they have CRM data. That's customer relationships and management. Something like Salesforce, which can be potentially free to nonprofits. And they can have that data there, but if they put it all in a spreadsheet, then they can combine it and use it the ways that they want. They can explore and visualize the data. They can sort and filter, and they can calculate sums and averages. They can make bar charts and line charts, they can make pivot tables. None of these things are complicated, but in my experience, they probably are gonna get you at least 90% of what you need in terms of useful data for making rational decisions within a service nonprofit. You can also augment the data. You can get information from your web analytics. Something like Google Trends. Which lets you look at the relative popularity of search terms across the country. You can get open data from government sources. You can conduct your own surveys, and the idea here is to get the data they have, stick it into spreadsheets that people have access to, a centralized source, combine your data sources, augment it with what you have, and then you're gonna have, really, the rough ingredients and the most important tools for making sense of that data. Now that spreadsheet data science, that I'm convinced, is sufficient for a huge number of organizations that don't have the size or the funding to have dedicated data people. On the other hand, you can also do data science proper, or capital D, capital S data science with nonprofits. That can include things like doing text analytics, maybe sentiment analysis on social media. You can do predictive analytics on demand for the nonprofits so they can better prepare for fluctuations both seasonally or weekly. They can also try categorizing clients. So this is a classical machine learning problem. Is this a high risk client, a low risk client? If you're doing, say for instance, a health intervention. And you can actually create maps of the outcomes. All of these involve significant data science knowhow, and they can be used to augment the very basic analyses that are done, say for instance, in spreadsheets, and together those can help nonprofits really get out of the data free environment and start making good use of this. On the other hand, it is important to remember the constraints of nonprofits. First off, you need to make sure that the staff who are dedicated nonspecialists, at these organizations, are able to follow up with your own analyses. That they know what it means, that they can do something with it. What that means, among other things, is very simple tools. That's why I advocate spreadsheets as a very first step. Even something like customer relationship management can be too much. Salesforce is a very powerful tool. It's also a complicated tool. It's got a steep learning curve and most of the organizations that I know really haven't been able to get past the first steps of Salesforce because they simply don't have the time to devote to it. One way of dealing with those problems is to have data scientists volunteer, and there are several different ways you can do this. You can do a short-term data hackathon. I organize some of those here in Utah. You can do longer term volunteering such as an internship where you give a few hours a week to an organization. You can do pro bono work like lawyers do, or perhaps you can have in-kind donations of data science expertise from corporations to local nonprofits as a way fulfilling their charitable giving obligations. You can also try providing free training on data science topics and resources to help them in their own work. Any of these are gonna be excellent ways that data scientists can help work with nonprofit and help them get past the rut of doing only the required amount of data work for the reports, but start getting the insights, from really, a huge amount of data that they can better provide the services that they're for in the first place.

Contents