Big data, little farmers markets Part 2: The minefield of analyzing Big Data

In the first installment of this series, I introduced the idea of Big Data, the Internet of things (IoT) and what social media has promised and what it has delivered. I promised some thoughts on analysis next. here goes:

•Big Data is partly defined by its resistance to analysis. The volume, velocity and variety of Big Data makes problems for easy collection and analysis. This story on the struggle among safe street advocates to find good data speaks to that issue.

•Big Data is probably more appealing to advertisers than to our often shadowy government at this point but still, we should keep an eye on both of them and their analysis/use of Big Data.

•Lastly, as put so well by the author of Dataclysm: Who We Are (When We Think No One is Looking), much of behavioral science research is based on WEIRD research: White, educated, industrialized, rich and democratic nation’s subjects. Big Data may help to offset that issue.

Markets already intersect with Big Data across many different sectors, such as health care, the public sector, agriculture and retail. So let’s think about how this could play out for markets:
What if a researcher used the total dollars spent at markets on SNAP and compared it to grocery store SNAP sales on a map, not adjusting for hours open or the number of goods or markets available or fixed costs to offer those goods? Or how about the decrease in certification for organic farmers among market vendors – What if that was just a graph showing the decrease year after year, without the analysis that many farmers stated that they feel they do not need certification while they sell directly to shoppers and are therefore able to explain their practices? What if those maps/graphs were what influenced policymakers?

Some scenarios to ponder:

    •Market A (which runs on Saturday morning downtown) is asked by its city to participate in a traffic planning project that will offer recommendations for car-free weekend days in the city center. The city will also review the requirement for parking lots in every new downtown development and possibly recalibrate where parking meters are located. To do this, the city will add driving strips to the areas around the market to count the auto traffic and will monitor the meters and parking lot uses over the weekend. The market is being asked for its farmers to track their driving for all trips to the city and ask shoppers to do Dot Surveys on their driving experiences to the market on the weekend. Public transportation use will be gathered by university students.

    •Market B is partnering with an agricultural organization and other environmental organizations to measure the level of knowledge and awareness about farming in the greater metropolitan area. For one summer month, the market and other organizations will ask their supporters and farmers to use the hashtag #Junefarminfo on social media to share any news about markets, farm visits, gardening data or any other seasonal agricultural news.

    •Market C is working with its Main Street stores to understand shopping patterns by gathering data on average sales for credit and debit users. The Chamber of Commerce will also set up observation stations at key intersections to capture visual data on visitor behavior.

    •Market D has a grant with a health care corporation to offer incentives and will ask those voucher users to track their personal health care stats and their purchase and consumption of fresh foods. The users will get digital tools such as cameras to record their meals, voice recorders to record their children’s opinions about the menus (to upload on an online log) with their health stats such as BP, exercise regimen. That data will be compared to the larger Census population.

In all of these cases, the data to be collected crosses sectors and systems, meaning that no one entity has all of the raw data at their disposal at all times. That boils down into Analysis Issue #1

In all of these cases, the data to be collected has many ways to be interpreted, based on which entity is interpreting the data. Analysis Issue #2

In most of these cases, the data collected requires some self-reporting. Analysis Issue #3

In some of these cases, privacy controls must be strictly managed and will affect how much analysis can be done. Analysis Issue #4

from the New York Times:
“The first thing to note is that although big data is very good at detecting correlations, especially subtle correlations that an analysis of smaller data sets might miss, it never tells us which correlations are meaningful (italics added). Analysis Issue #5

Check out this site for fun examples of how matching correlations doesn’t always add up to good conclusions.

The thing we should be able to agree on: all partners should be involved with the analysis and should receive access to the raw data. That means markets participating in just the data collection piece is not enough. They need to be involved in the analysis because if not, the context of markets will be lost.
Yet we know that just collecting the data is be a massive undertaking for low-capacity markets (even assume some funding is offered in all of these cases for the partners to staff the collection of the data), not even adding in the time and effort it takes to analyze it. What might help is to have some analysis prepared ahead of time and to prepare the market community for participation.
1. This means that every market association, or group of markets or markets themselves should keep information about each market’s history, size, structure and staffing in separate PDFs. This, by the way, is a resource that Farmers Market Coalition (for whom I am a consultant) is working on with one of their university partners, the University of Wisconsin to pilot for their AFRI Indicators for Impact project . Hopefully, the Market Profile will be available online for all markets to test in 2015- stay tuned!
2. Markets need to know the area’s current demographic and other relevant details. Check the census to know what the larger population’s stats are and make friends with real estate professionals to keep up on trends in the neighborhood.
3. Do a Dot Survey or Bean Poll a few times a year asking shoppers to tell you what zip code they live in, how they come to the market, things like that and keep track of that data. Maybe a big dry wipe calendar on the wall to add all data collected?
4. Market boards and advisors should keep any data already collected and the Profile information to be able to share it as needed in any meeting they happen to attend in their own professional lives.
5. When researchers do come to your market with an offer to help with data collection, be ready to ask for data you want. How about asking for focus group data so that a market can begin to build “persona profiles” of those who come to the market? Or ask for added analysis for numbers that you think might be important for the market: those who know me have heard my song about finding a way to track the number of return SNAP shoppers and how I think it that metric is so useful for markets and possibly even more useful than total SNAP dollars, in terms of analysis.

5. Encourage city or county public health agencies to offer a semi-annual breakfast for those entities that work on community interventions (like markets, health clinics, social service entities, university programs, youth outreach etc) to share news about what they are seeing in their field and to share any data informally. If meetings are impossible, then a regular email would work. In other words, stay in touch with other data collection efforts in your community.

I’ll end this post with some of the lovely words of Dataclysm author Christian Rudder who was talking about the Vietnam Memorial’s physical self versus its online database self:

“A web page can’t replace granite. It can’t replace friendship or love or family either. But what it can do – as a conduit for our shared experience – is help us understand ourselves and our lives. The era of data is here; we are now recorded. That, like all change is frightening, but between the gunmetal gray of the government and the hot pink of product offers we just can’t refuse, there is an open and ungarish way. To use data to know yet not manipulate, to explore but not to pry, to protect but not to smother, to see yet never expose, and, above all, to repay that priceless gift we bequeath to the world when we share our lives so that other lives may be better – and to fulfill for everyone that oldest of human hopes, from Gilgamesh to Ramses to today:that our names be remembered not only in stone but as part of memory itself.”
I think I’ll adopt that bit as my mantra.

Advertisements