I’ve been following the panting exuberance of big data apostles for the past few years, rolling my eyes at most of it. Sure, it can be interesting, but maybe my age showing when I say “so what?” to most of it.
What finally pushed me over the edge enough to comment on it was a tweet I saw from none other than our friends at SAP, saying (I’m paraphrasing) “you have lots of data, let SAP analyze it.” But of course. There’s gold in them there data… at least to SAP.
Just because we have lots of data, does it mean it has to be analyzed? What is the question, what is the problem, what is the opportunity being addressed? As Einstein said, “not everything that can be counted, counts.
It has become easier and easier to create, capture, and store data. Just ask the NSA. Or Target. Sure, it doesn’t take up physical space, but it still takes time, equipment, and people to manage that inventory of data. And perhaps inventory it is… excess, over-produced inventory. What does lean tell us about inventory…?
Sure, there is potential value in linking disparate datasets to come up with interesting and perhaps profitable conclusions. Unfortunately the most common outcome I’ve seen are an increasing number of KPIs, dashboard indicators, and the like tied to data. And once again, an increasing number usually creates a dilution of focus. Just ask those companies that have a two inch thick strategic plan.
A year ago Rufus Pollock penned an article for The Guardian that described how the real revolution isn’t big data, but the increased accessibility to small data.
Just as we now find it ludicrous to talk of “big software” – as if size in itself were a measure of value – we should, and will one day, find it equally odd to talk of “big data”. Size in itself doesn’t matter – what matters is having the data, of whatever size, that helps us solve a problem or address the question we have.
Bingo. Just like the 5S of lean or micro meditation of mindfulness, data is just a tool. First identify the problem, question, or opportunity. Only then determine what data, and analysis, is necessary.
Meanwhile we risk overlooking the much more important story here, the real revolution, which is the mass democratisation of the means of access, storage and processing of data. This story isn’t about large organisations running parallel software on tens of thousand of servers, but about more people than ever being able to collaborate effectively around a distributed ecosystem of information, an ecosystem of small data.
Smaller data, more interconnected and accessible.
This next decade belongs to distributed models not centralized ones, to collaboration not control, and to small data not big data.
What size is your data?
Yatin Ubhaykar says
Analysis; Paralysis !
Human beings, especially of the intelligent types get mentally stimulated at the thought of exercising their gray matter. What best than lots of data for the brain to chew on. But then again, we are lost in the maze. What’s the objective? What are the customer WINS (wants, interest and desires)?
James Parnitzke says
Perfect… think you have nailed this one, and uncovered the true purpose of why this is happening. Not even sure if you got 20 practitioners in the same room you could even settle on a definition of “big data” that all could agree on and didn’t sound like “market-tecture” …
Yes, there are things I couldn’t have imaged being able to do as late as 10 years ago with the capability that is very much mainstream today. I still believe first principals in our craft have not changed or been repealed; the real challenge remains deeply understanding what we are trying to solve for in a meaningful way.
Thanks for sharing this insight with any of us fortunate enough to find your writing.
-jdp (Parnitzke)
Allen Bonde says
I love the topic, and as I mentioned in a recent article, we need to “focus on data value, not it’s bigness” if we are to provide useful insights to everyday users/consumers. Also I’m glad you referenced Pollock’s work since I’ve been a fan of his and OKFN for some time. You may also like to check out the piece I did for Forbes in Oct 2012 on the topic, and the small data definition I posted on my blog (based on research I directed at Digital Clarity Group last year) -> http://smalldatagroup.com/2013/10/18/defining-small-data/
cheers,
Allen (@abonde)
Jason Morin says
When training engineers and data analysts, I tell them to constantly ask “what problem are we trying to solve?” before jumping into the pool of data. It’s so easy to forget the objective because you’re having too much fun analyzing irrelevant data and making neat graphs for your own amusement.
I also teach that data analysis is like walking along an unknown wilderness trail. While it’s OK to veer off the path occassionally to explore, don’t go too far. If you think you’ve found a goldmine, come back and tell others before going any further. Otherwide, stay on the path. I term it “managed discovery.”