I’ve been following the panting exuberance of big data apostles for the past few years, rolling my eyes at most of it. Sure, it can be interesting, but maybe my age showing when I say “so what?” to most of it.
What finally pushed me over the edge enough to comment on it was a tweet I saw from none other than our friends at SAP, saying (I’m paraphrasing) “you have lots of data, let SAP analyze it.” But of course. There’s gold in them there data… at least to SAP.
Just because we have lots of data, does it mean it has to be analyzed? What is the question, what is the problem, what is the opportunity being addressed? As Einstein said, “not everything that can be counted, counts.
It has become easier and easier to create, capture, and store data. Just ask the NSA. Or Target. Sure, it doesn’t take up physical space, but it still takes time, equipment, and people to manage that inventory of data. And perhaps inventory it is… excess, over-produced inventory. What does lean tell us about inventory…?
Sure, there is potential value in linking disparate datasets to come up with interesting and perhaps profitable conclusions. Unfortunately the most common outcome I’ve seen are an increasing number of KPIs, dashboard indicators, and the like tied to data. And once again, an increasing number usually creates a dilution of focus. Just ask those companies that have a two inch thick strategic plan.
A year ago Rufus Pollock penned an article for The Guardian that described how the real revolution isn’t big data, but the increased accessibility to small data.
Just as we now find it ludicrous to talk of “big software” – as if size in itself were a measure of value – we should, and will one day, find it equally odd to talk of “big data”. Size in itself doesn’t matter – what matters is having the data, of whatever size, that helps us solve a problem or address the question we have.
Bingo. Just like the 5S of lean or micro meditation of mindfulness, data is just a tool. First identify the problem, question, or opportunity. Only then determine what data, and analysis, is necessary.
Meanwhile we risk overlooking the much more important story here, the real revolution, which is the mass democratisation of the means of access, storage and processing of data. This story isn’t about large organisations running parallel software on tens of thousand of servers, but about more people than ever being able to collaborate effectively around a distributed ecosystem of information, an ecosystem of small data.
Smaller data, more interconnected and accessible.
This next decade belongs to distributed models not centralized ones, to collaboration not control, and to small data not big data.
What size is your data?