By Kevin Meyer
I've been observing the growing love affair with "big data" with increasing skepticism, and perhaps even worry. Long-time readers know I've always been this way; in fact, one of my first posts – seven years ago – was on the mis-use or over-use of data and algorithms: The False God of the Almighty Algorithm. Later posts demonstrated how a simple white board can sometimes analyze and plan a situation – such as a factory floor – better than software. Years later I believe that more than ever.
Recently Dr. Art Langer of Columbia University has written a couple of articles that align with my concerns over an obsession with big data. In CIOs Shouldn't Let Big Data Rule Their Decision-Making he discusses how data can sometimes lead to the wrong conclusions.
Once you rely on big data, it tends to become omnipotent in the decision-making process.
We must remember that our markets are constantly changing and the
variables that dominate decision-making are extremely complex—and there
are many great decisions that are made by people that go on intuition
and “gut-feel.” And there are many CEOs that consistently state that
many of their decisions have little to do with logic.
CIOs beware—don’t forget the power of the human factors.
Data is a tool, like many others. Dr. Langer's most recent article, It's Not Just the Data, Stupid, expounds on this a bit more.
We are being overwhelmed with the use of digital data to make decisions.
You can’t read much these days without immediately seeing buzzwords
like “Big Data,” “Business Analytics,” or “Business Intelligence.”
Discussing process issues is passé; how to deal with data is the “in”
thing for business discussions, especially at board meetings.
More and more, I’m seeing other ways in which we’ve become increasingly reliant on data testing.
And once again we are losing sight of the context and perspective. Dr. Langer quotes Ron Sentz, VP of EMSI:
“The biggest limit to big data is our ability to interpret it. People
need to understand why they are using data. What is the end goal?” Sentz
said. “Data is also like an assembly of facts, which aren’t necessary
the same thing as truth. If facts are poorly interpreted, it could lead
to the wrong conclusions.”
That last sentence needs to be repeated: Data is also like an assembly of facts, which aren't necessarily the same thing as truth.
We see that in so many areas. Politics, climate change, marketing, and even when trying to schedule a complex factory floor. We get so sucked into data that we assume it is fact. We get so sucked into facts that we assume it is truth.
Our LeanBlog friend, Mark Graban, wrote about this a couple years ago, referencing a great photo from FailBlog.
Mark also references two quotes that I've both enjoyed and been troubled by:
"In God we trust, all others bring data." – Dr. W. Edwards Deming
"Data is of course important in manufacturing, but I place the greatest emphasis on facts." – Taiichi Ohno
The problems I see with people and organizations caught in big data, or just plain old data, are two-fold:
The desire if not requirement that data must be used with every decision creates paralysis. In the lean world we are taught to seek perfection, but sometimes we forget that the one thing more important than perfection is simply progress. I've been part of situations where the data is never quite good enough, the analysis of data never quite complete, the conclusions from analyses never quite solid. So nothing happens.
Equally dangerous is an overconfidence on data, and subsequent analyses and conclusions. Data becomes facts becomes truth. In the two articles I referenced by Dr. Langer there are several examples where this is not the case.
Because truth requires perspective, context, reference, and understanding. Knowing when outliers are relevant or irrelevant, when the dataset is complete or incomplete, and how to make the correct, or necessary, conclusions when data is incomplete or inappropriate. Experience and intuition create relevance. The ability to create relevance is a core component of leadership.
Facts are not just data. Truth is not just facts. Don't forget the human factors.
Jim McPherson says
Man Kevin, you go away for a few days and come back and smack us with something like this! Great thought-provoking post! Jim.
Mark Graban says
I think the spirit of the Ohno quote is that, in modern terms, we can’t rely on data in reports, spreadsheets, ERP systems, etc. We have to often go to the gemba where we can see things with our own eyes… in this case, “fact” is observable and should be the same thing as “truth.”
I’m curious – what troubles you about the Ohno quote?
Kevin Meyer says
Mark – I understand where you’re coming from. Perhaps my concern actually deals with the “go to gemba” part. I see too many people think of “gemba” in very narrow terms – such as the factory floor. With larger business decisions it is more of a concept – where the value is created, taking place, etc. So there is danger in just saying “go to gemba to get the facts” and therefore “those facts are truth.” Why isn’t a process working? Many would go to the factory floor, observe the data of process directly, develop and execute plans to improve the process. All the while perhaps the process is simply no longer required by the customer because of a completely different process at a competitor. A business leader would go to gemba by also visiting the customer – seeing where ultimately value should have been created from the perspective of the customer. We need to be very cognizant of context and reference. Data on the shop floor may translate to facts from the perception of the shop floor, but they may not be truths about the business condition.
Mark Graban says
Right. For example, we can’t go to the inpatient unit (a gemba) to talk to nurses to get their opinions about what patients want. Observing and talking to patients might get closer to facts.
Likewise, going to the factory gemba to ask line workers what customers want is not fact either (it’s at best a biased sample). That’s why the brilliance of Toyota includes the chief engineer driving a minivan (or was it different minivans) all across the U.S. and Canada to experience things first hand and observe real customers at the store loading things into their vans, etc.
Dwayne Butcher says
The inherent problem with data is that it is a historical snapshot. And in a world moving faster and more complex how much do we want to rely on historical data? Of course we shouldn’t abandon data, but simply see it as a piece of the overall puzzle being solved.
Robert Hawkins says
At least I know I am not alone. This had been a thorn in my side for years. I once worked at a medical company where the DP dept. spent all night printing data reports of sometimes 1000’s of line items and well over 100 pages and distributing them to management and supervision in baskets before dawn. Nearly every one of those reports went straight to the recycle bin since no one understood or could use the data even if they had time to read it. It seems that, not only do we depend too much on data without questioning it, but there are millions of people who do nothing but produce data and assume all recipients will want and use it. As I have often said, we are drowning in data and there seems to be this logic that if we torture the data long enough it will confess.
The real issue for me is the big question, how reliable is the data and where did it come from?
David Hallsted says
How about making the data fit the goal? There are managers who instruct their employees on how to enter the ERP data to make the goal. The goal is made. Everyone gets the bonus. No one improves. Everyone knows it except the person at the top who is in the dark about the work around.
Wallpaper charts are a common use of data. Folk collect the data. Post the charts on the wall. No one looks at the data. The process repeats itself every month.
“Stop the line so that the line never stops” is an anti-data statement. It has taken me several years to understand why it is anti-data. The fact is that you have a quality problem. Stop. Fix the problem. The problem will go away.
Most places do the work around. Gather the data. Look at the data in a batch report. Prioritize the data. Fix the biggest ones. Repeat the process.
The “stop line” is continuous improving the process.
The batch data is the bottleneck in the improvement process.
For me, data is over-processing waste.
I have yet to see a customer PO that wants a factory to collect data on why they cannot make it right the first time.
Go to gemba.
See the problems.
Fix the problems.
Stop collecting the data.
Shrikant Kalegaonkar says
Kevin,
You hit the nail on the head with your point that “we are losing sight of the context”.
Dr. Donald J. Wheeler summarizes Dr. Walter Shewhart’s rules on data in his book “Understanding Variation: The Key to Managing Chaos” with the first principle for understanding data:
— No data have meaning apart from their context —
Dr. W. Edwards Deming emphasized that “[t]he ultimate purpose of collecting the data is to provide a basis for action or a recommendation for action. The step intermediate…is prediction.”
With “Big Data”, what is the theory (hypothesis) that the data is being collected to compare against? That is, what is the purpose for collecting the data? At present it seems to be a wide net cast without purpose. Or, maybe I just don’t understand enough.
Wayne says
The seductive power of Internet research and data mining from one’s own system make it difficult to get up and out of a cubicle setting and into the real world to observe with a keen eye the reality of things.
In my many years as a hands on mfg eng I have yet to find a shop floor that I could not improve with some simple device, system or fixture with almost instantaneous ROI.
And I’m no smarter than the average manufacturing guy.
Get out of your cubicle and walk the walk!
Craig Anderson says
Kevin,
Great article. For the record, Deming had this to say about managing by data:
“the most important figures that one needs for management are unknown or unknowable… but successful management must nevertheless take account of them.”
Deming’s alleged “In God we trust…” quote is unverified, as is his other alleged comment that “you can’t manage what you don’t measure.”
More on Deming and data:
http://curiouscat.com/deming/managewhatyoucantmeasure.cfm
Jon Miller says
Interesting discussion.
Human factors: read Daniel Kahneman. Humans make pretty good short-term decisions based on intuition, gut feel and subconscious analysis, and terrible long-term decisions based when we don’t look at data, trends, process/system behavior, based on understanding of statistics.
Ohno’s quote: in his book Toyota Production System was in the context of 5 why root cause analysis and problem solving at Toyota. The exact translation of his words would be
“Although I value ‘data’ on the production floor, I place the most importance on ‘facts’. When a problem occurs, if the root cause analysis is insufficient, the focus of countermeasures can be off. That is why we ask ‘why?’ five times. This is the foundation of the scientific attitude of the Toyota system.”
Some readers have misunderstood that Mr. Ohno weighed go-see facts above data. This was not the case. He had a great grasp of production floor data, even to the point of having developed plant-to-plant productivity comparison tables on his own. These were indicators of process behavior which allowed him to respond when he saw deviations, by going to see and asking why.
Big data: let’s be careful not to throw out the value of data and statistical thinking because of how this buzzword is being used by dealers of big data products and services.
Michel Baudin says
A few months ago, I wrote the following post about the distinction between data, information, and knowledge: http://wp.me/p1UTIj-B9
AC says
It is about balance! You need both data and the intuition developed from understanding a system to make good decisions. Relying solely on “gut” feel can lead one astray if it based on wrong assumptions. How do you check those assumptions? With data. Relying solely on data can lead one astray via analysis paralysis and if the data does not make sense. How do you check the data, using intuition based on process/product knowledge. Yin and Yang….