Alan Matsumura and I had an excellent conversation earlier this month about the work he is starting up at SilverTrain. Part of the discussion centered on the unexpected problems that you run into when doing BI/information analytics work.
Suppose you work for Kraft. You’d like to know how many Oreos you sold last quarter. An innocent enough question and, seemingly, a simple one. That simply shows how little you’ve thought about the problems of data management.
Start with recipes. At the very least Kraft is likely to have a standard recipe and a kosher recipe (they do business in Israel). Are there other recipe variations; perhaps substituting high fructose corn syrup for sugar? Do we add up all the variations of recipe or do we keep track by recipe?
How about packaging variations? I’ve seen Oreos packaged in the classic three column package, in packages of six, and of two. I’ve seen them bundled as part of a Lunchables package. I’m sure other variations exist. Do we count the number of packages and multiply by the appropriate number of Oreos per package? Is there some system where we can count the number of Oreos we produced before they went into packages? If we can manage to count how many Oreos we made, how does that map to how many we will manage to sell?
That may get us through standard Oreos. How do we count the Oreos with orange-colored centers sold at Halloween in the US? Green-colored ones sold for St. Patrick’s Day? Double stuf Oreos? Double stuf Oreos with orange-colored centers? Mini-bite size snak paks? Or my personal favorite: chocolate fudge covered Oreos. I just checked the official Oreo website at Nabisco. They identify 46 different versions of the Oreo and don’t appear to count Oreos packaged within another product (the Lunchables question).
That covers most of the relevant business reasons that make counting Oreos tricky. There are likely additional, technical reasons that will make the problem harder, not easier. The various systems that track production, distribution, and sales have likely been implemented at different times and may have slight variations in how and when they count things. Those differences need to be identified and then reconciled. Someone will have to discover and reconcile the different codes and identifiers used to identify Oreos in each discrete system. And so on.
By the way, according to Wikipedia, over 490 billion Oreos have been sold since their debut in 1912. As for how many were sold last quarter, it depends.