A former colleague of mine always referred to his clients’ customer data as their crown jewels. He had a point. No one else knows as much about your customers’ behavior, attitudes, and preferences. If your customer data doesn’t seem valuable to you, imagine if your competitors had access to the very same information. What would they do with it?
There has been lots of talk about analytics as a source of competitive advantage. More recently, big data has promised to uncover untapped value and insights. However, have you thought more holistically about the resulting customer insights and intelligence? Used wisely, what you know about your customers can be a source of competitive advantage. It can help you increase market share by promoting the right product at the right time to the right person using the right channel. It can provide insights that enable you to improve marketing ROI, conversion rates, and conquesting. It can help you identify customers likely to defect, uncover what you need to do to retain them and help you determine if they are worth retaining based on their future lifetime value.
Finally, if you aren’t thinking about your “crown jewels”, I bet one of your competitors are. They may be able to purchase data about your customers from a third party vendor and use it for conquesting. If you don’t think customer data is valuable, your competitors do and they are willing to pay for it.
Big data has been a hot topic for several years and for good reason. There is value in analyzing unstructured, high volume and massive data sets. However, when I interview candidates that say they want to be data scientists, they focus on the technology and techniques. They forget that the critical thinking and framework used for big data is also important and it is applicable to many types of analytic projects.
It comes down to some very fundamental questions:
- What problem am I trying to solve? Defining the problem up front will keep you grounded as interesting findings may lure you away from your goal.
- What data sources can I use? You want to consider multiple sources to triangulate your results and provided a richer picture of what is happening.
- Have I considered all the possible sources of bias? Bias of all sorts can skew results and must be considered and incorporated into your analysis plan.
- Do I need to use all the data available or will a sample be sufficient? There are times when it is not feasible or necessary to analyze all the data available. However, if you sample, you need to make sure that you are getting sufficient coverage and that your sampling is random.
- How can I validate my data? Validation must be part of your analytics plan, whether you validate one data set against another or at least compare your results to findings from other comparable projects.
- What analytic technique(s) are appropriate? Consider the pros and cons of various techniques and what would be most appropriate given the data and problem at hand.
While it is very tempting to dive straight into the data and analysis. Spending time up front to answer these questions will help you be more efficient.
I am often asked, what is big data? It happens at holiday parties and even once after a funeral. Certainly there have been large data sets before. So what is different now? Big data commonly refers to data that is so large that you cannot use the typical environments to store and manage it or the typical software to analyze it. In addition to volume, big data is often defined by velocity and variety. Velocity refers to the speed at which the data is available and big data typically includes frequent inputs. Variety refers to the diversity of sources and formats and big data typically contains unstructured data which is not easily categorized or organized.
The volume of big data requires new thinking about where to put the data.Traditionally, companies kept their data in house, in a data warehouse on an internal server.Now some companies are turning to the cloud, both private and public clouds, to house data because of its size.In addition, the cloud offers flexibility should the needed storage capacity grow.Similarly, the volume and variety of the data may make it impractical to load the data into a database for to do so would require assigning data elements to tables and fields.Some big data may not be easily structured.For example, it could be text messages from online customer service chats.In this case, companies might turn to a parallel programming framework such as MapReduce to capture the data.This enables them to load all the data and then parse the text of the on-line service chats to identify the frequency of words used.For example, how many customers reported a problem with a particular part or described themselves as frustrated.However, you can’t use SAS or SPSS to analyze the data in a MapReduce environment.Further, data mining techniques may be more useful than classical statistics because of the nature of the problem to be solved.Thus, almost everything about big data requires rethinking data and analytic tools.
However, in the end, big data is like all data. It must generate value. Big data is meaningless unless it enables companies to increase revenue and/or reduce costs by enabling them to identify insights that were previously unavailable. The power of big data is that analysts can explore larger data sets that were impossible to analyze before and delve into unstructured data that was typically ignored because of its non-conforming format.