Skip to main content
What makes someone good at building predictive analytics?

Here are a few of the things a good analyst will excel at:

  • Dealing with data bias.
    Many modeling data sets have biases, missing data on some percentage of customers or an inadequate sample size. For instance, a data sample may be biased toward—that is, predominantly include data on—the kinds of people you currently have as customers. For the model to help you evaluate a broader range of customers, the analyst may have to employ techniques such as reject inference (or performance inference), which statistically infers the behavior of customer types that aren't in your data set.
  • Finding predictive data patterns.
    The more experienced the analyst, the more capable he or she is to interpret the nuances in the data and find the best predictive characteristics for your desired performance outcome. A good analyst will also know which technique to use for your data set and business problem.
  • Engineering the model.
    Modeling includes both letting the data speak and interpreting what the data says. During model development, the analyst often needs to "engineer," or fine-tune, the model to ensure it will address the identified business goal. The analyst’s knowledge of your business problem and industry comes into play. The analyst could substitute or remove any predictive characteristics that may be contentious, in order to address regulatory requirements or customer concerns. When necessary, an analyst can make the output of the model more transparent so businesses can communicate decisions to customers.
  • Validating the model.
    There are many techniques used to rate a model’s strength, such as divergence, K-S (Kolmogorov-Smirnov statistic) and ROC (receiver operating characteristic). The analyst must examine the right measures to see how well the model performs on an independent data sample—one that was not used to build the model. This indicates how the model would perform in practice. A good analyst can also prevent "over-fitting," which occurs when a model is so tuned to the specific data patterns in its development sample that it would not work as well on other data.

related posts