Interesting poll over on KD Nuggets on data mining deployment. I was struck by a number of points:
About 40% publish a research paper and I suspect most of these do not deploy the results.
This probably reflects the large academic community in data mining
Nearly half use the models they develop to change their business rules and slightly more use some other technology to deploy models
This was very encouraging. Large numbers of data miners and analysts are clearly taking more of an enterprise decision management approach these days. Previous "unofficial" surveys I have done had much lower numbers of deployments. Excellent.
The use of models in batch processes was 2x real time
I think we are at a tipping point for this and will soon see this start to change. A focus on operational decisions, which is where most data mining is done, means that as these decisions become real-time and interactive, so will the model execution have to move away from batch.
Half of deployments was done using the data mining tool itself, the rest translated the model into some language for execution
This could go either way at this point. The tools could get better at handling interactive scoring and then they might maintain this percentage or they will get better at generating code or rules so that execution does not require the data mining tool. In general the performance characteristics of an interactive process are very different from a data mining/analytic process and so using different but integrated technologies seems more likely to scale.
Not a scientific survey but interesting.