When I attend industry conferences or speak with Chief Data Officers (CDOs) and Chief Analytics Officers (CAOs) of large financial institutions, one popular question that arises is, “What do you hear about open source analytics in other large banks? Is it ready for production?”
While I’ve encountered substantial growth of exploration and analytics development occurring in private and public clouds using open source analytics, I’ve also been a little surprised (in two ways) at the findings of these CDOs and CAOs when it comes to actually deploying analytics assets using open source.
First source of surprise: large financial institutions with whom I’ve spoken are getting nasty wake-up calls about failures of their selected open source platforms to provide robust, reliable results. The initial attraction was the price of open source tools; the subsequent feedback is a broader appreciation for total cost of ownership, which isn’t as attractive as they first anticipated.
Second source of surprise: that collectively, we didn’t see this coming.
It’s common that both analytics teams and technology teams forget to spend time working out their plans for deploying these assets in the field, before they complete their exploratory analytics and model development. This is even more important when working with relatively new platforms that have yet to be tested, validated and deployed.
- Chief Analytics Officer over Decision Strategy for a leading bank
After a 4-year initiative to carefully consider the roadmap for open source analytics in the bank, they’ve decided that the total cost of ownership for open source analytic model development and deployment is more expensive than using a single consolidated platform that both meets all their functional needs, and scales for their requirements in both development and deployment. They are mandating that all new model development effective January 2017 takes place on the SAS platform, and that all new open source model development and deployment projects on open source platforms will be immediately halted. - Chief Analytics Officer for top 10 property & casualty insurance carrier
This company chose 18 months ago to standardize on all new model development in Python and all model deployment in Scala. However, after substantial translation, deployment, refactoring and testing, they’ve found that none of their Scala model translations match the results from their Python model development. What’s worse, nobody in the company knows how to fix this problem. Unless they find a cure, that means that they’ve invested 18 months in models that can’t be deployed in production. - Chief Data Officer from leading credit card issuer
In a one-on-one conversation, this CDO and I weighed the merits of migrating to open source analytics from commercial-class platforms like SAS (where SAS is one of few incumbent analytics platforms). Other than the non-trivial licensing costs, there are the migration costs to the open source platforms to be considered, and the backlog required for migrating critical models to the open source platforms. More worrying to this CDO, however, is that if they support the desires of the analytics development team, they will need to support more than 20+ code bases, all of which, due to their open source nature, are under continuous enhancement, many of them without any formal release management programs. The real costs to the organization, he fears, is in supporting analytic assets whose code bases are not in lock step with the releases, or assets whose authors have left the company, or in the challenge of enabling cross-asset coordination among asset authors who all speak in different languages and coding conventions. And that’s just for model development; this challenge is compounded when you consider internal release management and change control for migrating assets through test and production. - Chief Data Officer from insurance services organization
When this CDO asks their development services organization to get specific about analytic asset deployment, the response is “We’ll use our standard deployment practices.” But when drilling to the next level of detail, specific practices and design patterns are vague or unavailable. Her fear is that the development team management believes analytics assets are the same as any other code base that will go through standard change management and migration, without understanding how these assets are unique, or how analytics SMEs need to be folded into the change management process.
Want to know more about operationalizing analytics assets in production? Click here to check out our latest RedPaper, “Operationalizing Analytic Assets.”
And of course, download my new book, “Skate Where the Puck’s Headed: A Playbook for Scoring Big with Predictive Analytics.”