Posted in Data
All that work down the tubes.
So your data science project that you worked on for six months got canned and you don’t know what to do. There are two distinct phases to recovering from such a mortal blow:
- Anger comes first.
Gnash your teeth. Weep. Loudly. And in a sobbing fashion. Lie on the ground and kick your feet like a toddler in a supermarket that has been told to put the packet of sweeties back on the shelf.
- Acceptance and personal growth.
Pick yourself up off the ground. Dust yourself down. Put on your Big Person Pants and get right back in the game. Because tomorrow is another day.
But why did your project get binned when you worked so hard on it?
The odds were against you from the start. A recent article in Venturebeat says that 87% of all data science projects never get put into production. You don’t have to be a mathematician to work out that in reality that means about 1 in every 10 DS projects gets green lighted while the rest get flushed. Not exactly encouraging when you frame it like that, is it?
So what stops the rest making it beyond the Proof of Concept phase?
Sometimes your model just plain sucks. Sorry, them’s just cold hard facts. Maybe the data scientists involved just weren’t experienced (or good) enough to get the most out of the data they had in front of them.
Could be there was a lack of business knowledge amongst the DS team that made them ask the wrong question. It happens. Again, inexperience or not collaborating with the right subject matter experts at the project scoping stage means you've basically wasted your time. Wave it goodbye.
Or the data they had to stitch together into a Frankenstein’s Monster-style creation from disparate spreadsheets, handwritten notes and whispered conversations under bathroom stalls wasn’t quite good enough to wring anything meaningful back out of it.
I should have “You Work With What You’ve Got” tattooed on my arm after the amount of times I’ve used it to account for less than desirable data situations over the years.
It’s one of life’s mockeries that the businesses with most money available to spend on data science teams are inevitably the ones who have the most risk-averse senior management with the most stringent regulators breathing down their necks. Think banking, insurance and healthcare for obvious examples.
All rotten with money and all (quite rightfully) shit scared of the heavy hand of regulatory authority coming down on them. In the choice between drawing a line through your project or getting hauled in front of a regulatory panel, there’s only ever going to be one winner. Sorry champ.
Shiny New Object Chasing Management.
A particular favourite of mine. They follow the trends in business magazines just long enough to kick projects off when the buzzwords are hot but not long enough to see them through to a proper conclusion.
Brought in to staff a Big Data team in Hadoop last year? Sorry pal, budget removed. We’re all doing AI this season, hadn’t you heard?
“Why did Jane get budget for her project and I didn’t get any for mine?” “Why did Tom get ten more staff and I had a headcount revision?” etc. etc. The average corporate boardroom has more to offer David Attenborough in terms of animalistic territory marking behaviour than the Amazon rainforest.
Back-stabbing. Talking behind backs. Power struggles. Coups. Counter-coups. It makes Game of Thrones look like a quiet family sitcom. With the inevitable result that someone wins their Game and someone loses. And looks like this time it was your project. Them’s the breaks kid.
This happens a few levels down from the boardroom bickering Cold War-esque shenanigans we just covered. This is shop floor, inter-department, inter-team rivalry and it’s just as cut-throat as the war the suits upstairs get into. Data Engineering don’t talk to Data Science. Data Science don’t talk to Data Management. Data Management don’t talk to Business Intelligence.
And no-one talks to I.T. because they are a bunch of basement dwelling neckbeards with no standards of personal hygiene or inter-personal skills. Or so you heard Steve from Data Engineering say in the queue for coffee in the canteen.
Either way, we need a well oiled machine working in unison. Instead we’ve got a bunch of unicyclists all off doing their own thing with no co-operation towards the bigger picture. Is it any wonder that even projects that get into production don’t end up delivering any of the benefits they initially promised? Hell no.
What can we do to increase our project’s chance of success?
Winning the rollover lottery jackpot has a probability per ticket somewhere in the tens of millions to one. But still people buy them. If our chance of data science project “success” is around 13% then why wouldn’t we try to increase our chances of being on that side of the equation?
Looking at the reasons for failure above, we see a few we can’t quite control at the operational level. Execs gonna exec until the cows come home so we’ll ignore that one. Hoping we’ve not got one of those business magazine, hype-chasing senior managers is another one for the wishlist but hard to control in real-life.
Instead try these three points to increase your project’s chance of making it to the holy land of production deployment:
- Pick the right problem.
Banging your head off a brick wall form the get-go won’t get you anywhere. You have to be realistic about what you are trying to do and how it will ultimately be implemented by the business. If it’s a long shot right at the start, it’ll make it a lot harder to keep bouncing it over the inevitable hurdles you’ll face to get it live further down the road.
- Start small, stay simple.
Ensuring you don’t over-complicate the problem (just because you CAN doesn’t necessarily mean you SHOULD) is the main slogan to stick over your workstation. Simple doesn’t mean basic. It means easier to explain, easier to sell to your management and their management and easier to implement. There are no prizes for taking the hardest road possible just to prove how smart you are. It’s a self defeating policy right out of the blocks. Don’t do it.
- Get the right team together to tackle the problem at hand.
Teamwork makes the dream work. Uuggh, I feel dirty even typing that. But it’s true. Office politics and inter-team rivalries will always exist but try and work through them for the greater good. It would amaze absolutely no-one who has ever worked in a corporate environment that the biggest obstacles to success are to be found within your own organization. The "competition" don’t even come into it. Work to overcome that in your project and you’ll have a much higher chance of success.
And it’ll still probably not make the grade.
I don’t want to put a downer on your mood but the odds will still be against you. A 1 in 10 strike rate should be expected in an experimental area like data science where not every idea is going to survive the Proof of Concept stage.
I'm seeing more and more despondency from young analysts and data scientists though when their long-term projects get shit-canned for whatever reason. I've been there myself and it's one of the major learning experiences, even further on in our data careers, that not everything is going to go into production. It's not called Data "Science" for nothing.
Educating the bosses.
Another major problem seems to be a misunderstanding at executive level that every project should be a winner or else the whole discipline is a waste of time. Which ultimately boils down to a lack of basic data literacy at senior levels. This is key to the success of any data-driven strategy and is often overlooked.
As modern data professionals we need to figure out how to educate our business leaders and bring them along with us as data equals. If we don’t we’ll see more and more projects fail and the attention will turn elsewhere for the next magic bullet for business success. And that would be a missed opportunity for all of us: nerds, suits and pen-pushers alike.