Where do we get all these data scientists from?

AI and machine learning require software engineers who are ridiculously good at maths. Where are they going to come from?
Back to Blog

As previously discussed, we live in an industry that is obsessed with trying to predict its direction of travel, and there are few predictions that get more airtime than that of AI.

There are two major structural problems with this prediction.

I’ve been writing software professionally for 25 years. In all the projects I’ve worked on, the vast majority of which are line-of-business apps, as they are for most professional developers. It’s a rare as hen’s teeth to find a LOB app that requires any form of AI, and applying AI to the vast majority of use cases is a stretch.

For example, let’s say I have a workflow where Alice writes a document to send to a customer (Chris), but Alice’s manager (Bob) has to approve it. What does AI do there? Does it look at the document and decide that Bob doesn’t have to approve it after all, reducing Bob’s workload? AI evangelists would say “yeah, sure!”, without considering any legal risks the company faces. What if Alice has made a mistake and sending that document to Chris contravenes some regulation such as GDPR? How does a business adapt to having software that can make decisions? What if that software was bought into the business? What if? What if…?

But, like as an industry we’re obsessed with predicting things we cannot possibly know, similarly as an industry we’re obsessed with trying to make problems fit the technology. Two sides to exactly the same coin.

Regardless of the fact that the demand-side of AI in contexts that we understand today is being grossly overstated, there is a problem on the supply-side to. Namely: there aren’t enough engineers to do this job.

As an industry, we have to adapt and rotate our skills. What was a big language one day becomes rarely used the next (VB, I’m looking at you, and today I’m fearful about C#).

A friend of mine, who had been writing software since he was a kid, used to say that programming was “all just functions and variables”. And he’s right. Whether it’s COBOL, or VB, or Python, or Scala, is it all just functions and variables. Once you get the basics down – take an input, do some work on it, create an output – all programming is the same basic churn.

Except for that I wouldn’t hire a developer who had worked exclusively writing games for ten years to write a LOB app for a company that manufactured sheds. For the simple reason that moving sprites around on a screen (is it even like that, isn’t it all polygons and shaders now?) is a world apart from processing invoices in a database.

Similarly, much as sometimes I look at game and think I fancy writing something like it (Stardew Valley springs to mind – I think I could handle that), I’m aware I’d be very bad at it because although I have the whole functions and variables thing down, I don’t know the first thing about writing games. But that’s OK, I’m not in the business to make games.

The issue for outsiders to our industry to understand is that data science is basically maths. It looks to model events that occur in the real world in some mathematical space, and then do something with that model to inform decisions. It’s entirely divorced from anything that a normal software engineer understands. Normal software engineers know that if you have a function to post an invoice, if that invoice is in an invalid state, throw an exception. Normal software engineers do not have the understanding of mathematics requires to be any good at building AI solutions.

Of course, we have to consider here that as an industry we have always been following a trend to greater componentisation. This is a good thing – I remember 20 years ago working on software that needed to send an SMS. The only way to do this at the time was to attach a physical mobile phone to the PC’s serial port and squirt commands to the phone. Today no one would do that – we’d use a webservice to do it. So when we think about AI, it’s likely that we can lift a lot of the work up to those services.

Fundamentally though, you still have to understand the maths. And there just aren’t enough engineers around now, and aren’t enough coming out of education and training, to support any uptick in interest around AI.