Kaylea Haynes

See All Posts

Does statistics still have a place in the data scientist’s toolbox?

By Kaylea Haynes on October 22, 2018

When I first got asked to present at the Royal Statistical Society Leeds and Bradford's 50th anniversary event, with a presentation brief of “talk about using statistical models in data science”, I thought it would be an easy talk to prepare. After all, we use statistics every day as data scientists…don’t we?

I then started to doubt myself when I asked the team how and where they use statistics, and was met by many blank faces. Do we not do statistics?!

When I started to think more in depth about where we use statistics in data science, it became apparent that statistics – or at least the fundamental ideas of statistics – are invaluable tools we do use every single day. But, we either take for granted the statistical origins of the method, or we call it “machine learning.”

Taking the former, particularly in areas such as exploratory data analysis, data cleaning (e.g. imputation) and model evaluations, we take for granted the statistical methods we use, as they have just become part of our everyday data science toolkit. Similarly, we might not rigorously use statistics, such as hypothesis testing, but the knowledge of which helps us to set up unbiased experiments to test business actions.

We also have a bad habit of calling everything machine learning. Take regression, for example; this is originally a statistical method but is very often mistakenly categorised as a machine learning algorithm.

Overall, the key takeaways I got from the event, both in terms of what I learnt during my preparation from my own talk but also from the other presentations from Vinny Davies (School of computing science, University of Glasgow) and Owen Johnson (School of Computing, University of Leeds), are:

Don’t just use machine learning or AI models because that’s the “trendy” thing to do. If you can use a regression model then do that first before trying something complex – at the very least, it can be a starting point to compare more complex models to.
But also, statistics has a reputation for being basic, but it can be complex. In fact, in machine learning, it’s generally more important to have methods which are fast to run and work in practice, whereas statistics proves the methods hold asymptotically.
Don’t take basic statistics for granted – exploratory data analysis is key to building the pathway for more sophisticated models.
Many machine learning methods are black box, and statistics help us to understand and interpret the models.
Enthusiasm for using ML within sectors’ IT systems and human generated data dating back many years, such as in the NHS, comes with the risk of unintended consequences which we need to be aware of.
Statistics are still very useful next to machine learning and AI.
However, to keep statistics relevant and useful, statisticians need to get better at programming and sharing code online. If a new method is developed in the statistics literature but doesn’t have code available, then it is very unlikely to be used by a data scientist.

AI | Data Science

How to get a job in data science

By Amy Sharif and Sorcha Gilroy on March 12, 2021

Decision Intelligence: the real new normal

How to revolutionize decision making, grow your business, and improve your bottom line.

Get the guide!

AI | Data Science

Do I need a PhD to be a data scientist?

By Stuart Davie and Tom Hassall on March 29, 2021

Sign up to the Peak newsletter

Get the latest Peak news and AI insights delivered straight to your inbox

May wrap-up: Agentic automation is here.

Service Level Predictor: Predict, prevent, perform

Why AI is becoming a non-negotiable for manufacturers

AI markdown optimization for retailers: maximizing margin for a luxury fashion brand

April wrap-up: “It’s all gone a bit Black Mirror”

Using technology to meet demand in the most cost-efficient way

Key takeaways from Retail Technology Show 2025

March wrap-up: A new chapter 📖

WATCH: How Hain Celestial tackles supply chain risk with Inventory AI

A new chapter: Peak joins forces with UiPath

UiPath Acquires Peak to Launch Vertically Specialized Agents within its Agentic Automation Platform

What is agentic AI, anyway?

February wrap-up: Let’s talk about agents 🕵️

Construction supply chain: Five ways AI can help merchants and manufacturers

Nexus Rental selects Peak as AI partner for tech-led innovation

WATCH | Against the odds

WATCH: How Ligentia leverages Inventory AI to optimize its supply chain

By 2028, one in three businesses will use agentic AI

Why supply chain agility is easier said than done

Retail Technology Show 2025: What to watch out for this year

Eurocell transforms inventory management process by deploying Peak’s AI capabilities

AI for inventory: right stock, right place, right time

AltitudeX 2024

How to support inclusion and diversity in your team as an ally

Peak ranked in the top 10 best technology companies to work for