Imagine that you’ve just managed to get your hands on a dataset from a clinical trial. Exciting! To help you get in character, I made up some data for you to look at:

Image for post

Pretend that these datapoints map out the relationship between the treatment day (input “feature) and the correct dosage of some miracle cure in milligrams (output “prediction) that a patient should receive for over the course of 60 days.

#The data:
(1,28)  (2,17)  (3,92)  (4,41)  (5,9)   (6,87)  (7,54) (8,3)   (9,78)  (10,67) (11,1)  (12,67) (13,78) (14,3) (15,55) (16,86) (17,8)  (18,42) (19,92) (20,17) (21,29) (22,94) (23,28) (24,18) (25,93) (26,40) (27,9)  (28,87) (29,53) (30,3)  (31,79) (32,66) (33,1)  (34,68) (35,77) (36,3)  (37,56) (38,86) (39,8)  (40,43) (41,92) (42,16) (43,30) (44,94) (45,27) (46,19) (47,93) (48,39) (49,10) (50,88) (51,53) (52,4)  (53,80) (54,65) (55,1)  (56,69) (57,77) (58,3)  (59,57) (60,86) ...

Now imagine that you’re treating a patient and it’s day 2. What dose do you suggest we use?

Image for post

I really hope you answered “17mg” since this was definitely not supposed to be a trick question. How about day 4? 41mg? Yes indeedy!

Now, how would you build software to output the right doses on days 1–5? Would you try to use machine learning (ML)? In other words, would you try to find patterns in these data and try to turn them into a recipe (“model”) for going from inputs to outputs?

No, of course you wouldn’t! You’d get your software to do exactly what you’re doing: look the answer up in a table. That way, you’ll get the right answer 100% of the time for all 60 days. No need for patterns here and no need for machine learning either.

So, what sort of situation requires machine learning?

Image for post

How about now? It’s day 61. What’s the right answer here?

Well, we’ve never seen data for day 61, so there’s no way we can look up the answer here. What can we do? Are we out of luck? Can machine learning help us?

That depends.

If there’s no pattern that connects the inputs with the outputs, forget it. In that case, nothing can help us… short of actual magic, which doesn’t exist (in case you thought machine learning was it). Give up now!

We must find a useful pattern

But if there is a pattern and if (that’s a big if!) we could find it, then we could try to apply it to day 61 to try to predict/guess the right answer. Perhaps machine learning might help us.

The pattern must generalize

The trouble is that it’s not enough for there to be a pattern in our data. That would be much too convenient. The pattern also has to be relevant beyond day 60. What if the conditions are fundamentally different in day 61, so the pattern doesn’t generalize? For all you know, maybe on day 61 all patients are fully cured or dead or on an incompatible medication. Then the pattern is no good to you.

Nonstationary universes

Let this sink in. If your data aren’t a useful window into tomorrow’s world — perhaps because a pandemic changed all the rules — it doesn’t matter how good your information was yesterday. If you live in an unstable corner of the universe, you’ll have a hard time justifying what we call ergodicity andstationarity assumptions. These roughly translate to “I believe that the rules haven’t changed.”

I’m not talking about the kind of nonstationarity that’s in the eye of the beholder (like when average prices appear to drift over time because you forgot to adjust for inflation). Dealing with gentle nonstationarity (when the rules are a predictable function of time) is what the field of time series analysis is all about.

I’m talking about the kind of violent nonstationarity that you can’t do anything about because your system’s rules are fundamentally different in a way you can’t predict from one period to the next. If your past data suddenly don’t apply at all to your nonstationary future, you’re not allowed to use yesterday to predict tomorrow with a straight face.

Image for post

Maybe you’re in luck

But if there is a pattern and if this pattern is relevant to the new situation we find ourselves in, then we’re in business. We could go and find the pattern in the old data, make a recipe based on it, and then use that recipe to succeed on day 61 and beyond!

Finding patterns and using them is what machine learning is all about.

When to use machine learning

In applied machine learning (and AI), you’re not in the business of regurgitating memorized examples you’ve seen before — you don’t need ML for that, just look ’em up! —you’re here to learn.

Just repeat old answers? ML can do better! It succeeds on new examples.

Your mission? To build a solution that generalizes successfully (or pull the plug on your project). (What does “successfully” mean? I have a whole guide for you on that topic.)

Image for post

In other words, your solution is no good if it can’t handle new examples it has never seen before. Not dramatically new examples that break all the rules of a stationary universe, but slight twists on the learned theme.

We’re not here to memorize like a parrot. We’re here to generalize to new situations. That’s the power and the beauty of machine learning.

If you haven’t seen this exact combination of input values before (day 61), what’s the right output answer? Well, maybe we can turn old patterns into a recipe that makes a decent guess.

For example, if you trained a cat/not-cat classifier from thousands of animal photos, you can ask it to tell you if a brand new photo has a cat in it, but you shouldn’t ask it to tell you whether a painting is in the Cubist style.

Is machine learning for you?

If you’re sick of hearing me call it a thing-labeler and an alternative approach to writing code, let me try putting it another way.

Machine learning is an approach to automating repeated decisions that involves algorithmically finding patterns in data and using these to make recipes that deal correctly with brand new data.

To know if machine learning is for you, I have three guides you might enjoy:

So, was there a useful pattern?

Still curious about day 61? Turns out there *is* a pattern in the toy data I made for this example. I know this because I put it there. I can even promise you that it generalizes to day The-Biggest-Number-You-Can-Think-Of-Plus-One because in these wildly nonstationary times, I find it luxuriously comforting to work with data that plays nice for a change.

#The data:
(1,28)  (2,17)  (3,92)  (4,41)  (5,9)   (6,87)  (7,54) (8,3)   (9,78)  (10,67) (11,1)  (12,67) (13,78) (14,3) (15,55) (16,86) (17,8)  (18,42) (19,92) (20,17) (21,29) (22,94) (23,28) (24,18) (25,93) (26,40) (27,9)  (28,87) (29,53) (30,3)  (31,79) (32,66) (33,1)  (34,68) (35,77) (36,3)  (37,56) (38,86) (39,8)  (40,43) (41,92) (42,16) (43,30) (44,94) (45,27) (46,19) (47,93) (48,39) (49,10) (50,88) (51,53) (52,4)  (53,80) (54,65) (55,1)  (56,69) (57,77) (58,3)  (59,57) (60,86) ...

For those who like a challenge, why don’t you try see if your favorite machine learning algorithm can find the pattern and turn it into a useful recipe? (Answer at the bottom of this page.)

I also suspect that there might be more folks who get it with an analytics approach instead of using machine learning (see this to understand the difference, plus the clue I’ve just given you) but GLHF. May the best approach win!

If you’re keen to try ML, don’t forget to do things in the right order — here’s a step-by-step guide to help you out.

Thanks for reading!

If you’re keen to read more of my writing, most of the links in this article take you to my other musings. You can also enjoy audio versions here and my statistics video playlist here.


If you’re curious to see the answer for day 61, try running the R function that I used to generate the data (you can paste it in and run it online here).

# Here's the R code I used to generate the data:
doseFun <- Vectorize(function(x) {r <- round(93 * cos(x) ^ 2 + sqrt(exp(x/100))); return(r)})# Output the result for day 61:
print(doseFun(61))# Plot the deterministic function:
plot(x = 1:60, y = doseFun(1:60))

Because my function turned out to be deterministic, you could have gotten the right answer by analytics (plotting the graph and eyeballing it to notice the repeating pattern) and you didn’t really need machine learning here, though it can work anyway. It’s just not the most efficient way to go about things.


I hope I haven’t done more harm than good by exposing you to that toy dataset. The danger is that you learn a very bad habit: failure to split your data and test your system properly.

Those of you who split the data and validated your solution before submitting it deserve an extra pat on the back. Your caution will serve you well!

Those of you who plotted/trained on the entire dataset may have gotten away with it… this time. The only reason you didn’t get suckerpunched by this cartoonish example is that the true underlying model was a simple pattern which could be extracted easily from the data. These are rare in practice, since your colleagues probably found all such low-hanging fruit decades ago. If you approach real world data the way you just approached this toy example, you’ll get hurt. You can find more info about that in my article How to be an AI idiot.

Except for the headline and featured image, this story has not been edited by Javelynn and is published from a syndicated feed. Originally published on