The Sweet Spot between Lazy Analytics and Overengineering

⏰ Reading Time: 7 minutes ⏰ 

“Everything should be made as simple as possible. But not simpler.”
- often attributed to Einstein

In analytics, I see two extremes over and over again.

On one end, there’s lazy analytics.

On the other end, there’s massive overengineering.

And if I’m being honest… I’ve personally done both.

Early in my career, I built churn prediction models with dozens of features, training data pipelines, model evaluation frameworks - the whole machine learning circus.

But I’ve also seen companies define churn like this:

“If a customer hasn’t bought anything in the last 90 days, they are churned.”

No analysis. No validation. Just a nice round number.

Both approaches miss the point.

And the lesson I learned over the years is simple:

Great analytics usually lives in the 80/20 middle.

Why This Matters

Data teams are full of smart people.

And smart people love solving complex problems.

But here’s the thing I sometimes struggle to admit:

Business impact isn’t driven by intellectual elegance.

It’s driven by ROI.

The goal of analytics isn’t to build the most sophisticated model.

The goal is to create the largest possible business impact with the least amount of resources.

And one of the most important levers in that equation is simple:

Don’t overbuild.

My Churn Modeling Rabbit Hole

Let me tell you about a mistake I made years ago.

I was working on customer churn prediction. Naturally, my instinct was:

“Let’s build a model.”

So I went down the classic data science path.

We collected dozens of variables:

  • Purchase frequency
  • Recency
  • Order values
  • Category preferences
  • Discount usage
  • Customer service interactions
  • Website behavior
  • And a bunch of engineered features on top of that

Then we trained models.

Tested different algorithms.

Iterated.

Weeks of work.

The result?

A model that predicted churn reasonably well.

But then something funny happened.

We compared the model to a simple rule-based heuristic.

Something like:

Customers who have been inactive longer than their typical purchase cycle are at risk.

And guess what.

The simple heuristic performed just as well as the complex model.

At a fraction of the cost and time.

That was a humbling moment.

Not because machine learning is useless - it absolutely isn’t.

But because I had violated a very important rule.

I skipped the 80/20 step.

The Other Extreme: Lazy Analytics

Now let’s look at the other side of the spectrum.

Many companies define churn like this:

  • 30 days without purchase → churn
  • 90 days without purchase → churn
  • 180 days without purchase → churn

Why those numbers?

Usually:

Because I've done this at this other company. Or gut-feeling.

This is what I call lazy analytics.

It’s easy. It’s fast.

But it’s also wrong most of the time.

Different businesses have very different purchasing patterns.

For some products, 30 days of inactivity is perfectly normal.

For others, 30 days means the customer is long gone.

And even for the same product the differences can be huge across markets, customer segments etc.

So how do we improve this without building a complex machine learning model?

The 80/20 Way to Define Churn

Here’s a simple, data-driven method I often use.

  1. Look at all purchases for each repeat customer
  2. Calculate the maximum number of days between two purchases for each customer
  3. Plot the distribution of those intervals

Now something interesting appears.

For example, you might see something like this:

  • After 32 days of inactivity, 60% of customers never buy again
  • After 65 days of inactivity, 80% of customers never return

An example chart looks like this (it's the actual chart we used):

Featured image

This gives you two useful markers:

Soft churn → 60% probability of not returning (after 32 days)

Hard churn → 80% probability of not returning (after 65 days)

These thresholds are:

  • Data-driven
  • Easy to calculate
  • Easy to explain to stakeholders
  • Easy to operationalize

And you can refine them further by segment.

Congratulations.

You just upgraded from lazy analytics to top-tier customer analytics - without writing a single line of machine learning code.

Why Data People Love Complexity

There’s a deeper psychological pattern here.

Data professionals are usually very smart (at least that's what I like to think of myself 😅).

And smart people often enjoy solving difficult technical problems.

There’s nothing wrong with that.

But complexity can become a trap.

Complex solutions:

  • look impressive
  • signal technical competence
  • feel intellectually satisfying

But they often come with massive costs:

  • longer development cycles
  • harder maintenance
  • lower explainability
  • slower decision making

And in fast-growing companies and competitive environments, that’s a problem.

Because complexity consumes the one resource many companies don’t have (especially in today's AI-first world): time.

The 80/20 Rule for Data Teams

Over the years, I’ve developed a simple mental rule.

Whenever someone proposes a complex analytics solution, I ask:

Can we achieve at least 80% of the value with 10-20% of the effort?

And very often the answer is yes.

Here are a few other examples.

Instead of Data Mesh
→ allow business users controlled access to a BigQuery data foundation via Google Sheets.

Instead of marketing mix modeling
→ help marketing teams understand real customer journeys across channels.

Instead of perfect unit economics
→ get contribution margin per unit right and approximate the rest.

Instead of custom recommender systems
→ manually build cross-sell logic for the top products that generate most of your revenue.

Is this perfect?

No.

But it often delivers 80% of the value with 20% of the complexity.

And that’s exactly what companies need in competitive market environments.

The Decision Rule I Use Today

Whenever I face an analytics problem, I go through three steps.

1️⃣ Start with the simplest possible heuristic

2️⃣ Validate whether it gets you most of the way

3️⃣ Only add complexity if the simple solution clearly fails

Most teams skip step one. I certainly did.

I used to jump straight into building the “perfect” solution.

But the best analytics teams I know follow a different philosophy:

Build less. Deliver more.

The Bottom Line

Lazy analytics produces bad decisions.

Overengineered analytics produces no decisions.

The real impact lies somewhere in between.

The sweet spot is the 80/20 middle.

Start simple.

Validate quickly.

And only add complexity when it’s truly justified.

Because the goal of analytics isn’t sophistication.

The goal is ROI.

Cheers,
Sebastian

Join 3,000+ readers

Subscribe for weekly tips on building impactful data teams in the AI-era

Error. Your form has not been submittedEmoji
This is what the server says:
There must be an @ at the beginning.
I will retry
Reply
Emoji icon 1f64c.svg

Whenever you need me, here's how I can help you:

Data Strategy Masterclass​: 🏭 From dashboard factory to strategic partner♟️

A digital, self-paced masterclass for growth-oriented data leaders who want to level up their careers by building impactful data teams in the AI-age. 📈

Learn and apply the frameworks that I used to win stakeholder trust, earn a seat in the board room, and lead with impact in 40+ companies across all continents. 

​Knowledge Base​ 

Free content to help you on your journey to create massive business impact with your data team and become a trusted and strategic partner of your stakeholders and your CEO.