Start by Not Touching the Data

Most people open a new dataset like a raccoon opening a trash can – fast, noisy, and with no plan other than to find something shiny. 

You see it in every hiring cycle: the candidate gets the link, their eyes glaze over with excitement, a short session of Analysis Paralysis (Hey there!) and they immediately start writing code. 

They want to show off. 

They want to find that oh-so-sweet-correlation. 

They want to be the hero who finds the “secret” in the numbers.

But that frantic impulse is exactly why so many take-home assignments collapse before the first JOIN is even executed. You can brute-force your way to a chart, but if your foundation is shaky, your conclusion will be nothing more than fantasy dressed up as insight.

When you receive a brief, your first move should be total silence. 

Really, that’s what I do – always.

Put your hands behind your head and read the prompt twice. 

Don’t just scan it – interrogate it. 

Circle the sentences (or color them,, you know, whatever works with your technology) that describe the actual business decision that needs to be made. 

Ask yourself: “If I deliver the perfect answer, what will change tomorrow?” Will a budget move? Will a feature be killed? Will a strategy pivot? If you can’t answer that cleanly, you just aren’t ready to query yet, and that’s FINE.

After this internal clarity do you sketch three things on paper: 

  1. What success looks like

  2. What could plausibly go wrong

  3. What assumptions you are already making. 

This is the line between being a button-pusher and being a professional data person. 

When you finally do open the data, your first queries should be “boring” by design: ranges of dates, missing values, duplicates, and basic distributions. 

Think of this as looking both ways before crossing the street – it won’t win you any awards, but it keeps you alive analytically.

When I’m building tasks for XP Lab’s users, I make sure our assignments are intentionally punishing speed without thinking. 

Slow thinking isn’t laziness; it is professional hygiene. 

And exactly what I want for you.

 .

More to explore

Well-defined processes and deserts

This post emphasizes the significance of clear definitions in data analysis. It provides examples of poorly defined questions and suggests ways to improve their clarity for accurate analysis. Additionally, it offers an interesting answer to the question of “Where’s the world’s biggest desert?”

Picture of Analysis Paralysis

Analysis Paralysis

Don't be a stranger!
Contact me

right here!

Leave a Reply

Your email address will not be published. Required fields are marked *