Mathieu Sawatzky's Field Notes

Trust - Knowing When to Throw Data Away

April 2026

Data Analysis Data Engineering

There’s an instinct in data work to keep everything.

More data feels safer. More complete. More “correct.”

But this month pushed back on that idea.

Some fields looked fine at first glance, but the deeper we looked, the more they broke down. Inconsistent, partially filled, or just wrong often enough to create doubt.

At a certain point, the question changed.

Not “can we use this?” but “should we?”

And the answer was no.

Those fields were removed entirely—replaced with empty values, intentionally. It felt strange at first. Like throwing away information. But the result was a system that people could actually trust.

Because bad data doesn’t just sit quietly, it leaks into decisions, dashboards, and assumptions.

And in that sense, removing unreliable data isn’t a loss of information.

It’s an increase in clarity.

Speed - See the Forest, Not Just the Tree

March 2026

ETL Data Engineering

There’s a point in a project where everything works… but something still feels off.

That showed up this month in an ETL pipeline. It was clean, correct, and reliable… but just slow enough to make every run feel a little heavy.

So I stepped back and asked a different question: what is each tool actually good at?

Python is great for orchestration, cleaning, and shaping data. The database is built to ingest and store it efficiently.

That’s when it hit me. I had been so focused on building the pipeline that I lost sight of the system as a whole, and ended up asking one tool to do the job of the other.

Switching to bulk inserts let each part of the system do what it was designed for, and everything changed. Minutes became seconds, and the pipeline went from something that worked to something that felt right.

This was the perfect reminder that: good systems aren’t just built, they’re composed.

When each tool is used for its strengths, performance improves, complexity drops, and the whole system becomes easier to reason about.

Algorithms – The Structure Shapes the Solution

February 2026

Algorithms Data Structures

One of the biggest takeaways from my algorithm design course so far is realizing that most algorithms are not something to be memorized. They emerge naturally from the properties of the data structure being used. When you understand the structure, the algorithm often becomes obvious, as algorithms are often just constraints meeting structure.

Examples:

Choosing the right structure often reduces a complex algorithm to a simple operation repeated efficiently.
Many algorithms succeed because they exploit guarantees such as ordering, hierarchy, or locality.
Good algorithm design is often the art of turning a problem into one that a known structure already solves well.

In that sense, algorithm design feels less like inventing clever tricks and more like discovering the structure already hiding inside the problem.

Data Engineering - The Hidden Work

January 2026

Data Engineering Analytics

Most of the work in real data systems is not modeling or visualization. It is fixing the data.

Common issues:

inconsistent data types
historical schema drift
missing values
duplicated entries
improperly entered data

A large part of a data engineer’s job is building pipelines that normalize messy operational data into something analyzable. Clean data isn’t glamorous, but it’s the foundation everything else sits on.

Programming – Systems Thinking from Biology

December 2025

Systems Design

Coming from kinesiology influenced how I think about software systems.

Biological systems work through interacting subsystems such as:

circulatory system
nervous system
muscular system

Software architectures behave similarly. A good system is not one giant component. It’s a collection of smaller components that communicate clearly.

Healthy systems share traits:

clear boundaries
feedback loops
redundancy
resiliency/adaptability