Sarah Floris

Sarah Floris

Lead ML Engineer

Hugging Face contributor Microsoft QKit contributor 80K+ on LinkedIn 6.7K+ on Substack 2K+ on Medium

Most AI education focuses on training models.

That is not the hard part anymore.

In practice, the harder problems often show up after the model exists: getting systems to run reliably, securely, repeatedly, and at scale.

My path into AI started in theoretical chemistry. At the University of Washington, I worked on quantum mechanics simulations and high-performance computing systems, which was my first experience building and debugging large computational workflows.

In graduate school, I joined DIRECT, a program focused on data-intensive research and data science. That is where I began working seriously with machine learning.

When I moved into industry, I started seeing a recurring gap between machine learning experiments and production systems.

Teams could build models, run notebooks, and create promising demos. But production introduced a different set of problems.

Pipelines failed. Inference systems broke. Data changed. Deployments took longer than expected. Monitoring was incomplete. Ownership was often unclear.

Over time, my work became focused on that gap: helping turn machine learning experiments into systems that could run, be maintained, and be improved over time.

I have worked across MLOps, ML infrastructure, and data platforms at companies including EquipmentShare, Zwift, Clinical Trial Media, Qumulo, Blueprint Technologies, and Redapt-Attunix. That work has included deploying low-latency inference systems, building standardized deployment templates, implementing cloud data platforms, optimizing large-scale pipelines, and helping teams adopt AI in practical ways.

That experience shaped how this program is designed.

The goal is not only to teach AI concepts or model training. It is to help learners understand the infrastructure, workflows, deployment patterns, and operational habits that make AI systems usable in real environments.

The program is built around the kinds of problems that show up in practice: broken pipelines, unreliable deployments, changing data, unclear interfaces, and systems that behave differently outside the notebook.

My teaching philosophy is simple: start with the failure, understand why it happens, then build the principle from scratch and apply it in a new context.

That is how the material is structured.

Students first see what can go wrong. Then they learn the concept. Then they implement it, test it, and apply it through a project.

The aim is not to make AI feel simple. It is to make the work clearer, more practical, and easier to reason about when things break.