<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>ML in Production | Gabriel Humpire-Mamani</title><link>https://gabrielhumpire.github.io/tags/ml-in-production/</link><atom:link href="https://gabrielhumpire.github.io/tags/ml-in-production/index.xml" rel="self" type="application/rss+xml"/><description>ML in Production</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Mon, 15 Jan 2024 00:00:00 +0000</lastBuildDate><image><url>https://gabrielhumpire.github.io/media/icon_hu_645fa481986063ef.png</url><title>ML in Production</title><link>https://gabrielhumpire.github.io/tags/ml-in-production/</link></image><item><title>End-to-End Computer Vision: From Requirements to Deployed System</title><link>https://gabrielhumpire.github.io/post/project-management/</link><pubDate>Mon, 15 Jan 2024 00:00:00 +0000</pubDate><guid>https://gabrielhumpire.github.io/post/project-management/</guid><description>&lt;p&gt;After 15 years working in computer vision - through a PhD, multiple industry roles, and a few startups - I&amp;rsquo;ve shipped systems that went from whiteboard sketch to production deployment. This post is about what that journey actually looks like, and the lessons that only come from doing it end-to-end.&lt;/p&gt;
&lt;h2 id="requirements-are-the-hardest-part"&gt;Requirements Are the Hardest Part&lt;/h2&gt;
&lt;p&gt;In research, the problem is given to you. In production, you have to discover it. &amp;ldquo;Detect anomalies in CT scans&amp;rdquo; is not a requirement. &amp;ldquo;Flag studies with a sensitivity of at least 92% while keeping false positives below 5% per study, with results available within 3 minutes of scan acquisition, running on a single GPU workstation&amp;rdquo; - that is a requirement.&lt;/p&gt;
&lt;p&gt;Every project I&amp;rsquo;ve worked on has been redefined at least once after the first prototype. Building for changeability is more valuable than building for the original spec.&lt;/p&gt;
&lt;h2 id="the-prototype-trap"&gt;The Prototype Trap&lt;/h2&gt;
&lt;p&gt;A prototype that achieves 94% accuracy on your held-out test set is not a product. The questions that matter:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;How does it perform on data from a different scanner model?&lt;/li&gt;
&lt;li&gt;What happens when the input is corrupted or incomplete?&lt;/li&gt;
&lt;li&gt;How does a clinician or operator actually interact with the output?&lt;/li&gt;
&lt;li&gt;What is the failure mode, and is it safe?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Medical imaging taught me this faster than any other domain. A false negative in cancer screening is not an acceptable failure mode.&lt;/p&gt;
&lt;h2 id="iteration-is-the-work"&gt;Iteration Is the Work&lt;/h2&gt;
&lt;p&gt;The drone delivery project I led at Embention is a good example. The first detection model worked well in controlled lab conditions. Outdoor lighting, vibration, variable altitude, and regulatory edge cases forced five major architecture revisions before we had something deployable. Each revision was cheaper than the last because we had built good evaluation tooling early.&lt;/p&gt;
&lt;p&gt;The 30% latency reduction we eventually achieved did not come from a clever algorithm - it came from profiling, identifying that the bottleneck was preprocessing, and rewriting that stage in C++ with CUDA.&lt;/p&gt;
&lt;h2 id="deployment-is-a-feature"&gt;Deployment Is a Feature&lt;/h2&gt;
&lt;p&gt;Models don&amp;rsquo;t deploy themselves. Containerisation (Docker), hardware-specific optimisation (TensorRT, ONNX), monitoring, and rollback procedures are engineering work that needs to be planned from day one - not bolted on at the end.&lt;/p&gt;
&lt;p&gt;The most valuable skill I have developed is being fluent in both the research side (what the model can learn) and the engineering side (what it will cost to run). That fluency is what makes end-to-end delivery possible.&lt;/p&gt;</description></item></channel></rss>