I build production ML systems, data platforms, and fraud/risk solutions that connect modeling to business decisions.
In practice, that means designing ML platforms, data pipelines, and decision-support systems that can be trusted in production. I choose the modeling approach that best fits the problem; whether that involves gradient boosting, Bayesian analysis, causal inference, deep learning, or a clear visualization.
A collection of work across ML engineering, data science, and analytics.
A hands-on workshop demonstrating how to build a feature store for time-series forecasting using Mage for orchestration and Feast as the feature store. Presented at DataDay 2025 in Monterrey, Mexico.
End-to-end data pipeline on Iowa Liquor Sales (30M+ rows) covering the full data lifecycle: concurrent ingestion from the SODA API, transformation with Polars, persistent storage in DuckDB, geospatial analytics in R, and time-series forecasting with sktime. Accompanied by a hands-on tutorial article.
Production-ready microservice predicting bike rentals with a CatBoost regressor. Features Pydantic request validation, automated unit/integration/inference tests with pytest, Docker containerization, and a CI/CD pipeline via GitHub Actions. Accompanied by a tutorial article on software engineering practices for data scientists.
Deep-dive into 12 years of Iowa liquor sales (~28M invoices). Covers STL decomposition, isolation forest anomaly detection (via PyCaret), and Bayesian estimation (BEST) to answer two questions: does inventory diversity drive sales, and did the pandemic increase alcohol consumption? Built in R with Python interop through reticulate, published as a reproducible Quarto document.
End-to-end fraud prevention platform built as five Docker-composed microservices: a FastAPI data generator powered by the Synthetic Data Vault (SDV), Apache Kafka for event streaming, a Mage pipeline that consumes transactions and scores them with an ML model, and a Streamlit risk viewer (backed by DuckDB over Parquet) where analysts review high-risk cases.
Sharing knowledge at conferences across Latin America.
Talk exploring the intersection of fraud prevention, machine learning, and human-in-the-loop software design patterns, keeping analysts as part of your system for better outcomes.
Hands-on workshop building a feature store for multi-series forecasting on Iowa Liquor Sales data. Covers ETL pipelines in Mage, feature definitions and point-in-time correct training datasets in Feast, DuckDB as the offline store, and online materialization to SQLite for model serving.
Open to collaborations, interesting problems, and good coffee.