Posts
All the articles I've posted.
-
Building a Robust Multi-Format Data Ingestion Pipeline with Auditability and Versioning
Updated:How we designed and implemented a scalable ingestion pipeline with support for CSV, Parquet, and JSON, built-in schema validation, and timestamped snapshots — forming the foundation for a reproducible AI/ML workflow.
-
Production-Grade Data Cleaning Workflow for the Titanic Dataset (pandas, 2025)
Updated:A modern, production-ready pandas workflow for cleaning the Titanic dataset—covering path setup, missing data handling, and categorical encoding with best practices.
-
If I Had to Start Over Again
A personal note on why this blog exists — to document learnings, share back, and make the journey easier for those walking a similar path.