Skip to content

MLOps

Status: 🚧 Coming soon β€” chapters are being written.

MLOps is the discipline of running ML systems in production reliably. It's where data engineering, software engineering, and ML meet. If you've ever shipped a model and then watched it silently rot β€” this is what you needed to know.

What this section will cover

  • The ML lifecycle β€” research β†’ training β†’ serving β†’ monitoring β†’ retraining
  • Experiment tracking β€” MLflow, Weights & Biases, Neptune
  • Data versioning β€” DVC, LakeFS, Pachyderm
  • Model registries and reproducibility
  • Serving β€” REST endpoints, batch inference, streaming, BentoML, KServe, Ray Serve, vLLM for LLMs
  • Monitoring β€” drift, performance decay, fairness, latency, cost
  • CI/CD for ML β€” testing data, models, pipelines
  • Feature stores β€” Feast, Tecton
  • LLMOps specifics β€” prompt versioning, eval pipelines, cost tracking, observability via LangSmith

A consolidated MLOps track lands next.