A full-cycle MLOps project for FAANG stock price forecasting. Built to showcase core MLOps concepts like reproducibility, monitoring, CI/CD, and cloud readiness β all within a containerized setup. The project runs locally using Docker Compose and can be extended to cloud setups.
Stock prices for high-growth tech companies (FAANG) are volatile, influenced by global economic trends, sector-specific developments, and investor sentiment. Accurately forecasting such prices is challenging but crucial for applications like:
This project tackles the challenge of building a fully reproducible, deployable, and monitored forecasting pipeline that predicts FAANG stock closing prices using Linear Regression. Our goal was not just to build a model, but to design an MLOps-ready system that could be trained, deployed, and monitored in both local and cloud environments.
Develop a predictive model to forecast FAANG closing prices. Implement MLOps best practices: reproducibility, CI/CD, monitoring, containerization. Enable seamless deployment through Docker Compose with potential cloud migration.
- Data Exploration & Preparation Collected historical FAANG stock price data.
Cleaned missing values, handled outliers, and engineered features (lag variables, rolling averages).
- Model Selection & Training Chose Linear Regression for its interpretability and quick iteration.
Used Scikit-Learn for training.
Logged experiments and metrics in MLflow.
- Deployment Built a FastAPI service for real-time predictions.
Containerized each service (model, monitoring, API) with Docker.
- Monitoring Used Evidently to detect data drift and monitor prediction quality.
Stored monitoring metrics in PostgreSQL and visualized them in Grafana.
- CI/CD & Automation Automated builds and tests with GitHub Actions.
Used Makefile for reproducible local workflows.
faang-mlops/
βββ orchestration/
β βββ mlflow_pipeline/ # MLflow tracking & model training
β βββ fastapi_app/ # FastAPI app for model serving
β βββ monitoring/ # Evidently + Grafana setup
β βββ data/ # Reference and current CSV data for monitoring
β βββ requirements.txt # Python dependencies
β βββ Dockerfiles # Each service has its own Dockerfile
βββ docker-compose.yml # Service orchestration
βββ .env # Environment variables (not committed)
βββ Makefile # Workflow automation
βββ README.md
- MLflow: Model training, experiment tracking, and versioning
- FastAPI: REST API for model inference
- Evidently: Data and performance monitoring
- Grafana + PostgreSQL: Visualization of monitoring metrics
- Docker + Docker Compose: Container orchestration
- Pandas & Scikit-Learn: Data manipulation & ML modeling
Model: Linear Regression
Metrics: 6.7 Delivered an MLOps-ready stack that is fully containerized, version-controlled, and monitored.
Objective: Train a time-series regression model on FAANG stock data
- Collected and preprocessed FAANG historical stock data
- Performed feature engineering (rolling averages, lags)
- Trained a linear regression model using Scikit-Learn
- Logged metrics and model to MLflow
orchestration/mlflow_pipeline/train.pyorchestration/mlflow_pipeline/config.yamlorchestration/mlflow_pipeline/utils.py
- Chose linear regression for interpretability
- Used MLflow to version both metrics and artifacts
Objective: Serve the trained model via REST API
- Loaded the latest MLflow model from local
mlruns/ - Created FastAPI endpoints for healthcheck and prediction
- Containerized with Docker
orchestration/fastapi_app/main.pyorchestration/fastapi_app/Dockerfile
make build-fastapi
make run-fastapiObjective: Track data drift and prediction quality using Evidently
- Compared
reference.csv(baseline) vscurrent.csv - Generated HTML and JSON monitoring reports
- Saved metrics to PostgreSQL
- Visualized in Grafana
orchestration/monitoring/monitor.pyorchestration/monitoring/grafana/(dashboard config)docker-compose.yml(services)
- PostgreSQL SSL errors (solved by matching container names)
- Resource-intensive on low-spec machines
Objective: Ensure stability of the FastAPI prediction pipeline
- Wrote unit test for
/predictroute usingpytest - Added pre-commit hooks for linting
orchestration/fastapi_app/test_main.py.pre-commit-config.yaml
Objective: Enable reproducible development and deployments
- Added Makefile for repeatable workflows
- Defined GitHub Actions for lint, test, and build
Makefile.github/workflows/ci.yml
- Container isolation ensures reproducibility
- Evidently + Grafana provides powerful monitoring with minimal setup
- MLflow simplifies experiment tracking and version control
- Modular development aids debugging and future extensions
- Use a lighter model (e.g., LightGBM or Ridge) for better performance
- Add support for cloud deployment (e.g., Render or LocalStack)
- Extend monitoring to support concept drift and multivariate alerts
- Improve model retraining pipeline with DAG (e.g., Mage or Airflow)
Kiriinya Antony MLOps | Data Engineering | Forecasting Systems
LinkedIn | GitHub | Nairobi, Kenya
# Build all services
make build-all OR docker compose up(from the root folder(faang-mlops))
# Start the stack (MLflow + FastAPI + Monitoring)
make up
# Open dashboards:
# MLflow: http://localhost:5000
# FastAPI: http://localhost:8000/docs
# Grafana: http://localhost:3000 (admin/admin)