Step-by-Step Guide to Deploying Machine Learning Models with FastAPI and Docker

By Bala Priya C on June 12, 2025 in Practical Machine Learning 1

Step-by-Step Guide to Deploying Machine Learning Models with FastAPI and Docker
Image by Editor | Midjourney

You’ve trained your machine learning model, and it’s performing great on test data. But here’s the truth: a model sitting in a Jupyter notebook isn’t helping anyone. It’s only when you deploy it to production real users can benefit from your work.

In this article we’re building a diabetes progression predictor on a sample dataset from scikit-learn. We’ll take it from raw data all the way to a containerized API that’s ready for the cloud.

By coding along to this tutorial, you’ll have:

A trained Random Forest model that predicts diabetes progression scores
A REST API, built using FastAPI, that accepts patient data and returns predictions
A fully containerized application ready for deployment

Let’s get started.

🔗 Link to the code on GitHub.

Setting Up Your Development Environment

Before we start coding, let’s get your dev environment ready. You’ll need:

Python 3.11+ (though 3.9+ works fine, too)
Docker installed and running
Basic familiarity with Python and APIs (I’ll explain the non-trivial parts)

Project Structure

Here’s how we’ll organize everything in the project directory:

diabetes-predictor/
│
├── app/
│   ├── __init__.py
│   └── main.py              # FastAPI application
│
├── models/
│   └── diabetes_model.pkl   # Trained model (we'll create this)
│
├── train_model.py           # Model training script
├── requirements.txt         # Python dependencies
└── Dockerfile              # Container configuration

diabetes-predictor/

│

├── app/

│ ├── __init__.py

│ └── main.py # FastAPI application

│

├── models/

│ └── diabetes_model.pkl # Trained model (we'll create this)

│

├── train_model.py # Model training script

├── requirements.txt # Python dependencies

└── Dockerfile # Container configuration

Installing Dependencies

Let’s create a clean virtual environment:

$ python -m venv diabetes-env
$ source diabetes-env/bin/activate  # Windows: diabetes-env\Scripts\activate

1 2	$ python -m venv diabetes-env $ source diabetes-env/bin/activate # Windows: diabetes-env\Scripts\activate

Now install the required libraries:

$ pip install scikit-learn pandas fastapi uvicorn

1	$ pip install scikit-learn pandas fastapi uvicorn

Building a Machine Learning Model for Predicting Diabetes Progression

Let’s start by creating our machine learning model. Create train_model.py:

# train_model.py
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score
import pickle
import os

# train_model.py

from sklearn.datasets import load_diabetes

from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestRegressor

from sklearn.metrics import mean_squared_error, r2_score

import pickle

import os

We’ve chosen Random Forest because it’s robust, handles different feature scales well, and gives us feature importance insights as well.

Let’s load and explore our diabetes dataset:

# Load the diabetes dataset
diabetes = load_diabetes()
X, y = diabetes.data, diabetes.target

print(f"Dataset shape: {X.shape}")
print(f"Features: {diabetes.feature_names}")
print(f"Target range: {y.min():.1f} to {y.max():.1f}")

# Load the diabetes dataset

diabetes = load_diabetes()

X, y = diabetes.data, diabetes.target

print(f"Dataset shape: {X.shape}")

print(f"Features: {diabetes.feature_names}")

print(f"Target range: {y.min():.1f} to {y.max():.1f}")

The diabetes dataset is a collection of 442 patient records with 10 physiological features. The target is a quantitative measure of disease progression one year after baseline: higher numbers indicate more advanced progression.

Output:

Dataset shape: (442, 10)
Features: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
Target range: 25.0 to 346.0

Dataset shape: (442, 10)

Features: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']

Target range: 25.0 to 346.0

Now let’s prepare our data:

# Split the data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

print(f"Training samples: {X_train.shape[0]}")
print(f"Test samples: {X_test.shape[0]}")

# Split the data

X_train, X_test, y_train, y_test = train_test_split(

X, y, test_size=0.2, random_state=42

)

print(f"Training samples: {X_train.shape[0]}")

print(f"Test samples: {X_test.shape[0]}")

The 80/20 split gives us enough training data while reserving a solid test set. Using random_state=42 ensures reproducible results.

Output:

Training samples: 353
Test samples: 89

1 2	Training samples: 353 Test samples: 89

Time to train our model:

# Train Random Forest model
model = RandomForestRegressor(
    n_estimators=100,
    random_state=42,
    max_depth=10
)

model.fit(X_train, y_train)

# Train Random Forest model

model = RandomForestRegressor(

n_estimators=100,

random_state=42,

max_depth=10

)

model.fit(X_train, y_train)

We’ve set max_depth=10 to prevent overfitting on this relatively small dataset. With 100 trees, we get good performance without excessive computation time.

Let’s evaluate our model:

# Make predictions and evaluate
y_pred = model.predict(X_test)

mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error: {mse:.2f}")
print(f"R² Score: {r2:.3f}")

# Make predictions and evaluate

y_pred = model.predict(X_test)

mse = mean_squared_error(y_test, y_pred)

r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error: {mse:.2f}")

print(f"R² Score: {r2:.3f}")

The R² score tells us what percentage of variance in disease progression our model explains. Anything above 0.4 is pretty good for this dataset!

Output:

Mean Squared Error: 2974.20
R² Score: 0.439

1 2	Mean Squared Error: 2974.20 R² Score: 0.439

Finally, let’s save our trained model:

# Create models directory and save model
os.makedirs('models', exist_ok=True)

with open('models/diabetes_model.pkl', 'wb') as f:
    pickle.dump(model, f)

print("Model trained and saved successfully!")

# Create models directory and save model

os.makedirs('models', exist_ok=True)

with open('models/diabetes_model.pkl', 'wb') as f:

pickle.dump(model, f)

print("Model trained and saved successfully!")

Run this script to train your model:

$ python3 train_model.py

1	$ python3 train_model.py

You should see output showing your model’s performance and confirmation that it’s been saved.

Creating the FastAPI Application

Now for the exciting part: turning our model into a web API.

If you haven’t already, create the app directory and an empty __init__.py file:

$ mkdir app
$ touch app/__init__.py

1 2	$ mkdir app $ touch app/__init__.py

Now create app/main.py with our API code:

# app/main.py
from fastapi import FastAPI
from pydantic import BaseModel
import pickle
import numpy as np
import os

# app/main.py

from fastapi import FastAPI

from pydantic import BaseModel

import pickle

import numpy as np

import os

FastAPI uses Pydantic for request validation. Meaning it automatically validates incoming data and provides clear error messages if something’s wrong.

Let’s define our input data structure:

# Define input data schema
class PatientData(BaseModel):
    age: float
    sex: float  
    bmi: float
    bp: float   # blood pressure
    s1: float   # serum measurement 1
    s2: float   # serum measurement 2  
    s3: float   # serum measurement 3
    s4: float   # serum measurement 4
    s5: float   # serum measurement 5
    s6: float   # serum measurement 6
    
    class Config:
        schema_extra = {
            "example": {
                "age": 0.05,
                "sex": 0.05,
                "bmi": 0.06,
                "bp": 0.02,
                "s1": -0.04,
                "s2": -0.04,
                "s3": -0.02,
                "s4": -0.01,
                "s5": 0.01,
                "s6": 0.02
            }
        }

# Define input data schema

class PatientData(BaseModel):

age: float

sex: float

bmi: float

bp: float # blood pressure

s1: float # serum measurement 1

s2: float # serum measurement 2

s3: float # serum measurement 3

s4: float # serum measurement 4

s5: float # serum measurement 5

s6: float # serum measurement 6

class Config:

schema_extra = {

"example": {

"age": 0.05,

"sex": 0.05,

"bmi": 0.06,

"bp": 0.02,

"s1": -0.04,

"s2": -0.04,

"s3": -0.02,

"s4": -0.01,

"s5": 0.01,

"s6": 0.02

}

The example values help API users understand the expected input format. Note that the diabetes dataset features are already normalized.

Next, we initialize FastAPI app and load the model into the FastAPI environment:

# Initialize FastAPI app
app = FastAPI(
    title="Diabetes Progression Predictor",
    description="Predicts diabetes progression score from physiological features",
    version="1.0.0"
)

# Load the trained model
model_path = os.path.join("models", "diabetes_model.pkl")
with open(model_path, 'rb') as f:
    model = pickle.load(f)

# Initialize FastAPI app

app = FastAPI(

title="Diabetes Progression Predictor",

description="Predicts diabetes progression score from physiological features",

version="1.0.0"

)

# Load the trained model

model_path = os.path.join("models", "diabetes_model.pkl")

with open(model_path, 'rb') as f:

model = pickle.load(f)

Finally, let’s create our prediction endpoint:

@app.post("/predict")
def predict_progression(patient: PatientData):
    """
    Predict diabetes progression score
    """
    # Convert input to numpy array
    features = np.array([[
        patient.age, patient.sex, patient.bmi, patient.bp,
        patient.s1, patient.s2, patient.s3, patient.s4,
        patient.s5, patient.s6
    ]])
    
    # Make prediction
    prediction = model.predict(features)[0]
    
    # Return result with additional context
    return {
        "predicted_progression_score": round(prediction, 2),
        "interpretation": get_interpretation(prediction)
    }

def get_interpretation(score):
    """Provide human-readable interpretation of the score"""
    if score < 100:
        return "Below average progression"
    elif score < 150:
        return "Average progression"
    else:
        return "Above average progression"

@app.post("/predict")

def predict_progression(patient: PatientData):

"""

Predict diabetes progression score

"""

# Convert input to numpy array

features = np.array([[

patient.age, patient.sex, patient.bmi, patient.bp,

patient.s1, patient.s2, patient.s3, patient.s4,

patient.s5, patient.s6

]])

# Make prediction

prediction = model.predict(features)[0]

# Return result with additional context

return {

"predicted_progression_score": round(prediction, 2),

"interpretation": get_interpretation(prediction)

}

def get_interpretation(score):

"""Provide human-readable interpretation of the score"""

if score < 100:

return "Below average progression"

elif score < 150:

return "Average progression"

else:

return "Above average progression"

The interpretation function helps make our API more user-friendly by providing context for the numerical predictions.

Let’s also add a health check endpoint:

@app.get("/")
def health_check():
    return {"status": "healthy", "model": "diabetes_progression_v1"}

@app.get("/")

def health_check():

return {"status": "healthy", "model": "diabetes_progression_v1"}

Testing the API Locally

Before containerizing, let’s test our API locally. Run the following command from your project’s root directory:

$ uvicorn app.main:app --reload --port 8000

1	$ uvicorn app.main:app --reload --port 8000

Open your browser to http://localhost:8000/ and you’ll see the FastAPI app running. Try making a prediction with the example data.

You can also test with curl:

curl -X POST "http://localhost:8000/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "age": 0.05,
    "sex": 0.05,
    "bmi": 0.06,
    "bp": 0.02,
    "s1": -0.04,
    "s2": -0.04,
    "s3": -0.02,
    "s4": -0.01,
    "s5": 0.01,
    "s6": 0.02
  }'

curl -X POST "http://localhost:8000/predict" \

-H "Content-Type: application/json" \

-d '{

"age": 0.05,

"sex": 0.05,

"bmi": 0.06,

"bp": 0.02,

"s1": -0.04,

"s2": -0.04,

"s3": -0.02,

"s4": -0.01,

"s5": 0.01,

"s6": 0.02

This should give you the following result:

{"predicted_progression_score":213.34,
"interpretation":"Above average progression"}

1 2	{"predicted_progression_score":213.34, "interpretation":"Above average progression"}

Containerizing with Docker

Now let’s package everything into a Docker container. First, create requirements.txt:

fastapi==0.115.12
uvicorn==0.34.2
scikit-learn==1.6.1
pandas==2.2.3
numpy==2.2.6

fastapi==0.115.12

uvicorn==0.34.2

scikit-learn==1.6.1

pandas==2.2.3

numpy==2.2.6

We’ve pinned specific versions to ensure consistency across environments.

Now create the Dockerfile:

# Use Python 3.11 slim image
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Install system dependencies (if needed)
RUN apt-get update && apt-get install -y \
    && rm -rf /var/lib/apt/lists/*

# Copy requirements and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY app/ ./app/
COPY models/ ./models/

# Expose port
EXPOSE 8000

# Run the application
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

# Use Python 3.11 slim image

FROM python:3.11-slim

# Set working directory

WORKDIR /app

# Install system dependencies (if needed)

RUN apt-get update && apt-get install -y \

&& rm -rf /var/lib/apt/lists/*

# Copy requirements and install Python dependencies

COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

# Copy application code

COPY app/ ./app/

COPY models/ ./models/

# Expose port

EXPOSE 8000

# Run the application

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

The slim image keeps our container small, and --no-cache-dir prevents pip from storing cached packages, further reducing size.

Build your Docker image:

$ docker build -t diabetes-predictor .

1	$ docker build -t diabetes-predictor .

Run the container:

$ docker run -d -p 8000:8000 diabetes-predictor

1	$ docker run -d -p 8000:8000 diabetes-predictor

Your API is now running in a container! Test it the same way as before.

Publishing to Docker Hub

Now that your containerized API is working locally, let’s share it with the world through Docker Hub. This step is necessary for cloud deployment. Most cloud platforms can pull directly from Docker Hub, making deployment seamless.

Setting Up Docker Hub

First, you’ll need a Docker Hub account if you don’t have one:

Go to hub.docker.com and sign up
Choose a username you’re happy with. It’ll be part of your image URLs

Logging Into Docker Hub

From your terminal, log into Docker Hub:

$ docker login

1	$ docker login

You’ll be prompted for your Docker Hub username and password. Enter them carefully. This creates an authentication token that lets you push images.

Tagging Your Image

Before pushing, we need to tag our image with your Docker Hub username. Docker uses a specific naming convention:

$ docker tag diabetes-predictor your-username/diabetes-predictor:v1.0

1	$ docker tag diabetes-predictor your-username/diabetes-predictor:v1.0

Replace your-username with your actual Docker Hub username. The v1.0 is a version tag. It’s good practice to version your images so you can track changes and roll back if needed.

Let’s also create a latest tag, which many deployment platforms use by default:

$ docker tag diabetes-predictor your-username/diabetes-predictor:latest

1	$ docker tag diabetes-predictor your-username/diabetes-predictor:latest

Check your tagged images:

$ docker images | grep diabetes-predictor

1	$ docker images \| grep diabetes-predictor

You should see three entries: your original image and the two newly tagged versions.

Pushing to Docker Hub

Now let’s push your image to Docker Hub:

$ docker push your-username/diabetes-predictor:v1.0
$ docker push your-username/diabetes-predictor:latest

1 2	$ docker push your-username/diabetes-predictor:v1.0 $ docker push your-username/diabetes-predictor:latest

The first push might take a few minutes as Docker uploads all the layers. Subsequent pushes should be substantially faster.

You can verify everything works by pulling and running your published image:

# Stop your local container first
$ docker stop $(docker ps -q --filter ancestor=diabetes-predictor)

# Pull and run from Docker Hub
$ docker run -d -p 8000:8000 your-username/diabetes-predictor:latest

# Stop your local container first