Back to Blog
DevOps
18 min read

Setting Up a FastAPI Server That Actually Works

Skip the tutorials that break in production. Here's how to set up FastAPI properly from day one, from someone who made every mistake first.

Tibbe & Brett

I fell in love with FastAPI in 2019. Coming from years of Django and Flask, it felt like the future of Python web development had finally arrived.

Automatic API documentation? Type hints that actually do something? Async support that doesn't make you want to cry? Sign me up.

But like every shiny new framework, FastAPI tutorials get you running in 5 minutes and then abandon you when it's time to deploy to production. I learned this the hard way when my first FastAPI service fell over under real load.

Here's everything I wish someone had told me before I deployed my first FastAPI server.

Why I Switched to FastAPI (And Why You Should Too)

I spent six years building APIs in Django REST Framework. DRF is solid, but it's heavy. For simple APIs, you're pulling in an entire ORM, admin interface, templating system, and middleware stack you'll never use.

Flask felt lighter, but then you spend weeks gluing together extensions for validation, documentation, authentication, and database connections. Every project becomes a custom framework.

FastAPI hit the sweet spot: lightweight like Flask, but batteries-included for API development. Plus, the automatic OpenAPI documentation meant I could finally stop maintaining API docs by hand.

The performance didn't hurt either. FastAPI benchmarks closer to Node.js and Go than traditional Python frameworks. For APIs that actually get used, this matters.

The Setup That Doesn't Lie to You

Most FastAPI tutorials show you this:

from fastapi import FastAPI

app = FastAPI()

@app.get("/")
def read_root():
    return {"Hello": "World"}

# Then: uvicorn main:app --reload

This works great for demos. It's catastrophic for production.

Here's what a real FastAPI project structure looks like:

your-api/
├── app/
│   ├── __init__.py
│   ├── main.py              # Application factory
│   ├── config.py            # Settings management
│   ├── dependencies.py      # Dependency injection
│   ├── models/
│   │   ├── __init__.py
│   │   └── user.py          # Pydantic models
│   ├── routers/
│   │   ├── __init__.py
│   │   ├── auth.py          # Authentication endpoints
│   │   └── users.py         # User management endpoints
│   ├── services/
│   │   ├── __init__.py
│   │   └── user_service.py  # Business logic
│   └── database/
│       ├── __init__.py
│       └── connection.py    # Database setup
├── requirements.txt
├── requirements-dev.txt     # Development dependencies
├── docker-compose.yml
├── Dockerfile
├── gunicorn.conf.py         # Production server config
└── alembic/                 # Database migrations
    └── versions/

I know it looks like overkill for a simple API. Trust me on this one. I've refactored too many "simple" FastAPI apps that grew into unmaintainable messes.

gunicorn.conf.py: The File That Saves Your Sleep

Every FastAPI tutorial tells you to use uvicorn main:app --reload. That's great for development. For production, it's a recipe for 2 AM phone calls.

Uvicorn is a single-process server. If it crashes, your entire API goes down. If you get a traffic spike, it can't scale. If you have a memory leak, it'll eventually consume all available RAM.

Gunicorn with Uvicorn workers gives you process-based concurrency and automatic restarts:

# gunicorn.conf.py
import multiprocessing

# Server socket
bind = "0.0.0.0:8000"
backlog = 2048

# Worker processes
workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "uvicorn.workers.UvicornWorker"
worker_connections = 1000
max_requests = 1000          # Restart workers after 1000 requests
max_requests_jitter = 50     # Add some randomness
preload_app = True           # Load application before forking workers

# Timeouts
timeout = 30                 # Worker timeout
keepalive = 2               # Keep-alive connections

# Logging
accesslog = "-"             # Log to stdout
errorlog = "-"              # Log to stderr
loglevel = "info"
access_log_format = '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s" %(D)s'

# Process naming
proc_name = 'fastapi-server'

# Server mechanics
daemon = False
pidfile = '/tmp/gunicorn.pid'
user = None
group = None
tmp_upload_dir = None

# SSL (if needed)
# keyfile = "/path/to/key.pem"
# certfile = "/path/to/cert.pem"

Why these settings matter:

  • max_requests: Prevents memory leaks from killing workers
  • preload_app: Saves memory when you have multiple workers
  • timeout = 30: Your database queries will sometimes be slow
  • workers = cpu_count * 2 + 1: Good starting point for I/O-heavy workloads

I learned these values through painful trial and error. Start with this configuration and adjust based on your specific load patterns.

Configuration That Doesn't Suck

FastAPI tutorials usually stuff configuration into global variables or environment variables loaded with os.getenv(). This works until you need different settings for development, testing, and production.

Use Pydantic's BaseSettings. It's not just nice to have - it's essential:

# app/config.py
from pydantic_settings import BaseSettings
from typing import Optional

class Settings(BaseSettings):
    # Database
    database_url: str
    database_pool_size: int = 20
    database_pool_overflow: int = 0
    
    # Security
    secret_key: str
    access_token_expire_minutes: int = 30
    refresh_token_expire_minutes: int = 60 * 24 * 7  # 7 days
    
    # CORS
    cors_origins: list[str] = ["http://localhost:3000"]
    
    # External APIs
    stripe_secret_key: Optional[str] = None
    sendgrid_api_key: Optional[str] = None
    
    # Application
    debug: bool = False
    app_name: str = "FastAPI App"
    app_version: str = "1.0.0"
    
    # Logging
    log_level: str = "INFO"
    
    # Redis
    redis_url: Optional[str] = None
    
    class Config:
        env_file = ".env"
        case_sensitive = False

# Global settings instance
settings = Settings()

This gives you:

  • Type validation: Pydantic ensures your config values are the right type
  • Environment variables: Automatically loads from .env files
  • Defaults: Sensible fallbacks for optional settings
  • IDE support: Full autocomplete and type checking

I can't tell you how many production issues I've prevented just by having proper configuration validation.

Application Factory Pattern

Don't create your FastAPI app at the module level. Use the application factory pattern:

# app/main.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.middleware.trustedhost import TrustedHostMiddleware
import logging

from .config import settings
from .routers import users, auth
from .database import engine

def create_application() -> FastAPI:
    application = FastAPI(
        title=settings.app_name,
        version=settings.app_version,
        debug=settings.debug,
        docs_url="/docs" if settings.debug else None,  # Disable docs in production
        redoc_url="/redoc" if settings.debug else None,
    )
    
    # Security middleware
    application.add_middleware(
        TrustedHostMiddleware, 
        allowed_hosts=["localhost", "*.yourdomain.com"]
    )
    
    # CORS middleware
    application.add_middleware(
        CORSMiddleware,
        allow_origins=settings.cors_origins,
        allow_credentials=True,
        allow_methods=["*"],
        allow_headers=["*"],
    )
    
    # Include routers
    application.include_router(users.router, prefix="/api/v1", tags=["users"])
    application.include_router(auth.router, prefix="/api/v1", tags=["auth"])
    
    # Health check endpoint
    @application.get("/health")
    async def health_check():
        return {
            "status": "healthy",
            "version": settings.app_version,
            "environment": "production" if not settings.debug else "development"
        }
    
    return application

app = create_application()

This pattern makes testing easier, allows for different configurations per environment, and keeps your app initialization clean and predictable.

Database Connections That Don't Leak

FastAPI works great with async ORMs like SQLAlchemy 2.0. But connection management in async Python is tricky. Get it wrong, and you'll leak connections until your database refuses new requests.

Here's the database setup I use in production:

# app/database/connection.py
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
from sqlalchemy.orm import sessionmaker
from contextlib import asynccontextmanager

from ..config import settings

# Create engine with connection pooling
engine = create_async_engine(
    settings.database_url,
    pool_size=settings.database_pool_size,
    pool_overflow=settings.database_pool_overflow,
    pool_pre_ping=True,      # Validate connections before use
    pool_recycle=3600,       # Recycle connections after 1 hour
    echo=settings.debug,     # Log SQL queries in development
)

# Session factory
async_session = sessionmaker(
    engine, 
    class_=AsyncSession, 
    expire_on_commit=False
)

@asynccontextmanager
async def get_session():
    async with async_session() as session:
        try:
            yield session
            await session.commit()
        except Exception:
            await session.rollback()
            raise
        finally:
            await session.close()

Key points:

  • pool_pre_ping: Tests connections before using them (prevents stale connection errors)
  • pool_recycle: Prevents connection timeout issues with long-running processes
  • Context manager: Ensures connections are always properly closed
  • Automatic rollback: On exceptions, changes are rolled back automatically

Dependency Injection Done Right

FastAPI's dependency injection is one of its best features. But most examples show simple cases. Here's how to handle complex dependencies:

# app/dependencies.py
from fastapi import Depends, HTTPException, status
from fastapi.security import HTTPBearer
from sqlalchemy.ext.asyncio import AsyncSession
import jwt

from .database.connection import get_session
from .config import settings

security = HTTPBearer()

async def get_db_session() -> AsyncSession:
    async with get_session() as session:
        yield session

async def get_current_user(
    token: str = Depends(security),
    db: AsyncSession = Depends(get_db_session)
):
    try:
        payload = jwt.decode(token.credentials, settings.secret_key, algorithms=["HS256"])
        user_id: int = payload.get("sub")
        if user_id is None:
            raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED)
    except jwt.PyJWTError:
        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED)
    
    # Fetch user from database
    user = await db.get(User, user_id)
    if user is None:
        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED)
    
    return user

# Role-based dependencies
async def get_admin_user(
    current_user: User = Depends(get_current_user)
):
    if not current_user.is_admin:
        raise HTTPException(
            status_code=status.HTTP_403_FORBIDDEN,
            detail="Admin access required"
        )
    return current_user

Now your endpoints can declare exactly what they need:

@router.get("/users/me")
async def get_current_user_info(
    current_user: User = Depends(get_current_user)
):
    return current_user

@router.delete("/users/{user_id}")
async def delete_user(
    user_id: int,
    admin_user: User = Depends(get_admin_user),
    db: AsyncSession = Depends(get_db_session)
):
    # Only admins can delete users
    user = await db.get(User, user_id)
    await db.delete(user)
    return {"message": "User deleted"}

Error Handling That Actually Helps

FastAPI's default error responses are developer-friendly but terrible for production APIs. Users don't need stack traces and internal error details.

Create custom exception handlers:

# In your main.py
from fastapi import Request, HTTPException
from fastapi.responses import JSONResponse
import logging

logger = logging.getLogger(__name__)

@app.exception_handler(HTTPException)
async def http_exception_handler(request: Request, exc: HTTPException):
    return JSONResponse(
        status_code=exc.status_code,
        content={
            "error": {
                "code": exc.status_code,
                "message": exc.detail,
                "type": "http_error"
            }
        }
    )

@app.exception_handler(Exception)
async def general_exception_handler(request: Request, exc: Exception):
    logger.error(f"Unhandled exception: {exc}", exc_info=True)
    return JSONResponse(
        status_code=500,
        content={
            "error": {
                "code": 500,
                "message": "Internal server error",
                "type": "server_error"
            }
        }
    )

The Docker Configuration That Works

Most FastAPI Docker tutorials create massive images and run as root. Here's a production-ready Dockerfile:

# Multi-stage build for smaller images
FROM python:3.11-slim as builder

WORKDIR /app

# Install system dependencies needed for building
RUN apt-get update && apt-get install -y \
    build-essential \
    gcc \
    && rm -rf /var/lib/apt/lists/*

# Copy requirements and install Python dependencies
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Production stage
FROM python:3.11-slim

# Create non-root user
RUN groupadd -r appuser && useradd -r -g appuser appuser

WORKDIR /app

# Copy Python dependencies from builder stage
COPY --from=builder /root/.local /home/appuser/.local
ENV PATH=/home/appuser/.local/bin:$PATH

# Copy application code
COPY --chown=appuser:appuser . .

# Switch to non-root user
USER appuser

# Expose port
EXPOSE 8000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

# Run with Gunicorn
CMD ["gunicorn", "app.main:app", "--config", "gunicorn.conf.py"]

Logging That Saves Your Sanity

Structured logging is critical for FastAPI applications. You need to correlate requests across multiple services and debug issues in production.

# app/logging_config.py
import logging
import sys
from pythonjsonlogger import jsonlogger

def setup_logging():
    # Create formatter
    formatter = jsonlogger.JsonFormatter(
        fmt='%(asctime)s %(name)s %(levelname)s %(message)s',
        datefmt='%Y-%m-%d %H:%M:%S'
    )
    
    # Configure root logger
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)
    
    # Console handler
    handler = logging.StreamHandler(sys.stdout)
    handler.setFormatter(formatter)
    logger.addHandler(handler)
    
    # Suppress noisy loggers
    logging.getLogger("sqlalchemy.engine").setLevel(logging.WARNING)
    logging.getLogger("uvicorn.access").setLevel(logging.WARNING)

Then in your main.py startup:

from .logging_config import setup_logging

@app.on_event("startup")
async def startup_event():
    setup_logging()
    logger.info("Application startup complete")

Testing That Actually Tests Things

FastAPI makes testing easy with its test client, but most examples only test happy paths. Here's how I structure comprehensive API tests:

# tests/conftest.py
import pytest
from fastapi.testclient import TestClient
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
from sqlalchemy.orm import sessionmaker

from app.main import create_application
from app.dependencies import get_db_session
from app.config import Settings

@pytest.fixture
def test_settings():
    return Settings(
        database_url="sqlite+aiosqlite:///:memory:",
        secret_key="test-secret-key",
        debug=True
    )

@pytest.fixture
async def test_db():
    engine = create_async_engine("sqlite+aiosqlite:///:memory:")
    async_session = sessionmaker(engine, class_=AsyncSession)
    
    async with engine.begin() as conn:
        await conn.run_sync(Base.metadata.create_all)
    
    yield async_session
    
    await engine.dispose()

@pytest.fixture
def client(test_settings, test_db):
    app = create_application()
    
    async def override_get_db_session():
        async with test_db() as session:
            yield session
    
    app.dependency_overrides[get_db_session] = override_get_db_session
    
    return TestClient(app)

Performance Monitoring

FastAPI is fast, but you still need to monitor performance in production. Add request timing middleware:

# app/middleware.py
import time
import logging
from fastapi import Request

logger = logging.getLogger(__name__)

@app.middleware("http")
async def add_process_time_header(request: Request, call_next):
    start_time = time.time()
    
    response = await call_next(request)
    
    process_time = time.time() - start_time
    response.headers["X-Process-Time"] = str(process_time)
    
    # Log slow requests
    if process_time > 1.0:  # Requests taking more than 1 second
        logger.warning(
            f"Slow request: {request.method} {request.url} took {process_time:.2f}s"
        )
    
    return response

My FastAPI Production Checklist

Before I deploy any FastAPI service, I go through this checklist:

Configuration:

  • ✅ Using Pydantic Settings with proper validation
  • ✅ Environment-specific configuration files
  • ✅ Secrets stored in environment variables, not code

Database:

  • ✅ Connection pooling configured
  • ✅ Migrations set up (Alembic)
  • ✅ Database indexes on frequently queried columns

Security:

  • ✅ HTTPS enforced
  • ✅ CORS properly configured
  • ✅ Input validation on all endpoints
  • ✅ Rate limiting (with slowapi or similar)

Monitoring:

  • ✅ Health check endpoints
  • ✅ Structured logging
  • ✅ Performance monitoring
  • ✅ Error tracking (Sentry or similar)

Deployment:

  • ✅ Running with Gunicorn, not Uvicorn directly
  • ✅ Docker image built with non-root user
  • ✅ Resource limits set (memory, CPU)

What I Learned the Hard Way

After three years of running FastAPI in production, here are the lessons that cost me the most time:

Async doesn't mean faster. If you're not doing I/O-heavy operations, sync code might actually be faster. Don't make everything async just because you can.

Connection pooling is critical. I've seen FastAPI apps that create new database connections for every request. This kills performance under load.

Validate everything. Pydantic models are great, but validate at the service layer too. Users will send you data that passes schema validation but breaks your business logic.

Monitor from day one. FastAPI apps fail in subtle ways. Set up proper logging and monitoring before you need it.

Test error cases. Your API will receive malformed JSON, missing headers, and invalid authentication tokens. Test these scenarios.

Is FastAPI Ready for Production?

Absolutely. I've run FastAPI services handling millions of requests per day. The performance is excellent, the developer experience is fantastic, and the ecosystem is mature.

But like any framework, FastAPI requires production-grade configuration and operational practices. Don't let the simple tutorials fool you - there's a big difference between a demo API and a production service.

Take the time to set things up properly from the beginning. Your future self (and your users) will thank you when your API stays up under load and you're not debugging production issues at 2 AM.

FastAPI is the future of Python web APIs. Just make sure you deploy it like it matters.

Ready to build production-grade FastAPI services?

We help companies architect and deploy FastAPI applications that scale. From proper async patterns to production deployment pipelines.

Build FastAPI Right