Módulo 7 14 min de lectura

07 - Validación y Serialización Avanzada

Patrones de Pydantic en endpoints: discriminated unions, nested models, custom serializers y response models.

#pydantic #validation #serialization #fastapi #api-design

1. Response Models: Controlando el Output

FastAPI usa response_model para:

  1. Filtrar campos — Solo devuelve lo declarado en el modelo
  2. Validar output — Garantiza que el response cumple el contrato
  3. Documentar OpenAPI — Genera el schema automáticamente
from pydantic import BaseModel, EmailStr
from datetime import datetime

# Modelo interno (incluye campos sensibles)
class UserInDB(BaseModel):
    id: int
    email: EmailStr
    hashed_password: str  # ¡No exponer!
    created_at: datetime
    is_admin: bool

# Modelo de respuesta (solo campos públicos)
class UserResponse(BaseModel):
    id: int
    email: EmailStr
    created_at: datetime

@app.get("/users/{user_id}", response_model=UserResponse)
async def get_user(user_id: int) -> UserInDB:
    user = await db.get(UserInDB, user_id)
    return user  # FastAPI filtra automáticamente hashed_password e is_admin

response_model_exclude y response_model_include

@app.get(
    "/users/{user_id}/admin",
    response_model=UserInDB,
    response_model_exclude={"hashed_password"},  # Excluir campos específicos
)
async def get_user_admin(user_id: int) -> UserInDB:
    return await db.get(UserInDB, user_id)

@app.get(
    "/users/{user_id}/minimal",
    response_model=UserInDB,
    response_model_include={"id", "email"},  # Solo estos campos
)
async def get_user_minimal(user_id: int) -> UserInDB:
    return await db.get(UserInDB, user_id)

2. Modelos Anidados y Relaciones

from pydantic import BaseModel, Field

class Address(BaseModel):
    street: str
    city: str
    country: str = "ES"
    postal_code: str = Field(pattern=r"^\d{5}$")

class Company(BaseModel):
    name: str
    tax_id: str
    address: Address

class UserWithCompany(BaseModel):
    id: int
    email: EmailStr
    company: Company | None = None
    addresses: list[Address] = Field(default_factory=list, max_length=5)

# Request body acepta estructura anidada
@app.post("/users", response_model=UserWithCompany)
async def create_user(user: UserWithCompany) -> UserWithCompany:
    # user.company.address.city está disponible con tipos
    return await user_service.create(user)

Validación Profunda con model_validator

class OrderCreate(BaseModel):
    items: list[OrderItem]
    shipping_address: Address
    billing_address: Address | None = None
    use_shipping_for_billing: bool = False

    @model_validator(mode='after')
    def validate_addresses(self) -> 'OrderCreate':
        if not self.use_shipping_for_billing and not self.billing_address:
            raise ValueError('billing_address required when use_shipping_for_billing is False')
        
        if self.use_shipping_for_billing:
            self.billing_address = self.shipping_address
        
        return self

3. Discriminated Unions en APIs

Para endpoints que aceptan múltiples tipos de payload:

from typing import Literal, Annotated, Union
from pydantic import BaseModel, Field

# Diferentes tipos de notificación
class EmailNotification(BaseModel):
    type: Literal['email'] = 'email'
    to: EmailStr
    subject: str
    body: str
    cc: list[EmailStr] = []

class SMSNotification(BaseModel):
    type: Literal['sms'] = 'sms'
    phone: str = Field(pattern=r'^\+\d{10,15}$')
    message: str = Field(max_length=160)

class PushNotification(BaseModel):
    type: Literal['push'] = 'push'
    device_token: str
    title: str = Field(max_length=50)
    body: str = Field(max_length=200)
    data: dict[str, str] = {}

# Union discriminada por el campo 'type'
Notification = Annotated[
    Union[EmailNotification, SMSNotification, PushNotification],
    Field(discriminator='type')
]

@app.post("/notifications")
async def send_notification(notification: Notification) -> dict:
    # Pydantic parsea al tipo correcto automáticamente
    match notification:
        case EmailNotification():
            return await email_service.send(notification)
        case SMSNotification():
            return await sms_service.send(notification)
        case PushNotification():
            return await push_service.send(notification)

4. Serialización Personalizada

Alias para APIs Camel Case

from pydantic import BaseModel, ConfigDict, Field
from pydantic.alias_generators import to_camel

class UserResponse(BaseModel):
    model_config = ConfigDict(
        alias_generator=to_camel,
        populate_by_name=True,  # Acepta snake_case Y camelCase
    )
    
    user_id: int
    first_name: str
    last_name: str
    created_at: datetime

# Output JSON:
# {"userId": 1, "firstName": "John", "lastName": "Doe", "createdAt": "..."}

Custom Serializers

from pydantic import field_serializer, model_serializer
from decimal import Decimal

class Product(BaseModel):
    name: str
    price: Decimal
    tags: set[str]

    @field_serializer('price')
    def serialize_price(self, value: Decimal) -> str:
        return f"${value:.2f}"

    @field_serializer('tags')
    def serialize_tags(self, value: set[str]) -> list[str]:
        return sorted(value)  # Set → Lista ordenada

# model_dump(): {"name": "...", "price": "$29.99", "tags": ["a", "b"]}

Serialización Condicional

class UserProfile(BaseModel):
    id: int
    email: EmailStr
    phone: str | None = None
    internal_notes: str | None = None  # Solo para admins

    @model_serializer(mode='wrap')
    def serialize(self, handler, info) -> dict:
        data = handler(self)
        
        # Excluir internal_notes si no es contexto admin
        if not info.context or not info.context.get('is_admin'):
            data.pop('internal_notes', None)
        
        return data

# Uso
profile = UserProfile(id=1, email="a@b.com", internal_notes="VIP customer")
profile.model_dump()  # Sin internal_notes
profile.model_dump(context={'is_admin': True})  # Con internal_notes

5. Paginación Genérica

from typing import Generic, TypeVar, Sequence
from pydantic import BaseModel, computed_field

T = TypeVar('T')

class PaginatedResponse(BaseModel, Generic[T]):
    items: Sequence[T]
    total: int
    page: int
    page_size: int

    @computed_field
    @property
    def total_pages(self) -> int:
        return (self.total + self.page_size - 1) // self.page_size

    @computed_field
    @property
    def has_next(self) -> bool:
        return self.page < self.total_pages

    @computed_field
    @property
    def has_prev(self) -> bool:
        return self.page > 1

# Endpoint tipado
@app.get("/users", response_model=PaginatedResponse[UserResponse])
async def list_users(
    page: int = Query(1, ge=1),
    page_size: int = Query(20, ge=1, le=100),
) -> PaginatedResponse[UserResponse]:
    users = await user_repo.list(page=page, page_size=page_size)
    total = await user_repo.count()
    
    return PaginatedResponse(
        items=users,
        total=total,
        page=page,
        page_size=page_size,
    )

6. Validación de Query Parameters

from fastapi import Query
from typing import Annotated
from enum import Enum

class SortOrder(str, Enum):
    asc = "asc"
    desc = "desc"

class UserFilters(BaseModel):
    """Filtros como modelo Pydantic en vez de parámetros individuales."""
    search: str | None = Field(None, min_length=2, max_length=100)
    role: Literal['admin', 'user', 'guest'] | None = None
    is_active: bool | None = None
    created_after: datetime | None = None
    created_before: datetime | None = None
    sort_by: str = Field('created_at', pattern=r'^[a-z_]+$')
    sort_order: SortOrder = SortOrder.desc

    @model_validator(mode='after')
    def validate_date_range(self) -> 'UserFilters':
        if self.created_after and self.created_before:
            if self.created_after >= self.created_before:
                raise ValueError('created_after must be before created_before')
        return self

# Dependency que parsea query params a modelo
async def parse_user_filters(
    search: Annotated[str | None, Query(min_length=2, max_length=100)] = None,
    role: Annotated[Literal['admin', 'user', 'guest'] | None, Query()] = None,
    is_active: Annotated[bool | None, Query()] = None,
    created_after: Annotated[datetime | None, Query()] = None,
    created_before: Annotated[datetime | None, Query()] = None,
    sort_by: Annotated[str, Query(pattern=r'^[a-z_]+$')] = 'created_at',
    sort_order: Annotated[SortOrder, Query()] = SortOrder.desc,
) -> UserFilters:
    return UserFilters(
        search=search,
        role=role,
        is_active=is_active,
        created_after=created_after,
        created_before=created_before,
        sort_by=sort_by,
        sort_order=sort_order,
    )

@app.get("/users")
async def list_users(
    filters: Annotated[UserFilters, Depends(parse_user_filters)],
    pagination: Annotated[PaginationParams, Depends()],
):
    return await user_service.list(filters=filters, **pagination.__dict__)

7. File Uploads con Validación

from fastapi import UploadFile, File, Form
from pydantic import BaseModel, field_validator

ALLOWED_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.gif'}
MAX_FILE_SIZE = 5 * 1024 * 1024  # 5MB

class ImageUpload(BaseModel):
    """Modelo para validar metadatos del archivo."""
    filename: str
    content_type: str
    size: int

    @field_validator('filename')
    @classmethod
    def validate_extension(cls, v: str) -> str:
        ext = Path(v).suffix.lower()
        if ext not in ALLOWED_EXTENSIONS:
            raise ValueError(f'Extension {ext} not allowed')
        return v

    @field_validator('content_type')
    @classmethod
    def validate_content_type(cls, v: str) -> str:
        if not v.startswith('image/'):
            raise ValueError('Must be an image')
        return v

    @field_validator('size')
    @classmethod
    def validate_size(cls, v: int) -> int:
        if v > MAX_FILE_SIZE:
            raise ValueError(f'File too large (max {MAX_FILE_SIZE // 1024 // 1024}MB)')
        return v

@app.post("/upload")
async def upload_image(
    file: UploadFile = File(...),
    description: str = Form(None, max_length=500),
):
    # Validar con Pydantic
    content = await file.read()
    metadata = ImageUpload(
        filename=file.filename or "unknown",
        content_type=file.content_type or "application/octet-stream",
        size=len(content),
    )
    
    # Procesar archivo válido
    path = await storage.save(content, metadata.filename)
    return {"path": path, "size": metadata.size}

8. Partial Updates (PATCH)

from pydantic import BaseModel
from typing import Any

class UserUpdate(BaseModel):
    """Todos los campos son opcionales para PATCH."""
    email: EmailStr | None = None
    first_name: str | None = None
    last_name: str | None = None
    is_active: bool | None = None

@app.patch("/users/{user_id}")
async def update_user(
    user_id: int,
    updates: UserUpdate,
) -> UserResponse:
    # Solo campos enviados (no None)
    update_data = updates.model_dump(exclude_unset=True)
    
    if not update_data:
        raise HTTPException(400, "No fields to update")
    
    return await user_service.update(user_id, update_data)

Diferencia entre exclude_unset y exclude_none

data = UserUpdate(email="new@email.com", first_name=None)

data.model_dump()
# {'email': 'new@email.com', 'first_name': None, 'last_name': None, 'is_active': None}

data.model_dump(exclude_unset=True)
# {'email': 'new@email.com', 'first_name': None}  ← first_name fue enviado explícitamente

data.model_dump(exclude_none=True)
# {'email': 'new@email.com'}  ← Excluye todos los None

9. Validación de Headers y Cookies

from fastapi import Header, Cookie
from typing import Annotated

class RequestContext(BaseModel):
    """Contexto extraído de headers."""
    request_id: str
    user_agent: str
    accept_language: str
    trace_id: str | None = None

async def get_request_context(
    x_request_id: Annotated[str, Header()],
    user_agent: Annotated[str, Header()],
    accept_language: Annotated[str, Header()] = "en",
    x_trace_id: Annotated[str | None, Header()] = None,
) -> RequestContext:
    return RequestContext(
        request_id=x_request_id,
        user_agent=user_agent,
        accept_language=accept_language,
        trace_id=x_trace_id,
    )

@app.get("/context")
async def get_context(
    ctx: Annotated[RequestContext, Depends(get_request_context)],
    session_id: Annotated[str | None, Cookie()] = None,
):
    return {"context": ctx.model_dump(), "session_id": session_id}

10. OpenAPI Schema Customization

from pydantic import BaseModel, Field
from typing import Annotated

class CreateOrderRequest(BaseModel):
    """
    Request para crear una nueva orden.
    
    El sistema validará el inventario disponible antes de confirmar.
    """
    model_config = ConfigDict(
        json_schema_extra={
            "examples": [
                {
                    "customer_id": 123,
                    "items": [{"product_id": 1, "quantity": 2}],
                    "notes": "Gift wrap please",
                }
            ]
        }
    )
    
    customer_id: int = Field(
        ...,
        description="ID del cliente que realiza la orden",
        examples=[123, 456],
    )
    items: list[OrderItem] = Field(
        ...,
        min_length=1,
        max_length=50,
        description="Lista de productos a ordenar",
    )
    notes: str | None = Field(
        None,
        max_length=500,
        description="Notas adicionales para el envío",
    )
    priority: Literal['normal', 'express'] = Field(
        'normal',
        description="Prioridad de envío",
    )

11. Performance: orjson

Para APIs de alto rendimiento, reemplaza el serializador JSON por defecto:

from fastapi import FastAPI
from fastapi.responses import ORJSONResponse
import orjson

app = FastAPI(default_response_class=ORJSONResponse)

# O por endpoint específico
@app.get("/fast", response_class=ORJSONResponse)
async def fast_endpoint():
    return {"data": large_dataset}

# Configurar Pydantic para usar orjson
class FastModel(BaseModel):
    model_config = ConfigDict(
        json_encoders={
            datetime: lambda v: v.isoformat(),
        }
    )
    
    def model_dump_json(self, **kwargs) -> str:
        return orjson.dumps(self.model_dump(**kwargs)).decode()

Conclusión

La validación en FastAPI/Pydantic es significativamente más potente que Zod/Joi:

  1. Response models — Filtrado automático de campos sensibles
  2. Discriminated unions — Parsing automático al tipo correcto
  3. Generics — Modelos parametrizados (PaginatedResponse[T])
  4. Serializers — Control total sobre el output JSON
  5. OpenAPI — Documentación generada automáticamente

Pattern Senior: Define modelos específicos para cada operación (CreateX, UpdateX, XResponse) en lugar de reutilizar el mismo modelo. Esto da flexibilidad y documenta claramente el contrato de cada endpoint.

En el siguiente capítulo, profundizaremos en SQLAlchemy 2.0 y patrones de persistencia async.