Skip to main content

Command Palette

Search for a command to run...

The Most Misunderstood Django Optimization

Updated
3 min read
The Most Misunderstood Django Optimization

A few weeks ago, I was debugging what looked like a simple performance issue.

An endpoint returning a list of organizations was slower than expected. Nothing dramatic — just slower than it should be.

I opened Django Debug Toolbar and saw this:

10 similar queries detected

That confused me.

I was already using prefetch_related() for the many-2-many relation I had. There were no obvious loops. And yet, the database was being hit once per row.

Here’s the simplified version of what I had.

The Setup

This was a fairly standard Django + DRF flow.

Imagine these models:

from django.db import models

class Organization(models.Model):
    name = models.CharField(max_length=255)
    categories = models.ManyToManyField(
        "Category",
        through="OrganizationCategory",
        related_name="organizations",
    )


class Category(models.Model):
    name = models.CharField(max_length=255)


class OrganizationCategory(models.Model):
    organization = models.ForeignKey(Organization, on_delete=models.CASCADE)
    category = models.ForeignKey(Category, on_delete=models.CASCADE)
    is_primary = models.BooleanField(default=False)

    class Meta:
        constraints = [
            models.UniqueConstraint(
                fields=["organization"],
                condition=models.Q(is_primary=True),
                name="only_one_primary_category",
            )
        ]

Each organization can have multiple categories, but only one can be marked as primary.

In the viewset, I defined the queryset with the prefetch_related.

def get_queryset(self):
    queryset = Organization.objects.prefetch_related("categories")
    return queryset

So at the database layer, I was explicitly telling Django: “Fetch all related categories in one additional query and attach them to each organization.”

Then, in the serializer — which runs after the queryset has been evaluated and objects have been instantiated — I had this:

class OrganizationSerializer(serializers.ModelSerializer):

    # Other fields
    # ...
    primary_category = serializers.SerializerMethodField()
    
    class Meta:
        model = Organization
        fields = [
            # Other fields
            # ...
            "primary_category",    
        ]
    
    def get_primary_category(self, obj):
        return obj.categories.filter(
            organizationcategory__is_primary=True
        ).first()

Notice what’s happening here:

  • The queryset optimization lives in the view layer.

  • The filtering logic lives in the serializer layer.

They look consistent.
They are not.

Seems reasonable, right?

Wrong.

What Was Actually Happening

Even though I used prefetch_related("categories"), the serializer was still triggering extra queries.

Why?

Because prefetch_related() only caches the result of:

obj.categories.all()

But this line:

obj.categories.filter(...)

creates a new queryset.

Django cannot apply that filter to the prefetched in-memory cache. So it executes a fresh SQL query. Once per object.

That’s your N+1.

The Fix

Instead of filtering inside the serializer, filter during prefetch.

from django.db.models import Prefetch

primary_category_prefetch = Prefetch(
    "categories",
    queryset=Category.objects.filter(
        organizationcategory__is_primary=True
    ),
    to_attr="primary_category_cached",
)

Organization.objects.prefetch_related(primary_category_prefetch)

Then in the serializer:

def get_primary_category(self, obj):
    if not obj.primary_category_cached:
        return None
    return obj.primary_category_cached[0]

Now there were no filtering inside the serializer and no N+1 queries. Debug Toolbar dropped to a constant number of queries.

The Real Lesson

prefetch_related() is not magic.

It does one thing well:
It loads related objects in bulk and stores them in memory.

But the moment you call .filter() on that relation, you’re back to the database.

The deeper lesson is this:

Your queryset and your serializer must agree.

If your serializer needs filtered data, shape the data at the queryset level — not per object.