Query Optimization¶
SQLAlchemy¶
1. joinedload()¶
Problem solved: N+1 query problem for ForeignKey / One-to-One
Loads related objects using a JOIN in the same query.
Generated SQL Query¶
SELECT
books.id AS books_id,
books.title AS books_title,
books.author_id AS books_author_id,
authors_1.id AS authors_1_id,
authors_1.name AS authors_1_name
FROM books
LEFT OUTER JOIN authors AS authors_1
ON authors_1.id = books.author_id;
Equivalent to Django select_related()
Use when:
- Relation is many-to-one
- Dataset size is moderate
2. selectinload()¶
Problem solved: N+1 query problem for collections
Loads relations using IN queries.
from sqlalchemy.orm import selectinload
session.query(Author).options(
selectinload(Author.books)
).all()
Generated SQL Queries¶
Query 1: Load the Parents
Query 2: Load the Children (The "IN" Query)
SELECT books.id AS books_id, books.title AS books_title, books.author_id AS books_author_id
FROM books
WHERE books.author_id IN (1, 2, 3);
Equivalent to Django prefetch_related()
Use when:
- One-to-many
- Many-to-many
- Large result sets
3. subqueryload()¶
Problem solved: Alternative solution for N+1 queries
Uses a subquery to fetch related data.
Use when:
selectinloadperforms poorly- Database optimizer prefers subqueries
4. lazyload()¶
Problem solved: Prevent unnecessary eager loading
Forces a relationship to load only when accessed.
Use when:
- Relation is rarely used
- You want a minimal initial query
5. noload()¶
Problem solved: Completely disable relationship loading
Use when:
- Relationship is never needed
- Optimizing heavy queries
6. contains_eager()¶
Problem solved: Avoid duplicate joins when manually writing joins
from sqlalchemy.orm import contains_eager
session.query(Book).join(Book.author).options(
contains_eager(Book.author)
)
Use when:
- You already wrote a JOIN
- You want SQLAlchemy to populate the relationship
7. load_only()¶
Problem solved: Over-fetching columns
Loads only specific columns.
Improves:
- Network performance
- Memory usage
8. defer()¶
Problem solved: Delay loading large columns
The large column loads only when accessed.
9. undefer()¶
Problem solved: Force loading of a deferred column
10. with_entities()¶
Problem solved: Avoid loading full ORM objects
Returns tuples instead of full models.
Use when only a few fields are needed.
11. exists()¶
Problem solved: Efficient existence checks
Equivalent to Django .exists().
12. yield_per()¶
Problem solved: Memory issues with large queries
Streams results in batches.
Use when processing millions of rows.
13. bulk_save_objects()¶
Problem solved: Slow bulk inserts
Much faster than normal ORM inserts.
14. bulk_insert_mappings()¶
Problem solved: Very large insert performance
Bypasses ORM object creation.
15. execution_options(stream_results=True)¶
Problem solved: Memory usage for huge queries
16. aliased()¶
Problem solved: Self-joins or multiple references to the same table
from sqlalchemy.orm import aliased
manager = aliased(User)
session.query(User).join(manager, User.manager_id == manager.id)
17. with_for_update()¶
Problem solved: Race conditions / concurrent updates
Locks rows during the transaction.
18. scalars()¶
Problem solved: Simplify result extraction
Returns model objects directly.
19. select()¶
Problem solved: Modern SQLAlchemy querying (2.0 style)
20. relationship(lazy="selectin")¶
Problem solved: Default N+1 prevention at the model level
Automatically prefetches related data.
Summary¶
| Function | Problem Solved |
|---|---|
joinedload() |
N+1 queries (FK / many-to-one) |
selectinload() |
N+1 queries (collections) |
subqueryload() |
N+1 alternative |
lazyload() |
Prevent unnecessary loading |
noload() |
Disable relation loading |
contains_eager() |
Manual join optimization |
load_only() |
Over-fetching columns |
defer() |
Large column loading |
yield_per() |
Memory for large results |
bulk_insert_mappings() |
Fast inserts |
aliased() |
Self-joins |
with_for_update() |
Race conditions |