Python SQLAlchemy: ORM and Core Usage
SQLAlchemy is Python's most powerful database toolkit, offering two complementary approaches to database interaction. SQLAlchemy Core provides a SQL abstraction layer that lets you write...
Key Insights
- SQLAlchemy offers two distinct APIs: Core for SQL-centric operations with maximum control, and ORM for object-oriented database interactions with automatic relationship management
- Core provides 20-30% better performance for bulk operations and reporting queries, while ORM excels at complex object graphs and business logic encapsulation
- Choose Core for data pipelines and analytics, ORM for application development with complex domain models, and don’t be afraid to mix both in the same project
Introduction to SQLAlchemy Architecture
SQLAlchemy is Python’s most powerful database toolkit, offering two complementary approaches to database interaction. SQLAlchemy Core provides a SQL abstraction layer that lets you write database-agnostic queries using Python expressions. SQLAlchemy ORM sits on top of Core, adding object-relational mapping capabilities that transform database rows into Python objects.
The choice between Core and ORM isn’t binary. Core excels when you need precise control over SQL generation, bulk operations, or data warehouse queries. ORM shines when modeling complex business domains with relationships, inheritance, and rich object behaviors. Most production applications use both.
Here’s the basic setup:
# Installation
# pip install sqlalchemy
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
# Create engine - connection pool to your database
engine = create_engine(
'postgresql://user:password@localhost/mydb',
echo=True, # Log all SQL (disable in production)
pool_size=10,
max_overflow=20
)
# For ORM operations
Session = sessionmaker(bind=engine)
SQLAlchemy Core: Low-Level Database Operations
Core gives you full control over SQL generation while maintaining database portability. You define tables using Python objects, then construct queries using an expression language that compiles to optimized SQL.
from sqlalchemy import MetaData, Table, Column, Integer, String, Text, DateTime, ForeignKey
from sqlalchemy import select, insert, update, delete
from datetime import datetime
metadata = MetaData()
users_table = Table(
'users',
metadata,
Column('id', Integer, primary_key=True),
Column('username', String(50), nullable=False, unique=True),
Column('email', String(120), nullable=False),
Column('created_at', DateTime, default=datetime.utcnow)
)
posts_table = Table(
'posts',
metadata,
Column('id', Integer, primary_key=True),
Column('user_id', Integer, ForeignKey('users.id'), nullable=False),
Column('title', String(200), nullable=False),
Column('content', Text),
Column('created_at', DateTime, default=datetime.utcnow)
)
# Create all tables
metadata.create_all(engine)
# Execute queries with Core
with engine.connect() as conn:
# Insert using expression language
stmt = insert(users_table).values(
username='alice',
email='alice@example.com'
)
result = conn.execute(stmt)
conn.commit()
# Select with filters
stmt = select(users_table).where(users_table.c.username == 'alice')
result = conn.execute(stmt)
user = result.fetchone()
print(f"User: {user.username}, Email: {user.email}")
Core’s expression language is type-safe and provides excellent IDE autocomplete. The Table.c attribute gives you access to columns for building WHERE clauses, JOINs, and ORDER BY statements.
SQLAlchemy ORM: Declarative Models
The ORM transforms your database into a collection of Python classes. Each table becomes a class, each row becomes an instance, and relationships become object attributes. This approach feels natural for application development.
from sqlalchemy.orm import declarative_base, relationship
from sqlalchemy import Column, Integer, String, Text, DateTime, ForeignKey
from datetime import datetime
Base = declarative_base()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
username = Column(String(50), nullable=False, unique=True)
email = Column(String(120), nullable=False)
created_at = Column(DateTime, default=datetime.utcnow)
# Relationship - one user has many posts
posts = relationship('Post', back_populates='author', cascade='all, delete-orphan')
def __repr__(self):
return f"<User(username='{self.username}')>"
class Post(Base):
__tablename__ = 'posts'
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey('users.id'), nullable=False)
title = Column(String(200), nullable=False)
content = Column(Text)
created_at = Column(DateTime, default=datetime.utcnow)
# Relationship - each post belongs to one user
author = relationship('User', back_populates='posts')
def __repr__(self):
return f"<Post(title='{self.title}')>"
# Create tables
Base.metadata.create_all(engine)
# Use the ORM
session = Session()
# Create and persist objects
user = User(username='bob', email='bob@example.com')
post = Post(title='First Post', content='Hello SQLAlchemy!', author=user)
session.add(user)
session.add(post)
session.commit()
# Navigate relationships
print(f"Author: {post.author.username}")
print(f"Posts by {user.username}: {[p.title for p in user.posts]}")
The relationship() function creates bidirectional links between objects. The back_populates parameter ensures both sides stay synchronized. The cascade option controls what happens to posts when a user is deleted.
CRUD Operations Comparison
Here’s how the same operations look in both APIs:
# CREATE - Core
with engine.connect() as conn:
stmt = insert(users_table).values(username='charlie', email='charlie@example.com')
conn.execute(stmt)
conn.commit()
# CREATE - ORM
session = Session()
user = User(username='charlie', email='charlie@example.com')
session.add(user)
session.commit()
# READ - Core
with engine.connect() as conn:
stmt = select(users_table).where(users_table.c.username.like('ch%'))
results = conn.execute(stmt).fetchall()
# READ - ORM
users = session.query(User).filter(User.username.like('ch%')).all()
# UPDATE - Core
with engine.connect() as conn:
stmt = update(users_table).where(
users_table.c.username == 'charlie'
).values(email='newemail@example.com')
conn.execute(stmt)
conn.commit()
# UPDATE - ORM
user = session.query(User).filter_by(username='charlie').first()
user.email = 'newemail@example.com'
session.commit()
# DELETE - Core
with engine.connect() as conn:
stmt = delete(users_table).where(users_table.c.username == 'charlie')
conn.execute(stmt)
conn.commit()
# DELETE - ORM
user = session.query(User).filter_by(username='charlie').first()
session.delete(user)
session.commit()
Core requires explicit transaction management with commit(). ORM uses the Unit of Work pattern—changes accumulate in the session and flush to the database on commit(). This batching improves performance and ensures consistency.
Advanced Querying Techniques
Complex queries showcase where each API excels:
# JOIN with aggregation - Core
from sqlalchemy import func
with engine.connect() as conn:
stmt = select(
users_table.c.username,
func.count(posts_table.c.id).label('post_count')
).select_from(
users_table.join(posts_table, users_table.c.id == posts_table.c.user_id)
).group_by(users_table.c.username).having(func.count(posts_table.c.id) > 5)
results = conn.execute(stmt).fetchall()
# Same query - ORM
from sqlalchemy.orm import joinedload
results = session.query(
User.username,
func.count(Post.id).label('post_count')
).join(User.posts).group_by(User.username).having(
func.count(Post.id) > 5
).all()
# Eager loading to avoid N+1 queries
users_with_posts = session.query(User).options(
joinedload(User.posts)
).all()
# This executes ONE query with a JOIN, not N+1 queries
for user in users_with_posts:
print(f"{user.username}: {len(user.posts)} posts")
# Lazy loading (default) - loads on access
user = session.query(User).first()
# SQL query happens here when accessing .posts
posts = user.posts
# Subquery loading for one-to-many
from sqlalchemy.orm import subqueryload
users = session.query(User).options(subqueryload(User.posts)).all()
Understanding loading strategies is critical for ORM performance. joinedload uses a LEFT OUTER JOIN, subqueryload issues two queries, and selectinload (SQLAlchemy 1.2+) uses an IN clause. Choose based on your data cardinality.
Best Practices and Performance Tips
Connection pooling is crucial for production applications:
from sqlalchemy.pool import QueuePool
engine = create_engine(
'postgresql://user:pass@localhost/db',
poolclass=QueuePool,
pool_size=20, # Persistent connections
max_overflow=10, # Additional connections under load
pool_pre_ping=True, # Verify connections before use
pool_recycle=3600 # Recycle connections every hour
)
Query optimization requires measurement:
from sqlalchemy import event
from sqlalchemy.engine import Engine
import time
import logging
logging.basicConfig()
logger = logging.getLogger('sqlalchemy.engine')
logger.setLevel(logging.INFO)
@event.listens_for(Engine, "before_cursor_execute")
def before_cursor_execute(conn, cursor, statement, parameters, context, executemany):
conn.info.setdefault('query_start_time', []).append(time.time())
@event.listens_for(Engine, "after_cursor_execute")
def after_cursor_execute(conn, cursor, statement, parameters, context, executemany):
total = time.time() - conn.info['query_start_time'].pop(-1)
logger.info(f"Query took {total:.4f}s: {statement[:100]}")
Use Core when you need:
- Bulk inserts/updates (10x faster than ORM)
- Complex reporting queries
- Data migration scripts
- Maximum control over SQL generation
Use ORM when you have:
- Complex domain models with inheritance
- Rich business logic in model methods
- Need for relationship traversal
- Standard CRUD operations
For database migrations, integrate Alembic:
pip install alembic
alembic init migrations
# Edit alembic.ini with your database URL
# Create migration: alembic revision --autogenerate -m "Add users table"
# Apply migration: alembic upgrade head
SQLAlchemy’s dual nature is its strength. Start with ORM for rapid development, optimize hot paths with Core, and use both where appropriate. The APIs interoperate seamlessly—you can drop to Core within ORM queries using session.execute() with Core statements. This flexibility makes SQLAlchemy the definitive choice for Python database work.