Repository Pattern: Data Access Abstraction

Key Insights

The Repository Pattern creates a clean separation between business logic and data access, making your codebase more testable and maintainable—but only when applied thoughtfully to domains with genuine complexity.
Pairing repositories with the Unit of Work pattern enables transactional consistency across multiple aggregates without leaking database concerns into your domain layer.
The pattern shines in testing scenarios, allowing you to swap real database implementations for in-memory fakes, but beware of creating “leaky abstractions” that expose ORM-specific behaviors through your repository interfaces.

The Problem with Direct Data Access

Every developer has inherited a codebase where database queries are scattered across controllers, services, and even view models. You find SELECT statements in HTTP handlers, Entity Framework queries duplicated across twenty different files, and business logic tangled with connection string management.

This tight coupling creates several concrete problems. When you need to switch from SQL Server to PostgreSQL, you’re touching hundreds of files. When you want to add caching, you’re modifying business logic. When you write unit tests, you’re spinning up actual databases or giving up entirely.

The Repository Pattern addresses this by introducing a single, well-defined boundary between your domain logic and your data access infrastructure.

What is the Repository Pattern?

Martin Fowler defines a repository as mediating “between the domain and data mapping layers using a collection-like interface for accessing domain objects.” The key insight is treating your data store as if it were an in-memory collection.

Your business logic shouldn’t care whether users are stored in PostgreSQL, MongoDB, or a flat file. It should simply ask for users and receive them.

Here’s the foundational interface:

public interface IRepository<T> where T : class
{
    Task<T?> GetByIdAsync(int id);
    Task<IEnumerable<T>> GetAllAsync();
    Task<IEnumerable<T>> FindAsync(Expression<Func<T, bool>> predicate);
    Task AddAsync(T entity);
    void Update(T entity);
    void Remove(T entity);
}

This generic interface provides basic CRUD operations while remaining completely agnostic about the underlying storage mechanism. Your domain services depend on this abstraction, not on Entity Framework’s DbContext or Dapper’s SqlConnection.

Anatomy of a Repository Implementation

A production repository typically consists of three components: the interface contract, the concrete implementation, and the entity models it manages.

Let’s build a complete user repository:

public interface IUserRepository : IRepository<User>
{
    Task<User?> GetByEmailAsync(string email);
    Task<IEnumerable<User>> GetActiveUsersAsync();
    Task<bool> EmailExistsAsync(string email);
}

public class UserRepository : IUserRepository
{
    private readonly ApplicationDbContext _context;

    public UserRepository(ApplicationDbContext context)
    {
        _context = context;
    }

    public async Task<User?> GetByIdAsync(int id)
    {
        return await _context.Users
            .Include(u => u.Profile)
            .FirstOrDefaultAsync(u => u.Id == id);
    }

    public async Task<IEnumerable<User>> GetAllAsync()
    {
        return await _context.Users.ToListAsync();
    }

    public async Task<IEnumerable<User>> FindAsync(
        Expression<Func<User, bool>> predicate)
    {
        return await _context.Users.Where(predicate).ToListAsync();
    }

    public async Task<User?> GetByEmailAsync(string email)
    {
        return await _context.Users
            .FirstOrDefaultAsync(u => u.Email.ToLower() == email.ToLower());
    }

    public async Task<IEnumerable<User>> GetActiveUsersAsync()
    {
        return await _context.Users
            .Where(u => u.IsActive && u.LastLoginDate > DateTime.UtcNow.AddDays(-30))
            .OrderByDescending(u => u.LastLoginDate)
            .ToListAsync();
    }

    public async Task<bool> EmailExistsAsync(string email)
    {
        return await _context.Users
            .AnyAsync(u => u.Email.ToLower() == email.ToLower());
    }

    public async Task AddAsync(User entity)
    {
        await _context.Users.AddAsync(entity);
    }

    public void Update(User entity)
    {
        _context.Users.Update(entity);
    }

    public void Remove(User entity)
    {
        _context.Users.Remove(entity);
    }
}

Notice how the repository encapsulates query complexity. GetActiveUsersAsync() contains business rules about what “active” means. If that definition changes, you modify one method, not fifty call sites.

Repository Pattern vs. Direct ORM Usage

Here’s where I’ll be direct: not every application needs the Repository Pattern.

If you’re building a simple CRUD application with Entity Framework, you already have a repository—it’s called DbContext. Adding another abstraction layer on top provides minimal benefit and maximum ceremony.

Compare these two approaches:

// Direct DbContext usage
public class UserController : ControllerBase
{
    private readonly ApplicationDbContext _context;

    public UserController(ApplicationDbContext context)
    {
        _context = context;
    }

    [HttpGet("{id}")]
    public async Task<ActionResult<User>> GetUser(int id)
    {
        var user = await _context.Users.FindAsync(id);
        return user == null ? NotFound() : Ok(user);
    }
}

// Repository-based approach
public class UserController : ControllerBase
{
    private readonly IUserRepository _userRepository;

    public UserController(IUserRepository userRepository)
    {
        _userRepository = userRepository;
    }

    [HttpGet("{id}")]
    public async Task<ActionResult<User>> GetUser(int id)
    {
        var user = await _userRepository.GetByIdAsync(id);
        return user == null ? NotFound() : Ok(user);
    }
}

For this trivial example, the repository adds nothing. Use repositories when you have complex query logic that benefits from encapsulation, when you need to swap data access implementations, when your domain logic requires isolation from infrastructure concerns, or when testability without a database is a hard requirement.

Unit of Work: The Repository’s Partner

Repositories handle individual aggregates, but real applications often need to modify multiple aggregates atomically. The Unit of Work pattern coordinates these changes.

public interface IUnitOfWork : IDisposable
{
    IUserRepository Users { get; }
    IOrderRepository Orders { get; }
    IProductRepository Products { get; }
    
    Task<int> SaveChangesAsync();
    Task BeginTransactionAsync();
    Task CommitTransactionAsync();
    Task RollbackTransactionAsync();
}

public class UnitOfWork : IUnitOfWork
{
    private readonly ApplicationDbContext _context;
    private IDbContextTransaction? _transaction;

    public IUserRepository Users { get; }
    public IOrderRepository Orders { get; }
    public IProductRepository Products { get; }

    public UnitOfWork(ApplicationDbContext context)
    {
        _context = context;
        Users = new UserRepository(context);
        Orders = new OrderRepository(context);
        Products = new ProductRepository(context);
    }

    public async Task<int> SaveChangesAsync()
    {
        return await _context.SaveChangesAsync();
    }

    public async Task BeginTransactionAsync()
    {
        _transaction = await _context.Database.BeginTransactionAsync();
    }

    public async Task CommitTransactionAsync()
    {
        if (_transaction != null)
        {
            await _transaction.CommitAsync();
            await _transaction.DisposeAsync();
            _transaction = null;
        }
    }

    public async Task RollbackTransactionAsync()
    {
        if (_transaction != null)
        {
            await _transaction.RollbackAsync();
            await _transaction.DisposeAsync();
            _transaction = null;
        }
    }

    public void Dispose()
    {
        _transaction?.Dispose();
        _context.Dispose();
    }
}

Now your services can coordinate complex operations:

public async Task ProcessOrderAsync(int userId, List<OrderItem> items)
{
    await _unitOfWork.BeginTransactionAsync();
    
    try
    {
        var user = await _unitOfWork.Users.GetByIdAsync(userId);
        var order = new Order(user, items);
        
        await _unitOfWork.Orders.AddAsync(order);
        
        foreach (var item in items)
        {
            var product = await _unitOfWork.Products.GetByIdAsync(item.ProductId);
            product.DecrementStock(item.Quantity);
        }
        
        await _unitOfWork.SaveChangesAsync();
        await _unitOfWork.CommitTransactionAsync();
    }
    catch
    {
        await _unitOfWork.RollbackTransactionAsync();
        throw;
    }
}

Testing Benefits and Mocking Strategies

The real payoff comes during testing. With repository interfaces, you can test business logic without touching a database.

public class UserServiceTests
{
    [Fact]
    public async Task RegisterUser_WithExistingEmail_ThrowsException()
    {
        // Arrange
        var mockRepo = new Mock<IUserRepository>();
        mockRepo.Setup(r => r.EmailExistsAsync("test@example.com"))
            .ReturnsAsync(true);

        var mockUnitOfWork = new Mock<IUnitOfWork>();
        mockUnitOfWork.Setup(u => u.Users).Returns(mockRepo.Object);

        var service = new UserService(mockUnitOfWork.Object);

        // Act & Assert
        await Assert.ThrowsAsync<DuplicateEmailException>(
            () => service.RegisterAsync("test@example.com", "password"));
    }
}

For integration tests, you can create an in-memory implementation:

public class InMemoryUserRepository : IUserRepository
{
    private readonly List<User> _users = new();
    private int _nextId = 1;

    public Task<User?> GetByIdAsync(int id)
    {
        return Task.FromResult(_users.FirstOrDefault(u => u.Id == id));
    }

    public Task AddAsync(User entity)
    {
        entity.Id = _nextId++;
        _users.Add(entity);
        return Task.CompletedTask;
    }

    // ... other methods
}

Common Pitfalls and Best Practices

Avoid leaky abstractions. If your repository interface exposes IQueryable<T>, you’ve defeated the purpose. Callers can now write arbitrary queries that may not work with all implementations.

Don’t create generic repositories for everything. The generic IRepository<T> is a starting point, not a destination. Domain-specific repositories with meaningful method names (GetActiveUsers()) are more valuable than generic ones (Find(x => x.IsActive)).

Use the Specification Pattern for complex queries. When query criteria become complex or need to be composed dynamically, specifications provide a clean solution:

public interface ISpecification<T>
{
    Expression<Func<T, bool>> Criteria { get; }
    List<Expression<Func<T, object>>> Includes { get; }
}

public class ActivePremiumUsersSpecification : ISpecification<User>
{
    public Expression<Func<User, bool>> Criteria =>
        u => u.IsActive && u.SubscriptionTier == SubscriptionTier.Premium;

    public List<Expression<Func<T, object>>> Includes =>
        new() { u => u.Profile, u => u.Subscriptions };
}

// Repository method
public async Task<IEnumerable<User>> FindAsync(ISpecification<User> spec)
{
    return await _context.Users
        .Where(spec.Criteria)
        .ApplyIncludes(spec.Includes)
        .ToListAsync();
}

The Repository Pattern isn’t about adding layers for the sake of architecture. It’s about creating clear boundaries that make your code easier to test, maintain, and evolve. Apply it where it solves real problems, and skip it where it doesn’t.