The Repository Pattern Explained

By Mike on 22 February 2012

Update: This is a much better, less technical explanation.

Made popular by Domain Driven Design, the Repository pattern is one of my favourite design patterns, which I use almost everywhere. What can I say? I like clear separation between the database access and the rest of the application and this is the purpose of the Repository pattern.

Separation of concerns is something you want in every real-life application and one of the most often met culprits when dealing with bad code is the happy mingleling of the database access  all over the place. Of course, this is also encouraged by the database centric approach which still dominates the developers mindset and you need stronger discipline in order to keep the database access isolated.

In DDD, you start with the domain while the database access , in fact the persistence details, is ironed out at a later time. This means you are free to ignore anything persistence related, but  that doesn't mean that there is no persistence. However the persistence is viewed as a collection of objects where domain object are sent to die... sorry to be persisted and to be retrieved from. Note that the important thing is to consider the repository a collection and not to actually implement a collection (for example .net has an ICollection interface, this has nothing to do with a repository). This approach isn't something specific only to DDD, it can be used in any application as it is a design pattern after all.

In a nutshell, the Repository pattern means abstracting the persistence layer, masking it as a collection. This way the application doesn't care about databases and other persistence details, it only deals with the abstraction (which usually is coded as an interface). The Repository also acts as a facade, simplifying the use of the persistence layer. The application just calls the methods of a repository in order to store or retrieve objects.

Even if you don't do DDD, the objects you send or get from the repositories are the business/domain objects and NOT database related objects. This means that the repository implementation has knowledge of the business objects and knows how to recreate them.

Using the Repository pattern doesn't mean there is only one repository in an application. There can be as many as needed, usually a repository deals with a specific context. For example, an application can have an OrdersRepository, an UsersRepository and an AdminRepository. It is good practice to code against an abstraction, so that's why you'll find many examples dealing with defining a repository interface and not a class. Working against an interface makes it easy to have multiple implementations which helps a lot with testing or changing implementation details.

As an example, let's see how we can apply the pattern for a blog engine. When dealing with posts I can use this 

public interface IPostsRepository
    {
        void Save(Post mypost);
        Post Get(int id);
        PaginatedResult<Post> List(int skip,int pageSize);
        PaginatedResult<Post> SearchByTitle(string title,int skip,int pageSize);
    }

    Post is a domain type, it isn't related to a possible Post class that can be defined in the persistence layer if you're using an OR\M. Remember that the Repository talks with the application only using the types that the application knows about. Any class defined for OR\M usage is hidden in the persistence layer as it's simply an implementation detail.
   
    We see methods for saving a post or for retrieving a post. The Save method is smart enough to detect if it's a new post or a modified post. The PaginatedResult<Post>, is a C# generic type, meaning that the List and SearchByTitle methods return a list of Posts with pagination details.
   
    When testing I can mock or stub this interface, without needing to actually implement a real database access. The application will work only with the interface (which will be injected where is needed), without caring about how things will be actually persisted.
   
    There is no rule or format on how to define a repository. You define it according to what you need from it. But most of the time you'll need to Save or to Get objects in different ways.
   
 A word about using transactions with a repository. Since most of the time the Repository is used in a DDD context, I'll refer to this case.  When you need to execute different operations as a unit, there's where the Unit of Work pattern comes into play. And there are discussion if the UoW is part of the Repository of viceversa. However, in DDD ,the repository should save only Aggregate Roots which themselves are a consistent unit. This means you don't need explicit transaction handling, because saving an aggregate root implicitly means that everything belonging to it is persisted as a transaction.

Comments (8) -

Jack
Jack
11 January 2013 #

The repository pattern is an anti-pattern: http://www.youtube.com/watch?v=0tlMTJDKiug

Reply

Admin
Admin
11 January 2013 #

Yeah... I'm tired of this (Ayende) stuff. Call it however you want, it's important to keep business layer and persistence layer separated. Also, I really like decoupled (maintainable) code.

Of course, Ayende tries to promote his RavenDB, where OMG no repositories are required. Well, similar to NHibernate, Entity Framework or any other db solution (with lots of cool videos of how easy is to do [insert trivial CRUD stuff]), it doesn't invalidate the repository pattern. Those solutions are great to work with the db and especially for the cases where business object matches database object, but they are STILL a Persistence detail.

Reply

Eric
Eric
14 December 2013 #

Mmmm...I watched the video Jack mentioned...I think the crux of the argument was that we shouldn't have abstractions for the sake of abstractions. I found the most useful part after the 55min mark. I've seen some of his blogs on the subject and even he (Ayende) would concede that it's horses for courses. I personally learned something from that video in the same way that I'm learning from SapiensWorks.

Reply

Jack
Jack
11 January 2013 #

I don't see the clear argument why to use repositories like an ancient stored procure style programming. Everything you tried does the ORM already.

Have you actually read the DDD book? Look at the definition for a Repository and you will recognize that it is exactly what todays ORMs do.

DDD was written around 2002 and Hibernate had just started 2001 followed by NHibernate.

This has nothing to do with the ravendb.
And if you have cross cutting concerns the repository pattern is anyway used wrong.

Reply

dg
dg
8 February 2013 #

Jack, that's all great and well if your domain is mapping to and ONLY to a relational database... what if you switch ORM tools or frameworks? Are you going to go through all of your application services/command classes and refactor that code? Or perhaps you could do it in one spot -- the repository. And what if your application changes, and you wind up having to pull from a remote application, or maybe you just want to run some unit tests on the rest of your code without accessing a database? That's when you would bootstrap your repositories with appropriate strategies. Different environments often require different measures.

Reply

Admin
Admin
8 February 2013 #

Jack, an ORM will always return Persistence Entities not Domain Objects. They might be very similar or even the same in some cases, but that's it. And personally, I hate building Domain Behavior around infrastructure details.

The purpose of an ORM is to map objects to tables. That's it. And if you're using a NoSql db or getting/sending data from a remote service, well... no ORM for you.

I know that probably nothing I'd say won't convince you, but I see great value in Repository pattern and I see only trouble when the ORM or a Db implementation is mistaken for the repository.

Reply

Kirk Rasmussen
Kirk Rasmussen
26 March 2014 #

Hi Mike,

I love your posts on DDD. It has helped clarify my thoughts on attempting DDD through the years. I have to admit that I have been tricked by the siren song into using the persistent objects as the domain model in the past. As you said as the complexity of a project increases the problem of “serving two masters” always crops up. I totally buy the argument that domain and persistent models (and UI models for that matter) should be separated. Yes it’s a bit extra work but you gain some architecture advantage so that each tier can evolve independently.

One place that I cannot quite wrap my head around in the separation of ORM and Domain objects is the concept of an ID . Unless you have a clearly identifiable IMMUTABLE business key I'm not sure what’s the best strategy here. Otherwise the ID exists as an implementation detail of the persistence technology only.

Here in your example below for repository interface:

Post Get(int id);

Does the ID a true business key? Should every domain object that expects to be persisted assume an external ID must be provided, e.g. UUID, Integer, or String? Do we just accept that we cannot have “100% pure” Domain objects? I would be very interested to hear your thoughts.

Thanks and great work!

Reply

Admin
Admin
27 March 2014 #

Hi Kirk, maybe I haven't chosen the best example, in my apps I'm always using a guid as domain object id which, of course, uniquely identifies the entity and it's not generated by the database.

You can use a business related id (such as an internal code), the only problem is you don't know if that id format would change in the future. A guid is a bit of a compromise, as it ensures future id changes compatibility, however there's nothing preventing you to have an overloard for Get, which takes the internal code as the argument.

So you'll end up with 2 Get method, one by guid and another by internal code. I don't say string (although it can be valid as well), because that code should be a value object (maybe implemented as a struct) to ensure we're dealing with valid values.

Of course, this is not a rule, but IMO it's easier to start with guids then add the domain specific id format later. Most of the time, that format is for UI only anyway.

Reply

Pingbacks and trackbacks (1)+

Add comment

biuquote
Loading