PHPnews.io

DDD and your database

Written by Matthias Noback / Original link on May. 13, 2020

The introduction of Domain-Driven Design (DDD) to a larger audience has led to a few really damaging ideas among developers, like this one (maybe it's more a sentiment than an idea):

Data is bad, behavior is good. The domain model is great, the database awful.

(We're not even discussing CRUD in this article, which apparently is the worst of the worst.)

By now many of us feel ashamed of using an ORM alongside a "DDD domain model", putting some mapping configuration in it, doing things inside your entities (or do you call them aggregates?) just to make them easily serializable to the database.

Infrastructure code in your entities

We want our domain model to be pure objects, have only "domain concerns", leading us to reinvent assertion libraries just because "those are in /vendor". We think that mapping configuration is infrastructure code, whether you use annotations or write your mapping code yourself.

In a previous article (Is all code in vendor infrastructure code?) I've shared my definition of infrastructure code and we discussed a more practical definition as well: infrastructure code is code that is not sufficiently isolated to test it with a unit test. Let's use this definition for domain models that are prepared for persistence - that are "ORM-ready" so to speak. In order to know if we have properly separated infrastructure code from domain code, we only have to check if we can exercise the model's behavior in a unit test. Let's show Michael Feather's definition of a unit test once more:

A test is not a unit test if:

  • It talks to the database
  • It communicates across the network
  • It touches the file system
  • It can't run at the same time as any of your other unit tests
  • You have to do special things to your environment (such as editing config files) to run it.

So to test a domain model's behavior with a unit test it shouldn't need an actual database, a network connection, a file system, etc. No "special setup" should be required. You should be able to instantiate the object and call a method on it, just like that.

Entities should be testable in isolation

Let's look at some common cases where people might be worried that they have infrastructure code in their domain model. What about this code:

/**
 * @Entity
 * @Table(name="todo_items")
 */
final class ToDoItem
{
    /**
     * @Id
     * @Column(type="integer")
     * @GeneratedValue
     */
    private int $id;

    /**
     * @Column(type="string")
     */
    private string $description;

    public function __construct()
    {
    }

    public function setDescription(string $description): void
    {
        $this->description = $description;
    }

    // ...
}

Is it true, does this code pass the test? It does: you can run this code without a database, etc. without external dependencies. You can run this code without preparing the context in a special way. In a unit test you can just do this:

$toDoItem = new ToDoItem();
$toDoItem->setDescription('The description');

// ...

So none of this code is infrastructure code, although those annotations are definitely meant to support persistence and you could say this class has "infrastructure concerns".

What about the following example, which has hand-written mapping code in it?

final class ToDoItem
{
    private int $id;

    private string $description;

    public function __construct()
    {
    }

    public function setDescription(string $description): void
    {
        $this->description = $description;
    }

    public function getState(): array
    {
        return [
            'id' => $this->id,
            'description' => $this->description
        ];
    }

    public static function fromState(array $state): self
    {
        $instance = new self();
        $instance->id = (int)$state['id'];
        $instance->description = (string)$state['description'];

        return $instance;
    }

    public static function getTableName(): string
    {
        return 'todo_items';
    }

    // ...
}

It's not a great model. But again, you can just create an instance of this class, call any method on it, and make some assertions. There are no external dependencies, or special setup needed. So far we haven't seen any example of infrastructure code, even though we saw table names, column names, and column types.

Testing persistence

It's good to note that unit tests don't prove that your object can actually be persisted. And it would be quite useless to unit-test the mapping code, e.g. getState() and fromState(). These methods only serve the purpose of persisting. So instead of unit-testing the mapping code you should prove that an object can actually be persisted (as far as your object philosophy allows you to say such a thing). This is something you'd demonstrate with an integration test. It would test the repository and prove that you can save an object and get it back and it would still behave in the same way (or be "equal" to it). See also Test-driving repository classes - Part 2: Storing and retrieving entities.

But for the question: is my domain model decoupled enough (from what? from the database, from infrastructure), the answer is yes. Which for me means that some approaches that I've seen in the wild are true examples of over-engineering. There are, I'm afraid, many projects that have the rule that an entity should be a plain old (PHP, Java, etc.) object that has nothing to do with an ORM or the database in general. Then they have the rule that for every entity they need an extra object (sometimes called an "infrastructure entity"). They then write some code to map the domain entity to the infrastructure entity, and finally they save it to the database using the ORM.

If you do this, it becomes really cumbersome to evolve your domain model in any way, which is a strong signal that this is not a good design at all. I think you should be prepared for change rather than be afraid of it. Maybe the reason for introducing separate infrastructure entities is that they want to localize the changes that would be needed when they'd have to migrate to a different ORM. I must admit that the previous code samples have some ORM-specific code in the entity classes themselves. This makes a migration project a bit more involved, and you'd definitely have to touch many classes to do that. However, the amount of time spent evolving the model is likely much more than the amount of time spent migrating to a new ORM. Which for me makes the debate end in favor of combining the two responsibilities in one object.

Some will say: but that's not following the Single Responsibility Principle (SRP). To be honest, I rarely find that a good argument in the first place. Everybody gets to use their own favorite definition of it, and nobody ever agrees on it. So I'm not going into any debate that brings up SRP.

Actually, if you're migrating to a different ORM, having a good set of integration tests will save you, and it doesn't really matter where the code is that needs to be migrated. The integration tests can then serve as contract tests showing that an alternative implementation of the repository implements the contract of the repository interface correctly (this means you also have to have that interface, which will be a big win anyway). A simple contract test for a repository could look something like this:

$repository = ...;

$repository->save($entity);

$fromDatabase = $repository->getById($entity->id());

self::assertEquals($entity, $fromDatabase);

This is just the beginning; you'd also have to test entities with different states and different numbers of child entities, etc. But once you have these tests, you can replace the repository implementation and check that the alternative implementation can do the same things as the original one did. In other words: it is a correct implementation of the contract of the repository abstraction.

Reconsider using Active Record

If you're looking for a domain model that can be tested in isolation (using a proper unit test, so no database, no special context), then Active Record is not really an option. Calling methods on such objects is likely to trigger database calls. Unless they don't. And then it might be an acceptable compromise.

The thing that gets in the way of achieving isolation with Active Record models is that they act as services too. save(), delete(), getById(); these are all service responsibilities that are tacked onto an entity, which is obviously not a service. Separating these responsibilities can be a good solution. e.g. have a repository interface, indicating that this is an object that calls to something outside the application. Then an implementation that does a Double dispatch to the AR entity, like so:

final class ToDoItemActiveRecordRepository implements ToDoItemRepository
{
    public function save(ToDoItem $entity): void
    {
        $entity->save();
    }
}

This approach only makes sense if it makes your model more testable in isolation, which in my opinion is a huge advantage. But as always it might not look like such a big advantage to you. In that case it may actually not be an advantage, or you could experiment with it and find out more.

Conclusion

In conclusion: don't let pointless dogmatism ruin your domain model and your ability to evolve it. Make sure you can test your model in isolation, but add any configuration or code to it that makes it easy to persist it.

matthiasnoback

« Weekly Update 194 - How to merge multidimensional arrays in PHP? »