November 27, 2009

Separation of Concerns and Replication

In an integration scenario that involves the replication of items from one system (the master) to another (the slave) the direction of the communication is an interesting issue.

First, there is the question, whether the initial creation of a replicated item in the slave should be performed by a request from the master to the slave or by a request from the slave to the master. The former will often be chosen when there is a business rule involved that should trigger the replication (e.g. replicate all employees in the slave system whose contract ends in 6 months). The latter is more likely to be used in a scenario where the slave replicates all new items in the master, in other words, where the slave keeps the overall set of items in sync.

Second there is the question how updates of the items in the master are made available in the slave. This is the classic distinction between polling and publish/subscribe: Should the master notify the client of updates or should the slave poll for updates?

In addition, there is sometimes a desire to notify the master about changes the slave makes to the replicated items. The intention of this scenario is not the bidirectional synchronization of master and slave but reflects the master's interest in any changes the slave makes.

What is the guiding principle to make an informed design decision regarding the direction of communication in such a replication scenario? The answer is separation of concerns with the goal of simplicity and avoiding unnecessary coupling. This leads to the question which of the systems should for which communication play the server role and which one should play the client role?

Every communication is initiated to achieve some goal and the correct separation of concerns puts the component that acts to achieve the goal into the client role and the other component into the server role.

Applied to the above scenarios this means that for initial replication and subsequent updates the slave system should act as a client and the master should act as a server. All the necessary resources to be provided by the sever (e.g. providing list of all items that match a business rule) can be derived.

Applied to the scenario where the master is interested in updates from the slave it means that the master should act as the client because it is the master that has the goal (the slave does not care at all). This leads to certain resources that need to be exposed by the slave.

I have seen at least three production systems that had this issue and all of them did it wrong and would have greatly benefited in terms of simplicity and loose coupling from following the above principled design path.

Related is Roy's post Paper tigers and hidden dragons.

No comments:

Post a Comment