Graphinder – Application architecture

In a previous post I mentioned that I came to agreement with myself on Graphinder’s application architecture.
As it’s a moment when I’m slowly moving out to other services working around the whole infrastructure, it would be wise to wrap things up and pinpoint possible points of failure or misuse. I’d also share a little insight on my plans of putting such architecture in place on Azure, but wider coverage of that topic would come as a next article.

Application architecture overview

To keep you accustomed with my concept right from the start, I guess the wisest idea would be to start from graphical visualization.
As its far from ideal and still evolving (but not so dramatically as before), please keep it mind that it’s far from ideal in representing every aspect of architecture that’s out there. It would be impossible to put it all organized in a small space of diagram.

Application architecture diagram

Alrighty. We have three web applications I’m currently developing for Graphinder (WorkerApi, GatewayApi and Web) and one, that is currently on hold (Users).
For anyone that’s at least a little into microservices idea, one thing would be odd here. There is no service directory (or registry – have seen many names around the web) and my services are not designed to use service discovery.
Why not? Well, I’m really, really new into microservices. When I’ve said I’m quite new in Reactive Extensions and would want to compare it with my experience with microservices, I’d have to say I’m Rx.Net pro. But I’m not.
Step by step, over next iterations of project I’m gonna improve the whole architecture but hey, first things first. Let’s at least deliver minimal project on end of May!

Services communication flow

Since I have more than one communication flow from frontend down to database, I’ve separated responsibilities in algorithms domain:

  • Algorithms.GatewayApi – manages classical requests like get me a data set, accept a new data set and persists it etc; GatewayApi also stands as point of requesting problem solutions, manages queue of requests and stands as a point of registration for new WorkerApi instances; has also knowledge of SignalR hubs that will accept live progress reports from workers
  • Algorithms.WorkerApi – works on a problem received from gateway; persist current state of worked problem; notifies to address given by gateway; has no idea of nothing around except algorithms_db and parent GatewayApi

Example workflows

  • User posts new solution finding request → Web application calls Algorithms.GatewayApi with request data → GatewayApi enqueues request, callbacks on what has been done → Web informs user what has been done
  • User opens view for currently worked algorithm → Web connects user to SignalR hub user requested ↔ Worker keeps on posting progress to hub so that user has feedback on what’s going on
  • User requests historical data of once completed solution finding → Web calls GatewayApi for archival data -> GatewayApi takes data from algorithms_db and returns it to WebWeb displays archival data to user

The list of possible scenarios for this design is long, but I hope you get the idea on why it has been split up here.

Infrastructure concepts

Since I’m going to communicate over unsecured HTTP protocol throughout the architecture, I’d need to put some sort of environment isolation.
I’ve decided to put all services into separate virtual network, provided out of the box by Azure.
The only valid, public endpoint for accessing whole application would be HTTPS (443) port on Web application.
Since I will cover whole configuration on Azure in the next post, I will leave the rest for that post.

Point of interest

  • Since whole architecture is strongly encapsulated, there would be a need for at least one more public endpoint for other applications reaching services, e.g. mobile applications and other web apps
  • Provided I would like to add integration with other vendors services, I would need to decide whether Web application is point of connecting to them or should I provide small service inside infrastructure for this, depending on my requirements
  • Provided I would like to connect to any on-premise (or different Azure subscription) applications I have, I would need to think of Site-To-Site VPN configuration or additional endpoint for that (VPN gateway cost vs performance)

And that’s it for today. Let me know what you think on my current design.

Graphinder – Queries and Commands vs Repository pattern

While planning a data access layer for Graphinder project I found myself really frustrated with Repository pattern, that I’ve followed for quite a time like a mantra, that was spread by many business and by Microsoft itself. But since Entity Framework 6 already provides a mix of repository and unit of work pattern, why would I build yet another abstraction over it?
That’s when I stumbled upon a concept of Queries and Commands, known especially from CQRS (Command and Query Responsibility Segregation). I would like to emphasize here, that I’m not 100% into CQRS. I like to think of myself as a picker of what’s best for me from any either newer or older pattern/approach. I’m nobodies blind follower.

What’s wrong with Repository pattern

Repository is a good-old pattern that is around here for some time already. Its purpose is quite simple:

The repository mediates between the data source layer and the business layers of the application. It queries the data source for the data, maps the data from the data source to a business entity, and persists changes in the business entity to the data source. A repository separates the business logic from the interactions with the underlying data source or Web service.

So we have mediating, we have mapping (post about persistence models: Domain models vs Persistence models) and we have seperation. And we do agree with those points, right? Right.
But things are not so simple out there. Repositories often end up as big, bloated almost-god-objects with lots of one-time-usage methods, spanning to mostly hundreds and hundreds of line. When we see such class in domain/business layer, what do we think first? Decomposition!

Why such thought have not been born here? Why encourage things like this (example exaggerated on purpose):

Of course we can go with query that takes one object of query params that will have milion of properties inside of it. And we can keep on going on this. But why?

Queries and Commands – decomposition

Why not go the other way round? When I see such big object with big responsibilities, I think decomposition.
I don’t care if it is most adore pattern in the world. From the moment I look at it, I can see how a great idea is so wrong much too often on implementation side. Why wouldn’t I go for something like this instead:

Can’t a simple insert operation be like this instead:

I know what would you ask now: how do I test then?
Question is, what would you like exactly to test here, except of maybe null or empty string or invalid debt value (which you can still test here)? Or maybe its ORM you want to test here? But let’s assume you really want to. What now?
You could simple create abstraction, ie. IUserContext instead of DbContext and expose it either through ctor or through public property. You can mock it now, you can stub it now.

So, how about replaceability here?
It has been said that it’s a common myth that databases are commonly swapped in business applications.
I would say that yes: it is quite rare. But it doesn’t stop you from being prepared!
In a rise of NoSQL solutions, it will also be much more easier to play around and test new technologies.
But hey, we already got it hidden from both domain and services layer, don’t we? Caller doesn’t care what is called up there. And that’s our goal.
As for replacing… Well, provided we don’t want to go through each query and command class to make an actual swap, we might want to expose Context up and just inject a property through any mainstream IoC container.
As an example, Autofac offers it out of the box:

Let’s sum it up then:
I guess I’ve just unsubscribed myself from Repository fan club.

Graphinder – Domain models vs Persistence models

As we’ve moved on from domain logic of optimization problems and algorithms for solving them, I’d like to start a series of few post about approaches of working with persistence layer. Today I’d like to elaborate a little about differences and usage of domain models vs persistence models (also called ‘entities’ in ORM world). But first of all, let’s start with few words about each of model types, what they are and what we’re using them for. For each example I will assume that we’re using Entity Framework 6 as an ORM of choice for each example.

Domain models vs Persistence models – what and what for?

Domain models

Generally speaking, models that represent closely as possible real world and its objects. In both pure DDD and in OOP in general we would see such models taking responsibility for some of the business logic that regards them (provided that SRP is not broken of course). While this approach is quite popular and is the closest to OOP paradigm, with rise of ORMs, so called Anemic Models concept have emerged. In such approach, models are mostly property bags and their business logic is often placed higher, in Service Layer. I do strongly oppose such approach, but there are possibly cases (like RAD approach) where it makes sense, whenever high re-usability is encouraged.

Persistence models

While in many cases domain tend to be simple and easy to persists as-is, it also tends to grow in complexity during application life cycle.
Persistence models attempt to solve that problem by abstracting away model representation required for persistence, that would often be considered at least inappropriate if not wrong in business logic world. When domain is simple, we can often just map properties one-to-one with some ORM-specific changes to meet persistence requirements. It’s also the moment when we think of merging two mentioned models into one. But should we?

Domain models vs Persistence models – to separate or not?

When models in both domain and persistence layer seem to be identical, an idea in developer’s head is born: let’s merge them!
That might be a good idea, whenever there is a strong tendency to model data one-to-one or whenever development is Database-First driven, not Domain-First driven. Whenever organization has at least few DB Admins/developers or analytics that model domain with client with ready to convert UML diagrams, that might be an often to see approach.

Simple domain

Let’s take a look on two sample models, that look almost the same and seem to be ideal candidates for merging.
They might come from Database-First approach or domain is so simple that it’s one-to-one mirror of persistence layer.

As Persistence models can enter invalid state as opposed to domain models, we might want not to bother even with encapsulation, as they won’t ever leave Persistence Layer at all. For Entity Framework purposes, we also needed to refactor Subordinates signature to virtual ICollection so that EF6 can create proxies for N-to-Many relationships.
Now if we think of merging those two models into one, that would be a no-brainer:

A must have that have been applied here is a parameterless constructor that Entity Framework will use to populate properties through reflection.

A not so simple domain

Now let’s think of a problem right from the Graphinder project. Let’s take our Simulated Annealing algorithm and all the possible cooling strategies it can use. I need to actually map interface to DB here:

But how do I map it to database? I cannot treat each implementation as separate entity (I mean, I can but it won’t make sense) as there won’t be any difference between them. ICoolingStrategy looks like this:

So.. what can I do?
I can introduce some sort of factory based on persisted enum, defining type of strategy, to be recreated on deserialization, ie:

Now I can recreate appropriate strategy whenever needed.
But strategies I use in algorithms are stateless. What if I have several implementations of interface that have state?
Well, then if there is only difference in logic, I could go with Table per Hierarchy (TPH) approach in Entity Framework and let it create a so called ‘Discriminator’ or even flatten the hierarchy to one entity myself.
But what if implementation might differ throughout the hierarchy?
Well, then the only valid way would be to go with Table per Type approach to have both flexibility and reusability preserved in case anything changes.

Domain models vs Persistence models – when to use what?

After doing a lot of research, I came to conclusion that whenever possible, go with both and even introduce ViewModels as additional representation used only by views. If you find yourself flexible enough without Persistance Model approach, make sure that you don’t throw around Domain models everywhere, make a clear separation.

    Finally, things to keep in mind when making decision:

  • Even if you decide to skip any of model types, make sure that domain has no idea how to persist itself – it’s persistence layer responsibility!
  • Make sure that you don’t throw around the same model throughout whole application. Entity might be good for persistence layer, but throwing it to your views can end up in some mumbo jumbo code in the end!