Programming and architectural paradigms

Programming and architectural paradigms #

Sharing a database is the greatest sin when you architect Microservices yet Space-Based Architecture is built around shared data. How do these approaches coexist? Do Microservices make any sense if blatantly violating their rules still results in successful projects?

Another programming paradox holds a clue. There was C. Then there came C++ to kill C. Then we’ve got Rust to kill C++. Now we have C, C++, and Rust, all of them alive and kickin’.

Technologies are specialized #

When a new technology emerges, it must show its superiority over existing mature methods. In most cases that is achieved by specialization. Is a car superior to a donkey? It depends. Probably yes, when there are good roads, plenty of gas and spare parts. A car is narrowly specialized, thus some areas have successfully adopted cars, while others still rely on donkeys.

The same holds true for programming languages and architectures. C is good when you work close to hardware and need complete control over whatever happens in the system. C++ is great at partitioning business logic, but it lost the simplicity of its predecessor. Rust will likely shine in communication libraries, which are often targeted by hackers, though we have yet to see its wide adoption. Hence the usefulness (and choice) of a tool or programming language depends on the circumstances.

Let’s turn our attention to your average code. It often mixes together:

  • Object-oriented programming that divides the application into a tree of loosely interacting pieces.
  • Functional programming, with the output of one function becoming the input to another, method chaining included.
  • Procedural programming, where multiple functions access the same set of data, which also happens inside classes whose many methods operate their private data members.

Each programming paradigm fits its own kind of tasks. Moreover, the same three approaches reemerge at the system level:

Object-oriented (centralized, shared nothing) paradigm – orchestration #

Almost every software project is too complex for a programmer to keep all the details of its requirements and implementation in their mind. Notwithstanding, those details must be written down and run as code.

The good old way out of the trouble is called divide and conquer. The global task is divided into several subtasks, and each subtask is subdivided again and again – till the resulting pieces are either simple enough to solve directly or too messy to allow for further subdivision. Basically, we need to split our domain’s control, logic, and data into a single hierarchy of moderately sized components.

We have heard a lot about keeping logic and data together: an object (or actor, or module, or service – no matter what you call it) must own its data to assure its consistency and hide the complexity of the component’s internals from its users. If the encapsulation of an object’s data is violated, the object’s code can neither trust nor restructure it. On the other hand, if the data is bound to the logic that deals with it, the entire thing becomes a useful black box one does not need to look into to operate.

Adding control to the blend is more subtle, but no less crucial than the encapsulation discussed above. If an object commands another thing to do something, it must receive the result of the delegated action to know how to proceed with its own task. Returning control after the action is conducted enables separation of high-level supervising (orchestration, integration) logic from low-level algorithms which it drives, adding depth to the structure.

Paradigms - Object-oriented

The ability to address complex domains by reducing the whole to self-contained pieces makes object-oriented design ubiquitous. This paradigm, when applied to distributed systems, gives birth to Microservices, Orchestrated Services, and Service-Oriented Architecture.

Paradigms - Object-oriented - Variants

Functional (decentralized, streaming) paradigm – choreography #

Sometimes you don’t need that level of fine-tuning for the behavior of the system you build – it operates as an assembly line with high throughput and little variance: its logic is made of steps that resemble work stations along a conveyor belt through which identically structured pieces of data flow, just like goods on the belt. In that case there is very little to control: if an item is good, it goes further, otherwise it just falls off the line. Here the control resides in the graph of connections*,* the domain logic is subdivided, while the data is copied between the components.

Paradigms - Functional

Functional or pipelined design is famous for its simplicity and high performance as the majority of processing steps can be scaled. However, its straightforward application lacks the depth needed for handling complex processes, which would translate into webs of relations between hundreds of functions present at the same level of design. It is also inefficient for choose-your-own-adventure-style (control) systems where too many too short conveyor belts would be required, negating the paradigm’s benefits. And it may not be the right tool for making small changes in large sets of data as you’ll likely need to copy the whole dataset between the constituent functions.

In distributed systems the functional paradigm is disguised as Choreographed Event-Driven Architecture, Data Mesh, and various batch or stream processing [DDIA] Pipelines.

Paradigms - Functional - Variants

Procedural (data-centric) paradigm – shared data #

The final approach is integration through data. There are cases where the domain data and business logic differ in structure – you cannot divide your project into objects because each of the many pieces of its logic needs to access several (seemingly unrelated) parts of its data.

Paradigms - Data-centric

In the data-centric paradigm logic and data are structured independently. In procedural programming, like in object-oriented paradigm, control is implemented inside the logic, making the logic layer hierarchical (orchestrated). Another, much less common, option relies on Observer [GoF] to provide data change notifications, resulting in decentralized (choreographed) application logic:

Paradigms - Data-centric - Notifications

The data-centric approach works well for moderately-sized projects with a stable data model (like reservation of seats in trains or game of chess). The best-known distributed data-centric architectures include Services with a Shared Database and Space-Based Architecture.

Paradigms - Data-centric - Variants

Composite cases #

The three programming paradigms tend to collaborate:

  • An ordinary class is object-oriented on the outside but procedural inside: each of its methods can access any of its private data members. Moreover, code inside methods may chain function calls, locally applying the functional paradigm.
  • Cell-Based Architecture tends to use choreography (pub/sub) between Cells [DEDS] and orchestration or communication via a shared database inside them.
  • A system of Services (or Space-Based Architecture) may be integrated through both Orchestrator and Shared Database (or processing grid and data grid, correspondingly).

Reality is more complex #

We have reviewed a few cases directly supported by common programming languages. However, there is a wide variety of possible combinations of (at least) the following dimensions, each making a unique programming paradigm:

  • Synchronous (method calls) vs asynchronous (messaging), with closely related:
    • Imperative vs reactive.
    • Blocking vs non-blocking.
  • Centralized (orchestrated) vs decentralized (choreographed) flow.
  • Shared data (tuple space) vs shared nothing (messaging).
  • Commands (actors) vs notifications (agents).
  • One-to-one (channels) vs many-to-one (mailboxes) vs one-to-many (multicast) vs many-to-many (gossip) communication.

Some of the combinations look impossible or impractical, others are narrowly specialized thus uncommon, while many more are commonplace. Discussing all of them would require insights from people who have used them in practice and that would take a dedicated book.

Summary #

We have deconstructed the most common programming paradigms into their driving forces and shown how those forces shape distributed architectures:

  • An object-oriented system relies on hierarchical decomposition of a complex domain, just like SOA and Orchestrated (Micro-)Services do.
  • Functional programming streams data through a sequence of transformations, which is the idea behind Choreographed Event-Driven Architecture and Data Mesh.
  • Procedural style lets any piece of logic access the entire project’s data, resembling Space-Based Architecture and Services with a Shared Database.

Now let’s examine each of these approaches in depth:

CC BY Denys Poltorak. Editor: Lars Noodén. Download from Leanpub or GitHub. Powered by odt2wiki and Hugo Book.