Enterprise Technical Debt: The Lurking Cancer of Your Distributed Systems

What is Enterprise Technical Debt?

When speaking of technical debt, we should define what we mean when we say the word. Technical debts are the things that make the maintenance, upgradability or integration of a system more difficult. The thing that makes technical debt so insidious is that it’s hidden in the code base waiting to potentially explode like a bomb acting in an inopportune time. But let’s face it when is there really a great time for code to stop working? Not all forms of technical debt are created equal. Some forms of debt can be impudent and may simply represent an inefficiency such as using more resources than absolutely required on a server.

Having this definition of technical debt in mind now allows us to have a better conversation about the difference between what I call project technical debt and enterprise technical debt. Project technical debt is the technical debt created in the code base of a system. Enterprise technical debt is where two or more systems create in inefficiency that may now impact multiple systems throughout the business. The obvious reason that “enterprise technical debt” is worse than project technical that is that now instead of one application going down we have multiple applications potentially impacted by an issue. This presents a much higher risk and potentially greater impact when this happens as we’ve increased the points of failure.

System Integration Types

In my observation, there are many development teams that still haven’t conceptualized the notion of application integration as opposed to data integration. In the grander scheme of things this is a discussion about distributed system architecture, but for now, let’s just focus on the two fundamental integration types I call data and application integration.

I look at application and data integration as two fundamentally different things. Data integration is where some process acting directly on behalf of a system retrieves and updates information in the system from external sources into the applications data store. For example, in a sales system customer information may be synchronized from an external data source such as a customer master list. This integration is low level manipulation of information. The data integration application acts as a mediator to the destination system. It is very likely that this application may have intimate knowledge of the target system’s data schema. In this case without an application API this becomes a necessity, but the risk of exposure to changing data model is low. This is because the data integration application only serves the target system and is in theory known and created by that system’s development team.

When we move to application integration we’re now talking about a new level of complexity wear to system’s now must communicate with each other. Depending on the design of those systems information exchanged between the two may have varying degrees of importance on the operation of the system. This starts to get us into a conversation about micro service style architecture. In theory, a micro-service is able to operate independently of other services. The micro-service style of architecture decouples the data layer dependencies such that if another system where to go down the interdependent micro-service would still be able to operate for a period of time. How long a micro-service can operate in a partitioned state without the flow of information from other systems is a business decision.

Domain Violation

This is where we get into what I label as enterprise technical debt. When two or more applications are directly dependent on each other to function in a tightly bound fashion (usually at the data layer) changes in the architecture and data schema of one application potentially breaks another application. The interesting thing is that I see this happening often simply for the fact that it is the easiest thing to reach into another applications database and pull out the information needed. This is an anti-pattern that I call domain violation. When one application bypasses the API of another application there is no contract between the two leaving the dependent application in a more fragile state. Just like people shouldn’t be able to walk into your house without your permission and eat your food external systems shouldn’t be able to directly query another system’s data schema. The problem is only exacerbated when multiple applications using the same pattern creating a daisy chain effect of system dependencies.

When one application bypasses the API of another application it silently takes on the responsibility of handling changes to the other application’s schema and functionality. But as we well know in production situations two systems being maintained by two different teams are rarely in such close collaboration that they’re aware of all breaking changes that may be happening in the future. From an enterprise perspective, this is a horrible situation to be in in causes development teams to be in a position where they don’t want to touch the code base out of fear it will break. Every large enterprise has legacy applications that no one wants to touch. Often the people maintaining that application may be three layers removed from the original team that created the application. Yet this application may be a core component of business operations. The scary thing about the situation is the most often business stakeholders don’t know what a fragile state their key systems are in.

How We Can Address The Problem

Your next logical question should be how does a company manage enterprise technical debt? Managing this kind of debt is a proactive activity. The problems are lurking within your business system architecture like a cancer and one day you may be surprised to walk in the office and not be able to do business. This is obviously a very undesirable situation to be in. The true crime is that if it was preventable and you did nothing about it then you’ve caused a self-inflicted wound. The challenge in this is that the business may not see the value in committing time and resources towards this kind of effort. The other part of this challenge is the business could come back and ask why wasn’t this done right the first time? Now IT must defend the original effort and as such may choose to instead say nothing. Unfortunately, many business stakeholders look at software like a house. The house is a fixed unit of work and once completed it’s done. This is in fact a very poor paradigm of software development. Software development is a journey, and a system will be in constant changed over time to meet the current needs of the business. Businesses that are running the same exact way today as they were 20 years ago most likely will have significant market challenges by new technologically advanced competitors. The information systems of today’s business needs to keep up with changing times. Therefore, applications you are building today should focus on supportability as opposed to speed of initial implementation. As part of that supportability each application should adhere to an integration approach that allows systems to be updated independently with minimal risk of breaking other systems. This is a necessity in an enterprise with an ever-growing set of systems spanning across heterogeneous platforms.

Spending the time to understand the current state of enterprise applications and creating plans to remediate potential risks is the only way to stay ahead of enterprise technical debt. As IT budgets are minimized and headcount is reduced this can be a very difficult thing to do. You may even get away with doing nothing for a while, potentially years or even decades. In the end, the exposure of risk depends on the type of business and the systems is involved. You can hope your key business system continue humming like finely tuned machines or you can be proactive and overt potential disaster later down the road. The choice is yours.



Categories: General, Project Delivery

Leave a Reply

%d bloggers like this: