If I were to ask you how healthy your technology system is, you probably wouldn’t be able to give me a straight answer. For one thing, every company defines system health differently, perhaps based on a set of pre-existing KPIs or performance metrics that indicate certain goals have been met.
The truth is no one really knows for sure how healthy his or her technology environment is today, and how much it will be able to support tomorrow. All they know is what they see in the large quantities of data being collected as technology systems process more and more business transactions and data every day.
The goal of my business, Fortified, and my book, Era of Abundance in Tech: How IT Leaders Can Find Efficiencies to Drive Business Value, is to help companies change the way we think about their technology systems. Just having a system up and running (availability) is not enough; we need to instrument new KPIs that measure the true health of systems. Yes, data folks do a pretty good job by using application monitoring (APM) and other tools to show plenty of performance and latency (how long it takes) charts and graphs, but ask them about the health of their system today and what they need to do over the next year to keep the system healthy as the business and data grows, and you’d likely be met with a blank stare.
As technologists, if we are supporting and optimizing technology systems and measuring their performance, we need to be able to answer these questions. The way to do this is by taking steps to accurately measure the health, efficiency, and financial impact of your technology, what I refer to as the three key pillars of a successful technology mindset.
1. Measuring System Health
Analysis of the health and performance of a server is necessary to ensure it is operating and processing transactions optimally. The right type of ongoing workload analytics can lead to more efficient and productive use of server resources, optimize cloud costs, and provide increased system reliability and stability.
System health means more than how your system is running today. It means understanding how your system will be running one-to-three years from now, and how to identify system issues before they lead to a critical malfunction. This means building code and applications not just to solve issues today, but to run efficiently over time.
2. Efficiency as a KPI
A popular tenet of the tech industry is, If it’s not broken, don’t fix it. In my experience, if it’s not causing an outage, I don’t know of a single application I’ve worked on in which the team has ever gone back and fixed something to make it more efficient. Why? Because they’re too busy putting out fires or developing new features. No one is focused on making his or her application better.
Most processes were not even designed with efficiency in mind. We need to change that from the base level and bring efficiency metrics upstream as developers are designing and building the system. Once the system is in production, we need a new set of efficiency metrics to identify when efficient processes become inefficient, because this often happens when the amount of data grows and the code is now scanning much more than it was designed to – taking the long way to the destination much like driving a car up a mountain in high gear.
3. Financial Impact
Sure, everyone has financial KPIs that aim to justify the cost of their tools but I believe there are interdependencies between technology and financial performance that are not fully understood and therefore not fully optimized in most organizations. It’s not just what you spend but the impact, or value, of what you get out of that spend in terms of systems or applications, that often falls by the wayside when we gauge performance.
We are not truly measuring the impact of technology today. When we promote code to production, we want to know the promotion was successful and then we move on. But we should be asking:
- What was the net performance and system impact of the change?
- How did the change impact the cost to run the application in the cloud?
- Did this change increase the percent growth of the data, thereby impacting the capacity forecast over the next year?
Along with data on how the technology is performing, CFOs and business owners want to know where their money is going. In other words, if I am making improvements to the system, what are the financial impacts of those improvements? There must be a way to access these in real-time, definitively, and from within the system itself.
It’s not common practice in the industry to calculate the Total Cost of Ownership (TCO) for your technology environment or the hundreds of applications your team is supporting, just as you would if you bought a new car. (Yes, I use a lot of car analogies.) But wouldn’t it be nice to know the real cost of supporting each application over time?
My vision is to enable the science of workload analytics to include diving into every system that runs an enterprise, every computer that’s in every data center or in the cloud, and for each one, understand:
- Is the server healthy?
- Is the server rightsized?
- Is the application workload efficient?
- What is the total cost of the applications today and in five years?
- What could the cost savings be if the application was 20 percent more efficient?
These are the types of questions we should be asking to change our mindset about how we measure technology in order to see real results for our systems, not just today, but well into the future.