In my previous blog, I discussed three events creating a perfect storm for mainframe capacity planning:
- Digital disruption
- Increasingly complex hardware and software choices
- Mainframe skills “brain drain”
As promised in that post, here are steps you can take to address each event and ensure accurate and timely capacity planning for mainframes.
Digital Disruption
Tightly align to business applications in planning and reporting. Replace plans and reports that portray service classes or technologies (such as “CICS Regions,” etc.) with those applications that are critical to the business, such as “Online Banking” for example, which will likely be a combination of usage and consumption in several technology stacks (such as CICS, DB2, IMS, MQ).
Recognize the difference between disruptive change and noise. Effective planning begins with a firm understanding of where you are. Determining that requires filtering out noise with machine learning and analytics.
Be prepared to rapidly evaluate impacts of changes and alternatives. Business events happen much more rapidly in today’s environment. Even the pace of change is changing, according to the 2018 Mainframe Survey from BMC. A capacity plan that requires weeks or months to develop cannot support the dynamic environment today. Capacity planning should be able to respond quickly to any nuance of change.
Apply sensitivity analysis to manage uncertainty. The disruption hitting mainframes introduces a level of uncertainty that has not been seen before. One way to account for uncertainty in the plan is to run many different scenarios across the range of uncertainty, to identify “breaking points” in activity that require capacity adjustments.
Complex Hardware and Software Options
Rationalize hardware costs versus performance. While MIPS and CPU utilization have been the default for expressing capacity needs, the requirements of a demanding user population place responsiveness and availability as top measures of success. Capacity plans must include the performance of specific business applications as key indicators of the ability of the plan to deliver on business services.
With a responsiveness focus, use capacity analysis to identify underlying resource constraints, and then evaluate a range of hardware responses. In some cases, the underlying conditions may show that an assumed option (such as a CPU upgrade) will not fix the performance shortfall, where another option (such as a peripheral upgrade) will deliver the required performance.
Right-size software cost options. An increasing menu of software licensing options creates a bewildering range of choices – from Country Multiplex Pricing to various forms of software containers. In many cases, the best way to determine the lowest ongoing cost and optimal capacity options is to analyze and optimize current costs and capacity. Then, evaluate the impact of the assumed future state of a software container based on capacity requirements and on other work.
Address volatility with rapid scenario planning. This is closely related to managing uncertainty, but the emphasis for volatility is on “rapid” evaluation as well as multiple scenarios. It is also possible to be proactive about volatility (even though it sounds like an oxymoron). Apply sensitivity analysis to plan inputs that might become highly volatile, and again, develop the range of potential performance results.
Support a wide range of what-if alternatives. To support a dynamic and volatile environment, capacity planning needs to accommodate a wide range of “what-if” scenarios, ranging from workload increases/decreases/changes, to CPU and peripheral options, to modifying configuration of existing resources, to adding/removing systems, LPARs and workloads.
Brain Drain
Solutions with engineered-in SME intelligence. A skilled capacity planner will take certain actions when analyzing performance and capacity needs. As the skilled capacity planners leave the workforce, a capacity planning solution that has built-in capacity planning knowledge will assist newer technicians in developing answers and avoiding major errors.
Automate, automate, automate. Capacity planning may be the best IT example of the dilemma of the “cobbler’s children” since IT applies automation to so many aspects of IT Operations Management, but we still have capacity planners who spend substantial time in manual management. Automating standard tasks frees technicians to work on higher value activities and mitigates risks from manual activities. Even more importantly, automating actions such as creating models every day ensures that capacity planning can respond quickly to the disruptive events that may create a need for adjustments in the plan.
Eliminate programming. In many organizations, capacity planning uses home-grown tools that were created in SAS, SQL or Excel macros. Such programming consumes substantial time to create and maintain, introduces potential for errors, and cannot be responsive enough for volatility in modern environments. In addition, any of the programming that runs on the mainframe is adding more load to the resource that is trying to be optimized.
Gain insights with machine learning analytics. Volatility makes it difficult to identify a basis for planning. Volatility is also changing the characteristics of what “normal” processing and usage look like. Machine learning and analytics can keep up with the volatility to provide a clearer understanding of “normal” and how much variability there is.
What next? Evaluate your mainframe capacity planning approach and see how many of these steps you have in place. Identify shortfalls and create a plan to implement improvements. Remember the words of IT guru Peter Drucker: “Long range planning is not concerned with future decisions, but with the future of present ones.” Prepare your capacity planning by taking these storm-proofing steps and you can have confidence in the future of the present decisions you will make.