Defining a cloud solution by the skills you actually have
Here’s a scenario you may have experienced: The consensus of the cloud architecture team is that a “cloud-native” solution is the best approach to a new business application that will define the next generation of the company. Containers and container orchestration will lead the way to a solution that’s going to span multiple clouds and even existing data center-based platforms.
The decision was not made lightly. The team worked hard to understand the core business problems and objectively evaluated available solutions, architectural patterns, and accepted best practices to find a solution that’s proven to be most optimal. In this case, it’s containers and container orchestration tech.
Of course, any good solution should include a skills matrix that defines the skills needed, the gap between employees’ current skills, and the training and recruiting needed to obtain the skills. Then the solution can be built, tested, and operated. This is where the bad news starts.
The HR team looks at the new skills requirements you provided and says recruiting for those specific skills will take much more time than stated in your plan. Also, the cost of these skills, even for those who will be internally trained, will be much higher than budgeted, and the people with those skills will need to be paid more to ensure they don’t leave for more lucrative opportunities.
As I often say, with enough money and time, most things are possible. Of course, businesses don’t have unlimited money or time, so they consider fallback solutions. They build a solution that’s not optimized to meet the requirements but can be done on time, within budget, with the skills they already have or can find in a timely manner.
So, state-of-the-art cloud-native architecture is traded for lower-order or more pervasive technology where the skills already exist or can be found more easily. Lacking a name for this, I’ve been calling it “skills availability–defined architecture.”
We’ve all made similar compromises. During the pandemic, shortages of durable goods caused by supply chain issues meant that we often did not get our first-choice dishwasher and settled for our second or third choice, if indeed we could find and afford one. It’s much the same concept here, and it’s really another way to create technical debt.
This is a bit of a quandary for those promoting fully optimized cloud computing solutions, such as myself, but running into business realities.
There is no easy resolution to this issue. Many will extend the project time horizon to accommodate the additional time needed to find the skills or train internal teammates. They’ll also add 50% more budget for recruiters, trainers, and the fact that these skills are likely to cost 30% to 40% more than standard salaries (plus hiring bonuses to attract and retain the talent you need).
Although technology groups are usually more than willing to open their pocketbooks to build a solution the right way, delays and budget overruns often don’t make it past CEOs, CFOs, and boards of directors. You can see their point: A delay in deploying a critical system could cause millions in lost revenue and a much lower value overall. Budget overruns remove profit and result in lower earnings per share, which in many cases lowers the company’s stock price.
Many cloud development projects that started in 2022, both big and little, are facing this conundrum and realizing there are no easy solutions. The business needs to take priority, and that may mean compromising what should be built with what can be built.
Your only option is to arm yourself with information. What skills are out there? How much do they cost versus what an HR person may think? What will it take to train existing teammates, and how will you retain them once you do?
Make sure you understand what the options truly are and watch out for people’s perceptions that may not be correct. Then you can make the best choices for the business. Even if it’s not the solution that you would have liked, it’s a compromise.