An IT Parable

September 29th, 2017

There was once a CIO with some money to spend (you can already tell this is a fictional account). He went to three project managers and gave them each a million dollars to implement an IT project that would add value to the organization.

The first project manager hired a consulting firm and developed a set of best practices for change management across the IT organization and even included the customer offices in the organization in the discussions. In the end they had a change management toolset that was adopted across the organization with full buy-in from all stake holders (again, fiction right?)

The second project manager assembled a team of crack developers and ran a flawless agile project, producing a social software package that integrated with Facebook, Twitter, YouTube and your internet enabled microwave to reach out to the organization’s consistently underrepresented twenty-something demographic.

Finally, the third project manager developed a plan to replace some of the organization’s aging IT infrastructure. He wrote a grant to the local power company and personally staffed a bake sale to fund the replacement of their energy hungry servers and ran a pilot project to run fiber-based networking between key pieces of infrastructure, improving the performance of key services and earning him a Nobel prize in IT Management (because, like, that totally ought to exist, right?).

Upon completion of their projects, the project managers came to the CIO. He asked them to explain how they had added value to the company. The first project manager said, “The change management practices and tools we put in place have dramatically reduced the number of change-related outages and improved overall communication both within the IT organization and between the IT organization and the customer offices.”

The second project manager said, “I increased our market-share in the 18-24 demographic by 130% and the 25-32 market by 295%. Our Facebook app is being used by over two-hundred thousand individuals in just the first two months and we project a user base of over a million by years end.”

The third project manager said, “I improved our company’s green image and I project we’ll save over $75,000 in utility costs in the first year alone. Our dedicated fiber channel initiative has improved the reliability of key infrastructure and will be rolling out across the enterprise in the next two years.

And they all lived happily ever after…

No, wait…

Why don’t projects usually go this way? Lets look at each project.

Project one was a change management project. This type of project generally requires more time talking about policy and practices than it actually takes to do the technical implementation. These projects require a balance between involving enough viewpoints to ensure the output of the project meets the intended need without stacking the project with so many viewpoints that they can’t all possibly be met. It is critical in these kinds of projects to have a highly engaged sponsor because there will invariably be conflicts between the priorities of the different stakeholders. It is especially critical for these kinds of projects to be output (quantitative) and outcomes (qualitative) and for the sponsor to prioritize them. Without this kind of guidance, these types of projects tend to fail.

Project two was an innovation project. Most organizations take on these kinds of projects from time to time. One of the challenges with these types of IT projects is that there are a lot of unknowns, perhaps because the initiative is on the bleeding edge and there aren’t any best practices for the tools being implemented or the practices being defined. It is critical that these types of projects are allowed to happen and just as critical that some of them be allowed to fail — in fact, leadership should expect that there will be failures in innovation projects because the level of unknowns tend to make them more risky. These types of projects are perhaps the best fit for agile development methodologies because they allow a team to “fail fast”, looking for the riskiest parts of the project and trying to tackle them as early as possible in the project schedule. Note that fail fast doesn’t necessarily mean that the project will fail, just that a particular aspect of the project may fail. But early identification of failures provide time for fixes that can make the project successful.

Finally, project three is about addressing technical debt — ageing infrastructure. The first challenge for these kinds of projects is getting them rolling in the first place. It’s easy to let entropy kick in or overlook technical debt. There are always other projects that are more interesting or can present more clear return on investment. Projects that address technical debt deliver more subtle value but they are all about building capacity — either by improving the performance of existing tools or enabling the enterprise to expand. Recognizing the value of these types of projects requires strategic thinking, considering the long-term cost of technical debt and the value of  addressing it.

There are certainly plenty of other reasons why IT projects fail, but these are some of the most common I have run into.

The Industrialization of IT

October 4th, 2014

A while back, maybe 15-20 years ago, if you wanted an information system to support a specific business use, it was a reasonable proposition to build it yourself. It might have been built on a mainframe, as a client/server application or even a web application. But in the past 15 years, there are more and more commercial-off-the-shelf (COTS) products available, such that there is something for just about every industry and niche. Now, the common wisdom is that it make more sense to implement on of these COTS packages rather than building your own. There are a number of reasons this may be true, but the goal of this essay is not to discuss whether this is a valid assumption, but to address the impact of this change in the IT industry.

This trend has led to what many have referred to as the industrialization of IT. Think about it, 20 years ago, if you worked supporting enterprise applications, you were a craftsperson, building and maintaining a custom-built piece of business machinery. There may have been a sense of pride about what was built and those who supported the system knew every nook and cranny, if not individually, then collectively.

The challenge comes when you replace that custom machinery with a commercial package. What are those craftspeople to do? Without much thought, management often expects them to simply embrace their new role as knob-tweakers of this new machinery. There certainly are opportunities for creativity, but much of that creativity is aimed at compensating for the shortcomings of the software rather than truly meeting the needs of the customer.

There are three primary theoretical advantages of COTS solutions:

1. You don’t have to start from scratch – These solutions are generally built around industry best practices, so whether you’re implementing a new finance system, a system for room scheduling or a new ticketing system, you’ll be starting with a system that theoretically implements those best practices.

2. You don’t have to make your own updates – With the pace of change in the industry, if you build your own solution’s you will need to update them regularly to keep pace. For example, if you built your own web application ten years ago, you’ll need to retool it to be responsive to work on mobile platforms. If you purchased a COTS solution, hopefully your vendor has or shortly will provide an update that provides responsive updates.

3. They take fewer resources to maintain – This is the most tenuous of the advantages, because the choices you make may play a large role in how much effort it takes to maintain. If you implement the solution with little to no modification and limit efforts to make changes on an ongoing basis, you might, might, lower your maintenance costs.

It’s really this last theoretical advantage that’s the sticker. If you customize your COTS solution and can’t manage to contain maintenance costs, you absolutely must move your craftspeople to knob-tweakers. If, on the other hand, you can limit the amount of customization and actually lower your ongoing support costs, you can IT craftspeople can continue to practice their craft in new, interesting ways, solving problems that we couldn’t address previously.

That’s the critical decision point in managing the industrialization of IT. If we can’t actually lower our support costs with these tools to free up our craftspeople for innovation, what’s the point really?

Defining an IT Problem

September 23rd, 2014

I previously wrote a little bit about defining IT problems, but I was recently asked to deliver a few hour-long sessions on requirements gathering and, among the topics I covered were problem statements. I needed to put together something more formal than my short blog post and something that would be memorable. I needed a mnemonic device and I couldn’t find one out there, so I made up my own:

Got a problem? Think SPASM OW!

That’s right, here’s how it works:

Size
Process
Audience
Symptoms
Metrics

Outcomes
Who, What, When, Where, Why

Here are some more details:

Size – How big is the problem? How many people are affected? How much money is is costing us? How much time is it taking? Quantify the size of the problem in a measurable way.

Process – Are there business processes impacted by or causing the problem? Name the processes. Later when you’re defining the rest of your requirements it will be important to be able to reference them. This will also help you maintain the sense of scope of the problem. If a process is not indicated in the problem statement, it’s out of scope.

Audience – Think about who the audience for the problem statement is. The implementation team will need to be able to read it. Perhaps there is a governance board that will need to approve the project. Make sure your problem statement is accessible to all audiences.

Symptoms – How do you know you have a problem? You have already defined the measurable size of the problem, which is part of the symptoms, but think about how the problem presents itself. Sometimes I like to write what I call anti use cases, which walk through scenarios in which people are not able to successfully complete a task. Describe their activities and focus on the pain points.

Metrics – How will you know when the problem is solved? You can start with the metrics represented by the size factors you defined, and realize that those will be your benchmark. But there will likely be other ways you can measure the problem or your success when you solve a problem. Ideally, you can take a snapshot of those metrics at the current time and later use those metrics to measure the effectiveness of the solution.

Outcomes – It’s not all about quantitative measures, though. Sometimes there are qualitative factors to be considered as well, how will solving the problem improve someone’s job satisfaction? Or will solving the problem allow you to provide other services you don’t currently have time for?

Who, What, When, Where, Why – So it’s a little bit of a cop out, but this is our catch-all. If we didn’t think of all of the factors of the problem using the previous concepts, walk through your five Ws and make sure you’ve got everything covered. Some of these may be redundant with other concepts, but that’s fine, just make sure you’re covering all of your bases.

There’s a starting point for you. Think SPASM OW!

Not All Data is Created Equal

July 27th, 2013

If you search the internet for definitions of the word data you’ll find a wide variety of definitions to choose from. But I like this definition from Wikipedia:

Values of qualitative or quantitative variables, belonging to a set of items

But where does data come from? Does it just naturally exist and we wrote it down? No, data the result of a measurement, observation or decision. This means that the reliability and usefulness of the data is dependent on the measurement or observation tools. This leads us to observation one:

The quality of the data is dependent on how you designed your yard stick.

Also note that the Wikipedia definition indicates that it belongs to a set of items, i.e. there is more than one (data is plural, after all) and there is at least some limited context based on the set it belongs to. When we struggle with questions about “big data” we need to think not only about where the data came from and how it was measured or recorded, but we also need to know how it relates to other data. This leads us to observation two:

How you use data is dependent on your understanding of its context

From the perspective of a data user, some due diligence to understand these concepts is crucial to effectively supporting decision making. If you are designing an information system, it is critical that you consider these two factors in the design of the system. It is also important that you communicate these two factors to those who will be consuming the data.

What Could Possibly Go Wrong?

May 3rd, 2013

After a recent service failure at work, I commented to a colleague that, far to often, we ask the question “What could possibly go wrong?” rhetorically. Instead, we should always be considering the risk of any implementation or change. What could possibly go wrong? It is a question that deserves an answer in the form of a list not just a self-confident shrug.

Architecture in Balance

March 6th, 2013

I find myself in meetings making that hand gesture a lot lately — you know the one where you hold your hands out to the sides and mimic a scale? It’s because I’ve spend a lot of time talking about the competing priorities when making architectural decisions for systems. For example:

  • Do we let the load balancer handle the traffic routing to all web servers because it’s good at that or do we avoid that because it’s a potential single point of failure?
  • How much refactoring of code is enough? It makes it easier to test your code and has the potential to make it more maintainable because you may be able to make a change in one place. But too much indirection in code can negate that maintenance value.
  • Does it make sense for our Mac users to run VMs on their own machines, which offers more flexibility for them and less support cost to our admins? Or do we have a central Windows VM servers for all non-windows users that need to use certain Windows-only software?
  • How secure does that server need to be? Can anyone log in? Or do they need to be on the VPN? Or can you only access it if you physically walk into the server room?

You get the idea. We make decisions daily about these kinds of things, whether we do so consciously or not. What guides these decisions is important to our success as IT professionals.

Informing Your Decision

A while back, I ran across this chapter from a Microsoft guide to application architecture. It outlines thirteen common quality attributes (-itys) of software applications, but the majority of them are equally applicable to any IT architecture:

  • Conceptual Integrity
  • Maintainability
  • Reusability
  • Availability
  • Interoperability
  • Manageability
  • Performance
  • Reliability
  • Scalability
  • Security
  • Supportability
  • Testability
  • Usability

When you run into a situation where you’re struggling to make an architectural decision a useful approach is to review this list of quality attributes and identify which ones are priorities for the service you are implementing. It may also be useful to ensure you understand the priorities within the constraints of the project management triangle for your initiative:

  • Cost
  • Scope
  • Schedule

Example

Let’s look at example two from above — the code refactoring question where the following qualities primarily come into play. The pluses and minuses after each quality indicate whether refactoring of code benefits or detracts from this factor:

  • Maintainability – making changes to code that is factored well is generally less likely to introduce new bugs because changes will likely be isolated to small parts of the code. (+)
  • Reusability – generally, the more granular the code is the more reusable it will be (+)
  • Supportability - more indirection in code may make support — tracking down bugs — more difficult when a developer needs to lift the hood on the system. (-)
  • Testability – the more granular the code the easier it generally is to test, thus making it easier to prevent regression with changes. (+)
  • Schedule – generally doing refactoring right over the course of a project will take some more time — although it often pays for itself during development. At worst, it’s a negative during the development cycle and and at best probably a net-zero game (-).

Once you have assembled this list for a given system implementation, you can review the factors that are most important.

This Is IT: The Refrigerator

March 3rd, 2013

Fridge as IT

A few weeks back, my historian-stay-at-home-mom-craft-diva-sunday-school-teacher wife was feeling overwhelmed with the amount of work she had to do. So she took action to get herself organized and sent me this picture.

This is a fine example of hidden information technology. Whether you write on your fridge with a dry erase marker or just use magnets to stick important documents to it (a check you need to deposit, for example) you are using your refrigerator as information technology. Sure there are lots of great productivity tools out there. My wife even uses Cultured Code’s Things on her iphone to keep track of her tasks on a day to day basis. But sometimes nothing beats a dry erase marker for unloading your brain.

Why is it important to recognize these invisible technologies? Because it’s critical to understand how we process information on a day to day basis, whether in our personal or professional lives. If we want to do a better job of handling information it’s a good idea of assess how we currently manage (or don’t). Recognizing the less than obvious information technologies around the house or in the work place gives us a good baseline for what we can optimize.That said, in the context of the home, perhaps the fridge-as-whiteboard is already pretty optimized if you use it to track information that is specific to the home as a geographic location.

Managing Technical Debt

January 23rd, 2012

I was reading this blog post by Gartner Analyst David Norton this morning and it got me to thinking that I’m not sure we have a good definition of technical debt to work with. I actually just had a conversation about this at a design review last week, and it was clear that we did not all have a common understanding of what technical debt means. We have a new set of mobile applications that we want to get out the door ASAP and it’s new ground for us. We haven’t done a ton of work to define best practices here, so we’re building the plane as we’re flying it. This is a good example of a situation where projects tend to accumulate technical debt.

At some point I’m hopeful that we’ll lay out a reference architecture for building web-based mobile applications. I’m actually leading the charge on defining that right now. At that point, we’ll likely have a number of applications already in the hopper and, chances are, most of them will not fit the best practices we’ve defined. Right now, we already know this is the case, but we don’t know what it would mean to pay off that debt until we have those best practices in place. Once they’re nailed down, we know what that debt looks like. We then have to decide if we’re going to pay it off. I think that’s a key point here. Technical Debit is often discussed as something that must be payed off and I don’t think that’s true.

Say you implement something before there’s an internal standard for it. It could be a new application, a new network instal, a new desktop configuration. That only formally becomes technical debt once the standard is defined. Whether we pay off that debt depends on the technical interest rate, as it were. If having this non-standard implementation in place increases support costs on an ongoing basis, it makes sense to pay off that debt sooner rather than later. For example, if all but a few of your workstations are centrally managed, you have to exert additional effort on an ongoing basis to keep them up to date. Paying off the technical debt in a timely manner will be a worthwhile practice. The challenge is the non-standard implementations that don’t have obvious ongoing expenses and obvious is an important term here.

Take the mobile application. If it’s a simple application and we’re not going to be making updates to it, that debt mostly isn’t noticeable. The place where recognizing the debt becomes important is when it comes to the risk that non-standard implementation brings. Even if we don’t make any changes to the application itself, there will be implications when it comes to infrastructure upgrades, for example. This application may require special testing above and beyond that which is required of other services. So it’s important to think carefully when considering what technical debit your services may be carrying. Consider documenting the debt as you accumulate it rather than simply accepting that accumulation. It will make it much easier to manage.

Also consider the possibility of a strategic default on your technical debt — declaring technical bankruptcy, as it were. This certainly isn’t always an option, but given the opportunity to retire a service that carries significant technical debt, one should carefully consider

  • the cost of the ongoing debt to the value the service offers the organization
  • or

  • the replacement cost of a new, debt free service.

Perhaps one of the biggest challenges of identifying technical debt is that everything has maintenance costs. In IT, there is no set-it-and-forget-it. This leaves a blurry line between an acceptable level of ongoing maintenance expense and true technical debt. As such, it would be beneficial to catalog not just known technical debit during the development phase of a new service (missing tests, non-standard implementations, missing documentation) but also potential areas of as yet undiscovered technical debt. This could be something like poor performance in the production environment.

In the end, the fist goal should be to proactive about identifying technical debt instead of waiting for it to surprise us.

When Technology Won’t Help

January 9th, 2012

I keep a copy of this cartoon posted outside my office:

P.S. Mueller - Technolgy Will Save You

It’s a reminder to myself and everyone who visits that technology is not our savior and a large portions of our problems will only be solved when we think about the root cause of a given problem. This point was driven home to me yet again while reading a great paper on UI design for multiple platforms by Gartner Analyst Ray Valdes. This quote was the one that really jumped out at me:

Although the user experience of many enterprise applications is, at best, mediocre and often user hostile, this dismal state is rarely a direct result of technology; therefor, new technologies are unlikely to fix the problems.

This gets to the heart of the problem that we, as IT Professionals, don’t focus enough on identifying the problem we’re trying to solve. If we always start with a clear definition of the problem we’re trying to solve we’re much more likely to add value to the larger organization.

Your Best Guess

September 2nd, 2011

I have spent a lot of time in meetings where people sit around guessing:

  • How will they respond to the proposal?
  • What will happen to the system under load?
  • What do they want in the report?
  • What does that requirement in the spec mean?

There is a lot of guessing involved in the IT profession. Some of it is valuable and some of it isn’t. What I would like to provide in just a few paragraphs is some guidance on when guessing is appropriate and when it’s a waste of time.

What’s the Question?

The first thing to consider is that when you’re guessing about something, there’s a question with an answer you don’t know. So when you find yourself stuck guessing, whether it’s by yourself or in a meeting with others, step back and figure out what the question is that needs to be answered. If there’s someone who has the ability to answer the question in a timely manner, give someone the action item to follow up with that person and move on. If it’s unclear who can answer the question or if the person with the answer isn’t readily available, focus your effort on identifying the person with authority to answer the question or how to expedite the request for an answer.

To the extent that the answer to the question at hand will have a significant impact on implementation of an IT solution and timeliness is essential, focus on the most likely answers to the question and the next steps required to address those answers. These next steps will become potential action items for individuals that can be acted on as soon as possible after the answer is defined without reconvening the group.

Who Can Answer the Question?

There are certainly times you can’t identify a person who will know a discrete answer to a question. Some common situations where this is the case include performance, system architecture and time estimating for complex tasks where pinning down an exact answer is essentially impossible. That said, even in these cases you should be able to identify someone who can help make an informed decision. For example, if you’re trying to decide on the scale of a server farm but don’t know how the servers will perform under load, you need to identify someone who can provide you with some additional information. Remember that the more information you have the better your chance of making the right decision.

In the context of project management, guessing is a risk management technique. Your guess is based on context. You’re guessing because you don’t have all the necessary data, but you do have some of it. So use the context you have to inform your guess.

When to Stop Guessing

When is guessing a waste of time? When you get hung up on it. Uncertainty breeds anxiety and often leads to uncontrolled over-thinking. The most common unproductive guessing I see is when that uncontrolled over-thinking takes the form of iterative guessing — going through the same guesses again and again rather than deciding how to get the answer to the question. When you find a meeting devolving into wild guessing, take control by bringing the conversation back to the question at hand.

To sum up: when you’re guessing it’s because you don’t have enough information to make a decision. Don’t spend any more time than necessary postulating what that missing information might be… figure out how to get your hands on it in a timely manner.