Scheduling planned changes minimising the overall impact and risk

The ITSM Change Management process manages the lifecycle of changes in an IT environment in such a way that any negative impact to the supported business, time, cost and overall risk are minimised. The IT environment can be considered to be a collection of items, such as network routers and web servers, and the proper functioning of one item often depends on the proper functioning of others. This dependency can be represented in a directed graph. This graph can also be used to trace the effects on all items, of any single item failing or not being in an up-to-date state. These global effect can then be used to estimate a cost of an item not being in its preferred state.

From time to time changes have to be made to items in the IT environment. These might be software or hardware updates or replacements. Changes sometime fail and may cause the item to fail. A single change consists of a series of tasks, and the dependency between these tasks can be represented in a directed graph (a different one from the one above which showed the dependency between items). The graph defines the order in which the tasks have to be carried out. If a task works according to plan then the time required to complete it and other resources for it are assumed to be known. There is a probability (which we assume known) that a task may fail, and if that occurs a roll back plan will be executed to get the item back to its initial state. From such a change graph we should be able to work out the time the change will take if it is successful and the distribution of times it will take if it is unsuccessful.

Once a collection of changes are planned for items in the environment, the changes have to be scheduled and implemented in what is called a change window. Change windows are determined by when it is convenient to make changes (e.g. often changes are best made a weekends or at night) and dates by which changes must be made (either for contractual reasons or security or performance reasons). If the implementation of the planned change fails at any point during the change window, a roll back plan will be executed to get the system back to its initial state. A conditional plan needs to be available to deal with change failures.

This project is about scheduling the planned changes, given a set of constraints (human resources who will implement the planned change, deadlines, priority and so on). The schedule will have as objective to minimise the overall impact and risk. The schedule will have to take into account the possibility of the change failing during execution.

Project deliverables

The candidate will be expected to understand the problem and build a model in a first phase. This will mainly be an exercise in understanding and structuring the problem. The second phase will include some research and study on the possible scheduling techniques, algorithms, heuristics that could be applied to this problem and propose a solution. If time permits, the student could start building a prototype of the solution. Java would be a suitable language for this, though other languages could be used if preferred.

Required skills

Scientific background: operations research experience, some mathematical aptitude, good communication and programming skills