There has been a lot of talk lately about Digital Twins which, put simply, is about using data and predictive modelling to try and predict how a system (which can be either software, hardware or both), reacts under particular circumstances. There are significant benefits of this as the digital twin is based on the actual live system, not one just like it. For example, the car you purchased five years ago may have been one of thousands all built exactly the same, but now, given how you’ve driven it, stored it, washed it, cared for it, fuelled it etc it's now totally unique among its peers. To accurately model how your car might react if it braked hard, took a corner at speed or accelerated quickly you would be more confident with the results if the data used in the model was identical to your actual car, rather than one just out of the factory.
This is the idea behind ‘digital twin’ and on paper it's brilliant. It's having an exact replica of your system to try things on and apparently this is how NASA modelled potential solutions to the major problems encountered by Apollo 13.
Establishing a digital twin however can be a big ask; for a software system not only is it a case of setting up a new environment but it should have exactly the same data as the live twin, all the time. This creates problems such as privacy, storage, migration, maintenance and cost. Then, trying something on it to see how it behaves is one thing, but then once done you need to revert the twin back to the new current state of the live twin, not necessarily the state you started in! For anything with hardware or requiring data gained via sensors the challenge increases as you need confidence the data you have is truly representative of the live twin; how accurate are the sensors, are they in the right place, are they collecting the correct data?
The increasing use of digital twins in predictive modelling is a fascinating subject and one with far reaching implications for practically everything. Its success however will be dependent on the quality of data supplied, the accuracy of the modelling applied and the maintainability of the twin. Three aspects it would be wise not to overlook or misunderstand.