Your ship performs an unexpected collision - how reliable is the software to cope with it?
The article was prepared on the materials of Icarus Interstellar specialist Donna A. Dulo, a leading mathematician, software scientist, system engineer of the US Department of Defense. Read more about Icarus Interstellar in the Discovery News article.
When your spaceship gallops the galaxy at the speed of light, you find a barely visible surge in the onboard range sensors. The more you approach the source, the stronger its flows, each of them in a dense formation is sent in your direction. In anxiety, you and your team take their places on the ship and realize the worst: you have to confront a large armada of Borg cubes and their spheres.
Fortunately, you were able to lead the ship away from a serious collision, by maneuvering through a small, barely noticeable loophole discovered during the preliminary preparation of the navigation plan, and the ship leaves unharmed. You had to deviate a little from the course, but the ship is safe and your crew is now safe.
As soon as you begin to calculate your route again, you discover another signal. The ship's life support system module failed due to a software error that occurred during a collision avoidance maneuver. The software damaged the team’s life support systems, and you understand that the ship is no longer able to draw air for breathing in the next 24 hours. The backup system was helpless, and the backup itself hardware component uses the same software procedures. Rescue devices allow for 48 hours to get air for breathing, and mobile units on board are provided with air kits designed for 8 hours of breathing.
You send your best computer scientists and software engineers to the engine room to diagnose a problem. They notify you that at least it will take four days to solve the problem in order to isolate and eliminate errors in several hundred million lines of codes that control the life support systems of the ship.
Your situation is especially difficult now. You request a report from non-critical systems and urgently send a team of programmers. Now you will wait, knowing that the life of all the crew members present is now in the hands of the software development team.
The scenario described above demonstrates the vital nature of software on a long ship journey. A natural question arises: what is a big enemy: a flotilla of space villains or a weakness in the software system of the ship systems?
For those familiar with the very complex nature of the software system, the answer is obvious; this is the software that poses the most danger.
Traveling in interstellar spaces requires a self-sufficient ship and crew, implying quick decisions of the most serious engineering problems. The exponential complexity and fragility inherent in final software makes it one of the weakest links in the need for long-term survival aboard an interstellar ship. Imagine a fully operational spacecraft, with hundreds of millions of lines of code and tens or even hundreds of thousands of their variables and states. Diagnosing a single error in a line of code is almost impossible in an emergency, even with the most advanced automated testing procedures. The tension of the situation in combination with the inherent difficulties of mathematical logic and a huge amount of code will create tension in the work of even the best teams of engineers who are currently conducting work.
As in the situation with the Borg, where you thought through everything in advance, made contingency plans and planned evacuation routes, then security planning for the long-term software of a spacecraft is possible. However, this planning must occur during the development of the ship, as well as during its interstellar operations. The key to the new engineering paradigm is called “sustainability,” and this may well be easily applicable to software development and development.
In the long-term space mission, the possible limits of the software will be challenged, however, the possibility of failure will not suit anyone.
The software, as well as crew members using it, must be stable in order to cope with all critical situations in maintaining security. The concept of sustainability as a discipline in engineering emerged in the mid-2000s as a way to reduce failure in complex systems, in the light of sound engineering efforts. The sustainability of engineering, as a software concept, is captured in how people cope with the complexity of a software system in order to succeed in a short time even in the most difficult situations. Engineering resilience focuses on the ability of the system to adapt to a constantly changing situation and conditions so that a positive state of control over the system is maintained to avoid failure. In combination with the ability of the system to adapt, the capabilities of the human factor in the system are necessary for greater adaptability to changing conditions. The combination of man-machine systems brings a new approach to security, providing people with the elements to gain knowledge and anticipate processes in the system, allowing them to become a pro-active part of the security operation of the system itself.
There are two aspects of software resiliency engineering: software resilience through the sound process of oriented security development and the current real-time operation of software with a positive human response in cycle operations. The overall program works in accordance with the concept that security is a core value along with the constant expectation of a potential software failure.
Thus, a person, by orienting security attention, helps to change the risk equation in the measures support system, to break the chain of cascade software failure of causality, at the same time, reducing the fragility of the system. The result is safer, more viable and predictable software performance, in collaboration with users, participating in the full extent of software processes and evolution.
The stability of engineering methods continues to manifest itself and focuses on the redundancy of logic software, adaptive intervention methods, intellectual analysis, among many other sound engineering techniques. Among engineering structures, there is a sound that is justified by a human resource, and operational management protocols are designed to focus on the crew’s abilities to adapt to changing conditions and mitigate even the most complex software emergencies. Through the clutches of resilient software development methods and the viability of technical organizational leadership and a team focused on managing emergency software, the complex system has the ability to survive a catastrophic failure, which helps prevent a total crew failure.
In our example, the backup life support system failed, because it was the same programming as the main system and, thus, in the same situation, the backup was also failed. A more robust system will use another program from the software package and a set of backup system algorithms to do the same work, making the system more stable.
A fault-tolerant system, like software, is more modular and mathematically provable, thereby providing more and more viable ways of adaptation, refactoring and repair. Reduced complexity and more standardized software and algorithmic structures will provide additional guarantees for improving stability.
Then man, as an element of a sustainable system, comes into play. After a life support system failure, the crew is on duty, immediately switching the system to backup components, in which a different set of software procedures, including a completely different set of mathematical logic, flows.
All crew members are trained in the nuances of the ship's hardware and software, as well as the responsibilities to understand all types of computational errors and how to cope with them. A considerable time later, the software system of the engineering team proceeds to work to repair the malfunction in the logical chain of the primary set of software procedures, since the backup system functions flawlessly.
The repair task is simpler, since the software is more modular, easily decomposed hierarchically, and carefully documented in design, architecture, and also in its mathematically proven structures. The team is complemented by a set of second-tier software engineers who have the necessary development and perform the secondary functions of the crew and highly qualified primary software team.
The script was well rehearsed in advance, during training, and the player of each team is familiar with its function: coder, verifier, mathematician, tester, and implementer. In a systematically organized engineering management activity, a new set of logic is developed and coded for the main system. Within two days he is checked and eventually goes to work. After the implementation of the experiment with the full participation of the crew, the ship returned to the original order of battle.
By applying perseverance in software development and real-time operation of the spacecraft, the crew can increase the survivability of the ship, even at a time when serious software problems arise. Thanks to the development and application of cutting-edge theories and methodologies for developing software for sustainability, the ship will have the tools and trained crew to safely carry out in-depth software operations.
Methods of sustainability can also be applied to other forms of technology, as well as operations on ships, creating a holistic safety culture that will improve the overall survival of the ship.
Thus, sustainability will make the ship a long-lived, destined to pass through a galaxy with endless possibilities for present and future generations. Even when there is no chance before the Borg. The mission of Icarus Interstellar is to promote the development of spacecraft research, both for manned and unmanned vehicles. The software will take a huge part of these future systems, and the stability of the study will help to achieve the ultimate goals, first of all, to get to the stars, and then move between them, as in an interstellar civilization.