Christmas is upon us… oh no! Christmas music, Christmas decoration, Christmas jingles, Christmas terror everywhere, but the MOVE-II team has come to your rescue with an almost non-Christmasy Christmas special on dependability! If you rather would like to tune in to yet another incarnation of jingle bells, please follow this link and stop reading right here. The the rest of this article will be rather technical with a focus on radiation, electronics and computer design.
Dependable computer design for space use up until now primarily relied upon radiation tolerant special purpose hardware. These components are mainly targeted towards aerospace and spaceflight applications with vast budgets and long-term projects. Such components usually are significantly more expensive than commercial off-the-shelf (COTS) hardware, as the cost of space electronics and software are usually dwarfed by a satellite launch, testing and validation costs. To enable radiation tolerance, designs primarily rely upon increased structural width of the silicon besides utilizing specialized manufacturing techniques and materials. Thus, these components usually require more energy, and offer less compute power compared to consumer hardware due to decreased clock frequencies and smaller memory sizes. Especially in nanosatellites, these components’ size and the prices commonly charged are prohibitively high often making their use entirely infeasible. Therefore, nanosatellite computing has historically taken two paths: very simple on-board computers (OBCs) based on one single or few microcontrollers and very complex custom-tailored systems.
The first category of OBC designs was sufficient for the very first nanosatellites which often were mere proof-of-concept devices, earning them the mean nickname of “space debris with antennae”. Such extremely simple systems are insufficient for more advanced missions with higher compute power, reliability, data storage and lifetime requirements, especially such with scientific or commercial objectives. First-MOVE followed this principle of extreme simplicity, however even such a comparably straight forward OBC design became a challenge to implement.
Many current nanosatellite OBCs follow a federated approach to enable more powerful systems. They consist of many microcontrollers of different models and brands, to keep cost and energy consumption low and provide various required interfaces. Thereby, federated OBCs can accommodate numerous different kinds of sensors and drive various other components. However, such designs also tend to become highly complex and error prone, requiring custom written firmware for each microcontroller. As each controller within such an architecture is dependent on its neighbors to fulfill its role, failure of one single controller can cripple the entire system. Many modern nanosatellites thus also introduce basic redundancy functionality to counter controller failure. These measures further increase complexity and require additional development manpower.
Recent miniaturized satellite development shows a rapid increase in available compute performance and storage capacity, but also in system complexity. CubeSats have proven to be both versatile and efficient for various use-cases, thus have also become platforms for an increasing variety of scientific payloads and even commercial applications. Such satellites also require an increased level of dependability in all subsystems compared to educational satellites, due to prolonged mission duration and computing burden. Consequently, nanosatellite computing will evolve away from federated clusters of specialized microcontrollers, a development that could also be observed with larger spacecraft over the past decades. Instead, more powerful, centralized general purpose processors will cover a wider range of responsibilities. Thereby, overall spacecraft complexity can be reduced and efficiency improved, while each individual processor’s complexity increases. In contrast to First-MOVE, the MOVE-II OBC must enable a sufficiently high degree of dependability and failure tolerance to enable future scientific missions.
Once reliable software is available for a given OBC, the system must also ensure computational correctness. Nanosatellite design thus seeks to replace space-qualified components with more readily available hardware, preferably in industrial or military grade variants. To satisfy scientific and commercial objectives, miniaturized satellites will also require increased data storage capacity for scientific data. Thus, many such satellites have begun fielding a small but integrity-critical core system storage for software, and a dedicated mass-memory for pre-processing and caching payload-generated data. Unfortunately, traditional hardware-centered approaches to achieving dependability of these components, especially radiation-hardening, can also drastically increase costs, weight, complexity and energy consumption while decreasing overall performance. Therefore, such solutions (shielding, simple- and triple-modular-redundancy – TMR) are often infeasible for miniaturized satellite design and unsuitable for nanosatellites. Also, hardware-based error detection and correction (EDAC) becomes increasingly less effective if applied to modern high-density electronics due to diminishing returns with fine structural widths. As a result of these concepts’ limited applicability, nanosatellite design is challenged by ever increasing long-term dependability requirements.
There are numerous design restrictions towards electronics in space, most notably extreme temperature variations and the absence of atmosphere for heat dissipation. However, more subtle is the impact of radiation in the operation environment of the vessel. For Cubesats, this first and foremost is the near-Earth environment, with exceptions coming up in interplanetary missions. About 20% of all anomalies aboard satellites can be attributed to high-energy particles from the sources depicted in below Figure.
Particles originating from Earth’s radiation belts, the Van-Allen belts, consist mostly of trapped protons and electrons. Galactic cosmic rays from beyond our solar system are mostly protons, whereas various other high-energy particles are ejected by the Sun during Solar Particle Events (SPEs).
Therefore, depending on the orbit of the spacecraft and the occurrence of SPEs, satellite electronics will be penetrated by a mixture of high-energy protons, electrons and heavy ions. Physical shielding using aluminium or other material can reduce certain radiation effects. However, sufficient protection would require a spacecraft to dedicate unreasonable additional mass to shielding.
Furthermore, in LEO, the radiation bombardment will be increased while transiting the South Atlantic Anomaly (SAA). Earth’s magnetic field experiences a local, height-dependent dip within the SAA, due to an offset of the spin axis from the magnetic axis. In this zone, a satellite and its electronics will experience an increase of proton flux of up to 10^4 times (energies > 30 MeV). This flux increase results in a rapid growth of bit errors and other upsets in a satellite’s CDH. In case of MOVE-II, the full functionality of CDH-subsystem is required at all time due to scientific measurements being conducted from one of the successor satellite’s possible future payloads.
Electronics and different logical data storage and compute logic technologies vary regarding the energy-threshold necessary to induce an effect and the type of effect caused. For example, the most important radiation induced phenomena on memory are:
- Single Event Effects (SEE), local ionization from protons or heavy ions
- Total Ionizing Dose (TID), the cumulative effect of charge trapping in the oxide of electronic devices
- Displacement Damage due to structural displacement in crystalline components of electronic hardware.
Other types of SEEs, the destructive ones being the most relevant, are well described in literature as well. Specialized manufacturing techniques such as Silicon-on-Insulator (resulting in the growth of an isolating cystal layer surrounding logic embedded into the substrate) can drastically reduce the impact of these radiation events and can minimize their effects. However, the use of propietary hardware produced using specialized manufacturing techniques, is usually not an option aboard most academic nanosatellites.
For data storage, there is more that can be done, as information can be stored in a variety of different ways: as a radiation-vulnerable charge or voltage level, magnetically, and also as a phase differential. Hence, some (in non-IT-terms) novel memory technologies (e.g. MRAM, PCM) have shown inherent radiation tolerance against bit-flips, Single Event Upsets (SEUs), due to their data storage mechanism. Due to a shifting voltage threshold in floating gate cells caused by the total ionizing dose, flash memories become more susceptible to bit errors the higher they are scaled. Highly scaled flash memories are more prone to SEUs causing shifts in the threshold voltage profile of one or more storage cells as well. All these memory technologies are sensitive to Single Event Functional Interrupts (SEFIs), which can affect blocks, banks or entire circuits due to particle strikes in the peripheral circuitry.
As you would expect, there is no single universal solution to handling all of these issues, instead we must make use of all available protective approaches available. As an academic nanosatellite team, the MOVE-II team of course is not rich, and thus many approaches based upon specialized manufacturing and propietary components are beyond our reach. Using COTS hardware, neither component level, nor hardware or software measures alone can guarantee sufficient system consistency. However, hybrid solutions can increase reliability drastically introducing negligible or no additional complexity. Software driven fault detection, isolation and recovery from (hardware) errors (FDIR) is a proven approach also within space-borne computing, though it is seldom implemented on nanosatellites. A broad variety of measures capable of enhancing or enabling FDIR for on-board electronics exists, especially for data storage. Combined hard- and software measures can drastically increase system dependability, and the design of the MOVE-II on-board computer is largely based on this approach. Of course, such an approach requires a developer to approach computer design from a different perspective, thinking at a logical system level, instead the component based approach commonly used in space computer engineering. To the right is one idea of how you can visualize such a system level view, in which every single block represents a functional logical element of your computerized system, which each must be protected from faults using architectural and programmatic means.
You can find more information in this topic in research papers which the CDH team has published over the past two years:
- Enabling dependable data storage for miniaturized satellites. 29th AIAA/USU Conference on Small Satellites, 2015. by Fuchs, C.M.
- A Fault-Tolerant Radiation-Robust Mass Storage Concept for Highly Scaled Flash Memory. In Data Systems in Aerospace (DASIA) Conference. by Fuchs, C. M., Trinitis, C., Appel, N., & Langer, M.
- FTRFS: A Fault-Tolerant Radiation-Robust Filesystem for Space Use. In Architecture of Computing Systems–ARCS 2015 (pp. 96-107). Springer International Publishing. by Fuchs, C. M., Langer, M., & Trinitis, C.
Ultimately, dependability is not just something relevant to electonics/computerized systems, which tend to comprise a majority of nanosatellites. All components, functions and subsystems of a cubesat have to become more reliable with a higher level of survivability and thus an increased lifetime to be suitable for upcoming missions and future miniaturized satellite applications.
Oh, one last thing: last time I did a webpage for MOVE-II, people really appreciated the awesome graphics. So here is a Christmas tree for you. I told you this special would be ALMOST Christmas free, not fully, though.