In the development of embedded software, observability is mostly used reactively, such as during post-hoc debugging or through log analysis. It is high time for a proactive approach that is capable of capturing even sporadic errors early on.
Critical Systems: As software in cars increasingly takes over more functions, it is becoming increasingly important to closely monitor and understand what the program code is doing.
The increasing complexity of embedded systems presents traditional development and testing methods with new challenges. A clear example of this is Volvo's new electric flagship model, the EX90, which is equipped with advanced technologies such as LiDAR sensors, NVIDIA's computer system, and Qualcomm’s "Snapdragon Cockpit Platform," all integrated into numerous internally developed software components. Originally planned for 2022, delivery was delayed until early September 2024 due to production issues. The first vehicles even rolled off the assembly line without functioning LiDAR sensors. These delays, at least partly attributable to the increased system complexity, incurred costs in the billions and underscore the challenges in developing and verifying software for modern embedded systems.
In response to these challenges, the concept of Continuous Observability has been developed. In software systems, Observability refers to the ability to understand internal system states by analyzing outputs such as logs, system traces, and memory dumps. Traditionally, Observability has been reactive: developers would employ debugging tools only when problems arose. However, this approach is insufficient for addressing sporadic issues or problems that occur outside of the development lab.
Continuous Observability adopts a proactive approach: data collection is enabled by default, and reporting is automated. This ensures that diagnostic data are always available for analysis when problems occur. This approach spans the entire product lifecycle, from early software development to on-site deployment. By combining automated and comprehensive Observability, hidden errors, performance issues, and system anomalies can be detected early on. This expedites problem resolution and enhances software quality.
A key driver for the implementation of Continuous Observability is the increasing complexity of embedded systems. Modern embedded systems often integrate various software functions through multithreading and asynchronous events, leading to unpredictable fluctuations in software execution. This complexity significantly complicates verification and debugging, as testing every possible scenario becomes unrealistic. Even after deployment, embedded software typically contains about three errors per 1,000 lines of code. This environment makes it crucial to have a robust system in place that can continuously monitor and report on system health and operational anomalies, thus enabling timely detection and correction of issues.
Edge connectivity further amplifies system complexity and simultaneously exposes systems to cyber threats. Paradoxically, however, the increasing connectivity also offers potential solutions through Over-the-Air (OTA) updates and remote observability. The combination of observability and OTA updates is revolutionizing the way developers address the challenges of software complexity. By enabling remote observability, developers can continuously monitor and diagnose systems from afar, catching potential issues before they cause significant impact. This capability is crucial in environments where systems are not easily accessible. Conversely, OTA updates allow for swift and seamless delivery of fixes and improvements, reducing downtime and enhancing the functionality of devices without requiring physical access. This integrated approach not only mitigates the risks associated with increased connectivity but also leverages its advantages to maintain and improve system reliability and security dynamically.
A typical example: Software-defined vehicles
The trend towards Software-Defined Vehicles (SDVs) illustrates a shift in the development of embedded systems. SDVs redefine automotive architecture by placing software at the core of vehicle functionality. Unlike traditional vehicles that use independent electronic modules, SDVs rely on centralized high-performance computers that organize functions into software layers. This approach enables Over-the-Air (OTA) software updates, the dynamic deployment of features, and a continuous connection to external systems. This shift not only enhances the adaptability and upgradeability of the vehicle but also significantly impacts how vehicles interact with drivers and the environment. For instance, continuous updates can improve vehicle performance, introduce new features, and address security vulnerabilities without the need for physical modifications. Furthermore, the centralized computing approach simplifies hardware architecture and reduces the complexity and weight of the vehicle, potentially leading to improved efficiency and lower manufacturing costs. Additionally, the permanent connectivity allows for better integration with smart city infrastructures and V2X (vehicle-to-everything) communication systems, further enhancing the vehicle's capabilities and safety features. Overall, the move towards SDVs represents a fundamental transformation in how vehicles are designed, manufactured, and maintained, increasingly aligning them with the broader digital ecosystem.
Date: 08.12.2025
Naturally, we always handle your personal data responsibly. Any personal data we receive from you is processed in accordance with applicable data protection legislation. For detailed information please see our privacy policy.
Consent to the use of data for promotional purposes
I hereby consent to Vogel Communications Group GmbH & Co. KG, Max-Planck-Str. 7-9, 97082 Würzburg including any affiliated companies according to §§ 15 et seq. AktG (hereafter: Vogel Communications Group) using my e-mail address to send editorial newsletters. A list of all affiliated companies can be found here
Newsletter content may include all products and services of any companies mentioned above, including for example specialist journals and books, events and fairs as well as event-related products and services, print and digital media offers and services such as additional (editorial) newsletters, raffles, lead campaigns, market research both online and offline, specialist webportals and e-learning offers. In case my personal telephone number has also been collected, it may be used for offers of aforementioned products, for services of the companies mentioned above, and market research purposes.
Additionally, my consent also includes the processing of my email address and telephone number for data matching for marketing purposes with select advertising partners such as LinkedIn, Google, and Meta. For this, Vogel Communications Group may transmit said data in hashed form to the advertising partners who then use said data to determine whether I am also a member of the mentioned advertising partner portals. Vogel Communications Group uses this feature for the purposes of re-targeting (up-selling, cross-selling, and customer loyalty), generating so-called look-alike audiences for acquisition of new customers, and as basis for exclusion for on-going advertising campaigns. Further information can be found in section “data matching for marketing purposes”.
In case I access protected data on Internet portals of Vogel Communications Group including any affiliated companies according to §§ 15 et seq. AktG, I need to provide further data in order to register for the access to such content. In return for this free access to editorial content, my data may be used in accordance with this consent for the purposes stated here. This does not apply to data matching for marketing purposes.
Right of revocation
I understand that I can revoke my consent at will. My revocation does not change the lawfulness of data processing that was conducted based on my consent leading up to my revocation. One option to declare my revocation is to use the contact form found at https://contact.vogel.de. In case I no longer wish to receive certain newsletters, I have subscribed to, I can also click on the unsubscribe link included at the end of a newsletter. Further information regarding my right of revocation and the implementation of it as well as the consequences of my revocation can be found in the data protection declaration, section editorial newsletter.
While Software-Defined Vehicles (SDVs) indeed offer undeniable benefits such as rapid innovation and reduced hardware complexity, they also introduce new challenges. Frequent updates, diverse configurations, and the continuous development of software create an environment where it becomes nearly impossible to thoroughly test every potential scenario throughout a vehicle's lifecycle. This conflict between traditional, hardware-oriented approaches and the agile, software-defined future of SDVs places enormous pressure on developers to find a balance between safety, performance, and innovation. The challenge is further compounded by the need to ensure robust cybersecurity measures, as the increased connectivity and reliance on software amplify vulnerabilities to cyber threats. Ensuring the reliability of software under various real-world conditions and protecting against potential security breaches become critical tasks. Developers must employ advanced testing methodologies like simulation and virtual testing environments, continuous integration and deployment (CI/CD) practices, and perhaps most importantly, rigorous real-world testing regimes. Implementing systems for monitoring and rapid response to issues as they arise, such as OTA updates for patching vulnerabilities, is also essential. Ultimately, the transition to SDVs not only revolutionizes the automotive industry’s product but also its processes, requiring a systemic change in how vehicles are engineered, from conception through to after-sales support. This requires a cultural shift within organizations, geared towards embracing continuous learning and adaptation, to truly harness the potential of SDVs while mitigating their risks.
In this context, Continuous Observability becomes indispensable. It provides persistent insights into system behavior, helping developers ensure that new software updates, feature rollouts, and system configurations do not compromise core functions such as safety or performance. The trend towards SDVs extends beyond the automotive industry and influences sectors such as aerospace, industrial automation, and medical technology, making Continuous Observability a cross-industrial necessity. This approach enables proactive management of systems by allowing developers to continuously monitor and analyze data and metrics in real-time. Continuous Observability facilitates early detection of anomalies, performance bottlenecks, and security vulnerabilities, allowing for immediate corrective actions. It integrates seamlessly into the development life cycle, supporting a more agile and responsive approach to system management and maintenance. Moreover, as devices and systems become increasingly interconnected and reliant on software, the role of Continuous Observability in ensuring the reliability and security of these technologies cannot be overstated. This makes it a critical component in the ongoing digital transformation across various industries, empowering organizations to navigate the complexities of modern technological ecosystems more effectively.
Stage set for Observability-Driven Development
Observability-Driven Development (ODD) is emerging as a key strategy to cope with the increasing complexity of embedded systems and fits seamlessly into Percepio's Continuous Observability concept. ODD incorporates observability into every phase of the development cycle, providing real-time insights into system behavior from early software development to field deployment, with software tracing as a central tool. By integrating observability from the outset, developers can gain a deeper understanding of how their code behaves under various conditions, which is crucial for identifying and rectifying issues early in the development process. This proactive approach helps in optimizing performance, enhancing security, and ensuring system reliability. When ODD is supplemented with continuous monitoring of software performance metrics, systems can autonomously identify abnormal behaviors and potential failure risks in runtime software. This "self-aware" capability is particularly valuable in systems where safety and uptime are critical, such as in automotive, aerospace, medical devices, and industrial automation. This methodological shift not only aids in addressing immediate issues more effectively but also contributes to a more robust and resilient software architecture. As a result, Observability-Driven Development is proving to be an invaluable approach in the software engineering landscape, particularly as systems continue to grow in complexity and scale.
Furthermore, the continuous observability of system performance allows software development teams to remain flexible even after a product is launched. In industries where failure costs are high—such as automotive, medical technology, and aerospace—ODD thus provides a valuable layer of security that prevents costly recalls, downtime, or performance losses. As system complexity increases, ODD ensures that developers have the necessary insights to maintain control over their products, regardless of the frequency of software updates. This flexibility is crucial in rapidly evolving technological landscapes, where the ability to respond and adapt to new challenges and requirements can define the success or failure of products. Continuous Observability enhances the capacity for continuous improvement and iteration, enabling teams to refine and optimize their systems based on real-world performance data. This adaptive approach not only improves product quality and safety but also enhances customer satisfaction by ensuring that systems are robust, reliable, and up to date with the latest features and security measures.
Ineffective management of the complexity of modern embedded systems can lead to serious consequences, especially in industries where system reliability is a top priority. Several prominent examples highlight this:
A leading automaker had to recall over a million vehicles delivered between 2020 and 2022. A flaw in the software and sensor combination of the Occupant Classification System (OCS) could potentially affect airbag deployment. This issue remained undetected during testing, leading to significant financial losses and damage to the brand's reputation.
In 2018, a medical technology giant faced an FDA-mandated recall of insulin pumps. A potential cybersecurity vulnerability could have allowed hackers to take control of the pumps via remote access.
The discovery of the Linux/OpenSSL XZ-Backdoor in 2024 revealed a new dimension of cyber threats: compromised open-source software components. This carefully orchestrated supply chain attack injected malicious code through the build system without altering the source code. The discovery was made incidentally through runtime measurements that highlighted discrepancies in software execution times. This incident underscores the vulnerabilities inherent in open-source ecosystems, particularly in the supply chain, where attackers can exploit multiple entry points. To mitigate such risks, it becomes crucial for organizations to implement robust security measures including: 1. Enhanced Scrutiny of Build Systems: Ensuring that the processes and tools used for compiling and deploying software are secure and monitored continuously. 2. Comprehensive Auditing: Regularly auditing the source code and build processes for any anomalies that might suggest tampering or malicious alterations. 3. Implementing Software Composition Analysis (SCA): Tools that can identify and manage open-source components within a project to ensure they are up-to-date and free from known vulnerabilities. 4. Adopting Observability and Monitoring: Utilizing advanced observability tools to detect any unusual behavior or discrepancies in execution that could indicate the presence of a backdoor or other malicious code. 5. Education and Awareness: Training developers and relevant personnel on the importance of security in the software development lifecycle and keeping them informed about the latest in cyber threat intelligence. The Linux/OpenSSL XZ-Backdoor serves as a harsh reminder of the evolving capabilities of cyber adversaries and the need for continuous vigilance and improvement in cyber defenses, especially concerning open-source software.
These cases highlight that traditional testing and debugging methods are no longer sufficient to cope with the complexity of today’s connected, software-driven devices. Software issues in critical systems have tangible, often costly consequences. Since developers cannot possibly foresee every potential operating condition, it is impossible to test every edge case before market launch. This reality necessitates a shift towards more dynamic and adaptive testing and monitoring strategies that can accommodate the unforeseen and rapidly evolving scenarios that modern software environments face. Some advanced approaches include: 1. Continuous Integration and Continuous Deployment (CI/CD): This approach allows for ongoing testing and deployment of software changes, enabling teams to identify and address issues more rapidly and iteratively. 2. Shift-Left Testing: Integrating testing early and often in the software development lifecycle—developers are encouraged to focus on quality from the beginning, rather than treating it as a final step. 3. Automated Testing: Using automated testing tools to handle repetitive tasks and to simulate a wide range of operating conditions, which might be impractical or impossible to replicate manually. 4. Chaos Engineering: Deliberately introducing disturbances into systems in controlled environments to understand how they react and how resilient they are, helping to anticipate and mitigate potential failures in real-world operations. 5. Artificial Intelligence and Machine Learning: Employing AI and ML to predict potential failure points and automate problem-solving processes, enhancing the capability to manage and rectify software anomalies efficiently. 6. Enhanced Observability: Implementing comprehensive observability tools that provide insights not just into what happened but also why it happened, offering an in-depth view into the system's internal state. These advanced methodologies, while requiring upfront investment and cultural adjustment within development teams, are vital for ensuring the reliability, safety, and security of complex software systems in a cost-effective manner. The goal is to evolve from reactive problem-solving to a more proactive, preventive approach in software quality assurance.
At this juncture, Continuous Observability and Observability-Driven Development (ODD) play crucial roles. By implementing ODD, developers can detect and rectify problems early on before they lead to widespread recalls or delays. Continuous Observability provides ongoing insights into system behavior in real-world environments. This enables the detection of anomalies, optimization of performance, and assurance of safety and reliability of products even after delivery. The benefits of Continuous Observability and ODD include: 1. Early Detection and Resolution: Observability tools can identify potential issues early in the development cycle, allowing teams to address them before they escalate into more significant problems. 2. Real-time Monitoring: Continuous tracking of a system’s operational data helps developers understand how the software behaves under various conditions. This real-time data is invaluable for quick diagnostics and enhancements. 3. Feedback Loop Enhancement: Continuous Observability strengthens the feedback loop in software development. Insights gained from production are fed back into the development and testing phases, improving the quality of subsequent releases. 4. Predictive Analytics: By analyzing trends and patterns from the data collected, teams can predict potential failures before they occur, allowing preemptive action to be taken. 5. Improved User Experience: With ODD, software teams can continuously refine the user experience based on real user interactions, leading to products that better meet user needs and expectations. 6. Compliance and Reporting: Continuous Observability helps in maintaining compliance with regulations by providing historical and real-time data necessary for audits and reports. Implementing these methodologies requires an investment in advanced monitoring tools, training for development teams, and a cultural shift towards embracing proactive and preventive software quality measures. However, the long-term benefits in terms of reduced downtime, fewer emergency patches, and enhanced user satisfaction make these investments worthwhile.
In industries such as automotive, medical technology, and aerospace, where the cost of failure is extraordinarily high, the benefits of Observability-Driven Development (ODD) and Continuous Observability prove crucial. Whether it's about preventing a life-threatening malfunction of a medical device or averting a widespread recall in the automotive industry, the capability to continuously monitor, analyze, and enhance software performance throughout the entire product lifecycle becomes an indispensable component of modern development strategies. The stakes in these industries are particularly high due to the potential impact on human safety and significant financial risks associated with system failures. For example: 1. Automotive Industry: Vehicle recalls not only cost manufacturers billions in direct expenses but also significantly damage brand reputation and customer trust. Continuous Observability can detect anomalies in vehicle systems early, potentially preventing malfunctions that would require recalls. 2. Medical Technology: In this sector, device reliability can directly impact patient health. Continuous monitoring ensures that any deviations in device behavior can be quickly identified and rectified, maintaining the stringent reliability standards needed in healthcare. 3. Aerospace: The complexity and cost of aerospace systems make failures exceptionally risky and expensive. Continuous Observability allows for the detailed monitoring of spacecraft and aircraft systems to ensure any potential issues are identified and addressed during the flight or even beforehand during testing and development stages. Implementing these systems involves integrating robust monitoring infrastructure, setting up detailed analytics to understand vast amounts of data collected, and training teams to respond swiftly to insights provided by these systems. The investment in such technologies leads to safer, more reliable products that comply with the high standards expected in these critical fields and reduces the long-term costs associated with potential failures and recalls. This proactive approach not only saves resources but also reinforces consumer and regulatory trust, providing a competitive edge in sensitive markets.
The path to the future
With Tracealyzer for development, Detect for system testing, and DevAlert for field deployment, Percepio offers a comprehensive continuous observability solution.
(Image:Percepio)
ODD requires a "shift-left" approach, where observability is planned from the beginning to support all subsequent phases of product development and maintenance. This is not only a concern for software quality assurance and test management but is becoming a key factor in the performance and competitiveness of product development companies.
In an era where the complexity of embedded systems is continuously increasing and we are moving towards an age of "software-defined everything," the question is no longer whether developers should adopt Continuous Observability and ODD practices, but when and how they can implement them most effectively.
Percepio's comprehensive Continuous Observability solution addresses these challenges effectively. The portfolio, which includes Tracealyzer for development, Detect for system testing, and DevAlert for deployment, covers the entire product lifecycle. These tools provide real-time insights into the behavior of complex embedded software, enabling developers to accelerate debugging, enhance product quality, and minimize deployment risks. The scalability of the solution—from small IoT nodes to powerful multicore SoCs—exemplifies how Continuous Observability can be implemented in embedded systems to effectively meet the challenges of increasing complexity. Each tool in Percepio’s suite serves a crucial role: 1. Tracealyzer: Offers deep visualization capabilities that allow developers to trace the real-time execution of embedded software, presenting a clear view of system behavior that aids in optimizing software performance and functionality. 2. Detect: Facilitates rigorous testing by identifying anomalies and performance issues during the software testing phase, ensuring any potential problems are addressed before deployment. 3. DevAlert: Provides a real-time diagnostics solution that integrates with deployed systems to continuously monitor and alert developers to issues post-deployment, thereby enhancing ongoing maintenance and updates. This integrated approach not only ensures that each phase of the development and deployment process is well-supported but also adapts to the specific needs and complexities of varying types of embedded systems. By incorporating these advanced tools into their development strategies, organizations can significantly reduce time-to-market and improve reliability and safety of their products, making Percepio’s solution a valuable asset in the rapidly advancing field of embedded systems.