In the rapidly evolving field of autonomous systems, the safety and reliability of the system components are fundamental requirements. These components are often vulnerable to complex and unforeseen environments, making natural edge-case generation essential for enhancing system resilience. This paper presents GENESIS-RL, a novel framework that leverages system-level safety considerations and reinforcement learning techniques to systematically generate naturalistic edge cases. By simulating challenging conditions that mimic the real-world situations, our framework aims to rigorously test entire system’s safety and reliability. Although demonstrated within the autonomous driving application, our methodology is adaptable across diverse autonomous systems. Our experimental validation, conducted on high-fidelity simulator underscores the overall effectiveness of this framework.
DRL Problem formulation: Our formulation follows the Markov Decison Process (MDP) framework, we define the state space, action space and reward as follows: The state space encompasses all conceivable states, which include permutations of parametric knobs, system’s behaviors, other actors, and features of the world. This state representation captures the dynamics of the world and the DRL agent’s action inputs, and is conveyed through information obtained by the system. The action space is the set of all possible actions at available to the agent, corresponding to the adjustments the agent can make to the parametric knobs within the simulation. To ensure that the changes introduced by the DRL agent lead to scenes that are natural and realistic, we imposed constraints on the extent of modifications possible at each step. Specifically, we limit the maximum percentage change that can be applied to any parametric knob by the DRL agent in a single action. This measure prevents extreme, unrealistic variations in conditions, thereby maintaining the realistic nature of the simulated scenes while still challenging the system under test.
GENESIS-RL framwork components: To effectively implement our DRL formulation, we designed a framework consisting of the following components: the DRL agent, the initial scene generator, the simulator, the system, and the reward calculator.
Training the DRL agent: The DRL agent is trained through interactions with the environment, where it observes the states, applies actions, and receives rewards. The training process involves iterative episodes of simulation, during which the agent refines its policy to maximize the cumulative reward, effectively learning to identify and create challenging scenarios for the system.