Epic: Implement a Self-Configuring Provider Generation System
Labels: epic, research, architecture, agent-autonomy, llm
Opened: 2025-07-14
1. Overview & Motivation
The ARLA platform has successfully implemented a decoupled provider pattern, which separates the core agent-engine from simulation-specific logic. This is a major architectural strength. However, it still requires a human developer to write the concrete provider implementations (e.g., SoulSimRewardCalculator) for each new simulation environment.
This epic proposes the next major evolution of ARLA's architecture: to make the agents themselves responsible for creating their own providers on the fly.
The vision is to move from a "pluggable" system to a "self-configuring" one. When an agent enters a new, unseen environment, it should be able to introspect its own components and the structure of the world, reason about the "physics" of its new reality, and dynamically generate the provider code it needs to function and learn. This represents a significant step towards truly general and autonomous agents, shifting the developer's role from writing world-specific logic to designing the agent's meta-learning and self-configuration capabilities.
2. Architectural Vision: The "Provider Generation" Meta-System
The core of this feature will be a new, high-level cognitive system that runs once for each agent at the beginning of a simulation.
New System: ProviderGenerationSystem (agent-engine/systems/)
- Purpose: To inspect the agent's components and the environment, and then use the LLM via the
CognitiveScaffold to write and load the necessary provider classes (RewardCalculator, StateEncoder, etc.) at runtime.
High-Level Workflow:
-
Introspection: At the start of a simulation, the ProviderGenerationSystem for a given agent will perform a detailed scan of:
- Its own components (e.g.,
PortfolioComponent, HealthComponent).
- The attributes of those components (e.g.,
cash_balance: float).
- The components of other entities it can observe in the environment.
-
Meta-Prompt Construction: The system will then construct a detailed, structured prompt to be sent to the LLM. This prompt is critical and will contain:
- The Goal: A high-level objective for the agent (e.g., "My primary objective is to maximize my
TimeBudgetComponent.current_time_budget.").
- The Interfaces: The full source code of the provider interfaces it needs to implement (e.g.,
RewardCalculatorInterface, StateEncoderInterface). This gives the LLM a clear "contract" to fulfill.
- The "Physics" (World Schema): A summary of the components and attributes it discovered during introspection. For example: "I have a component called
PortfolioComponent with an attribute total_value: float. My goal is to maximize survival, which is tied to my TimeBudgetComponent. Therefore, a good reward signal would likely be a positive change in my PortfolioComponent.total_value."
- The Task: A clear instruction to the LLM: "You are an expert AI research programmer. Your task is to write the complete Python code for a set of provider classes that correctly implement the given interfaces. The logic in these classes should be designed to help me achieve my primary objective given the world schema. Output only the raw, executable Python code."
-
Dynamic Code Generation and Injection:
- The
ProviderGenerationSystem sends the prompt through the CognitiveScaffold.
- It receives a string containing the complete Python code for the new provider classes.
- It uses a function like
exec() to execute this code in a restricted namespace, dynamically defining the new classes (e.g., AutoGeneratedRewardCalculator) in the agent's runtime memory.
- It then instantiates these newly-defined classes and injects them into the other core systems (
QLearningSystem, ActionSystem, etc.), which are waiting for their dependencies.
3. Implementation Plan
This is a major feature and should be rolled out in phases.
Phase 1: Core System and Dynamic Execution
Phase 2: Dependency Injection Mechanism
Phase 3: Simplified Simulation Entry Point
Phase 4: Safety, Validation, and Testing
4. Definition of Done
Epic: Implement a Self-Configuring Provider Generation System
Labels:
epic,research,architecture,agent-autonomy,llmOpened: 2025-07-14
1. Overview & Motivation
The ARLA platform has successfully implemented a decoupled provider pattern, which separates the core
agent-enginefrom simulation-specific logic. This is a major architectural strength. However, it still requires a human developer to write the concrete provider implementations (e.g.,SoulSimRewardCalculator) for each new simulation environment.This epic proposes the next major evolution of ARLA's architecture: to make the agents themselves responsible for creating their own providers on the fly.
The vision is to move from a "pluggable" system to a "self-configuring" one. When an agent enters a new, unseen environment, it should be able to introspect its own components and the structure of the world, reason about the "physics" of its new reality, and dynamically generate the provider code it needs to function and learn. This represents a significant step towards truly general and autonomous agents, shifting the developer's role from writing world-specific logic to designing the agent's meta-learning and self-configuration capabilities.
2. Architectural Vision: The "Provider Generation" Meta-System
The core of this feature will be a new, high-level cognitive system that runs once for each agent at the beginning of a simulation.
New System:
ProviderGenerationSystem(agent-engine/systems/)CognitiveScaffoldto write and load the necessary provider classes (RewardCalculator,StateEncoder, etc.) at runtime.High-Level Workflow:
Introspection: At the start of a simulation, the
ProviderGenerationSystemfor a given agent will perform a detailed scan of:PortfolioComponent,HealthComponent).cash_balance: float).Meta-Prompt Construction: The system will then construct a detailed, structured prompt to be sent to the LLM. This prompt is critical and will contain:
TimeBudgetComponent.current_time_budget.").RewardCalculatorInterface,StateEncoderInterface). This gives the LLM a clear "contract" to fulfill.PortfolioComponentwith an attributetotal_value: float. My goal is to maximize survival, which is tied to myTimeBudgetComponent. Therefore, a good reward signal would likely be a positive change in myPortfolioComponent.total_value."Dynamic Code Generation and Injection:
ProviderGenerationSystemsends the prompt through theCognitiveScaffold.exec()to execute this code in a restricted namespace, dynamically defining the new classes (e.g.,AutoGeneratedRewardCalculator) in the agent's runtime memory.QLearningSystem,ActionSystem, etc.), which are waiting for their dependencies.3. Implementation Plan
This is a major feature and should be rolled out in phases.
Phase 1: Core System and Dynamic Execution
ProviderGenerationSystemin theagent-engine.exec()to execute the LLM's code and retrieve the newly defined classes.Phase 2: Dependency Injection Mechanism
SystemManagerto support late-stage dependency injection. The manager needs to be able to hold off on fully initializing systems likeQLearningSystemuntil after theProviderGenerationSystemhas run and created the necessary providers.ProviderGenerationSystemwill need a way to pass the newly created provider instances to theSystemManager, which will then complete the initialization of the other systems.Phase 3: Simplified Simulation Entry Point
simulations/soul_sim/run.pyfile to remove the manual instantiation of all theSoulSim...providers.run.pyfile will now only need to register theProviderGenerationSystemalongside the other core systems.Phase 4: Safety, Validation, and Testing
exec()call to minimize security risks. The executed code should be in a restricted environment with no file system or network access.ProviderGenerationSystemshould validate it usingisinstance()to ensure it correctly implements the required interface.ProviderGenerationSystemitself, likely using a mock LLM that returns a pre-written, valid provider implementation as a string.4. Definition of Done
ProviderGenerationSystemexists in theagent-engine.run.py.soul-simenvironment, can dynamically generate a functional set of providers that allow it to learn and act in the world.