Breaking Changes

v0.2.0 (unreleased)

Per-task scope: `scope` and `read_only` may be a `ScopeResolver` (additive)

Controller(scope=...) and Controller(read_only=...) now each also accept a ScopeResolver (Callable[[Task], Scope], exported as superred.core.ScopeResolver), resolved once per task (independently of each other), in addition to the classic fixed Scope. This is purely additive: passing a frozenset behaves exactly as before.

from superred.core import ScopeResolver

controller = Controller(
    scope=lambda task: frozenset({tag_for(task)}),  # resolver, called once per task
    scope_label="per-goal",                          # required in this mode
    ...,
)

What’s new:

scope_label: str | None = None new constructor arg. Required (non-empty str) when either scope or read_only is a callable; must be None when both are fixed frozensets (else ValueError). The fixed-scope non-empty (scope | read_only) check is unchanged.
Either resolver may raise NotApplicable, which contributes an empty set for its own dimension, exactly like returning frozenset(). The task is skipped (skipped_tasks) when the resolved visibility (scope | read_only) is empty, i.e. no tag is granted in either dimension; any tag (read or write, from either resolver) means the task runs. A resolver raising any other exception fails just that task (stop_reason="error") and leaves siblings running.
TaskResult.scope and TaskResult.read_only new fields (default frozenset()) recording the scope enforced for that task. In static mode every TaskResult.scope equals the controller scope; with a resolver it is the per-task resolved scope.
ThreatModelResult.scope_label: str | None new field (default None). In static mode scope/read_only stay the concrete frozensets and scope_label is None (unchanged). In dynamic mode ThreatModelResult.scope and read_only are empty frozensets and scope_label carries the run identity (the per-task truth lives on each TaskResult.scope).
Persistence: in dynamic mode the filename stem is the sanitized scope_label ({label}__{model}.json + {label}__{model}/); the claim summary gains a scope_label field (with empty scope/read_only arrays), and each per-task detail file records that task’s own resolved scope/read_only. Static-mode filenames are unchanged.
SCHEMA_VERSION bumped 1 → 2: persisted JSON now carries scope_label and per-task scopes in detail files.

Read-only access via a `read_only` Controller argument (replaces `ReadOnly`/`ScopeSpec`)

scope keeps its original meaning — the read & write surface (visible AND injectable). A new optional read_only argument adds tags that are visible but not injectable. read_only defaults to empty, so the whole scope is read & write, identical to the previous behavior; existing Controller(scope=...) calls are unaffected. Access level is expressed by which of the two plain Scope sets a tag lands in; the short-lived ReadOnly, ScopeSpec, and resolve_scope symbols are removed.

# See the whole system subtree, inject only the prompt:
controller = Controller(
    scope=frozenset({prompt_tag}),     # read & write
    read_only=frozenset({system_tag}), # visible only
    ...,
)

A read_only tag already covered by scope has no effect (read & write overrules: only scope drives injection, so it stays injectable). scope and read_only cannot both be empty. ThreatModelResult gains a read_only: Scope field, and persisted JSON carries read_only (replacing read_only_scope); runs with read-only tags get a __ro_{read_only} filename component.

At optimizer.initialize(), the controllables list now means exactly “the surfaces the optimizer can inject into” (filtered by the read & write scope). A read-only controllable — visible but not injectable — is no longer in that list; it is re-presented in observables (as an ObservableValue with content=None). For an all-read & write run this changes nothing (no read-only controllables exist).

Controller is now one threat model: `target_factory`, single `scope`, single `llm_config`

The controller was previously a fan-out: it iterated the Cartesian product of scopes × llm_configs and returned a ControllerResult wrapping a list of ThreatModelResult. It now models exactly one threat model — a single (scope, llm_config) combination — and returns a ThreatModelResult directly. Sweeping multiple threat models is the caller’s job.

The same change replaces target: Target with target_factory: TargetFactory so each task in the claim gets its own target instance and tasks run concurrently up to target_factory.concurrency.

# Before
controller = Controller(
    optimizer_factory=lambda: MyOptimizer(),
    target=MyTarget(api_key="sk-..."),
    security_claim=claim,
    llm_configs=[cfg_a, cfg_b],          # list, optional
)
result = await controller.run(scopes=[scope_a, scope_b])   # Cartesian product
task_results = result.threat_model_results[0].task_results  # one ThreatModelResult per (scope, cfg)

# After
from superred.core.controller import Controller, TargetFactory

target_factory = TargetFactory(
    create=lambda: MyTarget(api_key="sk-..."),
    concurrency=8,                       # default 1 (sequential, old per-task behavior)
)
controller = Controller(
    optimizer_factory=lambda: MyOptimizer(),
    target_factory=target_factory,
    security_claim=claim,
    scope=scope_a,                       # required, single Scope
    llm_config=cfg_a,                    # optional, single LLMConfig
)
tmr = await controller.run()             # -> ThreatModelResult (not ControllerResult)
task_results = tmr.task_results

# Multiple threat models = multiple controllers at the caller:
import asyncio, itertools
results = await asyncio.gather(*(
    Controller(
        scope=s, llm_config=c,
        optimizer_factory=..., target_factory=target_factory,
        security_claim=claim,
    ).run()
    for s, c in itertools.product([scope_a, scope_b], [cfg_a, cfg_b])
))

What changed:

target → target_factory: each task gets its own Target from target_factory.create(). Concurrent tasks never share mutable target state.
Parallel tasks within a threat model: bounded by target_factory.concurrency via asyncio.Semaphore + asyncio.gather. Default concurrency=1 preserves sequential behavior.
target.teardown() is per-task and runs in the inner finally before the next task acquires its semaphore slot.
llm_configs: Sequence[LLMConfig] → llm_config: LLMConfig | None: one config per controller, no list.
scope: Scope is now a required constructor arg; run() no longer takes scopes= or models=.
ControllerResult is removed; controller.run() returns a ThreatModelResult directly.
No default-scope behavior: experiments that want target.security_domain.distinct_combinations() build it themselves before constructing controllers.
Partial trajectory + exception on failure: when a run raises mid-execution, the trajectory accumulated up to the crash is preserved as the final RunResult with a zero-score evaluation, and TaskResult.error carries the formatted traceback. Both fields land in the persisted JSON.

Migration:

Wrap target construction: target=MyTarget(...) → target_factory=TargetFactory(create=lambda: MyTarget(...)). For tests, use TargetFactory.singleton(my_target) (concurrency=1).
Move scope/llm_config into the constructor: drop run(scopes=[s]), add scope=s and llm_config=cfg to Controller(...).
Replace result.threat_model_results[0] with the direct return.
For sweeps, build one Controller per (scope, llm_config) combination and gather them at the experiment level.
For real targets that can serve parallel requests (chatbots wrapping API calls), bump concurrency= to match the deployed rate limit.

Optimizer.initialize() signature change

The llm_client parameter on Optimizer.initialize() is now required (LLMClient, not LLMClient | None). The base class stores the client — subclasses must call super().initialize(...) for self.llm to work.

# Before
async def initialize(
    self,
    goal: Goal,
    controllables: list[Controllable],
    observables: list[ObservableValue],
    llm_client: LLMClient | None = None,
) -> None: ...

# After
async def initialize(
    self,
    goal: Goal,
    controllables: list[Controllable],
    observables: list[ObservableValue],
    llm_client: LLMClient,
) -> None: ...

Impact: All existing Optimizer subclasses must update their initialize() signature to accept llm_client: LLMClient (required, not optional) and call super().initialize(...).

Migration: Change llm_client: LLMClient | None = None to llm_client: LLMClient and add a super() call.

class MyOptimizer(Optimizer):
    async def initialize(
        self, goal, controllables, observables, llm_client,
    ) -> None:
        await super().initialize(goal, controllables, observables, llm_client)
        # self.llm is now available
        ...

Controller llm_config is now required

The llm_config parameter on Controller is now required (no longer optional). LLM access is part of the threat model and must always be specified.

Migration: Pass llm_config=LLMConfig(...) to the Controller constructor.

LLMUsage is no longer optional on result types

RunResult.llm_usage and TaskResult.llm_usage are now LLMUsage (not LLMUsage | None). They are always present since llm_config is required.

Migration: Remove is not None checks around llm_usage access.

New core dependency: litellm

The superred package now depends on litellm>=1.0. This is pulled in automatically via pip. No action needed unless you pin dependencies — add litellm to your pins.

Cost-based budget enforcement

LLMConfig now uses max_cost: float | None (USD) instead of the previous max_calls/max_input_tokens/max_output_tokens fields. LLMUsage now tracks calls: int and cost: float only (token fields removed). Budget enforcement is based on USD cost computed via litellm.completion_cost().

Migration: Replace max_calls=N / max_input_tokens=N / max_output_tokens=N with max_cost=X.XX (USD amount). Remove any references to input_tokens or output_tokens on LLMUsage.

Controller takes optimizer_factory instead of optimizer

A fresh Optimizer is now built per task via optimizer_factory, which replaces the old optimizer: Optimizer constructor arg. Combined with the threat-model collapse above, the migration is:

# Before
controller = Controller(
    optimizer=my_optimizer,
    target=target,
    security_claim=claim,
    security_domain_tag=external_tag,
    llm_config=llm_config,
)

# After (see also the target_factory / scope / llm_config section above)
controller = Controller(
    optimizer_factory=lambda: MyOptimizer(),
    target_factory=TargetFactory(create=lambda: MyTarget(...)),
    security_claim=claim,
    scope=frozenset({external_tag}),
    llm_config=llm_config,
)

New types: Scope, scope_includes, ThreatModelResult, OptimizerFactory

Scope = frozenset[SecurityDomainTag] — type alias for multi-tag attack surface scope.
scope_includes(scope, tag) — returns True if any tag in the scope includes the target tag.
ThreatModelResult — frozen dataclass grouping results for one (scope, llm_config) combination.
OptimizerFactory = Callable[[], Optimizer] — type alias for optimizer factories.

All are exported from superred.core and superred.core.types.

Feedback scoping: RunEndEvent change, evaluation order, include_feedback

Three interrelated changes to how feedback flows to the optimizer:

1. RunEndEvent carries evaluation, not trajectory

RunEndEvent.trajectory has been replaced with RunEndEvent.evaluation: EvaluationResult | None (default None). The optimizer no longer receives the trajectory through RunEndEvent — use self.current_trajectory instead (available via _dispatch).

# Before
if isinstance(event, RunEndEvent):
    for entry in event.trajectory.snapshot():
        ...

# After
if isinstance(event, RunEndEvent):
    # Read evaluation directly from the event:
    if event.evaluation is not None:
        score = event.evaluation.primary_score.value
    # Or from the trajectory:
    if self.current_trajectory is not None:
        for entry in self.current_trajectory.snapshot():
            ...

2. RunEndEvent is now persisted to the trajectory

RunEndEvent is persisted to the trajectory (previously it was not). Its security_domain is set from the active scope (required for trajectory validation). Evaluation happens before RunEndEvent is sent. The order is: target.run() → evaluate() → RunEndEvent (with evaluation) → trajectory.close().

3. Controller include_feedback flag

Controller.__init__ accepts include_feedback: bool = True. When True, RunEndEvent.evaluation carries the filtered EvaluationResult; when False, evaluation is None. The optimizer reads feedback from event.evaluation on RunEndEvent, or from past trajectories (since RunEndEvent is persisted).

Impact: Optimizers that accessed RunEndEvent.trajectory must switch to self.current_trajectory or event.evaluation. FeedbackEvent has been removed entirely — remove any imports or isinstance checks for it. Read feedback from event.evaluation on RunEndEvent or from the trajectory instead.

Migration:

Replace event.trajectory on RunEndEvent with self.current_trajectory or event.evaluation.
Remove all FeedbackEvent imports and handlers — the type no longer exists.
To read feedback, use event.evaluation on RunEndEvent or query past trajectories.

Target.cleanup() renamed to reset_ephemeral_state()

The Target ABC lifecycle method cleanup() is now reset_ephemeral_state(), with a sharper contract: it resets only ephemeral (per-run) state, leaving durable state intact. Durable state (for example a memory bank accumulated by a memory-injection attack) persists across runs within a task and is discarded only when the controller obtains a fresh instance from the TargetFactory between tasks. Resources and identity live for the instance’s lifetime and are released in teardown().

Behavior is unchanged: the controller still calls the method after each run’s evaluation, and once more post-task. Only the name and the documented contract change.

# Before
class MyTarget(Target):
    async def cleanup(self) -> None:
        self._last_response = ""

# After
class MyTarget(Target):
    async def reset_ephemeral_state(self) -> None:
        self._last_response = ""   # reset ephemeral state only; leave durable state intact

Impact: Every Target subclass must rename its cleanup() override to reset_ephemeral_state(). The method is abstract, so a subclass that still defines cleanup() is no longer instantiable (TypeError at construction).

Migration: Rename async def cleanup to async def reset_ephemeral_state. If the old method reset state that an attack should be able to accumulate across runs (for example a memory store), move that state out of the method so it survives between runs.