Testing
PlanAI provides a dedicated planai.testing module with utilities for testing workers and workflows without making real LLM calls or touching the filesystem.
Import
Section titled “Import”from planai.testing import ( MockLLM, MockLLMResponse, MockCache, InvokeTaskWorker, inject_mock_cache, add_input_provenance, unregister_output_type,)InvokeTaskWorker
Section titled “InvokeTaskWorker”InvokeTaskWorker lets you test a single worker in isolation. It mocks the graph context and captures all published tasks.
Testing a TaskWorker
Section titled “Testing a TaskWorker”from planai import Task, TaskWorkerfrom planai.testing import InvokeTaskWorkerfrom typing import List, Type
class InputTask(Task): data: str
class OutputTask(Task): result: str
class MyWorker(TaskWorker): output_types: List[Type[Task]] = [OutputTask]
def consume_work(self, task: InputTask): self.publish_work(OutputTask(result=f"Processed: {task.data}"), task)
def test_my_worker(): worker = InvokeTaskWorker(MyWorker) published = worker.invoke(InputTask(data="hello"))
worker.assert_published_task_count(1) worker.assert_published_task_types([OutputTask]) assert published[0].result == "Processed: hello"Testing a JoinedTaskWorker
Section titled “Testing a JoinedTaskWorker”Use invoke_joined() instead of invoke() for joined workers:
from planai import JoinedTaskWorkerfrom planai.testing import InvokeTaskWorker
class AggregatorWorker(JoinedTaskWorker): join_type: Type[TaskWorker] = MyWorker output_types: List[Type[Task]] = [SummaryTask]
def consume_work_joined(self, tasks: List[OutputTask]): combined = ", ".join(t.result for t in tasks) self.publish_work(SummaryTask(summary=combined), tasks[0])
def test_aggregator(): worker = InvokeTaskWorker(AggregatorWorker) inputs = [ OutputTask(result="first"), OutputTask(result="second"), ]
published = worker.invoke_joined(inputs)
worker.assert_published_task_count(1) assert published[0].summary == "first, second"Passing Constructor Arguments
Section titled “Passing Constructor Arguments”Pass any keyword arguments the worker expects:
worker = InvokeTaskWorker(ChatTaskWorker, llm=mock_llm)worker = InvokeTaskWorker(MyConfigurableWorker, threshold=0.5, name="test")MockLLM
Section titled “MockLLM”MockLLM replaces a real LLM provider. You define patterns that match against the prompt text and return either a Pydantic model or a plain string.
String Responses
Section titled “String Responses”from planai.testing import MockLLM, MockLLMResponse
mock_llm = MockLLM(responses=[ MockLLMResponse( pattern=r"What is the capital of France", response_string="The capital of France is Paris.", ),])Structured (Pydantic) Responses
Section titled “Structured (Pydantic) Responses”When your worker expects a structured output, return a Pydantic model:
from planai.testing import MockLLM, MockLLMResponse
class PlanDraft(Task): plan: str
mock_llm = MockLLM(responses=[ MockLLMResponse( pattern="Create a detailed plan.*", response=PlanDraft(plan="# My Plan\n1. Step one\n2. Step two"), ),])Multiple Responses
Section titled “Multiple Responses”Define multiple patterns — the first matching pattern wins:
mock_llm = MockLLM(responses=[ MockLLMResponse( pattern=r"Hello, how are you\?$", response_string="I'm doing well!", ), MockLLMResponse( pattern=r".*Analyze how well.*", response=ScoreOutput(score=0.8), ), MockLLMResponse( pattern="Create a refined.*plan.*", response=FinalPlan(plan="Refined plan", rationale="Improved"), ),])Combining MockLLM with InvokeTaskWorker
Section titled “Combining MockLLM with InvokeTaskWorker”This is the typical pattern for testing LLM-powered workers:
from planai.testing import MockLLM, MockLLMResponse, InvokeTaskWorker
mock_llm = MockLLM(responses=[ MockLLMResponse( pattern=r"Hello, how are you\?$", response_string="I'm doing well, thank you for asking!", ),])
worker = InvokeTaskWorker(ChatTaskWorker, llm=mock_llm)
chat_task = ChatTask( messages=[ChatMessage(role="user", content="Hello, how are you?")])published = worker.invoke(chat_task)
assert published[0].content == "I'm doing well, thank you for asking!"MockCache
Section titled “MockCache”MockCache is an in-memory replacement for the disk-based cache used by CachedTaskWorker and CachedLLMTaskWorker. It also tracks access statistics.
Basic Usage
Section titled “Basic Usage”from planai.testing import MockCache
cache = MockCache()cache.set("key1", "value1")assert cache.get("key1") == "value1"assert cache.get("missing") is NoneDisabling Storage
Section titled “Disabling Storage”Use dont_store=True to simulate a cache that always misses — useful for forcing workers to recompute every time:
cache = MockCache(dont_store=True)cache.set("key1", "value1")assert cache.get("key1") is None # always returns NoneAccess Statistics
Section titled “Access Statistics”Track how often keys are read and written:
cache = MockCache()cache.set("key1", "value1")cache.get("key1")cache.get("key1")
assert cache.set_stats["key1"] == 1assert cache.get_stats["key1"] == 2
cache.clear_stats() # reset countersGraph-Level Testing
Section titled “Graph-Level Testing”For integration tests that run an entire graph, combine MockLLM, MockCache, and inject_mock_cache.
inject_mock_cache
Section titled “inject_mock_cache”Replaces the cache on every CachedTaskWorker in a graph, including workers inside subgraphs:
from planai import Graphfrom planai.testing import MockCache, MockLLM, MockLLMResponse, inject_mock_cache
mock_cache = MockCache(dont_store=True)mock_llm = MockLLM(responses=[ MockLLMResponse(pattern="Create a detailed plan.*", response=PlanDraft(plan="...")), MockLLMResponse(pattern=".*Score each criterion.*", response=PlanCritique(overall_score=0.8)), MockLLMResponse(pattern="Create a refined.*plan.*", response=FinalPlan(plan="...", rationale="...")),])
graph = Graph(name="TestGraph")planning = create_planning_worker(llm=mock_llm, name="TestPlanning")graph.add_workers(planning)graph.set_sink(planning, FinalPlan)
inject_mock_cache(graph, mock_cache)
graph.run( initial_tasks=[(planning, PlanRequest(request="Create a test plan"))], run_dashboard=False, display_terminal=False,)
output_tasks = graph.get_output_tasks()assert len(output_tasks) == 1assert isinstance(output_tasks[0], FinalPlan)Helper Functions
Section titled “Helper Functions”add_input_provenance
Section titled “add_input_provenance”Manually inject provenance into a test task when a worker depends on inspecting the provenance chain:
from planai.testing import add_input_provenance
parent = InputTask(data="original")child = OutputTask(result="derived")
add_input_provenance(child, parent)# child._input_provenance now contains [parent]unregister_output_type
Section titled “unregister_output_type”Remove an output type from a worker’s consumers. Useful for intercepting intermediate tasks in graph tests:
from planai.testing import unregister_output_type
# Capture PlanDraft tasks instead of letting them flow downstreamplanner = graph.get_worker_by_output_type(PlanDraft)unregister_output_type(planner, PlanDraft)graph.set_sink(planner, PlanDraft)
graph.run(initial_tasks=initial_work, run_dashboard=False, display_terminal=False)drafts = [t for t in graph.get_output_tasks() if isinstance(t, PlanDraft)]assert len(drafts) == 3- Always pass
run_dashboard=False, display_terminal=Falsewhen running graphs in tests to avoid starting the monitoring UI. InvokeTaskWorkervalidates input task types — passing the wrong type raisesTypeError.- Calling
invoke()on aJoinedTaskWorker(orinvoke_joined()on a regular worker) raisesTypeError. InvokeTaskWorkerautomatically injects aMockCacheif the worker has a_cacheattribute.MockLLMResponse.patternis matched as a regex against the full prompt text.