POML Toolkit
Overview
The POML Toolkit is a PHP-based orchestration engine designed to manage complex prompt workflows using a custom XML-based language called POML (Prompt Orchestration Markup Language). Its primary purpose is to enable developers to define, parse, and execute multi-step AI prompt chains that integrate with language models, custom tools, conditional logic, and state management. This toolkit is particularly useful for building AI-driven applications where prompts need to be sequenced, variables extracted from responses, and decisions made based on runtime conditions.
Key Features
-
Prompt Execution: Send prompts to AI models via connectors (e.g., dummy for testing or HTTP for real APIs like OpenAI).
-
Tool Integration: Call PHP functions as tools within the workflow, with parameter resolution from state.
-
Conditional Logic: Support for if/else steps with simple condition evaluation (e.g., `{{var}} == 'value'`).
-
Variable Extraction: Automatically extract values from JSON responses into state variables.
-
State Management: Persistent state across steps for variable sharing and result tracking.
-
Error Handling: Custom exceptions for engine and parser errors.
Architecture
The toolkit follows a modular design with the following core components:
-
Steps (Implement StepInterface): Abstract interface for executable steps.
- `PromptStep`: Executes a prompt via a connector and extracts variables.
- `ToolStep`: Invokes a registered PHP callable with resolved parameters.
- `ConditionalStep`: Evaluates a condition and executes then/else branches.
-
Orchestration: Container for a sequence of steps parsed from POML XML.
-
Connectors (Implement ModelConnectorInterface): Interfaces for AI model interactions.
- `DummyConnector`: For testing, returns predefined responses.
- `HttpConnector`: For real HTTP-based API calls (e.g., to OpenAI).
-
PomlParser: Parses POML XML using SimpleXML, validating structure and building step objects.
-
OrchestrationEngine: Central executor that registers connectors/tools, runs the orchestration, and manages state via `StateManager`.
-
StateManager: Handles variable resolution (e.g., `{{var}}`), storage, and retrieval.
-
Exceptions: `EngineException` for runtime errors, `ParserException` for XML issues.
Dependencies
From composer.json:
- PHP ^8.0 (strict types enforced).
- Autoloading via PSR-4 for Poml\Toolkit\ namespace mapping to src/.
No external libraries are required beyond PHP's built-in extensions (e.g., SimpleXML for parsing, cURL implicitly via HTTP connector if used).
Architecture Diagram
graph TD
A[Poml XML Input] --> B[PomlParser]
B --> C[Orchestration Object with Steps]
C --> D[OrchestrationEngine]
D --> E[Register Connectors and Tools]
E --> F[Run Orchestration]
F --> G[StateManager for Variables and Results]
G --> H[Execute Steps: Prompt, Tool, Conditional]
H --> I[Connectors: Dummy or HTTP]
I --> J[Tools: PHP Callables]
J --> G
H --> K[Variable Extraction from JSON]
K --> G
L[Conditional Evaluation] --> M[Then/Else Branches]
M --> H
Implemented Changes and Improvements
This version includes several enhancements based on best practices for robustness and maintainability:
-
Autoloader Improvements: Switched to PSR-4 compliant autoloading for better namespace organization and Composer integration.
-
Error Handling Refactor: Introduced specific exceptions (`EngineException`, `ParserException`) instead of generic ones. For example, connector/tool not found throws `EngineException`; invalid XML throws `ParserException` with detailed libxml errors.
-
Step Refactorizations: `StepInterface` now enforces a clean `execute(StateManager $state): mixed` contract. Subclasses like `PromptStep` handle JSON decoding safely with `json_last_error()` checks. `ConditionalStep` uses regex for simple condition parsing (e.g., equality checks) with fallback to truthy evaluation.
-
Optimizations: Parsing uses efficient SimpleXML with internal error suppression and clear/reset. Engine execution avoids recursion depth issues by linear step processing; conditional branches are executed sequentially without loops. Variable resolution in `StateManager` (not shown but inferred) uses string replacement for efficiency.
These changes improve reliability, reduce boilerplate, and enhance debuggability.
Installation Instructions
-
Prerequisites: Ensure PHP 8.0+ is installed with SimpleXML extension enabled.
-
Clone or Download: Place the project in your workspace (e.g., `c:/Users/kuasa/Documents/VScode/poml`).
-
Install Dependencies:
composer install
This sets up the autoloader; no additional packages are needed.
-
Configuration:
- For testing: No setup required (uses DummyConnector).
- For real AI models: Set environment variables, e.g., `export OPENAI_API_KEY=your_key_here` (or use `putenv()` in PHP).
- Edit connectors in code if needed (e.g., register `HttpConnector` with API URL and key).
-
Execution:
- Run the example: `php example.php`
- This parses a sample POML XML, registers a tool and connector, and executes the workflow, printing outputs.
If issues arise (e.g., missing extensions), check PHP error logs.
Usage Examples
The example.php demonstrates a full workflow: generating JSON, extracting a variable, and conditionally calling a tool.
<?php
declare(strict_types=1);
require 'vendor/autoload.php';
use Poml\Toolkit\OrchestrationEngine;
use Poml\Toolkit\Parsing\PomlParser;
use Poml\Toolkit\Connectors\DummyConnector;
use Poml\Toolkit\Connectors\HttpConnector;
echo "--- POML Toolkit Advanced Example ---\n\n";
// 1. Define a tool (a simple PHP function)
function send_notification(string $message): string {
$output = "NOTIFICATION SENT: {$message}\n";
echo $output;
return "Sent successfully";
}
// 2. Define the complex POML workflow
$pomlXml = <<<'XML'
<?xml version="1.0"?>
<poml>
<prompt model="json-generator">
<message>Generate a JSON object with a user 'name' and a 'status' of 'active'.</message>
<!-- Extract the status field from the JSON response into a variable called 'user_status' -->
<variable name="user_status" from="json" path="status"/>
</prompt>
<!-- Check the variable we just extracted -->
<if condition="{{user_status}} == 'active'">
<tool function="notify">
<param name="message">User is active, proceeding to next step.</param>
</tool>
<else>
<tool function="notify">
<param name="message">User is not active, aborting.</param>
</tool>
</else>
</if>
</poml>
XML;
// 3. Set up the Orchestration Engine
$engine = new OrchestrationEngine();
$parser = new PomlParser();
// 4. Register tools and connectors
$engine->registerTool('notify', 'send_notification');
// Use a DummyConnector that returns a valid JSON for this example.
class JsonDummyConnector extends DummyConnector {
public function execute(string $prompt): string {
return '{"name": "John Doe", "status": "active"}';
}
}
$engine->registerConnector('json-generator', new JsonDummyConnector());
/*
To use a real HTTP endpoint, you would do this instead:
$apiKey = getenv('OPENAI_API_KEY'); // It's best practice to use environment variables
$apiUrl = 'https://api.openai.com/v1/chat/completions';
$engine->registerConnector(
'gpt-4',
new HttpConnector($apiUrl, 'gpt-4', $apiKey)
);
*/
// 5. Parse and run the orchestration
try {
echo "Parsing POML...\n";
$orchestration = $parser->parse($pomlXml);
echo "Running orchestration...\n\n";
$finalResult = $engine->run($orchestration);
echo "\nOrchestration finished.\n";
echo "Final Result (from 'last_result'): " . print_r($finalResult, true) . "\n";
} catch (\Exception $e) {
echo "\nAn error occurred: " . $e->getMessage() . "\n";
echo "Trace: \n" . $e->getTraceAsString() . "\n";
}
?>
Expected output includes parsed POML execution, notification sent, and final result.
Best Practices
-
Environment Variables: Always use `getenv()` for sensitive data like API keys to avoid hardcoding.
-
Input Validation: In custom tools, validate parameters to prevent injection attacks (e.g., sanitize strings).
-
Error Handling: Wrap `run()` in try-catch to handle `EngineException` or `ParserException` gracefully.
-
Testing: Use `DummyConnector` for unit tests; mock responses for integration.
-
Scalability: For complex workflows, break into multiple orchestrations; monitor state size to avoid memory issues.
-
POML Design: Keep XML concise; use meaningful variable names; test conditions thoroughly.
Security and Vulnerability Notes
-
Resolved Vulnerabilities:
- Connector Errors: Previously unhandled missing connectors now throw `EngineException`, preventing silent failures.
- Parsing Issues: XML parsing uses `libxml_use_internal_errors(true)` to suppress warnings and throws `ParserException` on invalid structure, mitigating XML injection risks.
- Tool Invocation: Parameters are resolved but not executed as code; use strict types in tools to avoid type errors.
-
Corrected Inefficiencies:
- Parsing Optimization: SimpleXML is lightweight; no recursive parsing depth issues. Variable extraction is top-level only for speed.
- Condition Evaluation: Regex-based for simple ops, avoiding full expression parsers; no infinite loops as branches are linear.
-
Security Considerations:
- Input Validation: POML XML should be trusted or sanitized externally; `resolve()` in StateManager replaces `{{var}}` but doesn't execute code?extend with filters if needed.
- API Keys: Never commit keys; use `.env` files (add to `.gitignore`).
- JSON Handling: `json_decode` with error checks prevents malformed response crashes.
- Recommendations: Run in isolated environments for untrusted POML; audit tools for side effects; use HTTPS for `HttpConnector`.
For contributions or issues, please refer to the code structure in src/.