evaluate() is the only function on the SDK’s hot path. Every governed tool call goes through it. This page traces what it does, in order.
Step by step
DLP scan (optional)
If DLP is enabled, the detector scans
metadata.input (and args/kwargs) for patterns. The detection is summarized into dlp_detected, dlp_severity, dlp_types and added to the request. ~50µs for regex; ~1ms for Presidio.Build the EvaluationRequest
The SDK assembles
{ tool_name, agent_id, ...metadata, ...dlp_fields }. agent_id is always the JWT-bound id; passing a different value is silently ignored.Evaluate against the bundle
The in-process evaluator walks each policy in the bundle. For each policy: walk rules top-to-bottom. First matching rule wins; its
effect becomes the result. If no rule matches, the policy’s defaultEffect applies. If multiple policies match, deny wins (any deny short-circuits the bundle).Build the AuditEvent
The SDK packs
{ ts, agentId, sessionId, toolName, decision, policyId, policyVersion, latencyMs, metadata, traceId? }. Deny code and reason go into metadata.denyCode / metadata.denyReason. DLP detection goes into metadata.dlp.Upload trace (if attached)
If a
TraceContext was passed, the SDK appends a ToolCallMessage and uploads the running messages array to Rubric. The returned traceId and position are attached to the audit event.Enqueue audit event
The event goes into the audit sink’s in-memory queue. The flush thread batches up to 100 events or 1 second of latency and ships them to Rubric. Your code does not wait on the network.
Latency
The synchronous portion ofevaluate() is dominated by:
- Pure-Python evaluator: ~50–200µs depending on bundle size.
- Native (Rust) evaluator (
pip install rubric-app[runtime]): ~5–20µs. - DLP scan: ~50µs (regex), ~1ms (Presidio).
- Trace upload (if attached): 2–5ms — runs synchronously to give you
traceIdfor the audit event. If you don’t need traces, omit the parameter.
evaluate() finishes in well under 1ms on a modern laptop. With traces it’s bounded by your network RTT to Rubric.
What goes in the audit event
When evaluation fails
The SDK is conservative: if anything goes wrong insideevaluate(), you get an EvaluationResult with the default-allow decision rather than an exception. Specifically:
- No bundle loaded yet (cold-start race): default-allow,
result.matchedPolicyId = None. - Bundle parse error: same — default-allow, error logged.
- DLP detector raises: the detector is treated as no-detection, error logged.
- Trace upload fails: the audit event is still enqueued without the
traceId, error logged.
spec.defaultEffect: deny in your policy and let the evaluator return that explicitly.
Result shape
reason and code are populated for denies (from the matched rule’s id and description). For allows they’re None.