Skip to main content
INFORMATIVEDRAFTDocumentation Governance

Evaluation Dimensions

1. Purpose

This document defines the axes (dimensions) used to evaluate MPLP conformance.

Each dimension answers a specific question about the evidence. Together, they provide a complete picture of conformance.

2. Evaluation Axes

2.1 Schema Validity

Question: Do all objects pass JSON Schema validation?

RequirementEvidencePass Criteria
Context validmplp-context.schema.jsonNo validation errors
Plan validmplp-plan.schema.jsonNo validation errors
Trace validmplp-trace.schema.jsonNo validation errors
Confirm validmplp-confirm.schema.jsonNo validation errors

Evaluation Method: Run mplp validate against all exported artifacts.

2.2 Lifecycle Completeness

Question: Is the Plan → Trace chain complete?

RequirementEvidencePass Criteria
Context existsContext objectcontext_id present
Plan linkedPlan.context_idReferences valid Context
Trace linkedTrace.context_id, Trace.plan_idReferences valid Context and Plan
Steps tracedTrace.segments[]Every executed step has a segment

Evaluation Method: Traverse evidence chain, verify all links resolve.

2.3 Governance Gating

Question: Are high-risk actions gated by Confirm?

RequirementEvidencePass Criteria
Gated steps identifiedPlan.steps[].requires_confirmSteps marked when needed
Confirm objects existConfirm objectsOne per gated step
Decisions recordedConfirm.decisions[]Status is approved or rejected
Execution blockedTrace segmentsGated steps not executed until approved

Evaluation Method: Cross-reference Plan steps with Confirm objects.

2.4 Trace Integrity

Question: Can execution be reconstructed from Trace?

RequirementEvidencePass Criteria
Timestamps presentsegment.started_at, segment.finished_atISO-8601 format
Order determinableTimestampsNo logical conflicts
Parent-child validsegment.parent_span_idReferences valid parent
Status recordedsegment.statusOne of: completed, failed, skipped

Evaluation Method: Reconstruct timeline from Trace, verify logical consistency.

2.5 Failure Bounding

Question: Do failures produce recoverable states?

RequirementEvidencePass Criteria
Failures recordedTrace.segments[].status = 'failed'Explicit failure status
Recovery attemptedRecovery events or segmentsRetry, skip, or rollback recorded
Terminal state clearPlan.status or Trace.statusFinal status unambiguous
No orphaned stateAll objectsTerminal states are final

Evaluation Method: Identify failed segments, verify recovery or clean termination.

2.6 Version Declaration

Question: Is protocol version correctly declared?

RequirementEvidencePass Criteria
Protocol versionmeta.protocolVersionPresent in all objects
Schema versionmeta.schemaVersionPresent in all objects
Version matchAll objectsConsistent across evidence pack

Evaluation Method: Extract version from all objects, verify consistency.

3. Dimension-to-Class Mapping

Each conformance class requires passing specific dimensions:

DimensionL1L2L3
Schema Validity
Lifecycle Completeness
Governance Gating
Trace Integrity
Failure Bounding
Version Declaration

4. Evaluation Weight

All dimensions are binary (pass/fail). There is no weighting or scoring.

A single failure in any required dimension results in NON-CONFORMANT for that class.

5. Future Dimensions

The following dimensions are not evaluated in v1.0.0 but may be added in future versions:

Future DimensionQuestionStatus
Multi-agent coherenceDo agents coordinate correctly?Planned for v1.1
Performance boundsAre timeouts respected?Under consideration
Security boundaryIs sandboxing enforced?Under consideration

Scope: Defines 6 evaluation dimensions for v1.0.0
Exclusions: Scoring, weighting, future dimensions