Evaluation
How Prisma evaluates LLM output quality
Evaluation is the process of scoring LLM outputs against quality criteria. Prisma supports both automated metrics and human review.
Evaluation Types
- Automated — Run predefined or custom metrics against outputs
- Human-in-the-Loop — Route outputs to human reviewers for manual scoring
- Hybrid — Combine automated pre-screening with human review for edge cases
Metrics
Metrics are scoring functions that assess output quality. Prisma includes built-in metrics for:
- Hallucination detection
- Relevance scoring
- Toxicity detection
- Custom policy compliance

