SkillArena


Neural Podium | Skill Area | Cortex Protocol

SKILL ARENA: Project Manager Skill

COHERENCE0%
RUBRICS0%
ENTITY MATCH0%
CITATIONS0%
CoherenceRubricsCitationsEntity Match
Performance Evaluation Design v1.0
MetricEvaluation Target & Methodology
COHERENCEAnalyzes output coherence and narrative consistency. Utilizes a secondary "Judge LLM" to detect semantic drift, contradictory statements, or non-sequitur logic loops in long-form generation.
RUBRICSA LLM-driven evaluation layer check based on SME generated grading rubrics for PM performance. Penalizes scores based on the count of missed performance or violated negative constraints (e.g., "Do not discuss medical advice").
CITATIONSCalculates the grounding accuracy via count of assertions or discussions based on validated citations.
ENTITY MATCHNamed Entity Recognition check. Validates the accuracy of all Proper Nouns, Dates, and Technical Specifications against the input data. Score drop for incorrect or hallucinated entities (e.g., "15,000" vs "150,00").

Ben Truong
Ben Truong
ML Engineer

I build ML/GenAI apps for the cloud and private clusters.

Related