CaseCatalyst — Real Internships from Real Companies

Create a scoring rubric to evaluate a customer-facing chatbot across accuracy, tone, and safety, then score 20 sample conversations with it.

Evalon helps teams measure whether their chatbots are actually good. We need a rubric we can hand to a non-expert reviewer and trust the scores. Deliverable: a scoring rubric covering accuracy, tone, and safety, with clear 1-5 anchors for each dimension (what a 2 looks like vs. a 4). Then apply it to the 20 sample conversations we provide and summarize where the bot is weakest. You'll get the 20 transcripts, a short description of the bot's purpose, and one Q&A round with our evaluation lead.