Patronus AI's Percival Debugging a Toy Claims Agent

video1.0<iframe src="https://www.loom.com/embed/2f85e69bec7041a9ba7f33f333656a51" frameborder="0" width="1110" height="832" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>8321110Loomhttps://www.loom.com8321110https://cdn.loom.com/sessions/thumbnails/2f85e69bec7041a9ba7f33f333656a51-ebd40feb3353ed05.gif283.998Patronus AI's Percival Debugging a Toy Claims AgentIn this video, I walk you through how to use our Patronus platform to trace, evaluate, and enhance the performance of your GenAI applications, specifically in the context of developing a claims processing agent. We cover the setup process, including importing necessary packages, defining tools, and running experiments to compare our agent's outputs against a golden dataset. I highlight how to investigate traces for errors and utilize our agent judge for automatic insights into performance issues. I encourage you to take these insights back to your development workflow to refine your prompts and improve your agent's effectiveness. Finally, I suggest rerunning the experiment after making adjustments to see the improvements in action.