{"type":"video","version":"1.0","html":"<iframe src=\"https://www.loom.com/embed/2f85e69bec7041a9ba7f33f333656a51\" frameborder=\"0\" width=\"1110\" height=\"832\" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>","height":832,"width":1110,"provider_name":"Loom","provider_url":"https://www.loom.com","thumbnail_height":832,"thumbnail_width":1110,"thumbnail_url":"https://cdn.loom.com/sessions/thumbnails/2f85e69bec7041a9ba7f33f333656a51-ebd40feb3353ed05.gif","duration":283.998,"title":"Patronus AI's Percival Debugging a Toy Claims Agent","description":"In this video, I walk you through how to use our Patronus platform to trace, evaluate, and enhance the performance of your GenAI applications, specifically in the context of developing a claims processing agent. We cover the setup process, including importing necessary packages, defining tools, and running experiments to compare our agent's outputs against a golden dataset. I highlight how to investigate traces for errors and utilize our agent judge for automatic insights into performance issues. I encourage you to take these insights back to your development workflow to refine your prompts and improve your agent's effectiveness. Finally, I suggest rerunning the experiment after making adjustments to see the improvements in action."}