<?xml version="1.0" encoding="UTF-8"?><oembed><type>video</type><version>1.0</version><html>&lt;iframe src=&quot;https://www.loom.com/embed/0f0e4a992df44057a31fc479acdcf032&quot; frameborder=&quot;0&quot; width=&quot;1114&quot; height=&quot;835&quot; webkitallowfullscreen mozallowfullscreen allowfullscreen&gt;&lt;/iframe&gt;</html><height>835</height><width>1114</width><provider_name>Loom</provider_name><provider_url>https://www.loom.com</provider_url><thumbnail_height>835</thumbnail_height><thumbnail_width>1114</thumbnail_width><thumbnail_url>https://cdn.loom.com/sessions/thumbnails/0f0e4a992df44057a31fc479acdcf032-d1970493bca11083.gif</thumbnail_url><duration>200.011</duration><title>Introducing Judges: Enhancing AI Response Quality Monitoring</title><description>In this video, I introduced Judges, a new capability from LaunchDarkly that automatically evaluates the responses generated by AI models in our applications. Judges score outputs based on metrics like relevance, accuracy, and toxicity, allowing us to monitor quality over time. I explained how we can attach multiple Judges to an AI config and customize them to fit our business needs, including controlling costs by adjusting sampling percentages in different environments. I emphasized the importance of tracking these metrics to detect regressions and compare variations effectively. Please consider how you can implement Judges in your workflows to enhance our AI response quality.</description></oembed>