{"type":"video","version":"1.0","html":"<iframe src=\"https://www.loom.com/embed/6f604ce9c4604c35822debdf8b326ec3\" frameborder=\"0\" width=\"1416\" height=\"1062\" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>","height":1062,"width":1416,"provider_name":"Loom","provider_url":"https://www.loom.com","thumbnail_height":1062,"thumbnail_width":1416,"thumbnail_url":"https://cdn.loom.com/sessions/thumbnails/6f604ce9c4604c35822debdf8b326ec3-532369ac46e47c06.gif","duration":315.744,"title":"Backtesting Prompts for Improved Outputs 🔍","description":"In this video, I walk you through a more complex evaluation process using a new dataset populated with production data from our prompt. We created a backtest evaluation called \"backtest chef\" to analyze how changes in our prompt affect outputs, specifically focusing on the inclusion of ingredients in a bulleted list. I demonstrate how to compare the old and new responses using a diff column to highlight the differences. I encourage you to explore these changes and consider how this method can help us bootstrap datasets effectively. Please take a moment to review the outputs and think about how we can apply this in our future work."}