{"type":"video","version":"1.0","html":"<iframe src=\"https://www.loom.com/embed/f67000afe9184892aa08a3bbcb194892\" frameborder=\"0\" width=\"1920\" height=\"1440\" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>","height":1440,"width":1920,"provider_name":"Loom","provider_url":"https://www.loom.com","thumbnail_height":1440,"thumbnail_width":1920,"thumbnail_url":"https://cdn.loom.com/sessions/thumbnails/f67000afe9184892aa08a3bbcb194892-53cae458864e1af6.gif","duration":221.555,"title":"Benchmarking Long-Term Memory Systems for Agents","description":"Hey everyone, I’m excited to share my work on Memory Bench, a universal benchmarking tool for long-term memory systems in agents. The current landscape is fragmented with numerous benchmarks and providers, making it difficult to answer fundamental questions about memory back-ends. Our approach focuses on three universal operations: adding, retrieving, and deleting memory, while acknowledging the diverse semantics across providers. I encourage you to take a look at our research snapshot, as it highlights the semantic gaps that impact fair comparisons and why we report more than just accuracy. Cheers!"}