{"type":"video","version":"1.0","html":"<iframe src=\"https://www.loom.com/embed/0f77fc9251da4c528fa19cd9e81f5d74\" frameborder=\"0\" width=\"2446\" height=\"1834\" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>","height":1834,"width":2446,"provider_name":"Loom","provider_url":"https://www.loom.com","thumbnail_height":1834,"thumbnail_width":2446,"thumbnail_url":"https://cdn.loom.com/sessions/thumbnails/0f77fc9251da4c528fa19cd9e81f5d74-29d55bb6ce5d9f5f.gif","duration":146.763,"title":"Cloud Build, Mantine, Go, sqlite Benchmarking and Model Evaluation","description":"I deployed our OpenAI benchmarking app on GCP using Cloud Run, and I sorted the auth issues by refreshing the OpenAI key. I selected a model like GPT 5.4 and ran a standard test, then reviewed metrics like prompt processing rate, time to first token, decode rate, and full total time. I also added more benchmarks, made model selection more robust, and switched to Cloud Code for security checks. In full evaluation mode I ran MMLU Pro with 20 questions and got 80 percent accuracy. No specific viewer action was requested."}