OpenTelemetry-based evaluation dashboard for bilingual AI agents
Customer support for e-commerce queries
GPT-4-Turbo
1.2.0
English, Arabic
11/15/2024
| Test Case | Language | Score | Latency | Status | |
|---|---|---|---|---|---|
| Product Return Request | English | 1240ms | Pass | ||
| Shipping Delay Complaint | English | 1180ms | Fail | ||
| طلب استرجاع منتج | العربية | 1320ms | Pass | ||
| Product Availability Check | English | 890ms | Pass | ||
| استفسار عن طرق الدفع | العربية | 1050ms | Pass | ||
| Discount Code Issue | English | 1420ms | Pass | ||
| Order Status Inquiry | English | 980ms | Pass | ||
| شكوى جودة المنتج | العربية | 1280ms | Pass |
Total Tests
8
Passed
7
Failed
1
Avg Score
90.8%