Skip to content

LongMemEval benchmarking#80

Draft
danmichaeljones wants to merge 11 commits into
mainfrom
dan/engram_longmemeval
Draft

LongMemEval benchmarking#80
danmichaeljones wants to merge 11 commits into
mainfrom
dan/engram_longmemeval

Conversation

@danmichaeljones

@danmichaeljones danmichaeljones commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator
  • Loading: makes polling optional, loading is async and done concurrently over users (i.e., add items i for all users concurrently, then all items i+1 etc, so within user is still in-order)
  • Benchmark: EngramDSPyAgent supports async
  • Add separate script to poll progress after submitting (so you can check in on progress later), --tenants flag lists tenants split by done/still running
  • Small answer prompt changes, memories now sorted by time to help temporal resolution

@orca-security-eu orca-security-eu Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Infrastructure as Code high 0   medium 0   low 0   info 0 View in Orca
Passed Passed SAST high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Vulnerabilities high 0   medium 0   low 0   info 0 View in Orca

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant