Project Releases
This page will contain links to code repositories, datasets, model releases, and evaluation frameworks as they become available during the course of this research.
Shadowdark QA Bench One
A set of QA for Shadowdark RPG across multiple categories, used to gauge knowledge retention of RPG rules.
Upcoming Releases
- GM-Eval Framework: A set of metrics and evaluation scenarios for testing LLM performance as Game Masters
- Rules Engine: Custom tools for LLMs to interact with TTRPGs.