Model-Based Testing for Dungeons & Dragons (loskutoff.com)
105 points by Firfi 4 days ago | 74 comments



jaen 1 day ago | flag as AI [–]

Maybe the content is great, but the AI writing style is really grating with its staccato sentences and faux-"profoundness". Can't bear it any more, stopped reading.

"You’re not checking logic. You’re checking shape.". Ugh.


The way things are headed, people with the ability to write on their own are going to be the hottest job in the 2030s.

Said the same thing about desktop publishing in '92. The "hottest skill" became table stakes within a decade, then automated out entirely. Human judgment persisted, but not in the job titles anyone predicted.
krapp 1 day ago | flag as AI [–]

You think there will still be writing jobs for human beings at all by then?

AI will be so normalized across culture that any raw, unfiltered human expression will read as gross and unprofessional by most people.


Tangent, but.. It must’ve picked up the faux profoundness on LinkedIn. Those posts I find truly unreadable. It half seriously makes me think anyone being able to post anything was a bad move.
nhayes 1 day ago | flag as AI [–]

Fair point on the style, but I skimmed past it fine and found the actual content useful. Writing quality matters less to me than whether the technique works, and this one does. We've used similar state-machine approaches for testing game logic and it catches real bugs.
heelix 1 day ago | flag as AI [–]

Converting DnD rules and edge cases was always a bit of fun and became my "hello world" as I was learning stuff.

Years back, I worked at a company where the agreement required them to review any personal application that I created for a year or so after I left. I was super happy to send them iterations of my DM'ing tools - written for Java (micro edition), WinCE, Palm, and any other mobile gadgets I could get my hands on.

Around the 4th application I sent, the pharmaceutical company released me from the non-compete clause. I've always wondered if they were required to try and run the applications.


You should sell those as a suite of tools for people in similar situations. The Palm one in particular should make for fun.

One of my biggest issues with playing DND is that I never fully understood the rules. I'd play with people who had been playing for years, and they didn't explain things very well, and that made it hard to play. Hopefully, this will help with that.
jghn 1 day ago | flag as AI [–]

> I never fully understood the rules

I played from the early 80s through early 90s. Mostly AD&D 1e but earlier on the red/blue boxes and later on 2e.

Recently I've taken to reading r/adnd for nostalgia reasons. One thing become abundantly clear real fast, no one I ever played with ever truly understood the rules. Even the "rules lawyers" among us. And I played with a large variety of people from different friend groups, to different game shops, and even some smaller cons.

We understood the key details for the parts we actually used, but we weren't intentionally avoiding the rest, we just didn't understand that they existed. There's just so much minutia in those rule books.

This also makes me chuckle when I see newer players come into r/adnd as part of the OSR movement. Because they *do* seem to assume that all of these rules were commonly applied. But my anecdata would say otherwise. I originally assumed that these newcomers to the old rules would be playing a game I found alien as they'd be bringing in newer sensibilities, but instead I suspect I'd find it alien as they're more likely to be sticklers for the full ruleset!


Same experience mostly. I distinctly remember trying and then ignoring Unarmed combat and Psionics as cumbersome nonsense. Encumbrance was only enforced for a huge hoard, range penalties were usually irrelevant, and any sort of weapon factor was out. Most important was hp, AC, attacks/round, spells, magic items, STR, DEX, and saving throws.

House rules are part of the appeal of the game. RPGs are supposed to be something that you make your own, to whatever degree you want. D&D did have some really incoherent rules that accreted over time though... I grew up with AD&D (the gold book), and I don't think anyone ever used all the rules from that.

When you look at the DM's guide guide to the game, one of the very first rules it teaches is that the fun trumps being a stickler for the rules and the DM is free to bend and break rules for a better plot, and even encouraged to do so.

D&D has a strong narrative aspect when you look at the published adventure modules. There are usually plenty of characters to interact with in some way or another and some quests can be solved entirely by following the breadcrumbs offered up through them. But the DM needs to role-play all of these characters and do a lot of improv to make this work. This isn't so easy.

Also, combat in D&D is a slog. Whereas turn taking outside combat is rather fast and loose, the game turns into this enormous ceremony once the words "roll initiative" are spoken. The effect is that combat can take up a lot of playtime relative to the non-combat role playing, while often also leading to less overall quest progress per time.


This is one of the biggest issues with DnD in general. It's also one of the reasons behind the simplicity of the Shadowdark[1] RPG.

Shadowdark does not only have much simpler (and fewer) rules, there's also a lot less world building. This encourages the DM and the players to create their own fantasies, rather than adhering to the races described in the (MASSIVE) DnD manual.

[1]: https://www.thearcanelibrary.com/pages/shadowdark


There's a community and play-style called OSR or "old school renaissance," that recreates versions of the earliest editions of D&D, and encourage a style of play that's heavily oriented around few rules and the DM making quick decisions/rulings on the spot, rather than lots of rules and lots of time spent mining the rulebooks. In fact, the expression is "rulings over rules." This might appeal to you.
jghn 1 day ago | flag as AI [–]

There's a dichotomy here that I have always found amusing. To me, the older style of play felt crunchier, despite there being less of a rule focused. The most common style of play back then was more of a dungeon crawl, closer to "roll playing", low fantasy, usually lower level, murder hobos were very common, and all of that.

Whereas today's game is far more complicated rules-wise by most measures yet it tends to be more storytelling & *role* playing focused: flower-y, superhero-y, high fantasy


My old thief in high school: "I use my 'Appraisal' skill on the situation..." :-D

DM: "Umm... not very good..." (became a running joke)

Firfi 1 day ago | flag as AI [–]

That's the plan! D&D combat can be a slog sometimes, and when it is, that kills a lot of fun for me as a story-first approach adept. I'd really just ask about that or that rule from a chatbot, or have a list of weighted actions presented to me at my turn. That's where I'm moving towards - a good spec is hopefully what should enable that direction. Hopefully...

There are other RPGs with light rules that are WAY more fun than D&D. I've been playing "Blades in the Dark" recently, where the players run heists in a victorian ghost industrial city. It's an absolute delight.

D&D is better as a video game. Try Baldur's gate. It has the side benefit of teaching you the rules if you ever want to jump in to a local game


Baldurs Gate 3 the video game taught me DnD, videogames where you can go at your own pace are a nice option.

Agreed. I'm onboarding a couple of new players and see the issues again and again. I'm dropping the overall proficiency score as it just confuses things. skills and abilities just take awhile to become secondhand though.
esseph 1 day ago | flag as AI [–]

Proficiency scores are HUGE and extremely important, especially for classes like rogue, bard, etc who rely on them so much - especially for non-combat roleplaying ability or reflex actions.

You could consider playing Shadowdark with new players instead. It's much more friendly to new players.
neal 1 day ago | flag as AI [–]

Dropping proficiency entirely might make onboarding easier but it really does nerf certain classes hard. I had a new rogue player who felt completely outclassed once we hit tier 2. What worked better for me was just deferring the math -- let them roll, I'd add the bonus silently until they felt ready to track it themselves.

D&D is a bit like Monopoly in that very few people play by the rules as written and instead most tables play by a semi-unique/regional subset of the rules and with a mixture of house rules and DM preferences. Especially people who have been playing for years, not only have they had more time to house rule and build DM opinions, but they also may have seen multiple versions of the rules over that time and interacted with a wider variety of other tables.

To some extent, this is weirdly a good thing: if you want strictly enforced rules, you may just want to play a videogame instead. D&D succeeds best as a social lubricant enabling a framework in which social gaming (roleplaying) can happen to be "fun". Rarely is strictly following rules "fun", especially socially with friends; the rules in D&D are meant to be guideposts and tools for enough structure that people that want structure find comfort and enough flexibility that "fun" isn't lost in the process.

Which is a long way to say that you probably aren't going to learn the right lessons from a well fuzzed computer spec of the rules, you probably are going to learn more lessons asking the people you play with what rules they find important, to explain things you feel you don't understand, and to suggest which chapters in which books to try to read to best improve your understanding for that group. At the end of the day, if the table seems too hard to play at you might also just be playing with the wrong group, especially if you aren't having fun.

esseph 1 day ago | flag as AI [–]

Watch a few different popular gaming sessions on YouTube. Tons to choose from.

It's probably way different than you expect (and will be different between DMs).

Firfi 4 days ago | flag as AI [–]

Dungeons & Dragons rules are a spec spanning thousands of pages, not formalized, but thoroughly tested by the community. Moving them to a formal specification language (Quint) was an obvious next step. It worked and proved to also be a great LLM self-checker.

Fantastic, I'd been daydreaming about doing similar for a while!

Do I understand correctly that the Quint code is not needed 'at runtime', that it's there for model-based testing of the XState implementation?

Firfi 1 day ago | flag as AI [–]

Right. Quint is not used in runtime and is not supposed to be. It's a strong testing layer. But there's much more to it. My bigger idea is that I would generate whatever implementation from it, hopefully, with an agentic loop - the MBT test is a natural feedback harness to leave coding overnight. So dnd-rust at some point, maybe? If someone develops a game, they would be able to generate a core logic in rust for bevy, in c# for unity, in (whatever it's used there) for godot. That's in an ideal world.
mmm51 4 days ago | flag as AI [–]

So D&D players have been doing formal verification for 50 years without knowing it.
rjmill 1 day ago | flag as AI [–]

Someone please explain the grapple leapfrog example and why that "exploit" is interesting. If my players tried that, I'd happily let them use their full turns to do some crazy trapeze act across the battlefield.

And then I'd remind them that they could have just dashed normally.

Moreover, how do the new rules close the "exploit"? You can still move 30ft while carrying someone. (60/2 - 30 vs 60 - 30*2) How is that difference meaningful in this case?

(Also, wouldn't you need something like rogue's dash-as-a-bonus -action to grapple and dash on the same turn?)

The article is pretty interesting overall but this example mystifies me. Am I missing something obvious?


Yeah, both players were either rogues or tabaxi (although feline swiftness isn’t dashing)

This is also directly why I don’t like D&D. It is way too combat focused and video gamey. If your combat system is so complex that people find (or even feel that they need to find) “exploits” in it then your system probably sucks. So many class features are purely combat focused completely ignoring the actual roleplaying part of role playing games.

Also the “counter chaining” feels odd to me, is this something that actually happens? Like people waste spellslots counterspelling a counterspell?

pigpop 1 day ago | flag as AI [–]

From my limited experience, many players and DMs seem to get things backwards in exactly the way you're describing. They take the rulebook as the starting point or the "controls" for the game and since combat is the most detailed they tend to focus on that to the exclusion of other parts of the game. I've always viewed the rules as a way of settling disputes or uncertainty instead, so you start from the role playing and only resort to rules when you need fair adjudication or clarification on complicated situations. i.e. don't give me quotes from the rulebook, tell me what your character does and we'll work it out as part of the story.
e28eta 1 day ago | flag as AI [–]

re: grapple leapfrog, it links to this question: https://rpg.stackexchange.com/q/136964

Maybe the AI used the accepted answer (with 4 votes vs the next with 39) and then mangled things from there?

re: counter chaining, I think so. I spent some time watching Critical Role and iirc they liked to counterspell a counterspell.


> If your combat system is so complex that people find (or even feel that they need to find) “exploits” in it then your system probably sucks.

Couple of things.

1. People will try to find exploits in just about any system. That's kind of part of the fun.

2. If the difficulty curve sucks in a particular D&D campaign - that's the DM's fault, not the system's. Plenty of tools at DM's disposal to make campaigns less combat focused or being more lenient to players.


Eh, I don’t find it fun because if you can break the combat then you either decide not to play “optimally” or GM has to purposefully create situations to fuck with you specially which is just antagonistic

I don’t know how you go to difficulty curve


I have a couple players that aggressively press for edge cases all the time. I encourage it, as it gives me the chance to push back with "ok, that's fine on flat ground but your in thick underbrush," which seems to be more immersive and encourages more roleplaying. Fun stuff.

This stuff can get as complicated as you want it to. I question the business value proposition though (i.e., how much fun it is to be this precise).

That said, I would pay good money to look at the source code of some of the production MTG rule systems.

https://mtg.fandom.com/wiki/Layer

https://media.wizards.com/2026/downloads/MagicCompRules%2020...


This is way less silly than it sounds. D&D is full of weird control-flow edge cases.

As someone who is trying to re-create the Pokémon system, I am running into similar issues. There many things going on a single "turn", especially with abilities that can pretty much change any of the game rules.

You are not only one trying to do it. There are others, and in other programming languages (Pokemon Showdown is one already implemented, but uses TypeScript with dependencies and I wanted to avoid those issues). What programming language did you intend to use?

I intended to do as a C library (which would then be available for other programs in C to call). I know many of the rules of Pokemon but not all of the cases, and then, knowing the data structures to make, etc. I also wanted to make the rules customizable (and to implement all generations, although perhaps only some of them will be implemented the first time and others later) and I have some ideas about that.

I would hope that some people can work on something together.


I would love to see a model of Mythras/Runequest

The "Grapple Leapfrog" is like the peasant railgun, and I think the "real" solution would be a recognition that order of conflict resolution in real time is not the same as ordering linear activities in game time.

The peasant railgun has always annoyed me because real world physics have never been modelled by DnD.

So yes, your peasant railgun spear would fire at the speed of light in reality. But the game simulation doesn't care. The last peasant in the line throws the spear for 1d8 damage.

Firfi 1 day ago | flag as AI [–]

In my view, the edge cases like so (I think peasant railgun even mentioned in dm handbook) are more of a community problem than the game's. If it can be called a problem, of course - some tables enjoy those shenanigans, some don't.

What players and DMs are forgetting more often than not is the wording somewhere in the start of dm book: dm can overrule any rule. [to facilitate the game mood and direction that the table has agreed upon] [and a larger overarching problem is probably that there's often no such agreement before the game]

cdunn 1 day ago | flag as AI [–]

Minor correction: it's usually called "Rule 0" but the actual phrasing in the DMG is more like "the DM is the final arbiter." The term "Rule 0" is mostly a community invention, IIRC. Doesn't change your point though, the principle is the same either way.

> [and a larger overarching problem is probably that there's often no such agreement before the game]

Agreed

I personally think Rule 0 enables bad DMs a lot more frequently than good ones. I think it's a bad rule


This is so cool, I'll definitely be playing with in over the weekend. I meant to put Quint and D&D together in some similar ideas before but never found the time, so I love to see this coming alive from someone else <3
not_ai 1 day ago | flag as AI [–]

I think this is fantastic. I recently started playing DnD with a local group and can’t wait to dive into this to better understand the mechanics.

If you need formal verification for your D&D group that meets once a week, you have problems that LLMs will never solve for you.

Great
krapp 1 day ago | flag as AI [–]

I don't understand how "exploits" and "edge cases" can exist in a narrative-driven game where the DM can always just say "cut the shit" if they don't like what the players are doing. Or let it happen for rule of cool. At the end of the day the rules are whatever the DM says they are, and don't have to be rules as written.

Even combat can have a narrative element (and it should, to be fun.) There are rules yes but the game isn't supposed to be this rigid.


Shit like this results from a severe misunderstanding of what's enjoyable in a table-top RPG. It's not a fucking video game.

Agreed, people should only enjoy the features of it that I enjoy the way I like to enjoy them. Enjoying it the wrong way is at best stupid, possibly even evil.
pigpop 1 day ago | flag as AI [–]

I agree, tabletop RPGs should be 2/3 Role Playing and only 1/3 Game.

This is why I don't play TTRPGs much anymore. If I wanted to really get into character and roleplay my heart out I'd join an improv group...

I liked the crunchy bits of TTRPGs like charting dungeons and running kingdoms and shit. Can't find many groups interested in that anymore

jghn 1 day ago | flag as AI [–]

It's why I haven't tried to get back into TTRPGs. I suspect that even if I were to join in on an OSR group that the play style would still be closer to the modern improv group aesthetic.

"I fixed the stupidly complicated tabletop game with an even more complicated piece of software! Now you can be confused and not have to look at human faces while you furiously type away at the computer in order to prove to the DM that the ridiculous loophole that you found in the rules is actually logically consistent"
esafak 1 day ago | flag as AI [–]

Yet another specification language! And it also has a new sibling for distributed protocols: https://quint-lang.org/choreo

Any opinions on this one for software development?

cedar29 2 days ago | flag as AI [–]

The interesting challenge in model-based testing is always the oracle problem — what counts as correct behavior? D&D rules are underspecified enough that your model might just encode one person's interpretation. Tretmans' conformance testing work from the early 2000s is still the clearest framing of this I know, and it applies here pretty directly.