(This is a recap/review of the 2023 MIT Mystery Hunt, which happened this month. Puzzles, solutions,and solving stats can currently be found here. This recap will contain spoilers.)
Before I start talking about length issues, after posting Part 1, I remembered something else I loved about the pre-AI portion of the Hunt; the automated vote-on-a-response team interactions were brilliant. They were a great way to enforce a team bonding experience and immediately advance the story without requiring the constructing team to invest live person-hours, and I think my team laughed out loud much more than we would in a typical live interaction. The idea was great and the writing was great. I’m torn because this felt really specific to this year’s theme and artistic design, but at the same time I want everyone to steal it.
Now.
In the preamble to a recent Hunt writeup, a member of Cardinality amusingly said, “I am not a titan of the community and I will not share anecdotes about how this puzzle reminds me of this meta back in the 1926 Mystery Hunt where they gave us 3 rocks which we had to bang together in the right way to make the correct fires.” Of course not! That’s MY job. So let’s start with a brief history (at least within my time spent with Mystery Hunt, which is up to 25 years now) of Hunts that were too f***ing long.
If you analyze Mystery Hunts that went long from at least one team’s perspective, you’ll find that they generally fall into one of two categories:
- Hunts where one or more metapuzzles end up being killers and block teams’ progress for disproportionate periods of time. Let’s call these mettlenecks (short for meta bottlenecks).
- Hunts where the act of solving puzzles to get to the metapuzzles was so overwhelming that the construction team has to modify Hunt procedures so that even the teams in contention get credit for them without solving them entirely on their own. I don’t have as catchy a name for these, but I’m going with forcefeeds.
I’ve been on the constructing end of my share of mettlenecks, including 2009 (Zyzzlvaria, where different teams got stuck on different one-meta-left situations, which made for an exciting finish), 2019 (Holiday Forest, where the last two metas were very hard to solve without almost all of the feeder answers), and 2005 (Normalville, where one team had nothing left to solve but one metapuzzle for 24 hours).
The interesting thing about a mettleneck is that often it gets bad reviews from the lead teams that spent a lot of time staring at the same meta (which is booooooring), but disproportionately good reviews from casual/middle-tier teams, because they’re able to proceed through a lot of the Hunt before a winner is announced, as there’s not much bottlenecking before the bottleneck. I do think this is still a negative result despite positive side effects, because while it’s good for a majority of teams, you don’t want anybody to work hard solving your Hunt and then end up frustrated.
Forcefeeds, on the other hand, seem to rotate into Mystery Hunt periodically. There have been a lot of jokes recently about years that end in threes, because three of the most notable forcefeeds are 2003 (The Matrix, which was ahead of its time in that it would NOT seem too long today), 2013 (Coin Heist), and now 2023 (MATE? Puzzle Factory? Relentless AI Assault?). One of the other most notorious examples was 2004 (Time Bandits), which managed to follow up Matrix with something even more forcefeedy. The same team that won in 2003 also won in 2013 and went in determined NOT to repeat mistakes of the past, and I feel that was a success. I was on the writing team for 2014 and not 2004, but I can’t take credit for the changes… the leadership of that team made very good choices, and I mostly served in an advisory role. I did insist that we keep our endgame operating in full for as many teams as possible once the coin was found, even though it was a pain in the ass, and I’m very glad we did so. 2008 (Murder Mystery) had some forcefeed elements as well, though I don’t remember the details as vividly.
The hallmark of a forcefeed Hunt is that at some point, the construction team realizes things are not proceeding on pace, and that something has to be done. In 2003/2004, that involved hinting puzzles liberally when teams reached even the smallest bit of resistance; in 2003 we were actually assigned a dedicated in-HQ hinter for an extended period of time. In 2013 and 2023, the constructing team took things one step further and gave out a large quantity of what I call “nukes,” the ability to get free answers for puzzles without any idea of how to solve them. I want to be clear that given the pace of both Hunts, this practice was probably necessary, since the Hunt would have gone well past Monday if teams were going to solve what they were expected to solve. But I want to highlight why you don’t want to find yourself in a situation where this is necessary.
When Setec first earned a couple of nukes, we met and talked about our strategies for using them, and on a related note, what we wanted to get out of Hunt. I said that my priority was solving metapuzzles and opening rounds, because that’s what I find exciting. Several other team members agreed. But cut to 24 hours later, and opening rounds wasn’t fun anymore, because we weren’t doing it by solving puzzles… we were doing it by giving up on puzzles and pressing buttons. When I saw the notifications that Ascent and Conjuri’s Quest, I had a “meh” feeling I’ve never had when opening new rounds before, because I didn’t feel like we’d earned access.
This is why I didn’t have fun with the AI rounds. It seemed pointless to work on a puzzle when we could just as easily flip a switch and disappear. At one point we opened Flooded Caves, which is a set of seventeen Cave logic puzzle variants. I love abstract logic puzzles, and one of our captains, Tanis, asked me if I was going to solve this, or if we should nuke it. We counted the puzzles and realized I’d probably spend the rest of Hunt solving it, or worse, I’d spend hours on it and then we’d give up and get the answer for free anyway. We nuked it immediately. We basically spent Sunday looking at puzzles, deciding whether they seemed approachable enough to bother with, and often deciding no. This wasn’t just us… the Hunt stats indicate that the entire Ascent ROUND (meta not included) had 18 successful solves, and 147 nukes. Teams didn’t complete this round. They took an elevator that went past it.
Even when we’re stuck on a meta I usually enjoy Mystery Hunt, but this year’s Sunday afternoon was the second time I remember legitimately not having fun. The other time was 2013, and with both data points in hand, I now assume the free answers are to blame. Solving feels pointless when puzzles are spontaneously combusting around you.
So what do I think Teammate could have done to avoid this? It’s easy for me or for anyone to criticize from afar, because writing Hunt is time-consuming and sometimes thankless work, and tuning/pacing Hunt is extremely difficult (as I stated above, I’ve been on teams that have messed it up, though more frequently through problematic metapuzzles). But we grow as a community by sharing insight, and I’ve been around long enough to have a lot of perspective, so here were some of my observations.
Many individual puzzles were bigger than they should have been. I already mentioned the seventeen caves above that caused me not to attempt to solve any of them. I also mentioned the number 147, which coincidentally happens to be the number of morals you had to identify in Moral of the Story, after finding a message in 147 typos. It’s unlikely that any single solver wants to do anything 147 times. Hunt is, of course, a team activity, so you probably won’t have a single solver doing it. But that still means you’re devoting multiple people to stare at a single puzzle for an extended period of time. When you have lots of puzzles that are really big (the one with the quiz bowl questions also comes to mind, which looked like a fun idea iterated way too many times for me to want to solve), you are spreading non-giant teams thin, and the vast majority of teams solving Hunt are non-giant.
Testing may have needed to take into account that most teams are not like Teammate. This is very similar to what I said in 2013, when I felt like Manic Sages wrote the perfect Hunt to be solved by Manic Sages… who were unfortunately the only team not solving that year. After the solutions were posted, I saw a lot of people reference the authors’ note for Terminal, which begins, “This puzzle being solvable at all was honestly a huge surprise to me.” That is a MASSIVE red flag, and the note goes on to explain that the puzzle was made harder because testsolvers got better and better at solving clues. I am curious how long this process took, and whether it was considered that in practice, this would only be one of many puzzles teams were contending with at once. For what it’s worth, we thought Terminal was a fun idea, and we had at least a dozen people put hours into trying to solve it. After expending those hours, we still had less than half the grid filled. We nuked it. Moral: Don’t make puzzles harder because your testers are surprisingly brilliant; your testers won last year’s Mystery Hunt. (There are similar “let’s make it harder” stories from 2004 and 2013, which is not a coincidence.)
Number of puzzles is not a good gauge of size/difficulty. The number of puzzles in Hunt has oscillated, but overall it’s grown close to linearly over the last few decades… that might be okay, because solving resources and ability are also growing. But the definition of what a puzzle is is also growing; the puzzles I wrote in 2000/2002/2005 would barely qualify in the modern era. And with really chunky online hunts like Galactic and Teammate and QoDE and Silph pushing boundaries throughout the year, people’s expectations for how long a puzzle can take and be reasonable are expanding. The problem with this is that if the number of puzzles is O(n), and the size of a puzzle is O(n), the total size of the Hunt is actually O(n^2), which is a terrifying rate of growth. The team that won has been reported publicly to have over 160 people, though someone on the team reported that they had about 170 unique solvers, and only 120 of them were active solvers. ALL THREE OF THOSE NUMBERS ARE TOO BIG. And it is vital that TFKA…TTBNL does not write a Hunt with a team of that size in mind, because few of those teams exist (and in my opinion, none should).
The first part of the Hunt did not feel friendly to casual solvers. I was genuinely surprised at wrap-up when Teammate said one of their goals was to support casual teams, because I found the first round and meta to be much less accessible than the intro phases of recent Hunts. The first puzzle I worked on, Museum Rules, immediately subverted the expectation that the copy-to-clipboard feature would consistently work, and the aha was very challenging (and even once you got it you still had a bunch of nontrivial superimposing to do). Apples Plus Bananas required two ahas (you need PLUs and you need to get prime totals) and turned into a significant logic puzzle if you assumed those things, which you could not necessarily confirm before solving. I think these are tough puzzles as an experienced solver, and I can’t imagine how an MIT frosh who wandered in and wants to see what all the puzzling was about would navigate them. As for the meta, I recognized what to do with it quickly because I’ve solved many Anglers/Numberlinks. Most newbies haven’t. I encountered very few easy puzzles in this Hunt in general, and I was surprised that the ones I did find most approachable weren’t generally at the beginning.
The recap over at Fort & Forge predicts that my Part 2 thesis will be that Teammate should have cut Part 3 of the Hunt entirely. Actually that was my Part 1 thesis, so maybe I was too subtle about it? As it stands, if Part 1 and Part 2 were going to be what they ended up being, then yes, I think Part 3 should have been cut (or more specifically, replaced with a single round or more involved endgame that introduced and neatly resolved the multiple-AI story). I think if Teammate really wanted to have Part 3, all the parts needed to be smaller, both in terms of number of rounds, number of puzzles, and the size/complexity of the puzzles themselves. I get the desire to mess with structure… Zyzzlvaria had a second phase that was all about messing with structure, and while Holiday Forest only had one structural innovation, carrying it out to the extent we wanted caused us to include more puzzles than we should have. But there were lots of fun structural things in the Museum and Factory, and frankly, I found the Innovations and Factory Floor metas far more interesting than anything I saw in the AI answer format gimmicks or metas. But even if you really like Part 3, I don’t think it’s defensible to say that Parts 1, 2, and 3 all fit into this Hunt as is, because the stats show that teams didn’t solve Part 3.
At the beginning of this rant, I pointed out that forcefeeding is a pattern that occurred ten and twenty years ago. One team responded to it by making things even bigger (and arguably more poorly edited). Another team–okay, the same team ten years later–looked at the issues and intentionally tried to fix them. I’m a little worried that, because this year’s winners are a giant team, they will all want to contribute and will write something that can only be solved by giant teams. Giant Hunts are what cause first-time constructors to be scared of winning; they assume they also have to write something giant. You don’t. Setec has had a writing philosophy for years to “write for the middle.” During Wrap-Up, someone said Mystery Hunt is growing and growing, and so it needs money or it will die. Donating to Hunt is a good thing, but please understand that if it does not keep growing, it will not die. In fact, it could use some shrinking and breathing room.
Despite my criticisms, thank you to Teammate for making something creative and cool. I thought the last Teammate Hunt excelled, especially for an online puzzlehunt, in terms of cohesive puzzles, art, and story, and I found the same to be true for the portion of this Hunt I was able to enjoy before time intervened and puzzles started disintegrating all around us. I know from experience that I can have a lot of fun your puzzles and structures, and I hope you recover enough to write another Teammate Hunt soon. But since you effectively just wrote 1.5 Mystery Hunts, I won’t blame you if you need a break.