Natural Language Puzzling

Welcome back, Puzzlers! And welcome back to host Will Shortz!

Let's take a look at this week's challenge from the NPR Sunday Puzzle:

This week's challenge comes from listener Jim Vespe, of Mamaroneck, N.Y. Think of a a major American corporation of the past (two words, 15 letters altogether). Change the last three letters in the second word and the resulting phrase will name something that will occur later this year. What is it?

Oof, this one feels like a total freebee, and not to brag, but I got this right away after considering just a few possibilities.

The major American corporation of the past doesn't narrow things down much, but knowing that it must be two words, 15 letters altogether would definitely help us filter down a list if we had one handy.

The something that will occur later this year feels like the best place to start, because I can't think of a lot of possibilities. I don't think it would be something that occurs every year.

That leaves special events for 2024. It's safe to assume that this would be something that your average NPR listener would be aware of---not some obscure event or gathering.

The next World Cup is in 2026, so that's not relevant. Likewise, we already had a major eclipse.

The Summer Olympics takes place this year in Paris. This is a major election year in the USA. Can you think of variations on one of these that might give us a solution? How about other major, newsworthy events happening this year?

I ran this challenge by Microsoft Copilot (using GPT-3.5, I believe), and while it made some wrong assumptions, it got close enough that it would lead a reasonable person to the correct answer. I also passed it to Google ~~Bard~~ Gemini, but the results were nowhere near the correct answer.

I'm going to pass on writing a full script for solving this one since it's so easy, but let's walk through a hypothetical NLP approach here.

This is a pretty standard word puzzle format for the Sunday Puzzle.

There exists some string (2 words, 15 letters) among List A (American corporations of the past) such that when the last 3 letters are changed, the resulting string can be found among List B (something that will occur later this year). In other words, take a thing from List A, transform it, get a thing from List B.

In an NLP approach, the main challenge would be gathering List A and List B and ensuring that they are reasonably complete. These are both open classes, meaning there is no complete list, because we could always keep digging and find something else. This is in contrast to closed classes, which are things like U.S States or Academy Award Best Picture winners.

I would suggest that we search the web for a ready-made list of American Corporations, and/or ask an LLM to provide us a list. Alternatively, as we've done in the past, we can use a pretrained (L)LM like BERT or GPT-2 (both available in python libraries) to fill in a blank for us with the 100 or 200 most likely candidates: "In the past, major American corporations like ________ provided employees with a wide range of benefits." We know these items need to be 2 words, 15 letters total, so we remove any items that don't fit that pattern. A little cleaning and formatting and we have List A.

For List B, I would turn to the web again. Possibly we could do a web search and find a list of major events taking place this year. We could ask Copilot. We could also just brainstorm our own list (but in doing so the solution would probably be so obvious we don't need an NLP approach). Let's assume we manage to find or assemble a list of plausible candidates. Again, we'll want to filter our list to only include strings that give us 2 words, 15 letters total.

The next element is the transformation--change the last three letters in the second word. In a brute force approach, we could indeed iterate through all permutations of 3 letters, but wouldn't this be unnecessary at this stage? Instead, we should iterate through List A, then iterate through List B, looking for a string match between ItemA[:-3] and ItemB[:-3]. In other words, we're comparing the strings but ignoring the final 3 letters. If we find such a match, our python script would print it out, at which point we can see how the 3 letters were changed.

I suspect we'd only find one solution, but let's take this a step farther. Let's assume, for the sake of challenge, that this would result in thousands of potential solutions--too many to review manually. How could we sort through those potential solutions?

Again, I would suggest we use a language model here. This time, instead of asking the model to fill in a blank for us, we're going to have the script fill in a blank with each of our potential solutions and then rank them according to a probability (or perplexity) score. In fact, we used such an approach just last week. A good sentence template to use here might be: "Many people are excited because the ________ will take place later this year." So our script will use the remaining items from List B and tell us how well each one fits in the context of our template.

What do you think of this hypothetical approach? Did you also find this puzzle easier than usual? I'll be back after the Thursday submission deadline to share my solution.

--Levi King

Update

Here's my solution:

See you next week!

Natural Language Puzzling

Monday, April 22, 2024

Major American corporation of the past

Director, anagram, film award

Report Abuse