Happy Friday, Puzzlers! I'm later in the week than usual, but I've got a puzzle breakdown and solution for you. Let's look at this week's NPR Sunday Puzzle:
Yes, it comes from listener Ethan Kane (ph) of Albuquerque, N.M. Name a famous TV personality of the past. Drop the second letter of this person's last name, and phonetically, the first and last names together will sound like a creature of the past. What celebrity is this?So again, a famous TV personality of the past. Drop the second letter of this person's last name, and say it out loud. The first and last names together will sound like a creature of the past. What celebrity is it?
This is a familiar format-- take item from Class A, apply a transformation and return an item from Class B.
In this case, we have an extra layer of representation to deal with--the speech sounds of the two items in question. In other words, we're dealing with both orthography and phonology.
So here's what we need to solve this puzzle:
- T: a list of candidate names for the TV personality of the past
- C: a list of candidate "creatures of the past"
- pronunciation dictionary: we'll have the spelling of the personalities and creatures, but we need this resource to get the phonetic spellings so we can compare them
- script, which will:
- load the data (list of TV personalities, list of creatures, pronunciation dictionary)
- split personalities and creatures into separate strings that we can query in the pron dictionary;
- "Bob Hope" --> ["bob", "hope"]
- "Saber-toothed tiger" --> ["saber", "toothed", "tiger"]
- "Dodo" --> ["dodo"]
- query these separate strings for pronunciations:
- "bob" --> "B AA1 B"
- "hope" --> "HH OW1 P"
- "saber" --> "S EY1 B ER0"
- "toothed" --> "T UW1 TH T"
- "tiger" --> "T AY1 G ER0"
- rejoin the pronunciation strings and normalize (remove numeral accent markers and ensure correct spaces):
- ["B AA1 B", "HH OW1 P"] --> "B AA B HH OW P"
- ["S EY1 B ER0", "T UW1 TH T", "T AY1 G ER0"] --> "S EY B ER T UW TH T T AY G ER"
- Note: we do this because stress can be quite variable anyway, and we probably want to relax the constraint here.
- I'll start with this approach, and revisit if necessary. For example, I can see that perhaps we'd want to reduce the double \T T\ to a single \T\ in the phonetic spelling of sabertooth tiger, as a contextual phonotactic rule would normally apply here in running speech anyway. So this is something to revisit if I'm striking out.
- Iterate through the person pronunciations, then through the creature pronunciations, looking for a match and printing out any results.
I've done a bit of a "draw the rest of the owl" trick here, of course, because much of the challenge is simply coming up with T and C, our lists of candidate TV personalities and creatures. I tried using BERT in masking mode to fill in blanks like "The paleontologists discovered a rare complete [MASK] skeleton last month" in hopes of getting a list of suitable creatures, but I found it really difficult to tune these prompts in a way that generated good answers and not a lot of bad answers. We had success with BERT masked mode before, like in this puzzle, but I pivoted to use an LLM this time. Actually, for the creatures list, I found a few lists online and cobbled together a list of only about 30 creatures. For the list of TV personalities, I ran a few ChatGPT queries and eventually had a list of about 250 names.
You can see my work on GitHub; you'll also want my list of TV personalities (the list of creatures is much shorter and hard-coded). As these puzzles are always one-off solutions, I rarely optimize the script for efficiency--it's simply a race to get a solution. This script could benefit from some TLC (feel free to push your changes), but I'm happy we got a solution. The deadline for submissions has passed, so click below if you just want to see my answer. See you next week!
No comments:
Post a Comment