Now that the submission deadline has passed, it's time to post the solution. Spoiler warning: solution below! First, let's take another look at this week's Sunday Puzzle:
This week's challenge is a spinoff of my on-air puzzle, and it's a little tricky. Think of a hyphenated word you might use to describe a young child that sounds like three letters spoken one after the other.
Okay, we'd better step back and look at the on-air challenge. Here is the description:
Every answer today is a word or name that sounds like it starts with two spoken letters of the alphabet.
Example: Wanting what other people have --> ENVIOUS (N-V-ous)
The full on-air challenge has more examples, and you can read or listen at the link above.
So we need a list of candidate words, then we can check that they fit the rules: hyphenated, describes a young child, sounds like three letters spoken one after the other.
This week, I didn't come up with a script that gives a solution outright, but I did write a script to help me generate a list of words and then narrow it down to a number that I could manually search through.
My script uses word2vec: I give it a seed word, and it gives me the top k most similar words---most similar based on the contexts in which the words appear. Given that we have some specific form requirements (3 syllables, hyphenated), we probably want to generate a really large list of candidates, then filter them down so we keep only those matching the form requirements. In other words, we are prioritizing recall over precision here. This means we need quite a few seed words.
I struggled to come up with seed words on my own, so I decided to use BERT in masking mode: I gave it a few "fill in the blank" phrases, and it returned predictions for the blanks:
- 'that little kid is such a [MASK]'
- 'play with the [MASK] toddler'
- etc.
'active', 'adorable', 'adventurous', 'angel', 'angry', 'animated', 'annoying', 'anxious', 'athletic', 'beauty', 'blessing', etc.
The seed words I used are coded into the script, so you can see the full list there if you'd like. The list is 125 words long.
I didn't pursue the phonemic dictionary approach that I discussed in the preview post. Instead, I went for a more "quick and dirty" approach. I generated a list of candidates from the seed words. Then I filtered out any results without a hyphen. My first attempt resulted in over 10,000 hyphenated words. This was too many to manually read through to find a solution, especially because it really takes a moment to read each one and consider whether the pronunciation "sounds like three letters spoken one after another."
In skimming through this long alphabetical list of candidates, I realized that some spans of words could be pruned out smartly with just a little more work, simply based on the first couple of letters. For example, no letter has a "name" that is pronounced with a consonant cluster. I decided to go through the alphabet, letter by letter, and for each letter list all the letter sequences that might possibly start a spelling that sounds like that letter. Then I could filter out any candidates that don't start with one of these allowed sequences. For example:
- A: "a", "ei",
- B: "be", "bi"
- C: "ce", "ci", "se", "si"
- D: "de", "di", "dy"
- etc.
I had the script run word2vec for my seed words with four different pre-trained models. This is probably overkill, but it's better to cast a wide net here. The rationale for using different models is that some are trained on newspaper test, others are trained on twitter, etc. There might be words that appear on Twitter but not in news articles, or vice versa. With each of these four models providing the top 300 most similar words to each of the 125 seed words, and after running the results through the filters described above, 477 words remained. Most of them make no sense at all, and very few of them come anywhere close to sounding like three letters. Here are 20 of the most reasonable from that list, and the solution is among them:
- beauty-queen
- big-boned
- bitter-sweet
- cared-for
- cat-eyed
- city-born
- curly-headed
- cutie-pie
- denim-clad
- diva-esque
- gimlet-eyed
- girl-child
- pencil-thin
- pet-loving
- pie-eyed
- pink-cheeked
- pint-sized
- pixie-like
- teddy-bear
- teeny-bopper
Did you spot the solution in that list? If so, you probably let out a little groan, right? I'm glad I didn't go the full phonemic dictionary route, because I wouldn't have caught the solution that way. The instructions did mention that this one would be tricky, but using a Greek letter is a bit of a stretch!
Thanks for reading, Puzzlers. I'll see you for the next one!
--Levi King
No comments:
Post a Comment