Tuesday, January 26, 2021

Sound it out (Preview)

New week, new Sunday Puzzle:

This week's challenge is a spinoff of my on-air puzzle, and it's a little tricky. Think of a hyphenated word you might use to describe a young child that sounds like three letters spoken one after the other.

Okay, we'd better step back and look at the on-air challenge. Here is the description:

Every answer today is a word or name that sounds like it starts with two spoken letters of the alphabet.

Example: Wanting what other people have --> ENVIOUS (N-V-ous)

The full on-air challenge has more examples, and you can read or listen at the link above.

The weekly challenge is slightly different, however. Let's break it down.

hyphenated word you might use to describe a young child

As usual, we need a list of candidate words. They must be hyphenated and appropriate for describing a young child. In the past we've used word2vec: we give it a "seed" word and it gives us the top k similar words. We can brainstorm some seed words and do this again... precocious, mischievous, baby-faced, etc.

sounds like three letters spoken one after the other

Whew, okay, this looks tricky. What we really need is a phonemic dictionary (often referred to as a phonetic dictionary, but that's technically a misnomer). We'd also have a list of the phonemic transcriptions of each letter of the alphabet. We would query this phonemic dictionary with each of our candidate words to get the phonemic transcription, then check that the phonemic transcription contains exactly three syllables and that those appear in our list of phonemic transcriptions for the letters of the alphabet. I'm not sure what's out there in terms of free, publicly available phonemic dictionaries, so I'll have to poke around a little.

But what if we can't come up with a phonemic dictionary? It's not all bad. The way I read it, this rule does mean that the target word contains exactly three syllables. That's helpful. And how many possible combinations of three letters are there?

Well, we have three positions, with 26 possibilities for each position, so that's 263 = 17,576.

But we can do better than that. Can all 26 letters work here? There's at least one exception: w. Obviously, the "name" of this letter containing double is problematic here. So that leaves us with 253 = 15,625. Well, that's a little better, but still too many to manually work our way through. With a smaller number, perhaps we could just generate all those possibilities, then sit down and read through them out loud until we find a pronunciation that makes sense. And in fact, maybe we could reduce the total number by thinking more carefully about the 25 letter pronunciations; we may determine that certain sequences do not occur in English.

More likely, without a phonemic dictionary, we'd approach from the other end---generate a list of candidate words, keep only those with hyphens, then read through them manually to find one that fits. Given that we expect three syllables, we could probably further filter our candidates first using some range for the number of letters. I would guess, for example, that the target word contains between 5 and 11 letters.

Good luck, Puzzlers! I'll see you on Friday, and I hope to have a solution by then!

--Levi King

No comments:

Post a Comment

Director, anagram, film award

Welcome back to Natural Language Puzzling, the blog where we use natural language processing and linguistics to solve the Sunday Puzzle from...