Happy Tuesday, Puzzlers. Let's get cracking on the latest Sunday Puzzle from NPR:
This week's challenge comes from listener Ari Carr, of Madison, Wis. Name a form of musical composition. If you say the word quickly, you'll name something, in two words, that you might buy in a music store. What is it?
Oh boy, this one seems a little more challenging than some of the string manipulation type puzzles we often deal with here. The good news is that we'll probably want to dip a little deeper into our NLP toolbox to solve it.
Let's break down what we need and how we might obtain it here:
- C: a list of forms of musical composition;
- I'm assuming this means things like concerto, aria, song, hymn, symphony.
- How do we get this list?
- We might find one on Wikipedia or elsewhere online; it's worth searching for.
- We can generate one by querying Word2Vec for similar words to concerto, etc.
- We can use BERT or SBERT in mask mode, where it fills in a blank ("[MASK]"). So we'd feed it some sentences like:
- The prolific composer wrote more dozens of [MASK] in his lifetime.
- I could hear an upbeat [MASK] playing softly over the stereo downstairs.
- B: a list of things one can buy in a music store, in two words;
- Potential pitfalls:
- What's a "music store"? A record store? A musical instrument store? I think we'd better assume it could be either.
- "in two words": This looks tricky, because it could mean terms like "bass guitar" but it could also include things like "a guitar".
- We can try the Word2Vec and SBERT methods for this list too (I need to check that I can get two word terms from these tools).
- "If you say the word (in C) quickly, you'll name something, in two words" in B.
- This is the transformation function, like we see in about 90% of these puzzles.
- This time, however, the transformation applies not to the spelling string but its corresponding pronunciation string. In other words, we need a pronunciation dictionary.
- Here's an example of a past post where we've used the CMU Pronouncing Dictionary. I plan to use it again for this puzzle.
- Regarding the "say the word quickly" bit, I plan to simply ignore the vowels in the pronunciation strings, so we'll also want a function like strip_vowels(pronunciation).
That covers the setup. Assuming we have the resources above, let's look at how we'd use them to arrive at a solution. Here's some pseudo code:
- for x in C:
- qp_x = pronunciation[x] #"qp" for "quick pronunciation"
- qp_x = strip_vowels(qp_x)
- for y in B:
- qp_y = pronunciation[y]
- qp_y = strip_vowels(qp_y)
- if qp_y == qp_x:
- print(x, y)
It's possible we'll get some false positives here, so we'll print out any matches and see if they make sense. If we don't get any matches, we may want to relax the matching. Instead of looking for a perfect match, maybe we want something more like an edit distance. I'll probably keep it simple---check that most of the phonemes (minus vowels) in x also appears in y (and vice versa).
Good luck! I'll be back with the solution and my approach implemented in python after the Thursday deadline for submissions.
--Levi
No comments:
Post a Comment