Natural Language Puzzling: March 2024

Wednesday, March 27, 2024

Element names spelled with element symbols

Hi Puzzlers! Let's tackle this week's Sunday Puzzle from NPR.

This week's challenge comes to us from Mae McAllister, from Bath, in the United Kingdom. As you may know, each chemical element can be represented by a one or two-letter symbol. Hydrogen is H, helium is He, and so on. McAllister points out that there are two commonly known elements whose names each can be spelled using three other element symbols. Name either one.

Looks easy, right? The vocabulary of element symbols (H, He, etc.) is a closed class, which makes things much easier. I'm not certain, but for now we'll assume that "commonly known elements" means the element names as they appear on the periodic table, which is also a closed class.

Let's break down the components we need here:

element_dict: We'll need a dictionary like {"helium": "H", ... } in order to sure that we don't try to spell an element name using it's own symbol. We can also use this dictionary to derive these lists, S and N:
S: the list of element symbols

['H', 'He', 'Li', 'Be', ... ]

N: the list of element names

['hydrogen', 'helium', 'lithium', ... ]
Note that the name must be spell-able with exactly 3 element symbols, meaning the name has between 3 and 6 letters (because symbols are 1 or 2 letters). This means we should remove any elements with a name longer than 6 letters.

check_symbols(element_name): For the main loop of our script, we'll want to check each name in N with a function that takes an element name and determines whether it can be spelled from the vocabulary in S.

We know that the solution uses exactly 3 element symbols to spell an element name, and that the symbols used are from other elements.

For example, we can't solve by spelling xenon with Xe, No, N because Xe is the symbol for xenon.
So we'll use python's itertools.permutations to get all permutations of three symbols, then for each element name we process, we can immediately toss any permutation that contains the element's own symbol by checking the element_dict

Then we iterate through the subset of permutations for each element name. Each time we find a match, we'll replace the symbol substring with a matching number of underscores (1 or 2), and if we find a permutation of symbols that replaces all the letters in the name with underscores, we've found a solution.

It's important we use permutations rather than combinations here, because some symbols may have overlapping letters, so the results of replacing all the substrings can differ depending on the order of the symbols.

You can find my Python implementation here, and I'll be back after the Thursday deadline to share my solution. Good luck!

March 30, 2024 Update

The Thursday deadline has passed, so here's my solution:

See you next week!

Monday, March 18, 2024

Two Trees

It's Monday, and you know what that means! Let's tackle yesterday's NPR Sunday Puzzle:

This week's challenge: Our challenge comes from Emma Meersman of Seattle, Washington: Take two three-letter tree names and combine them phonetically to get a clue for a type of fabric, then change one letter in that word to get something related to trees. Your answer should be the two tree names you started with.

This looks like another tricky one. Let's break it down.

T: a list of three-letter tree names

oak, ash, elm, fir ...

"combine them phonetically"

We'll need a phonological dictionary and probably a phonetic model
The dictionary will map the orthography of each tree name to a phonetic spelling (i.e., IPA or SAMPA)

we've solved puzzles like this before using the Carnegie Mellon University Pronouncing Dictionary, and we'll likely use it again here. Here's an example.

The model will take a phonetic spelling and map it to a list of possible matching (orthographic) spellings

c: "a clue for a type of fabric"

This is quite vague, but I think we could plug in a language model here and score candidates. Among other models, we've used the hugging face GPT-2 implementation and the pytorch pretrained BERT for problems like this. This time I'll be using BERT feeding it template sentences with candidate words filling in the blanks, e.g.:

What kind of fabric is _______ ?
Is herringbone more _______ than twill?
We can describe textiles as plain weave, satin weave or using other characteristics, like _______ fabrics.
I'm looking for a _______ purse for my sister, so should I choose denim, leather, houndstooth or linen?

The model will produce a probability score for each sentence, so we'll get an average for the candidate words and print the best scoring candidates out for us to review.
NOTE: It's not super clear from the way this puzzle is worded, but the clue here is not the "type of fabric" per se, but rather a clue that would lead us to it. For example, if the clue were "flax fiber", this could lead us to "linen", which we would then take as the input to the "something related to trees" portion of the puzzle.

s: "something related to trees"

c above should lead us to type of fabric
If we change one letter from the type of fabric, we get something related to trees.
I think we'll need to manually make the leap from the clue to the type of fabric, at which point we'll mentally do some letter swapping to confirm that we get something related to trees.
If we can't get this mentally, we can simply iterate through each letter in the word, then iterate through each letter of the alphabet and make the change.

We'll check an English vocabulary to keep only the words that appear there.
Then we'll pass the candidates through our BERT model, much like we did above:

_______ always makes me think of trees.
Arborists study everything about trees, including their distribution, reproduction, pathologies, ________, timber, growth rates, etc.
The topic of ______ came up in my Forestry class when we discussed trees.

I'm doubtful that we can write a script to perfectly solve this one by spitting out a single solution, because it's a pretty complex problem. This is going to be more of a HITL (human in the loop) approach, and we may have to run the script iteratively and manually review the output at each stage. You can view my script here on GitHub. Okay Puzzlers, good luck and I'll be back after the submission deadline with my solution!

March 23, 2024 Update

The deadline has passed, so here's my solution:

See you next week!

Monday, March 11, 2024

Body parts

Note: The Natural Language Puzzling blog is wishing a speedy recovery for Puzzle Master (and fellow IU alum) Will Shortz. Rest up and get well, Will!

It's Monday, so let's tackle yesterday's Sunday Puzzle from NPR:

This week's challenge: Take a body part, add a letter at beginning and end to get another body part, then add another letter at beginning and end to get something designed to affect that body part.

Wow, this one's pretty complicated! What do we need to solve this problem?

B: a list of body parts

based on the puzzle text, I think it's safe to assume each item in this list is a single word

get_pairs(B): For each item in B, we can check every other item in B to see if the first item appears as a substring within the second item;

if so, and if the second item string has just additional letter before and after the first item, we return this pair as a candidate for the solution

add_letters(<string>): a function that takes a word (in our case it's a body part) and adds a letter to the beginning and end

we'll brute force this so the function iterates through each permutation of two letters
it will check a vocabulary to see if the resulting string is a word

if not, we abandon the string

V: a vocabulary

"something designed to affect that body part" is a little too vague to narrow this down further, especially since we don't even know what "that body part" is going to be

score(): we'll need a scoring function to determine if the final string makes any sense as "something to affect that body part"

Note that at this point, we'll have the body part that we're looking to match with a "something"; let's call this body part b
I'm thinking we'll just use a BERT model in masking mode; i.e., we'll use a few sentence templates like:

I got a new ____ for my <b>
I was reading about the affect of _____ on the <b> yesterday

The model will give us a score for each of these sentences with the candidate words filled in, so we'll take the average score and expect that the solution will be the highest scoring candidate or at least among the highest scoring.

That's going to be my approach. Do you have other ideas? I'll be back with my solution (or lack thereof) after the Thursday submission deadline. In the meantime, you can try my (partially working) python script if you want some help.

March 16, 2024 Update

The deadline has passed, so here's my solution:

If you tried my script, you'll notice that it didn't produce a single, definitive solution. However, it does include the solution among a list of potential solutions. The tricky thing about this solution is that it isn't a single word, so we have to tokenize the string into two words, and my script wasn't really expecting this or equipped to handle it.

See you next week!

Monday, March 04, 2024

Nobel Peace Prize winners

It's Monday, so let's take a shot at yesterdays Sunday Puzzle from NPR:

This week's challenge: This week's challenge comes from listener Anjali Tripathi of Los Angeles, California. Take the last name of a Nobel Peace Prize winner. Remove the middle three letters and duplicate the last two letters to get the first name of a different Nobel Peace Prize winner. What are those two names? Again, take a Nobel Peace Prize winners last name, remove the middle three letters and duplicate the last two letters, get the first name of another Nobel Peace Prize winner.

If you solve the Sunday Puzzle regularly, you're probably thinking this one will be easy.

Why? Closed class! These puzzles generally ask us to take a thing from one class, perform some transformation on it and get a thing from some other class. In some cases, these classes are open: e.g., English girl's names or hit pop songs. In this case, we have a closed class--we can easily look up the list of Nobel Peace Prize winners and be certain that the solution is in that list.

So how do we solve this? We need:

P: a list of all Nobel Peace Prize winners
f: a function to take the last name of a winner, remove the middle three letters and duplicate the last two letters, and return the string

Easy peasy--we put together a python script to iterate through the list of names, take the last name and apply the transformation, then check the result against the first names in the list.

I've uploaded my script to Github, here. I'll be back after the submission deadline to share my solution. Good luck!

March 8, 2024 Update

The deadline has passed, so here's my solution:

See you next week!

Natural Language Puzzling

Wednesday, March 27, 2024

Element names spelled with element symbols

Monday, March 18, 2024

Two Trees

Monday, March 11, 2024

Body parts

Monday, March 04, 2024

Nobel Peace Prize winners

Director, anagram, film award

Report Abuse