Lightroom fails to normalize unicode on searches, so that seemingly identical search strings give different results.
I was searching for all photos with the word "Médano" in the caption. It turns out that there are (at least?) two ways to indicate this word using unicode UTF+8. Lightroom treats these two ways as different words, and doesn't return one when you search for the other. This is a subtle issue, but those who are using non-English languages will probably run into it sooner or later.
The two ways to "spell" Médano are "Médano" and "Médano" (I have no idea if these are distinct once pasted into this forum).
You can see the difference using the hexdump command:
~$ echo 'Médano' | hexdump -C
00000000 4d 65 cc 81 64 61 6e 6f 0a |Me..dano.|
00000009
~$ echo 'Médano' | hexdump -C
00000000 4d c3 a9 64 61 6e 6f 0a |M..dano.|
00000008
The difference is explained at Wikipedia. I don't fully understand all of this, but the first form seems to be what I get when I copy from a pdf, and the second form is what I get when I type the word using the standard MacOS English keyboard.
I guess my only suggestion is that if some searches involving accented characters are not giving you the results you expect, this might be what is going on.John Ellis adds:
The difference between the two versions of é is that one is a single Unicode character, Latin Small Letter E with Acute (U+00E9), whereas the other is a combination of two characters, Latin Small Letter E followed by a Combining Acute Accent (U+0301). On Mac, you can type the former by holding down "e" and then selecting the accented version from the popup menu.
John R. Ellis, Champion
I've figured out the problem but it's the strangest thing - essentially I found that if I copied and pasted the keyword from a photo which had it set, rather than the one I had in the smart filter already, it works. I then added the version I had in the smart filter as a keyword and Lightroom appeared to add the exact same keyword twice. I have copied both variants here:
La Quinetière <-- Broken
La Quinetière <-- Works
They look identical (and where I've copied them here I guess they might be) but to Lightroom at least, they are not identical. I have since established that there are in fact two different variants of the è character in unicode which look outwardly identical but have a different character code:
https://apps.timwhitlock.info/unicode/inspect?s=La+Quineti%C3%A8re
https://apps.timwhitlock.info/unicode/inspect?s=La+Quinetie%CC%80re
Alan Harper
John R. Ellis, Champion
I just verified that both smart collections and the Library Filter bar treat the two representations of "é" as different characters.