I’d written previously about my project to display images of artworks owned by the Art Institute of Chicago (ARTIC) https://workshop88.com/oldblog/index.php/2023/01/14/the-adafruit-pyportal-and-modifying-adafruit-libraries/. The challenge was that, unlike the project that inspired it, the ARTIC’s REST API does not return the full URL path to the image files. I solved it by patching the source code for the PyPortal library. It was clunky and stopped me from using that PyPortal for any other project.
Aha!
Then, inspiration struck. I had an under utilized Raspberry Pi W and decided to implement a web server on it that could take the response from the ARTIC, massage it and forward it on to the PyPortal. So rather than talking directly to ARTIC, the PyPortal talks to the Raspi, which talks to ARTIC. I used Bottle to build the web server.
Enhance the original
To spice up this second version, I implemented a “Happy Birthday, <artist>” system. The PyPortal asks the Raspi for the URL of the nth image to be displayed. The Raspi looks up which artists have birthdays today and asks ARTIC if it owns any pieces by those artists. From that search results list, the Raspi constructs the URL to the nth image. It puts the image URL, the artist’s name, the artwork’s title, and the dimensions of the image into a dictionary structure and sends a JSON response back to the PyPortal. The PyPortal downloads the image and displays it on the screen.
New things to learn
For this project to work, Raspi had to compose a query for ARTIC that returned only pieces by the artists desired. It would be poor form to display Pearson’s works on Beardson’s birthday. The prior project didn’t require such high precision. “/artworks/search/?q=impressionism” was adequate to get a lot of very beautiful imagess. For greater precision, I was going to have to learn complex queries using Elasticsearch’s Query DSL. The engineering staff at the ARTIC was very responsive in providing support, but I still struggled. I eventually found an on-line course on Udemy.com for the complete Elasticsearch stack. Section 3 of the course was DSL. I signed up for a free trial and completed the training.
Learning DSL allowed me to develop the following process:
- PyPortal request to Raspi: Get the nth image’s URL
- Raspi table lookup: What artists were born today?
- Raspi request to ARTIC: Perform a query to get those artists’ ARTIC IDs
- Raspi to ARTIC: Perform a query to get the URL of the nth image by those artists
- Raspi to PyPortal: Massage ARTIC’s response and send the modified image URL to PyPortal
- PyPortal: Download and display the image
- Rinse/Repeat
Here is what the Raspi uses to find artists IDs:
criteria = {
"query": {
"bool": {
"should": [
{"match_phrase":{"title": {"query":artist_name,"slop":1}}},
{"match_phrase":{"alt_titles":{"query":artist_name,"slop":1}}},
], # should
"minimum_should_match": 1,
}, # bool
}, # query
} # criteria
The query takes the artist’s name and tries to find that name in the title field. It also looks for that name in the alt_titles fields. The ‘slop:1’ specifies that ‘Louis Sullivan’ (the name in the DailyArtFixx.com birthday database) is a close enough match to ‘Louis H. Sullivan’ (the name used by the ARTIC).
The alt_titles field is necessary because the artist Maurice Quentin de La Tour (DailyArtFixx.com) has all of the following aliases:
"Maurice Q. de Latour", "Maurice Quentin La Tour", "Maurice-Quentin de La Tour", "Maurice-Quentin de La Tour", "Maurice-Quentin de la Tour", "Maurice Quentin De La Tour", "Maurice Quentin Delatour", "Maurice Quentin de Latour", "Maurice Quentin de La Tour"
Misspellings are also a risk. Taddeo Zuccari is also known as Taddeo Zuccaro. Hopefully, the alt_titles will catch this. If it doesn’t, there is still a “fuzziness” factor that can be added to the query. Fuzziness can handle individual letter insertions, deletions and transpositions. The greater the fuzziness factor, the more errors it can tolerate — at the risk of including someone who should not be included. Combining fuzziness and slop makes for a very complex query and since alt_titles seems to be working, I only use 1 degree of slop and skip fuzziness.
If Raspi is successful finding artist_name, it extracts that artist’s unique ARTIC ID. I do the above query for each artist having a birthday today. I put all the found IDs into a list.
I then loop over all the IDs and build something like the following query to find all the ARTIC artworks. The filter clauses ensure that the ARTIC has an image of the artwork and that there is a title and artist name associated to it. In the following example, I use IDs to find all artworks by two artists — Maurice LaTour and Casper Friedrich.
criteria = {"query":
{"bool": {
"should": [
{"match": {"artist_id": 34563}},
{"match": {"artist_id": 34185}}],
"minimum_should_match": 1,
"filter": [
{"exists": {"field" : "image_id"}},
{"exists": {"field" : "artist_title"}},
{"exists": {"field" : "title"}}, ], }, }, }
Every time the PyPortal asks for the nth image, the Raspi sends back data from the nth image in this list.
While there are plenty of opportunities for refactoring and bullet proofing the code, I consider the project complete and a success. It is running continuously in my workshop — Happy Birthday, artists!
Final thoughts
The brains of the project were moved from the PyPortal into the Bottle web server. I can change the server and add any number of parallel query schemes — all the works in the Narcissa Niblack Thorne miniatures gallery, all artwork associated with the Caravaggio exhibit, images of cats, stained glass, architectural models, etc. What would you like to see?