archival face recognition for fun and nonprofit – andromeda yelton Skip to content andromeda yelton Menu Home About Contact Resume HAMLET LITA Talks Machine Learning (ALA Midwinter 2019) Boston Python Meetup (August 21, 2018) SWiB16 LibTechConf 2016 Code4Lib 2015 Keynote Texas Library Association 2014 Online Northwest 2014: Five Conversations About Code New Jersey ESummit (May 2, 2013) Westchester Library Association (January 7, 2013) Bridging the Digital Divide with Mobile Services (Webjunction, July 25 2012) archival face recognition for fun and nonprofit Andromeda Uncategorized February 5, 2021 In 2019, Dominique Luster gave a super good Code4Lib talk about applying AI to metadata for the Charles “Teenie” Harris collection at the Carnegie Museum of Art — more than 70,000 photographs of Black life in Pittsburgh. They experimented with solutions to various metadata problems, but the one that’s stuck in my head since 2019 is the face recognition one. It sure would be cool if you could throw AI at your digitized archival photos to find all the instances of the same person, right? Or automatically label them, given that any of them are labeled correctly? Sadly, because we cannot have nice things, the data sets used for pretrained face recognition embeddings are things like lots of modern photos of celebrities, a corpus which wildly underrepresents 1) archival photos and 2) Black people. So the results of the face recognition process are not all that great. I have some extremely technical ideas for how to improve this — ideas which, weirdly, some computer science PhDs I’ve spoken with haven’t seen in the field. So I would like to experiment with them. But I must first invent the universe set up a data processing pipeline. Three steps here: Fetch archival photographs; Do face detection (draw bounding boxes around faces and crop them out for use in the next step); Do face recognition. For step 1, I’m using DPLA, which has a super straightforward and well-documented API and an easy-to-use Python wrapper (which, despite not having been updated in a while, works just fine with Python 3.6, the latest version compatible with some of my dependencies). For step 2, I’m using mtcnn, because I’ve been following this tutorial. For step 3, face recognition, I’m using the steps in the same tutorial, but purely for proof-of-concept — the results are garbage because archival photos from mid-century don’t actually look anything like modern-day celebrities. (Neural net: “I have 6% confidence this is Stevie Wonder!” How nice for you.) Clearly I’m going to need to build my own corpus of people, which I have a plan for (i.e. I spent some quality time thinking about numpy) but haven’t yet implemented. So far the gotchas have been: Gotcha 1: If you fetch a page from the API and assume you can treat its contents as an image, you will be sad. You have to treat them as a raw data stream and interpret that as an image, thusly: from PIL import Image import requests response = requests.get(url, stream=True) response.raw.decode_content = True data = requests.get(url).content Image.open(io.BytesIO(data)) This code is, of course, hilariously lacking in error handling, despite fetching content from a cesspool of untrustworthiness, aka the internet. It’s a first draft. Gotcha 2: You see code snippets to convert images to pixel arrays (suitable for AI ingestion) that look kinda like this: np.array(image).astype('uint8'). Except they say astype('float32') instead of astype('uint32'). I got a creepy photonegative effect when I used floats. Gotcha 3: Although PIL was happy to manipulate the .pngs fetched from the API, it was not happy to write them to disk; I needed to convert formats first (image.convert('RGB')). Gotcha 4: The suggested keras_vggface library doesn’t have a Pipfile or requirements.txt, so I had to manually install keras and tensorflow. Luckily the setup.py documented the correct versions. Sadly the tensorflow version is only compatible with python up to 3.6 (hence the comment about DPyLA compatibility above). I don’t love this, but it got me up and running, and it seems like an easy enough part of the pipeline to rip out and replace if it’s bugging me too much. The plan from here, not entirely in order, subject to change as I don’t entirely know what I’m doing until after I’ve done it: Build my own corpus of identified people This means the numpy thoughts, above It also means spending more quality time with the API to see if I can automatically apply names from photo metadata rather than having to spend too much of my own time manually labeling the corpus Decide how much metadata I need to pull down in my data pipeline and how to store it Figure out some kind of benchmark and measure it Try out my idea for improving recognition accuracy Benchmark again Hopefully celebrate awesomeness Share this: Twitter Facebook Like this: Like Loading... Tagged archival photos face recognition fridAI Published by Andromeda Romantic analytical technologist librarian. View all posts by Andromeda Published February 5, 2021 Post navigation Previous Post sequence models of language: slightly irksome Leave a Reply Cancel reply Enter your comment here... Fill in your details below or click an icon to log in: Email (required) (Address never made public) Name (required) Website You are commenting using your WordPress.com account. ( Log Out /  Change ) You are commenting using your Google account. ( Log Out /  Change ) You are commenting using your Twitter account. ( Log Out /  Change ) You are commenting using your Facebook account. ( Log Out /  Change ) Cancel Connecting to %s Notify me of new comments via email. Notify me of new posts via email. Create a free website or blog at WordPress.com. Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use. To find out more, including how to control cookies, see here: Cookie Policy %d bloggers like this: