Coding Adventure: Author Mapper
Let's get mapping
Goodreads has been a guilty pleasure for me in the past few months. I read books at, you know, a busy college student's pace, maybe one to three a month, but I love to scroll, so I frequently find myself exploring books on the app and reading their reviews. Books I've heard of, books I haven't; books I find referenced online, casually brought up in conversation, or mentioned in other books (Foucault's Pendulum has been a bottomless source of these, its protagonist is literally in the publishing industry); books recommended to me by the app's own comically terrible algorithm (would youuuu like to read another one of those Tiktok romantasy books with identical moody gothic fantasy cover arts, which are all titled like "An X of Y and Z" because they wish they were A Court of Thorns and Roses? Actually, I'm quite tempted, but that's neither here nor there.)
So I have a huge to-read list now. It's longer than my previously-read list. It spans many genres, everything from graphic novels to serious literary classics. Like any good physical library, it also includes things I'm not realistically going to read, but which look awfully nice on the shelf, like The Decline and Fall of the Roman Empire and Hegel's Phenomenology of Spirit (I looked at the public domain version you can read on Wikipedia's Wikisource, and got to about the third paragraph of the preface before throwing in the towel).
The habit isn't new, of course. When I was first getting into music (in terms of albums, that is, I've always been into music in a broader sense), I was making a similar list on RateYourMusic - it's just easier to look at an album's cover art and genre and add it to the end of "To listen to (Length: 220 albums)" than to go through the work of giving it a serious critical listen. The difference, of course, is that RateYourMusic has a map.
You get a dot when you give a rating to a release by an artist with a known birth location (for a solo artist) or origin (for a group). This is probably my favorite feature of any website I've been a regular user of, like, ever. It's so much fun to fill up over time. Not only is it a status symbol to have an impressively filled-out map, it's a great motivator to explore music from other parts of the world. You really don't know how much you're unconsciously avoiding until you see a whole continent blank. I discovered Marijata when I was looking for more map points in West Africa, and Jorge Ben Jor, a prolific Brazilian musician who has become another personal favorite, when I was trying to add dots in Brazil.1
I decided to extend this whole map idea to Goodreads in my latest project. Maybe I'd have something to do with the data from that long, long to-read list - would be a leg up over RateYourMusic, which only tracks listened and rated albums.
Since Goodreads lets you export a CSV file (basically a spreadsheet) of your book data, I knew early on that the pipeline I would be working with involved checking the author data from that file in some sort of database, then putting the coordinates on a map. To rephrase, we are looking at three primary tasks for the program after the user gets and uploads their spreadsheet:
Process the spreadsheet for a list of unique authors.
Turn author names into birth coordinates
Put birth coordinates on a map
The unique author list from spreadsheet thing was trivial. My primary challenge was the second point - 'precise birthplaces of many people of varying importance throughout history' is an inherently hard system to implement. RateYourMusic's parent company Sonemic does it at scale, storing a big bespoke database of artists' birthplaces based on user-submitted (and moderator-verified) information, as well as a whole geographic system that stores coordinates of towns. Every time someone submits an artist who was born in a rural small town, especially in a non-English-speaking country, the database moderators have to research the location. Obviously it is not feasible to do this as one developer.
I decided to use Wikidata to get the birthplaces instead. Connected to Wikipedia, they provide a database of standard-formatted biographical information about basically every person with a Wikipedia article, including birth location. Unfortunately, the API is incredibly slow because, also like Wikipedia, they are all free and open source, funded by donations. It's definitely the best you can get for free though!
Their API uses an unusual branch of SQL (Structured Query Language) called SPARQL (SPARQL Protocol And RDF Query Language). The (very condensed) SPARQL call I make for each author looks like this:
This searches the given name (entityId) on Wikimedia and looks at the first result that's listed with a birthplace (P19) that has exact coordinates (P625). After the shown step, the program originally filtered these results for only the results that are listed as 'writers, authors, playwrights, and poets,' but I took this step out after I realized it was eliminating (among other things) musician autobiographies and philosophy by people only listed as 'philosophers,' and there was no way to include every possible category of person who might write a book. Now it only looks for valid results in general. Authors who share a name with more famous non-authors get shafted (sorry, Scott Alexander the Bay Area Rationalist who wrote SSC and Unsong, you're listed in the location of Scott Alexander the American-British screenwriter from Big Eyes (2014)...) but it's not as common of a case as I expected.
From there, I plugged the location names and coordinates into a Leaflet.js map. I based the graphic design of the website on a previous work, Checkbox Nightmare, which I recently updated (new article coming soon!). I also added a small set of default data so you can see the full functionality without having a Goodreads account.
I decided to make seeing the to-read list an optional toggle. It makes the map impressively crowded, but might not reflect your actual reading habits.
Overall, I'm satisfied with the design of the website, though I wish there was some way to make the Wikidata requests return faster without spamming them. After Author Mapper, I made another toy website (to be shown in another article) and made a major update to Checkbox Nightmare with many, many new puzzles (ditto).
Try it yourself: https://author-mapper.vercel.app/
Add me on Goodreads: https://www.goodreads.com/user/show/163630091
Jaffeelabs returns in two weeks!
Of course, this system has its issues, the primary one being that it turns albums into commodities (if you sit through this music, you get another dot!) rather than experiences (authentically engaging with unfamiliar styles on their own terms), but the average RateYourMusic user is an Anthony Fantano worshiper, so they're already most of the way there.








