For much of my first few years of birding in Singapore, I wondered how the data from earlier eras in local ornithology could be made available for more to benefit; if early dates and late dates, major hotspots, past trends in the local avifauna, among other useful information, could be compiled in an accessible format for the community at large. After all, this small country has always been blessed with a higher-than-average observer coverage relative to its surrounding regions. The data could fill major gaps in understanding Asian avifauna, and be greatly beneficial to interested local birdwatchers as well.
So when I first saw that the Nature Society (Singapore) (NSS) had made its old editions of Singapore Avifauna from 1987 through 2010 available on its website, it was immediately clear that this could be a great resource for the birding community in Singapore if it could be consolidated into a more readily-accessible format. To really know how many times a species has been recorded locally, or which months it appears most frequently, manually scrolling through hundreds of reports for that species would not be practical. Rather, the data would need to be in a spreadsheet, and easily searchable, for the information to be most useful.
Recently, as Keita discussed last week, eBird has established itself as the most widely-used citizen science database for avian records. Many studies have referenced data stored in eBird to examine trends and achieve important conservation outcomes. It also reached a major milestone — 1 billion bird observations — in May this year, reinforcing its position as a powerful tool for conservationists and casual birders alike to share observations and further broaden our collective knowledge. It was clear to me that putting these important records from SINAV on eBird was the best way to make them as impactful as they can possibly be.
In collaboration with Singapore’s eBird reviewer Martin Kennewell, after obtaining permission from NSS to consolidate and upload the records, I set to work on designing a program to extract species names, observation counts, dates, and locations, as well as observer names for proper credit, for all the volumes of Singapore Avifauna available on the NSS website. Dividing this process into two steps: conversion of the PDFs (stored as images) into text, followed by extracting the important details from the text, I was able to upload over 27,000 individual observations, around 23,000 from Singapore and 4,000 from Malaysia and Indonesia.
The records uploaded now make up around 75% of all eBird records until 1990, and over 25% of all records until 2010 (the last year that Singapore Avifauna was published).
I can’t say that this journey was always smooth; one of the biggest challenges I faced was resolving old locations, with old place names, to current landmarks or points on the map which could be uploaded to eBird. With Martin’s expertise and support, I was able to resolve most of these, and sent a further few to NSS for their review. Additionally, some older editions of SINAV were missing; indeed, this was an era with limited technological access and keeping track of documents was admittedly more challenging than it has become today. Older records also suffered from a lack of specific counts; especially as birds that have now become rare once numbered in the dozens or even hundreds, observers sometimes may not have made the effort to accurately count these species.
See, for example, the entry for Sanderling in the report for February 1987.
This species is now barely an annual visitor; in the 80s, 90s, and even the early 21st century, counts in the double digits were regular. It’s hard to believe this count of 100 Sanderlings along with exceptional counts of other shorebirds, was just over 15 years ago.
Sometimes the work was tiring and it became difficult to continue, but I always knew the reward of making all these sightings accessible was worth the effort. In the end, after probably 100 hours of work, I uploaded the data to the NSS Records eBird account.
The excerpts below are from Volume 19 (Jan-Mar) of Singapore Avifauna, published in 2005 (link). I’ve used this example to highlight how the process of digitizing reports into individual records works. For January, February, and March of 2005, Blue-crowned Hanging-Parrots were recorded a total of 12 times. Since the report is split into each of the months, this species appears three times.
These three entries are then converted into text with optical character recognition and combined into one overall entry for the period covered by the report, in this case January to March:
BLUE-CROWNED HANGING-PARROT Loriculus galgulus 1 over Dairy Farm Road, 17/1 (LKS) and 18/1 (LKS) and 4 at Malcolm Park, 30/1 (NK/LKS/FR/IR/JR). At Botanic Gardens, 7 were counted on 4/2 (LKS) and 5 on 28/2 (AF/LKS). Also 3 over Nee Soon, 8/2 (LKS), 1 at MacRitchie Reservoir, 16/2 (LKS) and 28/2 (AF/LKS), and 1 over Dairy Farm Road, 24/2 (LKS). 1 heard at the foot of Bukit Timah, 12/3 (LKS), 1 flying over Dairy Farm Road, 22/3 (LKS) and 3 at Sime Road, 27/3 (LKS).
The individual records are then separated by looking for “sets” comprising the four important pieces of information for each record: count, date, location, and observer names. Of these, the most challenging to parse out is the location. In this case, there’s no extraneous information that we need to ignore, so it seems relatively straightforward to just use the leftover text as the location. But sometimes, sightings are associated with lengthy descriptions and the location needs to be extracted from that description – so I had to use natural language processing to pick out the location.
For the example above, the following 12 sightings would then be uploaded to eBird. This checklist shows how the first record (17 Jan) would appear in eBird’s outputs.
This project is mostly complete, and with it, thousands of bird observations recording hundreds of species have now been placed somewhere they can be accessed by researchers and amateurs alike. As more people come forward to contribute their sightings and share their knowledge, we can make more meaningful progress in conserving our valuable local wildlife.
This piece was written with the help of comments and advice from the Singapore Birds Project team (Dillen, Francis, Keita, Movin, and Sandra). My project drew on over a hundred reports made available by NSS on its website; their permission for me to take on this project also made this project possible. I also appreciate Martin’s contribution to many aspects of my project, including location-matching and manual approval/rejection of the uploaded records.
Sullivan, B. L., Aycrigg, J. L., Barry, J. H., Bonney, R. E., Bruns, N., Cooper, C. B., … Kelling, S. (2014). The eBird enterprise: An integrated approach to development and application of citizen science. Biological Conservation, 169, 31–40. doi:10.1016/j.biocon.2013.11.003
Sullivan, B. L., Wood, C. L., Iliff, M. J., Bonney, R. E., Fink, D., & Kelling, S. (2009). eBird: A citizen-based bird observation network in the biological sciences. Biological Conservation, 142(10), 2282–2292. https://doi.org/10.1016/j.biocon.2009.05.006