Merlin Is Magical, but It Still Makes Mistakes

As the popularity of the app’s Sound ID feature grows, so do concerns about how imperfect artificial intelligence impacts a trove of scientific data.
Photo: Luke Franke/Audubon

As a volunteer reviewer for eBird, Tim Carney’s role is to review bird observations logged in multiple counties in Maryland to ensure they are as accurate as possible. In a typical migration season, he might receive up to 50 reports of uncommon species that require him to email users for additional documentation.

But in the past couple of years, Carney says, his workload has grown dramatically. Dubious reports have poured in without sufficient evidence to support them. Carney says that’s because more birders have been attributing their identifications to Merlin Sound ID, a tool created, like eBird, by the Cornell Lab of Ornithology.

For the most part, the Merlin app’s Sound ID feature, launched in 2021, is a birder’s dream. Activate it, and it transforms the bird sounds it hears into images that depict pitch and volume. The app then renders a real-time species identification, using artificial intelligence trained to read those images, called spectrograms. The experience is so seamless, people have taken to calling Sound ID “Shazam for birds.” 

Yet, impressive as the tool is, Merlin Sound ID can make mistakes. And when eBird users rely solely on the technology to make identifications, reviewers are swarmed with unexpected and sometimes questionable observations. The potential consequences go beyond mere irritation: The possibility of misidentifications sneaking through has experts concerned for the integrity of eBird’s high-quality data source, which is not only valuable for birders but also important for science and conservation.

When Cornell launched Sound ID, the idea was to create a space separate from eBird where beginners could learn the complexities of birding by ear. Merlin project manager Drew Weber says the feature “is a way of providing a safe playground for people to get acquainted with bird identification,” and to help them build up skills to eventually contribute to community science.

Some Merlin-based submissions to eBird, however, have raised eyebrows in the birding community. Reports of unusual species automatically populate the platform’s rare bird alerts, which are then emailed to users in the area. As a result, these errors are highly visible. Birders have taken to social media to point them out. For instance, a Little Ringed Plover in Arkansas (native to Europe) and a Plush-crested Jay in a backyard in Michigan (native to South America), both misidentified by Merlin, were listed in the past few months.

It’s the slip-ups involving native species, however, that most worry experts. The Philadelphia Vireo, for instance, is an uncommon migrant over much of North America, but its song is extremely similar to the more common Red-eyed Vireo. Even experienced birders have trouble discerning the two, says Wisconsin eBird reviewer Jason Thiele. And Philadelphia Vireos identified by Merlin have significantly increased the number of submissions eBird reviewers are seeing for the more elusive species. Carney, the reviewer in Baltimore, has seen a huge spike in reports just in the past year, forcing him to spend more time tracking down evidence.

It’s not yet clear if or how these reports have affected eBird’s data quality. Jenna Curtis, project leader at eBird, says the team at Cornell is looking into what role Merlin might play in contributing to bias in the database—or removing it. In fact, it’s likely that birders have historically under-detected Philadelphia Vireos and that Merlin is helping to correct that oversight, according to Weber. The Merlin team has also noticed increased detections of high-pitched species like Tennessee Warbler, Blackpoll Warbler, and Golden-crowned Kinglets, Weber notes, which are likely underrepresented in eBird’s data because they’re easily missed by birders with high-frequency hearing loss.

When Merlin Sound ID does make a mistake, it’s often an issue of audio length. The app analyzes sound in three-second intervals, but making an accurate identification sometimes requires a longer snippet of birdsong. Confidently distinguishing Philadelphia from Red-eyed Vireos by ear, for instance, requires tuning in to subtle differences in the cadence of their songs that unfurl over time. Merlin also struggles with mimics like Northern Mockingbirds: By analyzing vocalizations in short intervals, the tool often ends up identifying the species that the bird is mimicking. “Merlin doesn’t have that kind of memory currently, but that’s something we can investigate in the future,” says Weber.

Experts still encourage birders to submit the species they hear to eBird, as long as they can make a confident identification. This means also seeing the bird if possible, especially for easy-to-mix-up species. Curtis and Weber also urge anyone who submits Merlin-based observations to upload sound recordings from the app along with their eBird checklists. Those files not only give reviewers further evidence to check, but also train Merlin’s algorithm to make more accurate identifications in the future. If submitting a longer recording, users should include a timestamp in the notes to specify when the vocalization in question happens, says Carney. eBird also provides a set of guidelines for using Merlin. Thiele gave his own set of similar tips in a post last year.

The Merlin and eBird teams are also making changes within the app, designed to improve how the two platforms interact. Merlin now reminds users to turn on their phone’s location services to narrow down possible identifications to birds that occur in that area. The team has also created resources to help make it easier to upload Merlin recordings to eBird.

Navigating these issues will be an ongoing learning experience as Merlin Sound ID evolves, Curtis notes. “It’s funny,” she says, “you think about a paper field guide, the way most people learned how to bird up until recently—there’s no pop-up messaging. There’s no banner there to tell you that should exercise caution.” Technology is pushing birding in exciting new directions, but it’s not quite a substitute for learning to identify birds the old-fashioned way: by spending time in the field and learning from other birders.