Our understanding of bird song, a model system for animal communication and the neurobiology of learning, depends critically on making reliable, validated comparisons between the complex multidimensional syllables that are used in songs. However, most assessments of song similarity are based on human inspection of spectrograms, or computational methods developed from human intuitions. Using a novel automated operant conditioning system, we collected a large corpus of zebra finches' (Taeniopygia guttata) decisions about song syllable similarity. We use this dataset to compare and externally validate similarity algorithms in widely-used publicly available software (Raven, Sound Analysis Pro, Luscinia). Although these methods all perform better than chance, they do not closely emulate the avian assessments. We then introduce a novel deep learning method that can produce perceptual similarity judgements trained on such avian decisions. We find that this new method outperforms the established methods in accuracy and more closely approaches the avian assessments. Inconsistent (hence ambiguous) decisions are a common occurrence in animal behavioural data; we show that a modification of the deep learning training that accommodates these leads to the strongest performance. We argue this approach is the best way to validate methods to compare song similarity, that our dataset can be used to validate novel methods, and that the general approach can easily be extended to other species. How do birds hear the differences between their songs? This fascinating question carries implications, since the study of bird song, a model system for the neurobiology of learning and animal communication, depends critically on our ability to assess the similarity of songs. Traditionally, researchers compare sounds by human assessment, or use computational methods based on human intuitions about similarity. However, neither approach is connected to birds' own perception of sound similarity. Here, using a novel automated operant conditioning system, we recorded many thousands of acoustic judgments of similarity from zebra finches, and used this perceptual decision data for the first time to train a deep learning system. The trained system outperforms other computational methods for the task of making the same judgments as birds. This algorithm to compare song similarity, together with the potential of extending the general approach to other species, places the study of bird song on a firmer footing.

, , , , , , ,
doi.org/10.1371/journal.pcbi.1012329
PLOS Computational Biology

Released under the CC-BY 4.0 ("Attribution 4.0 International") License

Staff publications

Zandberg, Lies, Morfi, Veronica, George, Julia M., Clayton, David F., Stowell, D., & Lachlan, Robert F. (2024). Bird song comparison using deep learning trained from avian perceptual judgments. PLOS Computational Biology, 20(8), e1012329–e1012329. doi:10.1371/journal.pcbi.1012329