Multi-view Tracking, Re-ID, and Social Network Analysis
of a Flock of Visually Similar Birds in an Outdoor Aviary
|
|
Abstract
The ability to capture detailed interactions among individuals in a social group is foundational to our study of animal behavior and neuroscience. Recent advances in deep learning and computer vision are driving rapid progress in methods that can record the actions and interactions of multiple individuals simultaneously. Many social species, such as birds, however, live deeply embedded in a three-dimensional world. This world introduces additional perceptual challenges such as occlusions, orientation-dependent appearance, large variation in apparent size, and poor sensor coverage for 3D reconstruction, that are not encountered by applications studying animals that move and interact only on 2D planes. Here we introduce a system for studying the behavioral dynamics of a group of songbirds as they move throughout a 3D aviary. We study the complexities that arise when tracking a group of closely interacting animals in three dimensions and introduce a novel dataset for evaluating multi-view trackers. Finally, we analyze captured ethogram data and demonstrate that social context affects the distribution of sequential interactions between birds in the aviary.
Dataset
|
|
|
|
Results
Our stracking pipeline (shown at the top) produces good qualitative tracks for a variety of motion sequences. (a) Examples of detected bird instances with variations in pose, shape, lighting, scale, occlusion, and motion blur. (b) Example of a successful short track (56 frames) followed by its 2D projections in 3 different views. Colors indicating the camera views are consistent with those in the above figure. The green cube/circle is the start 3D/2D position and the red cube/circle is the end position. Dots in the 2D images are smaller/larger as the bird gets further away/closer. (c) Example of a successful long track (375 frames). During flight, the individual hops on the wall and briefly pauses for 1-2 seconds. Examples in (b) and (c) are from video segments drawn from different days, demonstrating variable time of day and lighting.
|
Bird re-identification results. We use a ResNet50 network supervised with triplet and ID losses to predict the identity of perched birds.
|
Using our dataset we analyzed the birds’ social network and investigated how birds’ behavior depends on social context. We show that interaction transition probabilities differ between pair-bonded (a, n = 163 transitions) and non-pair-bonded (b, n = 187 transitions) males and females.
|
Acknowledgements
We are grateful for the help of Henry Korpi, Ana Alonso, Greg Forkin, and Marcelina Martynek for their helpful discussion and many contributions to annotations in the dataset. We gratefully acknowledge support through the following grants: National Science Foundation IOS-1557499, National Science Foundation MRI 1626008, National Science Foundation NCS-FO 2124355.
The design of this project page was based on this website.