We present the Eindhoven Wildflower Dataset (EWD) as well as a PyTorch object detection model that is able to classify and count wildflowers. EWD, collected over two entire flowering seasons and expert annotated, contains 2,002 top-view images of flowering plants captured ‘in the wild’ in five different landscape types (roadsides, urban green spaces, cropland, weed-rich grassland, marshland). It holds a total of 65,571 annotations for 160 species belonging to 31 different families of flowering plants and serves as a reference dataset for automating wildflower monitoring and object detection in general. To ensure consistent annotations, we define species-specific floral count units and provide extensive annotation guidelines. With a 0.82 mAP (@IoU > 0.50) score the presented baseline model, trained on a balanced subset of EWD, is to the best of our knowledge superior in its class. Our approach empowers automated quantification of wildflower richness and abundance, which helps understanding and assessing natural capital, and encourages the development of standards for AI-based wildflower monitoring. The annotated EWD dataset and the code to train and run the baseline model are publicly available.

, , , , , , ,
doi.org/10.1371/journal.pone.0302958
PLOS ONE

Released under the CC-BY 4.0 ("Attribution 4.0 International") License

Staff publications

Schouten, G., Michielsen, Bas S. H. T., & Gravendeel, B. (2024). Data-centric AI approach for automated wildflower monitoring. PLOS ONE, 19(9). doi:10.1371/journal.pone.0302958