A COCO-Formatted Instance-Level Dataset for Plasmodium Falciparum Detection in Giemsa-Stained Blood Smears
By: Frauke Wilm , Luis Carlos Rivera Monroy , Mathias Öttl and more
Potential Business Impact:
Helps doctors find malaria faster with computers.
Accurate detection of Plasmodium falciparum in Giemsa-stained blood smears is an essential component of reliable malaria diagnosis, especially in developing countries. Deep learning-based object detection methods have demonstrated strong potential for automated Malaria diagnosis, but their adoption is limited by the scarcity of datasets with detailed instance-level annotations. In this work, we present an enhanced version of the publicly available NIH malaria dataset, with detailed bounding box annotations in COCO format to support object detection training. We validated the revised annotations by training a Faster R-CNN model to detect infected and non-infected red blood cells, as well as white blood cells. Cross-validation on the original dataset yielded F1 scores of up to 0.88 for infected cell detection. These results underscore the importance of annotation volume and consistency, and demonstrate that automated annotation refinement combined with targeted manual correction can produce training data of sufficient quality for robust detection performance. The updated annotations set is publicly available via GitHub: https://github.com/MIRA-Vision-Microscopy/malaria-thin-smear-coco.
Similar Papers
Malaria Detection from Blood Cell Images Using XceptionNet
CV and Pattern Recognition
Helps doctors find malaria faster and more accurately.
Exploring Few-Shot Object Detection on Blood Smear Images: A Case Study of Leukocytes and Schistocytes
Image and Video Processing
Helps doctors count blood cells for sickness.
Weakly Supervised Virus Capsid Detection with Image-Level Annotations in Electron Microscopy Images
CV and Pattern Recognition
Finds viruses in pictures with less work.