New York Smells: A Large Multimodal Dataset for Olfaction
By: Ege Ozguroglu , Junbang Liang , Ruoshi Liu and more
Potential Business Impact:
Teaches computers to smell and identify things.
While olfaction is central to how animals perceive the world, this rich chemical sensory modality remains largely inaccessible to machines. One key bottleneck is the lack of diverse, multimodal olfactory training data collected in natural settings. We present New York Smells, a large dataset of paired image and olfactory signals captured ``in the wild.'' Our dataset contains 7,000 smell-image pairs from 3,500 distinct objects across indoor and outdoor environments, with approximately 70$\times$ more objects than existing olfactory datasets. Our benchmark has three tasks: cross-modal smell-to-image retrieval, recognizing scenes, objects, and materials from smell alone, and fine-grained discrimination between grass species. Through experiments on our dataset, we find that visual data enables cross-modal olfactory representation learning, and that our learned olfactory representations outperform widely-used hand-crafted features.
Similar Papers
SMELLNET: A Large-scale Dataset for Real-world Smell Recognition
Artificial Intelligence
AI learns to identify smells like a nose.
Diffusion Graph Neural Networks and Dataset for Robust Olfactory Navigation in Hazard Robotics
Robotics
Helps robots smell and find things better.
Data Augmentation via Latent Diffusion Models for Detecting Smell-Related Objects in Historical Artworks
CV and Pattern Recognition
Finds hidden smells in old paintings.