The Kvasir-Capsule Dataset

The largest gastrointestinal PillCAM dataset.
Also available as an OSF repository with
file browsing and as an OSF preprint.

 ~112 400 (jpg, avi)

Artificial intelligence (AI) is predicted to have profound effects on the future of video capsule endoscopy (VCE) technology. The potential lies in improving anomaly detection while reducing manual labour. However, medical data is often sparse and unavailable to the research community, and qualified medical personnel rarely have time for the tedious labelling work. In this respect, we present Kvasir-Capsule, a large VCE dataset collected from examinations at Bærum Hospital in Norway.

Kvasir-Capsule consists of 118 videos from which we can generate 4,820,739 image frames. We have labelled and medically verified \numImagesLabelled frames with a bounding box around detected anomalies from 13 different classes of findings. In addition to these labelled images, there are 2,785,829 unlabelled frames included in the dataset. Initial experiments demonstrate the potential benefits of AI-based computer-assisted diagnosis systems for VCE. However, they also show that there is great potential for improvements, and the Kvasir-Capsule dataset can play a valuable role in developing better algorithms in order for VCE technology to reach its true potential.


@article{Smedsrud2021, title = {{Kvasir-Capsule, a video capsule endoscopy dataset}}, author = { Smedsrud, Pia H and Thambawita, Vajira and Hicks, Steven A and Gjestang, Henrik and Nedrejord, Oda Olsen and N{\ae}ss, Espen and Borgli, Hanna and Jha, Debesh and Berstad, Tor Jan Derek and Eskeland, Sigrun L and Lux, Mathias and Espeland, H{\aa}vard and Petlund, Andreas and Nguyen, Duc Tien Dang and Garcia-Ceja, Enrique and Johansen, Dag and Schmidt, Peter T and Toth, Ervin and Hammer, Hugo L and de Lange, Thomas and Riegler, Michael A and Halvorsen, P{\aa}l }, doi = {10.1038/s41597-021-00920-z}, journal = {Scientific Data}, number = {1}, pages = {142}, volume = {8}, year = {2021} }

Labeled images

In total, the dataset contains 44,228 labeled images stored using the PNG format. The images can be found in the images folder. The classes that each of the images belongs correspond to the folder they are stored. For example, the ’polyp’ folder contains all polyp images, and the ’Angiectasia’ folder contains all images of Angiectasia. The number of images per class is not balanced, which is a common challenge in the medical field because some findings occur more often than others. This adds an additional challenge for researchers since methods applied to the data should also be able to learn from a small amount of training data. The labeled images represent 13 different classes of findings. Furthermore, the labeled image data includes bounding box coordinates, which can be found in the metadata.csv file.

Labeled Videos

The dataset contains a total of 44 labeled videos containing different findings and landmarks. This corresponds to approximately 19 hours of video and 2,034,910 video frames that can be converted to images if needed. Each video has been manually assessed by a medical professional working in the field of gastroenterology and resulted in a total of 44,228 annotated findings.

Unlabeled Videos

In total, the dataset contains 74 unlabeled videos, which is equal to approximatley 25 hours of video and 2,785,829 video frames.

Terms of use

Kvasir-Capsule is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source.  This means that in all documents and papers that use or refer to the Kvasir-Capsule dataset or report experimental results based on the dataset, a reference to the related article needs to be added: PREPRINT: Additionally, one should provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

Ethics approval

In this study, we used fully anonymized data approved by Privacy Data Protection Authority. It was exempted from approval from the Regional Committee for Medical and Health Research Ethics - South East Norway. Furthermore, we confirm that all experiments were performed in accordance with the relevant guidelines and regulations of the Regional Committee for Medical and Health Research Ethics - South East Norway, and the GDPR.


Email pia (_at_) simula (_dot_) no if you have any questions about the dataset and our research activities. We always welcome collaboration and joint research!