Updated
June 2021
Datasets

Annotation tools for building datasets

A list of open-source annotation tools for labeling data from across the web.

Use this form to add new tools to the list.

Subscribe to get updates when new datasets and tools are released.
Name License
LabelImg is a graphical image annotation tool. It is written in Python and uses Qt for its graphical interface. Annotations are saved as XML files in PASCAL VOC format, the format used by ImageNet
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
Labelme is a graphical image annotation tool inspired by http://labelme.csail.mit.edu. It is written in Python and uses Qt for its graphical interface.
GPL
GNU General Public License v3.0 - Permissions of this strong copyleft license are conditioned on making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights.
Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can be used to prepare raw data or improve existing training data to get more accurate ML models.
Apache
Apache License 2.0 A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
CVAT is completely re-designed and re-implemented version of Video Annotation Tool from Irvine, California tool. It is free, online, interactive video and image annotation tool for computer vision. It is being used by our team to annotate million of objects with different properties. Many UI and UX decisions are based on feedbacks from professional data annotation team.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
doccano is an open source text annotation tool for human. It provides annotation features for text classification, sequence labeling and sequence to sequence. So, you can create labeled data for sentiment analysis, named entity recognition, text summarization and so on. Just create project, upload data and start annotation. You can build a dataset in hours.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
An open source annotation and labeling tool for image and video assets. VoTT is a React + Redux Web application, written in TypeScript. This project was bootstrapped with Create React App. Features include: The ability to label images or video frames Extensible model for importing data from local or cloud storage providers Extensible model for exporting labeled data to local or cloud storage providers VoTT helps facilitate an end-to-end machine learning pipeline.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
A client-side JavaScript component to display and interact with audio waveforms in the browser. Peaks.js was developed by BBC R&D to allow users to make accurate clippings of audio content in the browser, using a backend API that serves the waveform data.
GPL
GNU General Public License v3.0 - Permissions of this strong copyleft license are conditioned on making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights.
2k
A free and open source tool that aims to significantly reduce the time of labeling in object detection projects. No installation is required, all you need is a browser. Make Sense is online, but we care about your privacy and we don't send your photos anywhere.
GPL
GNU General Public License v3.0 - Permissions of this strong copyleft license are conditioned on making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights.
NeuroNER is a program that performs named-entity recognition (NER).
Not found
License information not found
GUI for marking bounded boxes of objects in images for training neural network Yolo v3 and v2
Unlicense
A license with no conditions whatsoever which dedicates works to the public domain. Unlicensed works, modifications, and larger works may be distributed under different terms and without source code.
The Universal Data Tool (UDT) is an open-source web or downloadable tool for labeling data for usage in machine learning or data processing systems. The Universal Data Tool supports Computer Vision, Natural Language Processing (including Named Entity Recognition and Audio Transcription) workflows.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
brat is a web-based tool for text annotation; that is, for adding notes to existing text documents. brat is designed in particular for structured annotation, where the notes are not freeform text but have a fixed form that can be automatically processed and interpreted by a computer.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
COCO Annotator is a web-based image annotation tool designed for versatility and efficiently label images to create training data for image localization and object detection. It provides many distinct features including the ability to label an image segment (or part of a segment), track object instances, labeling objects with disconnected visible parts, efficiently storing and export annotations in the well-known COCO format. The annotation process is delivered through an intuitive and customizable interface and provides many tools for creating accurate datasets.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
Software that allows you to manually and quickly annotate images in directories. The method is pseudo manual because it uses the algorithm watershed marked of OpenCV. The general idea is to manually provide the marker with brushes and then to launch the algorithm.
LGPL
GNU Lesser General Public License v3.0 - Permissions of this copyleft license are conditioned on making available complete source code of licensed works and modifications under the same license or the GNU GPLv3. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights. However, a larger work using the licensed work through interfaces provided by the licensed work may be distributed under different terms and without source code for the larger work.
1k
A web based labeling tool for creating AI training data sets (2D and 3D). The tool has been developed in the context of autonomous driving research. It supports images (.jpg or .png) and point clouds (.pcd). It is a Meteor app developed with React, Paper.js and three.js.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
976
FIAT enables image data annotation, data augmentation, data extraction, and result visualisation/validation. Annotate images for image classification, optical character reading (digit classification, letter classification), ... Extract data into different format (Caffe LMDB, OpenCV Cascade Classifiers, Tesseract ... ) with data augmentation (resizing, noise in translation / rotation / scaling, pepper noise , gaussian noise, rectangle merging, line extraction ...).
GPL
GNU General Public License v2.0 - The GNU GPL is the most widely used free software license and has a strong copyleft requirement. When distributing derived works, the source code of the work must be made available under the same license. There are multiple variants of the GNU GPL, each with different requirements.
921
A web based tool to label images for objects that can be used to train dlib or other object detectors.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
738
Label images and video for Computer Vision applications
Apache
Apache License 2.0 A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
732
YEDDA (the previous SUTDAnnotator) is developed for annotating chunk/entity/event on text (almost all languages including English, Chinese), symbol and even emoji. It supports shortcut annotation which is extremely efficient to annotate text by hand. The user only need to select text span and press shortcut key, the span will be annotated automatically. It also support command annotation model which annotates multiple entities in batch and support export annotated text into sequence text.
Apache
Apache License 2.0 A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
722
This is the official PyTorch reimplementation of Polygon-RNN++ (CVPR 2018).
Not found
License information not found
673
sloth is a tool for labeling image and video data for computer vision research.
GPL
GNU General Public License v3.0 - Permissions of this strong copyleft license are conditioned on making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights.
587
A Semi Automatic Image Annotation Tool which helps you in annotating images by suggesting you annotations for 80 object classes using a pre-trained model
Apache
Apache License 2.0 - A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
486
Javascript image annotation tool based on image segmentation. Label image regions with mouse. Written in vanilla Javascript, with require.js dependency (packaged). Pure client-side implementation of image segmentation.
BSD
BSD 3-Clause "New" or "Revised" License - A permissive license similar to the BSD 2-Clause License, but with a 3rd clause that prohibits others from using the name of the project or its contributors to promote derived products without written consent.
452
Curve is an open-source tool to help label anomalies on time-series data. Curve is designed to support plugin, so one can equip Curve with customized and powerful functions to help label effectively.
Apache
Apache License 2.0 A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
424
LOST (Label Object and Save Time) is a flexible web-based framework for semi-automatic image annotation. It provides multiple annotation interfaces for fast image annotation. LOST is flexible since it allows to run user defined annotation pipelines where different annotation interfaces/ tools and algorithms can be combined in one process. It is web-based since the whole annotation process is visualized in your browser. You can quickly setup LOST with docker on your local machine or run it on a web server to make an annotation process available to your annotators around the world. LOST allows to organize label trees, to monitor the state of an annotation process and to do annotations inside the browser. LOST was especially designed to model semi-automatic annotation pipelines to speed up the annotation process. Such a semi-automatic can be achieved by using AI generated annotation proposals that are presented to an annotator inside the annotation tool.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
418
A JavaScript interface for annotating and labeling audio files.
BSD
BSD 3-Clause "New" or "Revised" License - A permissive license similar to the BSD 2-Clause License, but with a 3rd clause that prohibits others from using the name of the project or its contributors to promote derived products without written consent.
327
Tool for labeling of a single point clouds or a stream of point clouds.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
305
A semantic annotation platform offering intelligent assistance and knowledge management The annotation of specific semantic phenomena often require compiling task-specific corpora and creating or extending task-specific knowledge bases. Presently, researchers require a broad range of skills and tools to address such semantic annotation tasks.
Apache
Apache License 2.0 - A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
261
It is a tool used to annotate 3D box in point cloud. Point cloud in KITTI-bin format is supported. Annotation format is the same as Applo 3D format.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
236
Anafora (pronounced "a-nuh-FOUR-uh", /ænəˈfɔɹə/) is a new annotation tool written at the University of Colorado by Wei-te Chen and Will Styler. Anafora is designed to be a lightweight, flexible annotation solution which is easy to deploy for large and small projects.
Apache
Apache License 2.0 A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
225
WebAnno is a general purpose web-based annotation tool for a wide range of linguistic annotations including various layers of morphological, syntactical, and semantic annotations.Additionaly, custom annotation layers can be defined, allowing WebAnno to be used also for non-linguistic annotation tasks. WebAnno is a multi-user tool supporting different roles such as annotator, curator, and project manager. The progress and quality of annotation projects can be monitored and measuered in terms of inter-annotator agreement. Multiple annotation projects can be conducted in parallel.
Apache
Apache License 2.0 A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
221
An infinitely customizable image annotation library built on React
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
213
Anomaly detection labeling tool, specifically for multiple time series (one time series per category).
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
212
A scalable open-sourced annotation web tool brought by Berkeley DeepDrive. Support various types of annotations on both images and videos Build innovative features and user-friendly interface Improve speed by using semi-automated annotations Support concurrent annotation sessions and progress monitoring Accessible through a web browser without installation
BSD
BSD 3-Clause "New" or "Revised" License - A permissive license similar to the BSD 2-Clause License, but with a 3rd clause that prohibits others from using the name of the project or its contributors to promote derived products without written consent.
195
A JavaScript image annotation library. Add drawing, commenting and labeling functionality to images in Web pages with just a few lines of code.
BSD
BSD 3-Clause "New" or "Revised" License - A permissive license similar to the BSD 2-Clause License, but with a 3rd clause that prohibits others from using the name of the project or its contributors to promote derived products without written consent.
195
This is a collaborative online tool for labeling image data. The Imagetagger is a database with integrated tools to create and manage image data and related labels. It was designed for the RoboCup to create training data for neural networks and evaluation data for diverse object recognition methods. Therefore cooperative labeling of the same data set, flexible further use of the images and labels and the option to share the data had to be made possible.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
183
Fast and efficient BBox annotation for your images in YOLO, and now, VOC/COCO formats!
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
174
Image annotation: Provides a simple GUI for marking bounded boxes of objects in images for training Yolo v3 and v2 Object detection: Built-in image detector, It can automatically annotate the detected objects in the images. Search by tags: You can browse and search tagged images in the tags view Localization: Support English, Simplified Chinese, Traditional Chinese, expandable support for other languages. Private space: A password-protected space where you can hide non-public image sources UWP: Support for Windows Universal Platform (UWP), you can click this link to view it in the windows app store. Due to the development cost of the UWP version, it will not be updated with the desktop version.
GPL
GNU General Public License v2.0 - The GNU GPL is the most widely used free software license and has a strong copyleft requirement. When distributing derived works, the source code of the work must be made available under the same license. There are multiple variants of the GNU GPL, each with different requirements.
164
SMART is an open source application designed to help data scientists and research teams efficiently build labeled training datasets for supervised machine learning tasks.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
140
VGG Image Annotator is a simple and standalone manual annotation software for image, audio and video. VIA runs in a web browser and does not require any installation or setup. The complete VIA software fits in a single self-contained HTML page of size less than 400 Kilobyte that runs as an offline application in most modern web browsers. VIA is an open source project based solely on HTML, Javascript and CSS (no dependency on external libraries). VIA is developed at the Visual Geometry Group (VGG) and released under the BSD-2 clause license which allows it to be useful for both academic projects and commercial applications.
BSD
BSD 2-Clause “Simplified” License - A permissive license that comes in two variants, the BSD 2-Clause and BSD 3-Clause. Both have very minute differences to the MIT license.
139
MUltiple VIdeos LABelling tool is a manual annotation tool to help you labelling videos for computer vision, machine learning, deep learning and AI applications. With MuViLab you can annotate hours of videos in just a few minutes!
Non-commercial
MuViLab is freely available for free non-commercial use, and may be redistributed under these conditions
136
LabelD is a quick and easy-to-use image annotation tool, built for academics, data scientists, and software engineers to enable single track or distributed image tagging. LabelD supports both localized, in-image (multi-)tagging, as well as image categorization.
GPL
GNU Affero General Public License v3.0 - Permissions of this strongest copyleft license are conditioned on making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights. When a modified version is used to provide a service over a network, the complete source code of the modified version must be made available.
129
DeepLabel is a cross-platform tool for annotating images with labelled bounding boxes. A typical use-case for the program is labelling ground truth data for object-detection machine learning applications. DeepLabel runs as a standalone app and compiles on Windows, Linux and Mac.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
120
VIAME is a computer vision application designed for DIY AI including object detection, object tracking, image mosaicing, stereo measurement, image/video search, image/video annotation, rapid model generation, and tools for the evaluation of different algorithms. Originally targetting marine species analytics, it now contains many common algorithms and libraries, and is also useful as a generic computer vision library.
BSD
"BSD 3-Clause ""New"" or ""Revised"" License - A permissive license similar to the BSD 2-Clause License, but with a 3rd clause that prohibits others from using the name of the project or its contributors to promote derived products without written consent."
118
TagEditor is a desktop application (requires Windows 10, 64-bit) that allows you to quickly annotate text with the help of spaCy library.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
104
A lightweight web-based tool for annotating word sequences.
Research
Research and Academic Use License
103
Image annotation tool by comma.ai
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
102
A web app to play, visualize, and annotate your audio files for machine learning
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
90
EVA is a web-based tool for efficient annotation of videos and image sequences. It is a re-design of BeaverDam with additional tracking capabilities. The annotation is done on a bounding box level and the labels can be exported in YOLO or Pascal VOC format.
BSD
BSD 2-Clause “Simplified” License - A permissive license that comes in two variants, the BSD 2-Clause and BSD 3-Clause. Both have very minute differences to the MIT license.
85
The Wearables Development Toolkit - a development environment for activity recognition applications with sensor signals
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
75
A collaborative framework for annotating medical datasets using crowdsourcing.
Apache
Apache License 2.0 A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
63
OpenLabeler is an open source application for annotating objects. It can generate the PASCAL VOC format XML annotation file for artificial intelligence and deep learning training.
Apache
Apache License 2.0 A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
61
An Amazon Mechanical Turk turn-key segment tool. Turkey lets you easily create a web UI on Amazon Mechanical Turk to crowd-source image annotation data. Its main functions include: Customize the annotation modes and class labels on per-image basis, Import previous annotations generated by either another human or an algorithm, Zoom-in, zoom-out, delete, undo, reset.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
56
KNOSSOS is a software tool for the visualization and annotation of 3D image data and was developed for the rapid reconstruction of neural morphology and connectivity.
GPL
GNU General Public License v3.0 - Permissions of this strong copyleft license are conditioned on making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights.
53
MAE
MAE is a lightweight, general-purpose natural language annotation tool
GPL
GNU General Public License v3.0 - Permissions of this strong copyleft license are conditioned on making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights.
51
Time series annotation library
Not found
License information not found
48
Pixie is a GUI annotation tool which provides the bounding box, polygon, free drawing and semantic segmentation object labelling.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
46
An in-browser app for labeling audio clips at random, using Docker and Flask.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
31
This is a collaborative online tool for labeling image data.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
31
This is a tool used to create 6D labels for 2D images. 6D labelling means that a 3D object is fitted onto it's projection in a 2D picture. It is 6D, because the 3D bounding box fitting requires 3 translational inputs (x,y,z) and 3 rotational degrees. It is designed to create labels for singleshotpose and betapose. But should be easily extendable to other 6D label formats.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
9
The Bio-Image Indexing and Graphical Labelling Environment (BIIGLE) is a web service for the efficient and rapid annotation of still images and videos.
MIT
MIT License - A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
5
Subscribe to get updates when new datasets and tools are released.
© 2021 Nikola Plesa | Privacy | Datasets | Annotation tools
hello@datasetlist.com