TECH OFFER

Gamified Data Annotation Platform for Supervised Machine Learning

KEY INFORMATION

TECHNOLOGY CATEGORY:

Infocomm - Artificial Intelligence
Infocomm - Data Processing

TECHNOLOGY READINESS LEVEL (TRL):

TRL8

LOCATION:

Singapore

ID NUMBER:

TO174642

Download PDF

Make an Enquiry

Technology Readiness Level

TRL	Physical Sciences & Engineering	Healthcare (Pharmaceutical)	Healthcare(Medtech)	Healthcare(Diagnostics)	Simplified
1	Basic principles observed	Basic principles observed	Basic principles observed	Basic principles observed	Proof-of-Concept
2	Technology concept formulated	Technology concept formulated	Technology concept formulated	Technology concept formulated	Proof-of-Concept
3	Experimental proof of concept	Experimental proof of concept in vitro and in vivo research model	Experimental proof of concept in vitro and in vivo research models	Experimental proof of concept in vitro	Proof-of-Concept
4	Technology validated in lab	Proof of concept in vitro and in vivo research models	Proof of concept in vitro and in vivo research models	Proof of concept in vitro and in vivo research models	Prototype in Lab
5	Technology validated in relevant environment	Non-clinical and pre-clinical research studies, & initial demonstration of feasibility and efficacy	Product Development Plan
6	Technology demonstrated in relevant environment	Phase I clinical trials	Phase I clinical trials
7	System prototype demonstration in operational environment	Phase 2 clinical trials	Clinical safety and effectiveness trials in operational environment	Clinical validation in 1 site	Prototype in Live Environment
8	System complete and qualified	Phase 3 clinical trials	Overall risk-benefit Trials
9	Actual system proven in operational environment	Pharmaceutical can be distributed or marketed	Medical device can be distributed or marketed	Clinical validation in multi-site	Ready-to-Market

TECHNOLOGY OVERVIEW

Machine Learning (ML) is a sub-field of Artificial Intelligence (AI) where a machine is able to learn without being explicitly programmed. However, before a machine can effectively perform even the simplest AI tasks, e.g. differentiating between images containing an elephant or a tiger, it has to be trained on images containing both animals. To be useful in supervised learning, training data needs to be properly labelled or annotated by a human for the machine to extract the relevant features and produce an ML model that serves its intended purpose. This highlights the important role that data annotation plays in producing robust, accurate ML algorithms in video analytics, natural language processing, and audio processing. However, many organisations that want to embark on their supervised learning journey often face difficulties gaining access to high-quality labelled datasets, known as ground truth data, due to the abundance of low-quality, expensive and unstructured data.

This technology offer is a mobile application-based data platform that enables companies to obtain high-quality annotated data. It de-centralises data collection and data annotation tasks into manageable bite-sized chunks for optimal annotation performance and crowd/out-sources the annotation task to a pool of data taggers via a mobile application. Labelling quality is established through a gamification system and a series of built-in verification procedures, including AI-assisted pre-filtering and collective human quality control.

TECHNOLOGY FEATURES & SPECIFICATIONS

This technology offer comprises a mobile application for data taggers to participate in crowd-sourced annotation and a web portal that serves as a control panel for organisations to submit and track the progress of their annotation tasks. A proprietary system for data quality assurance consists of the following:

Data pre-preprocessing process
Gamification and leveling (mobile application)
Clustering algorithms to filter outliers
AI-assisted filters to detect anomalies
Human quality control

The mobile application has the following features:

Built-in gamification rewards data taggers for generating high-quality labeled data
Simplified, micro-job structure within an anywhere, anytime annotation tool

The web portal has the following features:

Manage the upload and distribution of raw data
Data annotation workload is split algorithmically, down to the basic unit of each annotation task
Download labelled/annotated data in various commonly used formats; customisable for new formats
Library of ready-to-use datasets

Supports common data annotation formats:

YOLO Darknet TXT
Tensorflow CSV
COCO JSON
PASCAL VOC XML

Audio annotations are captured in an SRT (subtitle) file, while classification type annotations are saved to a CSV file.

POTENTIAL APPLICATIONS

The data labelling/annotation platform bridges the gap between people who have the time to deal with unstructured data with the organisations that do not. The data platform's supports the labelling/annotation of various formats of unstructured data:

Image (Bounding Box, Image Classification, Polygonal Bounding, Image-to-text transcription)
Build robust detection, background/foreground segmentation, or image classification AI models supported by high-quality annotated data

Text (Entity Extraction, Intent Recognition, Sentiment Analysis, Text Classification):
Imbue chatbots with enhanced natural language processing capabilities, with the ability to understand region-specific intent and discern the user's sentiment (positive, neutral, negative)

Audio (Audio Transcription, Sound Classification, Audio Translation):
Eliminate accent bias and improve audio/conversational AI with a wider range of vocabulary, in multiple languages

Video (Bounding Boxes, Polygonal Bounding, Subtitling):
Accelerate computer vision model development (model training and testing) for person, and object detection, with accurately labelled ground truth data

Market Trends & Opportunities

The need for data annotation increases with AI/ML growth. It is expected to grow from US $1.35B in 2020 to US $8.2B in 2028. With the highest CAGR of 31.1% during the period of 2021 - 2030.

Unique Value Proposition

This technology has the following benefits:

Crowd-sources dataset collection and reduces the time required for ground truth data annotation
Technology companies can collect and annotate raw data that support their own core AI products, especially when it is inefficient/expensive to maintain and/or scale a team of full-time data annotators.
Non-technology organisations that have access to large amounts of visual or unstructured data will be able to monetise their annotated datasets or use annotated data to support in-house AI/ML projects that relate to their respective industry.
Proprietary quality control system ensures annotated data quality is maintained
Addresses contextual, localised data annotation needs e.g. region-specific translation of a native language or local landmarks

The technology owner is interested in collaborating with various organisations to test-bed existing competencies and deep-tech companies to work on developing data generation and AI augmentation capabilities across various sectors/industries.

Make an Enquiry

RELATED TECH OFFERS

Gamified Data Annotation Platform for Supervised Machine Learning

KEY INFORMATION

TECHNOLOGY OVERVIEW

TECHNOLOGY FEATURES & SPECIFICATIONS

POTENTIAL APPLICATIONS

Market Trends & Opportunities

Unique Value Proposition

AI Solution for Safety Management in High Risky Industry or Workspaces

Generative AI Technology Developed for B2B Sales Automation and Acceleration

Synthetically-generated Privacy-preserving Data for Machine Learning

Generative AI Technology for Business Process Automation and Customer Engagement Improvement

SeaLLMs - Large Language Models for Southeast Asia

Digital Twin Platform for Quick Conversion of Point Cloud Data to BIM

Automating Medical Certificate Submission using Named Entity Recognition Model

Physical Climate Risk Analytics

Autonomous Built Environment Inspection

Building Explainable, Verifiable, Compact & Private AI Solutions For Critical Applications

Gamified Data Annotation Platform for Supervised Machine Learning

KEY INFORMATION

TECHNOLOGY OVERVIEW

TECHNOLOGY FEATURES & SPECIFICATIONS

POTENTIAL APPLICATIONS

Market Trends & Opportunities

Unique Value Proposition

Share

AI Solution for Safety Management in High Risky Industry or Workspaces

Generative AI Technology Developed for B2B Sales Automation and Acceleration

Synthetically-generated Privacy-preserving Data for Machine Learning

Generative AI Technology for Business Process Automation and Customer Engagement Improvement

SeaLLMs - Large Language Models for Southeast Asia

Digital Twin Platform for Quick Conversion of Point Cloud Data to BIM

Automating Medical Certificate Submission using Named Entity Recognition Model

Physical Climate Risk Analytics

Autonomous Built Environment Inspection

Building Explainable, Verifiable, Compact & Private AI Solutions For Critical Applications