Class AI — A tool to improve the efficiency of data labeling processes

William Law
6 min readDec 31, 2019

With more and more discoveries made in AI each day, the applications of AI have been expanding. Take the example of AI Dungen 2 that uses Open AI’s GPT-2 (NLP model), an open-source text adventure game that generates effectively limitless open-ended storylines.

Or take the example of self-driving cars. The research being done by companies like Uber and trying to predict the intent of drivers through signal cues, or creating more efficient algorithms to fuse the data from different sensors to provide a more robust/detailed input to the computer.

In both these cases, the machine learning model needs to be able to learn from some data (which is why I excluded the breakthroughs in reinforcement learning which were mostly unsupervised). The performance of the model depends on the quality of data that you give it and kinda follows this sort of formula:

The higher quality the data = the higher accuracy/performance from the model

However, the time and resources that it takes to clean and process the data take up too much time for a lot of companies. It’s not surprising for companies to outsource this work to other companies, which is why startups like Scale AI, LabelBox, Playment exist to solve this problem: providing high-quality training data for AI

--

--

William Law

swe // trading — prev: @MLHacks, eng @ early-stage startups | Twitter @wlaw_