Classification of content in appropriate categories including keyword tagging, and categorization for images, product descriptions or websites. Extraction of particular words or phrases to determine whether content is positive, negative or neutral.
Content assessment and analysis
Review of sponsored listings against a set of specific guidelines determined by the client as well as scoring the quality of machine-translated segments or fixing errors to produce natural, error-free translations.
The Gengo experience
We provide high-quality data for machine learning at scale
With over nine years of know-how in providing AI training data, Gengo is the trusted platform for leading global tech companies for natural language, speech, communication and multilingual projects. Our crowd of more than 25,000 highly skilled and specialized language professionals are located across the globe and available 24/7, providing access to a huge volume of data across all languages and file types.
Working with Gengo
Our multilingual workforce will take care of your data for complete ease-of-use
Automatic job distribution
Our advanced crowd system allows projects to be automatically distributed to qualified contributors to begin work immediately.
Streamlined platform and communications
Our innovative and fully-automated platform offers seamless project and crowd management to provide cost-effective pricing and uncomplicated interactions for on-time delivery.
Advanced quality check system
Our established quality assurance system includes built-in validation, worker spot-checking and a worker seniority system to ensure high-quality data.
Three simple ways to get high-quality data
Send your file to one of our personal account managers.
Link directly with our API for high-volume data and a seamless tech approach.
Or tell us your requirements and we’ll create the data from scratch to suit your specific business needs.
Customer case studies
With access to a huge volume of data in numerous language-related categories, we’ve generated extensive data sets for some of the world’s top companies
Machine translation retraining
Sourced over 200,000 segments in Japanese to English to train machine translation deep-learning models.
Created a customized team and process to select the top three most informative comments across hundreds of forum posts.
Audio speech analysis
Evaluated hundreds of machine-generated speech samples to identify problematic pronunciation and errors to determine overall naturalness.