Appen optimizes the cost and quality of a massive data project.
Result: 70% cost reduction and 30% increase in accuracy
"eBay has completed more than 15 different types of high-value projects through Appen over the past year."
Founded in 1995, eBay Inc. is a global technology corporation. The company manages eBay.com, an online marketplace that connects millions of buyers and sellers worldwide.
Cost-Effective Use of Human Brainpower
eBay features millions of product taxonomies that originate from users all over the globe. The job of ensuring its constantly changing online catalog is optimally organized at all times can’t be done with computer algorithms alone. Human brainpower is needed to accurately categorize the products and create product taxonomies from the user’s standpoint.
In addition to product categorization, eBay needed an efficient method of finding a product’s Global Trade Item Number (GTIN). This unique 12- to 14-digit identifier is often missing from posted product descriptions, and the necessary task of finding it is complicated by the lack of a central repository of GTINs.
The company tried to leverage offshore teams in several low-cost destinations to complete these processes, but the traditional outsourcing model posed several challenges with respect to cost, scalability, and accuracy.
Microtasking + Automated Quality Control
Appen’s platform took huge amounts of eBay product information and broke it down into microtasks that were completed online by thousands of individuals collaborating across the globe. This massive collaborative effort ensured the enormous data task was accurately completed in record time and at low cost.
With Appen managing the workflow and checking the accuracy of the data returned, the final results sold eBay on the benefits of crowdsourcing. eBay has completed more than 15 different types of high-value projects through Appen over the past year.
To improve the product categorization algorithm, eBay and Appen collaborated to design a machine learning workflow. Through crowdsourcing, they were able to accomplish the job faster, at a lower cost, and with the best possible accuracy level. The machine workflow design, still in use for ongoing classification jobs, presents a contributor with a Product Image and a Product Title with one to six possible classifications.
To minimize the impact of first- response bias, the order of the possible classifications is randomized each time it is presented.
In addition, Appen was able to create a geographic filter to ensure native English speakers who make up eBay’s core customer base and hence are more familiar with eBay product taxonomy completed the project.
eBay and Appen also designed an exhaustive search process for product GTINs. Appen contributors searched for GTINs for a specific product (based on product title, product type, and image) through a variety of reliable channels. Appen then compared the contributors’ responses, ensuring that each piece of data was verified and accurate.
The Superior Accuracy of an Iterative Process
Appen’s large online contributor pool completed product categorization five times faster than a traditional outsourcing team with a vastly higher accuracy rate.
Global Trade Item Number (GTIN) collection was performed 10 times faster with a comparable accuracy rate. Appen’s platform collected multiple responses for each product to minimize the impact of any individual error or response. Appen also increased the number of judgments collected on products that are difficult to categorize or for which it is difficult to find GTINs. This means that when there was disagreement, Appen collected judgments until a reliable answer was found.
eBay needed a solution that was optimized for both quality and cost and found that solution with Appen. Appen provided the quality and accuracy of an in-house team but at a lower cost and in a highly scalable manner.
“As more and more projects with Appen graduate to production, different groups (engineering, product management, quality engineering) across the organization are seeing the benefits of crowdsourcing and are keen on embracing this new paradigm.”