▸ contents
Note: This work was completed at Amazon. Internal systems and organizational details have been generalized where appropriate to respect confidentiality.
One of the leading causes of mistranslations is poor-quality source content. Garbage in, garbage out. At Amazon, that meant errors amplified across dozens of languages, millions of products, and billions of words. The question wasn’t whether to fix the problem—it was whether fixing it at the source was even possible inside a system that fragmented across dozens of content management platforms.
The Problem
Product copy, marketing campaigns, help center articles, and UI strings were authored across dozens of systems, each with its own quirks and workflows. Maintaining quality across that fragmentation was nearly impossible. Teams with high volume, like Prime Video and Retail, found review costly and unenforceable—feedback required offline handoffs through spreadsheets, Slack threads, and support tickets that were slow, error-prone, and hard to track. Meanwhile, 75% of smaller content teams had no review process at all.
The consequence was predictable but expensive: wasted cycles, duplicated effort, and inconsistent quality at scale. The challenge was how to build a system that could enforce editorial review at the source, unify practices across diverse CMSs, and do it without slowing down delivery.
What We Learned
Our initial research revealed that reviewers across Amazon teams shared common working patterns and pain points that would shape how the system needed to function.
Most editorial reviewers worked in bursts, scanning large blocks of content quickly and making small corrections on the fly. They valued autonomy and speed over structure, and preferred tools that let them fix things directly rather than document issues for someone else to resolve. As one editor put it:
“By the time I explain it, I could’ve just fixed it.”
We also saw a sharp difference between editorial and legal or compliance reviews. Editorial reviews were about clarity, tone, and readability. Legal reviews required granular records, formal approvals, and a slower structured process. Treating these two types as interchangeable created frustration on both sides. The solution was to decouple them entirely—design lightweight, flexible workflows tailored to how editorial reviewers actually work, and route compliance into its own process.
The System We Built
Grounded in the principle of trusting editors and designing for speed, the solution embedded editorial review at the very start of the translation process—before content reached any CMS, before it touched any translation pipeline. This made review unavoidable. Every submission flowed through a single entry point, giving reviewers the autonomy to fix issues directly while ensuring nothing slipped through unreviewed.
The workflow surfaced new submissions automatically, provided a centralized dashboard to track them, and opened each in a workspace built for review and editing. Large sets of UI strings were organized into a table with the source text and context notes. An edit mode activated all text for editing, allowing reviewers to click anywhere and make immediate changes. Once the content was ready, a single submit action pushed it into translation workflows and final publishing.
The design required letting go of a strong early assumption. We started thinking stricter approval flows were the answer. What users showed us was that they needed trust, flexibility, and clarity. By designing for how they actually worked, we improved the experience and generated cleaner data that could fuel future automation.
The Right Kind of Editing Power
For the MVP, we faced a tradeoff in how much editing power to deliver. Simple in-app editing worked for small reviews but became click-heavy at scale. Exporting to Excel gave power users bulk-editing capability but added so much overhead it slowed daily tasks. User feedback made it clear that neither addressed the full range of needs—the deeper issue was the absence of true inline editing.
Without inline editing, every correction broke the reviewer’s flow. They needed to scan large volumes of text, spot issues in context, and click directly into the table to make quick edits, with search and filters to keep them moving fast.
We achieved this by integrating a third-party data grid into our internal framework. I built a functional prototype using real review data, which helped drive alignment across design, PM, and engineering. The prototype made the tradeoffs visible in a way that static specs couldn’t—and it got the teams building it into alignment faster than any discussion would have. While the solution introduced some technical complexity, it delivered the in-context editing experience that reviewers actually needed, balancing speed of delivery with long-term usability.
Outcomes
With Prime Video as one of the first adopters, reviewers cut small daily review cycles by 40%, enabling them to increase volume without adding overhead. Seven organizations onboarded in the first wave.
Authors gained visibility across the content lifecycle. Instead of wondering what happened after hitting “submit,” they could track progress through review and translation. That clarity reduced confusion and eliminated duplicate work.
For the organization, this meant less churn in translation, more consistent messaging, and a scalable path for onboarding partner teams. And because the system captured structured feedback inside natural workflows, it laid the groundwork for AI-driven improvement loops—turning everyday human fixes into training signals for the future.
Reflection
What stuck with me wasn’t just the speed gains. It was how much better the system worked when we respected people’s instincts.
We started thinking stricter controls were needed. What users showed us was that they needed trust. By designing for how they actually worked—fast, autonomous, in-context—we improved the experience and generated cleaner data that could fuel future automation.
The annotation system being stripped from this page is a good parallel. Sometimes the right move is to reduce friction first, get the work done, and add the structural intelligence later. Editorial review at Amazon was the same argument: do the thing that scales, design it so it doesn’t hurt, and the data takes care of itself.
Trust users, and they’ll give you the data your system needs.