{"id":84059,"date":"2024-05-22T17:59:14","date_gmt":"2024-05-22T15:59:14","guid":{"rendered":"https:\/\/phrase.com\/?p=84059"},"modified":"2024-10-28T16:56:48","modified_gmt":"2024-10-28T15:56:48","slug":"understanding-phrase-quality-performance-score-phrase-qps-and-auto-lqa-how-they-unlock-hyperautomation-on-the-phrase-localization-platform","status":"publish","type":"post","link":"https:\/\/phrase.com\/blog\/posts\/understanding-phrase-quality-performance-score-phrase-qps-and-auto-lqa-how-they-unlock-hyperautomation-on-the-phrase-localization-platform\/","title":{"rendered":"Understanding Phrase Quality Performance Score (Phrase QPS) and Auto LQA:  How they Unlock Hyperautomation on the Phrase Localization Platform"},"content":{"rendered":"\n<div id=\"acf\/text-block_fce23e1b2391eb65901766efc14fd6cf\" class=\"pxblock pxblock--text alignfull spacing--default bg--white\">\n\n\t\n\t<div class=\"container\">\n\t\t<div class=\"wysiwyg animate-in\">\n\t\t\t<p><span style=\"font-weight: 400;\">In this article, I focus on highlighting the rationale behind the development of our two Phrase-proprietary automated translation quality technologies: <\/span><span style=\"font-weight: 400;\">Phrase QPS <\/span><span style=\"font-weight: 400;\">and <\/span><span style=\"font-weight: 400;\">Auto LQA<\/span><span style=\"font-weight: 400;\">. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">I explore how these two technologies, both built and based on a foundation of the MQM (Multidimensional Quality Metrics) framework, are designed to work in tandem to unlock <\/span><a href=\"https:\/\/phrase.com\/blog\/posts\/how-to-hyperautomate-business-growth-five-key-strategic-insights-from-our-expert-panel-webinar\/\"><span style=\"font-weight: 400;\">hyperautomation<\/span><\/a><span style=\"font-weight: 400;\"> capabilities for our enterprise customers.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Automated Translation Quality Visibility Unlocks Hyperautomation<\/span><\/h3>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400;\">The advances in neural <a href=\"https:\/\/phrase.com\/solutions\/machine-translation\/\">machine translation (MT)<\/a> and large language models (LLMs) in recent years, have been nothing but breathtaking.<br \/>\n<\/span><\/p>\n<div class=\"iframe\"><iframe loading=\"lazy\" title=\"Accelerating Hyperautomation\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/NY5xM8q7iJA?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/div>\n<p><span style=\"font-weight: 400;\">For many of the major language-pairs and use-cases, state-of-the-art MT is now both accurate and fluent. However, despite such impressive strides, the use of MT alone still carries risks that are not tolerable for most enterprise applications. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Human translators are still required to perform post-editing and review \u2013 but this method is time-consuming and costly. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Alternatively, an organization may opt to employ MT without any human input at all. In doing so they take a gamble on possible embarrassing errors and misinterpretations that could be very damaging.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Enterprises seeking to streamline translation and localization workflows face a conundrum: how to adopt and accelerate automation without compromising quality or risking highly flawed translations. <\/span><span style=\"font-weight: 400;\">The <\/span><a href=\"https:\/\/phrase.com\/platform\/\"><span style=\"font-weight: 400;\">Phrase Localization Platform<\/span><\/a><span style=\"font-weight: 400;\"> already implements and offers workflows in support of both human post-editing of MT as well as full automation with MT. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">However, we understand the need for more sophisticated workflows that allow our customers to maximize the value of MT while strictly optimizing for the right tradeoff balance between automation and level of quality risk. <\/span><span style=\"font-weight: 400;\">The solution lies in harnessing cutting-edge technology to provide unparalleled visibility into translation quality at scale.\u00a0<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-77003\" src=\"https:\/\/phrase.com\/wp-content\/uploads\/2024\/03\/Phrase_QPS_in_Suite-300x211.jpg\" alt=\"Phrase QPS empowers linguists to efficiently identify and address issues while working in the TMS and Strings editors, leading to enhanced efficacy in content review.\" width=\"733\" height=\"515\" srcset=\"https:\/\/phrase.com\/wp-content\/uploads\/2024\/03\/Phrase_QPS_in_Suite-300x211.jpg 300w, https:\/\/phrase.com\/wp-content\/uploads\/2024\/03\/Phrase_QPS_in_Suite-1024x719.jpg 1024w, https:\/\/phrase.com\/wp-content\/uploads\/2024\/03\/Phrase_QPS_in_Suite-768x539.jpg 768w, https:\/\/phrase.com\/wp-content\/uploads\/2024\/03\/Phrase_QPS_in_Suite-1536x1078.jpg 1536w, https:\/\/phrase.com\/wp-content\/uploads\/2024\/03\/Phrase_QPS_in_Suite.jpg 1710w\" sizes=\"(max-width: 733px) 100vw, 733px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">The goal is to enable organizations to automatically detect and address low-quality translations efficiently within the translation workflow process, minimizing the need for extensive human intervention. <a href=\"https:\/\/phrase.com\/phrase-quality-technologies\/quality-performance-score\/\">Phrase QPS was designed specifically to address this need.<\/a><\/span><\/p>\n<p><span style=\"font-weight: 400;\">Phrase QPS assigns quality scores at the segment level, which are then aggregated to the document and job level. Workflow \u201cGating\u201d decision-points are then implemented to support two major complementary decisions:<\/span><\/p>\n<ol>\n<li><strong>At the job-level: is a translated job of sufficient quality to be completed without further human editing or review?\u00a0<\/strong><\/li>\n<li><strong>At the segment-level, for jobs that are sent to human editing: which segments are of sufficiently high quality and can be \u201cblocked\u201d from human editing and correction?<\/strong><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">Crucially, Phrase QPS is designed to operate seamlessly across diverse translation scenarios. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">This not only includes machine-translated content but also human-edited MT and traditional human translation. This versatility ensures that enterprises can maintain rigorous quality standards across all of their localization workflows, regardless of the translation method employed.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Phrase QPS, Auto LQA and MQM\u2026 and How they work in Tandem<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">The most reliable process for assessing the quality of translations has long been a human-expert-intensive process known as Linguistic Quality Assessment (LQA). Human LQA has evolved, with increasing adherence in recent years to the <\/span><a href=\"https:\/\/themqm.org\/\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Multidimensional Quality Metrics (MQM) framework<\/span><\/a><span style=\"font-weight: 400;\">. This approach has been further consolidated by the recent release of <\/span><a href=\"https:\/\/www.iso.org\/standard\/80701.html\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">ISO-5060<\/span><\/a><span style=\"font-weight: 400;\">, an ISO standard based on MQM.<br \/>\n<\/span><\/p>\n<p><span style=\"font-weight: 400;\">MQM is a comprehensive framework for assessing translation quality over several dimensions. It takes into account different quality requirements such as fluency, adequacy and adherence to terminology and categorizes them into error types. This offers users a structured way of evaluating the accuracy and quality of translated content.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-77046\" src=\"https:\/\/phrase.com\/wp-content\/uploads\/2024\/03\/Phrase_Orchestrator_QPS-4-300x211.jpg\" alt=\"Harnessing the power of generative AI, Auto LQA (Language Quality Assessment) offers an in-depth and instantaneous assessment of the quality of already localized content.\" width=\"728\" height=\"512\" srcset=\"https:\/\/phrase.com\/wp-content\/uploads\/2024\/03\/Phrase_Orchestrator_QPS-4-300x211.jpg 300w, https:\/\/phrase.com\/wp-content\/uploads\/2024\/03\/Phrase_Orchestrator_QPS-4-1024x719.jpg 1024w, https:\/\/phrase.com\/wp-content\/uploads\/2024\/03\/Phrase_Orchestrator_QPS-4-768x539.jpg 768w, https:\/\/phrase.com\/wp-content\/uploads\/2024\/03\/Phrase_Orchestrator_QPS-4-1536x1078.jpg 1536w, https:\/\/phrase.com\/wp-content\/uploads\/2024\/03\/Phrase_Orchestrator_QPS-4.jpg 1710w\" sizes=\"(max-width: 728px) 100vw, 728px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Beyond its value in providing consistent and structured MQM evaluation feedback, MQM enables improvements to the various stages of translation leading to consistency and dependability in translation quality assessment. Its multidimensional nature allows for a nuanced understanding of translation performance.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This makes it a useful tool for translators, researchers and stakeholders involved in the localization and language industry. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">The execution of human LQA for translation using the MQM framework typically involves trained linguists or evaluators who systematically go though translated content based on the MQM predefined quality criteria. Evaluators compare the target text with its source in order to identify markup and categorize translation errors and discrepancies, along with their severity.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Consequently, an overall score for the entire translation job is calculated based on a severity-weighted scoring mechanism. This aggregates the various issues identified during evaluation and annotation of the translated segments leading to an overall MQM score. Employing LQA within frameworks offered by MQM enables a rigorous examination of translation quality that can inform decisions made by stakeholders geared towards improving their localization process.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Due to its cost and speed limitations, Human LQA has long been restricted to offline quality assessments on small samples of content. Yet many enterprises allocate major portions of their localization budgets to LQA. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">With the recent advent of LLMs, full automation of Human LQA, at impressive levels of accuracy, has now become possible.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For this purpose, <a href=\"https:\/\/phrase.com\/generative-ai-localization\/\">we developed Auto LQA<\/a>.<br \/>\n<\/span><\/p>\n<div style=\"width: 1920px;\" class=\"wp-video\"><!--[if lt IE 9]><script>document.createElement('video');<\/script><![endif]-->\n<video class=\"wp-video-shortcode\" id=\"video-84059-1\" width=\"1920\" height=\"1080\" preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"https:\/\/phrase.com\/wp-content\/uploads\/2024\/05\/AutoLQA-animated-cut.mp4?_=1\" \/><a href=\"https:\/\/phrase.com\/wp-content\/uploads\/2024\/05\/AutoLQA-animated-cut.mp4\">https:\/\/phrase.com\/wp-content\/uploads\/2024\/05\/AutoLQA-animated-cut.mp4<\/a><\/video><\/div>\n<p>&nbsp;<\/p>\n<p>This LLM capability\u00a0 is fully-automated and is orders of magnitude faster and less-costly than Human LQA.<\/p>\n<p><span style=\"font-weight: 400;\">It can be used for use-cases where automated analysis is deemed sufficient, as well as an automated \u201cpre-annotator\u201d for Human LQA (similar to the way MT is used in conjunction with human MT post-editing).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Once Auto LQA annotates all segments in a job, an MQM score can be algorithmically calculated. This can be done in exactly the same way as is typically done with human LQA.<\/span><\/p>\n<h3><strong>So if we now have Auto LQA, why do we still need Phrase QPS? <\/strong><\/h3>\n<p><span style=\"font-weight: 400;\">In an ideal world, one AI component would indeed be capable of doing both, and we expect that to emerge in the not-too-distant future. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">While Auto LQA represents a significant leap forward, its utility is currently still hindered by the relatively slower speed and higher cost of LLMs, limiting its scalability. QPS addresses this challenge by training a smaller, faster and less costly AI model.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This model\u00a0 predicts the quality score that would be assigned by an MQM annotator (or by Auto LQA), without generating the detailed annotation. <\/span><span style=\"font-weight: 400;\">While slightly less accurate than its human and Auto LQA counterparts, QPS meets the crucial requirements of speed and cost-effectiveness.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Phrase QPS is trained on Human LQA MQM annotations supplemented by synthetic data generated through Auto LQA, and then refined by human corrected iterations. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">This ensures that Phrase QPS balances precision and scalability. Furthermore, the fact that QPS and Auto LQA are separate and independent AI models presents an opportunity for mutual validation. <\/span><span style=\"font-weight: 400;\">This\u00a0 enhances\u00a0 the overall reliability of assessment and validation of both models (More about that in a future article!).<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">The Bottom Line<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">To sum up, the field of estimating translation quality is undergoing significant changes due to technological advances and methodological improvements. Auto LQA and Phrase QPS represent two innovative pathways toward achieving automation in evaluation, each with their own set of compromises.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Phrase QPS can scale translation quality visibility by unlocking new levels of translation hyperautomation at measurable quality risks. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Auto LQA\u00a0 significantly reduces LQA costs while simultaneously generating valuable data required for training an accurate Phrase QPS model.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">New processes of assessment involving both human and automated judgments can help achieve reliable and scalable TQA (Translation Quality Assessment) that lead to streamline localization processes and effective multilingual communication.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This innovation signifies a critical turning point in the market, revolutionizing automation and scalability within localization. It lays the groundwork for hyperautomation, where content is seamlessly processed through various AI and machine learning techniques and workflows. This will transform both the efficiency and precision of multilingual content creation\u2014a vital component for managing the growing volume of content in today&#8217;s globally connected world.<\/span><\/p>\n\t\t<\/div>\n\t<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Discover how Phrase QPS and Auto LQA are revolutionizing translation quality and unlocking hyperautomation. In this insightful article, you&#8217;ll learn about cutting-edge technologies built on the MQM framework, designed to enhance translation quality visibility and streamline localization workflows. Understand how these innovations balance automation with quality risk, reduce costs, and improve efficiency, enabling enterprises to achieve reliable and scalable Translation Quality Assessment (TQA).<\/p>\n","protected":false},"author":73,"featured_media":84085,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_stopmodifiedupdate":false,"_modified_date":"","_searchwp_excluded":"","footnotes":""},"categories":[41,42],"class_list":["post-84059","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-translation","category-phrase-and-beyond"],"acf":[],"_links":{"self":[{"href":"https:\/\/phrase.com\/wp-json\/wp\/v2\/posts\/84059"}],"collection":[{"href":"https:\/\/phrase.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/phrase.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/phrase.com\/wp-json\/wp\/v2\/users\/73"}],"replies":[{"embeddable":true,"href":"https:\/\/phrase.com\/wp-json\/wp\/v2\/comments?post=84059"}],"version-history":[{"count":15,"href":"https:\/\/phrase.com\/wp-json\/wp\/v2\/posts\/84059\/revisions"}],"predecessor-version":[{"id":85237,"href":"https:\/\/phrase.com\/wp-json\/wp\/v2\/posts\/84059\/revisions\/85237"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/phrase.com\/wp-json\/wp\/v2\/media\/84085"}],"wp:attachment":[{"href":"https:\/\/phrase.com\/wp-json\/wp\/v2\/media?parent=84059"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/phrase.com\/wp-json\/wp\/v2\/categories?post=84059"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}