91tv

Ismenia Miranda
Ismenia Miranda in Allariz, Spain.

There’s a lot of noise around the implementation of AI into translation workflows, but what’s really happening right now?

The truth is that AI technologies, though maturing quickly, are not self-sustaining enough to replace linguists (yet, though I suppose I’m no fortune teller). This was made no clearer to me than through the work I completed as a Spanish terminology intern with the .

What Is Terminology Management?

Terminology work combines various domains and subjects, particularly linguistics, translation studies, and philosophy. My internship work was a combo of terminography and terminology management. Terminography addresses the philosophical aspects of language, specifically how concepts and terms relate to one another and, consequently, how they should be categorized based on these relationships. For example, the , National Red Cross Societies, and International Red Cross are all admissible terms referring broadly to the same entity. I had to assess whether the relation between these terms was partitive (one was a part of the other) or generic (one is a type of the other) and, ergo, whether this could or should constitute a single entry in a termbase.

Terminology management, on the other hand, is the practical maintenance and creation of terminological databases and systems once terms have been established and goals have been set.

Sounds pretty straightforward? Well, it’s not. One of the primary challenges is that most commercially available translation management systems (TMS) and translation environment tools (TEnTs) only provide basic fields for structuring terminological resources, such as term, language, locale, part of speech, definition, use case, context, domain, and/or term status. This may suffice for a freelancer working on a short-term project but not necessarily for larger-scale operations. 

That’s why international organizations like the  build their own proprietary termbases. WIPO keeps a comprehensive record of terms, concepts, and concept relations in the 10 Patent Cooperation Treaty (PCT) languages to support the activities of their Translation Division. Full records for the terms can be accessed via the concept map, where users can also find detailed information like usage type, term reliability, and context. WIPO’s termbase is available to the public via —truly a gem in terms of large-scale terminology management. If you have time, I recommend playing around with their concept maps.

concept map
Screenshot of WIPO Pearl concept map search for ”artificial intelligence.“
wipo concept map ai
Screenshot of WIPO Pearl full record for the term ”generative AI” accessed from the concept map for ”artificial intelligence.”

What AI Does Well—and What It Doesn’t

For the work we needed to complete with the American Red Cross, the capabilities of our TMS were not enough, so the organization has created elaborate spreadsheets for different use cases and departments.

Where the TMS did excel was in the integration of AI for term extraction. Conventional automatic term extraction is based on frequency and uniqueness relative to a corpus. We used AI to compile a list of potential terms for a termbase in a more efficient and higher-quality manner than purely statistical term extraction. This reduced time spent in the termbase cleanup phase and more quickly provides translation resources to translators, potentially decreasing the incidence of term-related errors. 

The predictive capabilities of AI tools is what separates them from traditional, statistical models. Though how exactly AI models process is largely a black box, even for professionals, we understand that these models use an extensive quantity of word vectors, layers, dimensions, and other data analytics (integrating strategies of natural language processing, machine learning, and deep learning) to produce information that is more comprehensive and sophisticated than classic automation. 

If there is one thing I’ve learned from working in AI, it’s that it’s similar to working with big data. Translation memories (TM) are data, and terminology extraction is basically data analysis. In order to use AI term extractors, you must understand the type of language you are looking for and what purpose it serves. It is not enough to simply have a list of sample terms. This would provide the model too much freedom in the assessment of the data pattern and we want it to be as precise as possible. This is where the importance of linguistic knowledge becomes undeniable, and no one has more linguistic knowledge than a great translator. This is central to the argument for the use of augmented intelligence for translation workflows.

It’s important to note that even with the AI term extractor, I still had to do termbase cleanup. With large operations such as the American Red Cross, use case is integral aspect of how terms are classified and categorized. Case in point, whether a concept is being described between stakeholders or toward the public affects the appropriate term for the concept. This would be the case for something like “client” vs. “member” for recipients of a service. Considering the nature of public-facing documents vs. internal communications, the need for plain language during natural disasters, and the constantly changing terms used to describe marginalized communities, it is evident that a person with a deep understanding of the pragmatic applications of language is still necessary. While I still had to manually clean up and recategorize terms, it was less cleanup than I’ve done using conventional automatic term extraction, which is based on frequency and uniqueness relative to a corpus.

Terminology Work Is a Promising Niche in the Language Industry

I would encourage translators or translation students concerned about the integration of AI into translation workflows to look into terminology work. Internships at nonprofit organizations can be a fantastic way to build experience. If these internships are unpaid or low-paid, Middlebury Institute students can apply for experiential learning funding.

I think you will also find that contributing your time and efforts to organizations that improve the lives of people in your community will be fruitful beyond pure professional development. At the Red Cross, I was mentored by knowledgeable linguists, terminologists, and program and project managers. The experience made me more prepared to dive into the shifting language services landscape.

My internship experience with the American Red Cross informs the work I currently do as a generative AI associate with  where I work primarily in data creation, preparation, and evaluation for multilingual projects. Data is at the heart of AI projects and terminology is certainly one of the most data-centric aspects of the translation process. Working in terminology builds competency in organizing and creating high-quality structured data, which can ultimately be used to train AI models (this depends on a variety of other factors as well). If you are a person who enjoys data and information processing, this is certainly an intersection of language services and technology worth investigating.