Testing AI: risk factors

Testing AI: risk factors

5. March 2024 | 6 min |

In today’s connected world, AI translators such as DeepL and Google Translate have become indispensable. I have already put these translators through their paces in the previous chapter. However, as with all technologies, there are certain risks involved in using AI translators. In this chapter, I conclude my blog series and take a closer look at these risk factors and how they can be minimized.

When language changes

Language is constantly changing. Not only the style, but also the rules of the German language have changed considerably in recent years. New words and phrases have been introduced (youth word of the year 2023: “goofy”), while others have become outdated and less common. The influences of technology, social media and cultural exchange have led to a rapid evolution of language. In addition, grammar and spelling have also evolved to reflect changes in society and communication.

In order for these trends and developments to be reflected in the translations, the AI translators must always be kept up to date, especially the training data. The system uses the training data to learn how to deal with unknown texts and interpret them in a way that takes into account both the context and the nuances of the original text. If this data is not up to date, the system could deliver translations that are out of date.

The right way to deal with user feedback

So-called feedback loops enable systems to learn from mistakes and improve their own performance over time. For example, if an AI translator translates a text incorrectly, a user of the system can mark this translation as “bad” and suggest an alternative “correct” translation. This allows the system to use the suggested “correct” translation to adapt and improve its internal models. This process of continuous improvement allows the AI translator to autonomously refine its skills over time and become more accurate. However, it is important to note that the quality of the feedback – i.e. whether a translation is marked as good or bad – and the accuracy of the suggested “correct” translation have a significant impact on this process. Incorrect or misleading feedback, whether intentional or unintentional, could lead to the system interpreting incorrect translations as correct, repeating these errors in future translations or even embedding offensive content into the system.

To prevent this from happening, this feedback should be carefully checked and moderated. This could be a combination of automated systems and human reviewers. In this way, the risk of incorrect or misleading feedback affecting the performance or accuracy of the AI translator can be minimized.

Results must be checked

Overfitting can occur in any AI system. Overfitting occurs when a model is so adapted to the training data that it loses flexibility and performs poorly on new, unknown data. As a result, the AI system does not learn to recognize the underlying patterns. This leads to inaccurate predictions and poor overall model performance.

To ensure that this behavior is detected during a system test, a separate, independent test data set should be used that was not – or rather, cannot be – used for training. I have already drawn attention to this topic in the previous chapter.

Attack on personal data

As previously mentioned, training data is required for an AI system to function at all. However, it should be ensured that this data does not contain any personal or private information, such as names, addresses, telephone numbers or other identifying characteristics. If this data is nevertheless used to train an AI system, there is a risk that the privacy of the data subjects will be violated. This is because an attacker could potentially gain access to this data and misuse it for unwanted purposes.

In order to prevent targeted advertising, identity theft or even blackmail through the use of an AI system, all data associated with the AI should be anonymized – especially training data. The cleansing of data should also be part of the feedback loop. Otherwise, there is also a risk that the information entered will be used unfiltered.

Further risks

The aspects mentioned so far represent only a small selection of the risks. Other risks include

  • Lack of accessibility: People with disabilities may find it difficult to use the systems – especially those with visual and hearing impairments. It is important that AI systems are designed to be barrier-free so that they are accessible and usable for all users.
  • Loss of context: Human communication contains subtle nuances. If these are not captured, this can lead to misunderstandings and miscommunication, especially with complex and sensitive topics.
  • Dependence on internet connection: Many AI systems require a stable internet connection in order to be used at all. This can be an obstacle for users in areas with poor or no internet connection.
  • Lack of cultural sensitivity: AI translators should take cultural differences and specific regional expressions into account to avoid unpleasant or offensive translations.
  • User dissatisfaction: If AI systems do not meet user expectations, this can lead to dissatisfaction. Factors that play a role in this include inaccurate translations, a slow response time, a lack of support for dialects, jargon or specialist languages or a difficult user interface.
  • Lack of interpretability of the results: In machine learning, the method of how AI systems generate results is not always clear. This can lead to confusion for users, especially if a translation is inaccurate or unexpected. Therefore, it is important to make the systems transparent and interpretable so that user trust and understanding is fostered.

Assessment of risks

Comparison of the factors in a risk matrix. The factors are sorted by occurrence in the article.

At the end of my analysis, I have compared and weighed up all aspects in terms of their impact and probability of occurrence. This makes it possible to prioritize the risks in a targeted manner. It is important to note that this presentation is very simplified and only reflects my personal perception based on the experiences I have had during my journey on the topic of “testing AI”.

A journey comes to an end

In view of the risks described above, it is important that the development and use of AI systems is carried out responsibly and with the utmost care. This applies to AI translators as well as to other systems that use artificial intelligence. For other systems, however, many more risks need to be considered. Further information on this can be found in the ISTQB certification for testing artificial intelligence. By considering these risk factors, developers can realize the full potential of AI translators while protecting the privacy and quality of translations.

Concluding words

This chapter marks the end of my journey through the world of AI testing. Looking back, the first experiments with the “Teachable Machine” from the first chapter helped me a lot to better understand how machine learning works. The analysis of the AI-specific quality characteristics and the subsequent integration of the test cases in ALM Octane in the following chapters, together with the risk analysis, form a well-rounded conclusion. I remain curious to see what interesting AI topics the future holds in store.

About us

We are a powerhouse of IT specialists and support customers with digitalization. Our experts optimize modern workplace, DevOps, security, big data management and cloud solutions as well as end user support. We focus on long-term collaboration and promote the personal development of our employees. Together, we are building a future-proof powerhouse and supporting customers on their path to successful digitalization.

Contact

Do you have a request? Please contact us!

Do you have a request? Please contact us!

As your companion and powerhouse in the IT sector, we offer flexible and high-performance solutions.