T

Internship | Effect of translation on representation

TNO

Soesterberg, Utrecht, Netherlands Full-time June 16, 2026
Apply Now

Vacancy Description

About this position

In lower-resource languages such as Dutch, it can be challenging to find enough high-quality data to train large language models. When data is collected in an ethical manner, without scraping copyrighted material, this creates even more data sparsity. As a result, data from other sources is used and transformed. For example, texts are translated, adapted, shortened and lengthened. These actions are often necessary, but they also raise important concerns about (cultural) representation and bias. They can introduce different norms and values and cause social groups, identities and cultural concepts to be inaccurately represented. This can have negative effects in sensitive contexts such as recruitment, education and public services, where misrepresentation can have real-world consequences.

In this research internship, the central question is: How does the use of translated data affect representation in language models for Dutch?

<...

Ready to Apply?

अभी आवेदन करें

Submit your application for Internship | Effect of translation on representation at TNO

Apply for this Position