Should the accents be “erased” by artificial intelligence?

June 2022, theSANAS company announces that it has raised $32 million to develop technologies based onartificial intelligence whose purpose is to remove accents. September 2022 the platform will be born, not without generating interest, curiosity and excitement around the world English speaker the French speaking.

Such software plunges us into a dystopia Where technology makes differences, identity traits and cultures of individuals disappear. However, this idea is not new: the film Sorry to bother you The 2018 film already addressed the accent of the African American population in a satire about call centers.

Sorry to Disturb Movie Trailer – Universal Pictures UK

So how do you actually remove an accent? Between utopia and dystopia, why can the development of an artificial intelligence capable of “removing” accents be more of a problem than a solution? What do you remove more than a track of sound by neutralizing an accent?

How artificial intelligence can silence an accent

Accent can be defined as a bundle of often oral cues (vowels, consonants, intonation, etc.) involved in the more or less conscious development of hypotheses about geographic, social, or linguistic origin. This accent can be said, among other things: ” regional » or « foreigners », referring to different imaginations.

The relevance of detecting an accent lies in the fact that a number of sound characteristics appear homogeneous between speakers of a language, a geographical area or a social group, as Philippe Boula de Mareüil points out.

These start-up technologies often represent a black box and there is little concrete information about the tools used to “remove” the accent. However, the means are varied and are mainly aimed at partially transforming the structure of the sound wave in order to produce specific acoustic cues towards a perceptual norm.

In this way we can play with the timbre of certain vowels, realize consonants or even transform parameters such as rhythm, intonation or accentuation according to expected perceptual goals.

At the same time, we keep a maximum of voice parameters that allow to reproduce the original speaker’s voice in the image of the ” Cloning Voices “which can lead to scams” Vote deepfake “. These technologies make it possible to separate what is in the order of the language from what is related to the voice.

These technologies make it possible to separate what is in the order of the language from what is related to the voice – Pixnio CC0

Automatic and real-time speech processing poses technological difficulties, the main problem being the quality of the audio signal to be processed. Still, there are different solutions based on the deep learning and the Neural Networksas well as big speech corpuswhich allow better handling of the uncertainties in the signal.

When it comes to foreign languages, Sylvain Detey, Lionel Fontan and Thomas Pellegrini name a few Challenges in developing these technologiesnamely by what standard one compares with what is expected, or even what role the corpora can play in setting these goals – without particularly promising answers emerging at the moment.

The myth of the neutral accent

However, identifying an accent is not limited to acoustic cues alone. Donald L. Rubin was able to show that listeners can recreate the impression of a perceived accent simply by associating faces of supposedly different origins with voices.

Similarly, the speakers are absent in the absence of these other cues not so good in their ability to recognize accents that they do not hear regularly or that they imagine stereotypically, for example, imagining that there are many consonants in them German.

Want to remove accents to counteract the social impact of a accent discrimination boils down to asking what a “neutral” accent is. Now all pronunciation variations imply representations.

Médéric Gasquet-Cyrus, according to the media “Marseille specialist”, even remembers it the so-called “Parisian” accent is an accent. In French, the accent known as “standard” evolved based on sociologically dominant groups : Parisian bourgeoisie, media (radio, TV), preferred middle classes, for example.

Tour de France of regional accents and linguistic discrimination – France24

For several years, researchers working in a collective have been trying to define the contours of a reference French based on the similarities that exist between all the Francophonie dialects. The project ” Phonology of Contemporary French has thus made it possible to offer audible accents to the general public.

It should also be noted that the value placed on an accent (strong, soft, romantic, harsh) varies greatly by individuals, times, and social groups. Iván Fónagy, philologist hHungarianshowed that in his work, people tend to ascribe the same properties to sounds The living voice : Essays on psychophonetics: the /r/ a lively sound, the /i/ as small, the /u/ (the spelling “ou”) as opulent, etc.

Delete or keep, the chicken or the egg?

In sociology, Wayne Brekhus raises the question of the need to look at the invisible while also engaging with the marked and the unmarked—the accent and what passes for non-accent. This leads to a re-examination of the power relations that exist between individuals and the way we homogenize the marked: the one who (according to others) has an accent.

In addition, we are led to the question of how new technologies can make us more “actor” or “actress” than “machine”says Catherine Pascale, by participating in the creation of an eco-ethical framework.

Removing an accent means evaluating a dominant accent type, neglecting that other cofactors are involved in the perception of that accent as well as the emergence of language discrimination. Removing the accent does not remove the discrimination. On the contrary, the accent makes the differenceidentify thus participation in phenomena of humanization, group affiliation and even empathy: the accent is quite ancient.

If the evolution of technologies through artificial intelligence and deep learning offer society untapped potentials, they can also lead to a dystopia where dehumanization means that the political and societal role of coexistence and the diversity that resonates within it, in is pushed into the background UNESCO General Declaration on Cultural Diversity.

Instead of hiding them, it seems necessary to make recruiters aware of how accents can contribute to client satisfaction and that the policy is tackling this issue. If thenational assembly had taken a strong step by voting in 2020 on a text banning discrimination based on accent, Provence notice that the senate doesn’t seem to get it, as two years later it’s still not on his agenda.

This article is produced by The Conversation and hosted by 20 Minutes.

Leave a Comment