They use real chats without permission to train a chatbot and have to remove it for hate messages and leak personal data



Science of Love is an app developed by a company called Scatter Lab, which in exchange for a small payment allows young South Koreans evaluate, in theory, the affection that their partners show them through the analysis of transcripts of their conversations on KakaoTalk, a famous messaging platform in your home country.



The app itself is nothing special ... but what happened with the data of the 100,000 people who downloaded it: were used by Scatter Lab to, through deep learning, train a chatbot called 'Lee Luda', who pretended to be a twenty-something South Korean fan of K-pop music.



Lee Ruda was quite successful due to its realism, in general terms, when interacting with users: between the last week of 2020 and the first of this year, 750,000 people spoke to 'her' on Facebook Messenger.



The problem is that, from time to time, users received comments out of tune. And we are not talking (only) about messages of a sexual nature, but also of other insulting about lesbians, transsexuals, blacks, the disabled and women with periods.







The chatbot to talk to dead Microsoft people "it's disturbing" Y "there are no plans to develop it"





No, it is not like the case of the Microsoft chatbot



You might think that this is not so relevant either: after all, we all remember the famous case of Tay, the Microsoft chatbot that also they had to withdraw from circulation for similar reasons (hate messages, inappropriate language, etc).



The problem is that where Tay was 'trained' live by users who knew they were interacting with a chatbot (and who, in many cases, were explicitly seeking to boycott Microsoft's experiment), Lee Luda had been trained by real conversations between real people, unrelated to the use that was to be given to that information.



In the words of Namkoong Whon, CEO of Kakao Games,




"It is society who needs to reflect after this incident, not artificial intelligence. It is not as if Lee Lu-da has said things outside of society: he is just a character who was trained from conversations made between people between 10 and 20 years".








Build your first ChatBot in 30 minutes





A huge personal data leak



Of course, there is a second problem: the data had been transferred without consent of affected people.



And, to make matters worse, the conversations they had not been anonymized, in such a way that in some cases the chatbot responded with real personal data (bank accounts, postal addresses, telephone numbers) to explicit questions from users.



Then it was discovered that, in addition, AI training data was even published in a repository on GitHub, thus increasing the public exposure of the data.



Now, of course, it is immersed in legal litigation for violating the private data of its users. Not to mention that the negative reviews in recent weeks have plunged the app's rating on the Google Play Store.



Via | The Register