DeepFake Faces More Trusted, People Trusts DeepFake faces more than Real Faces, Deepfake AI News
A study conducted over three experiments researchers from Lancaster University and UC Berkeley has found that AI-synthesised (deepfake) faces are indinguishable from real faces and that people rate the former to be more trustworthy.
A report of the study authored Sophie J Nightingale and Hany Farid was published in the Proceedings of the National Academy of Sciences of the United States.
For the study, the researchers used 400 synthetic faces generated StyleGAN2, ensuring equal representation across gender (200 women and 200 men), estimated age (faces that seem to correspond to a range of ages from childhood to older adults), and race (100 Black, 100 Caucasian, 100 East Asian and 100 South Asian). To reduce the effect that outside cues could have, researchers only used images with a uniform background and no clear rendering artefacts.
For each of the 400 synthetic faces, the researchers collected a matching face (in terms of overall appearance, race, gender etc.) from the face database used in StyleGAN2. A neural network was used to extract a low-dimensional representation of each face to compare with the database of real faces to derive the most similar face in each case.
In the first experiment, 315 participants classified 128 of the 800 faces as real or synthesised, one one. The participants were able to guess with an average accuracy of just 48.2 per cent, which is close to a chance performance of 50 per cent.
For real faces, there was a significant interaction between gender and race, and the results. Mean accuracy was higher for male East Asian faces than female East Asian faces. It was also higher for male White faces than it was for female White faces. The study didn’t infer any such significant interaction between race, gender and results for synthetic faces.
In the second experiment, 219 new participants classified 128 faces taken from the 800 faces, but this time, with training and trial–trial feedback. Average accuracy improved a little to 59 per cent but there was no improvement in accuracy over time despite providing trial–trial feedback. The average accuracy was 59.3 per cent for the first set of 64 faces and 58.8 per cent for the next set of 64.
The dribution of participant accuracy for experiment 1 and experiment 2. (Image credit: PNAS)
The third experiment was designed to ascertain whether there is a difference in perceived trustworthiness between synthetic and real faces. A total of 223 participants rated the trustworthiness of 128 faces taken from the same set of 800 faces on a scale of one to seven (one for very untrustworthy and seven for very trustworthy).
Perceived trustworthiness ratings for experiment 3. (Image credit: PNAS)
At the end of the experiment, the average rating for real faces was just 4.48 compared to 4.82 for synthetic faces. Even though the difference is only 7.7 per cent, it is significant due to the high t-value and low p-value of the experiment (t(222)=14.6, P<0.001)
The four most trustworthy faces (top) and four least trustworthy faces (bottom). S – Synthetic. R – Real. The number is the average trustworthiness rating given on a scale of 1 – 7. (Image credit: PNAS)
Women were rated significantly more trustworthy than men with an average 4.94 rating compared to male faces’ 4.36 rating. There was also a small effect of Black faces being rated more trustworthy than South Asian faces. There was no other significant effect across race in the third experiment.