Tuesday, 20 December 2022

AI worries from MIT

Following the story noticed at reference 1, yet another story yesterday from MIT at reference 2 about one of what seem to be the many dangers of AI. With another one reported only yesterday at reference 3.

This one is mainly about language generation software, sometimes grouped under the heading 'large language models', with something called ChatGPT being a newly released chunk of software of this sort, to be found at references 4 and 5. The sort of software into which Meta, aka Facebook, has poured a huge amount of money.

The problem seems to be that while in the beginning it might just have been a bunch of good natured geeks working on the interesting & tricky problem of getting computers to talk sense, we now have accessible software which, using all the stuff out on the Internet as a foundation, can generate text about more or less anything you care to name. Text which might read plausibly enough, but which might be complete rubbish or worse. It seems that the software is cued by what it finds out there, and to the extent that it finds rubbish, it is rubbish in, rubbish out. And given that the software never rests, there is lots of it.

It also seems that the ability to generate plausible text has outstripped the ability to distinguish true from false, or good from bad. Or, indeed, the ability to distinguish human text from computer text.

So all this computer generated rubbish, possibly inciting all kinds of bad people to do bad things, is pouring onto the Internet, where it not only gets read by the bad, the ignorant and the unwary, but also acts as seed corn for the next generation of even worse rubbish. Which sounds pretty awful, but I have not come across any proper statistics about all this. Where are all the statisticians who are trying to keep score?

Notwithstanding, it looks to me as if a heavy dose of regulation and supervision is needed - although I can't see the governments of either the UK or the US taking effective action. Rather they will hide under the blankets of free speech, the need to keep the nanny - not to say surveillance - state at bay and the need to allow enterprising types to be enterprising. To try an make an honest buck.

For an easier going, if no less scary, take on all this one can always watch the television drama series noticed at reference 9. Given that it must have been conceived well over five years ago, BBC have done rather well.

PS 1: somewhere along the way, I came across the people at reference 6. But I am no longer sure whether they are among the good guys or the bad guys.

PS 2: the Chinese, with their centralised, surveillance state, have simply forbidden the use of this technology for bad purposes. With the state in charge of the interpretation and enforcement of the relevant regulations. So 'the Chinese Cyberspace Administration has banned deepfakes that are created without their subject’s permission and that go against socialist values or disseminate illegal and harmful information'. This in the context of faking up videos rather than text. We are pointed to reference 7, the top of which is snapped above.

References

Reference 1: https://psmv5.blogspot.com/2022/12/lensa.html.

Reference 2: The Algorithm - Melissa Heikkilä, MIT Technology Review - 2022. What I suppose to be a regular column. Issue of 19th December.

Reference 3: https://psmv5.blogspot.com/2022/12/foreign-parts-continued.html.

Reference 4: https://openai.com/blog/chatgpt/.

Reference 5: https://openai.com/research/.

Reference 6: https://huggingface.co/.

Reference 7: http://www.cac.gov.cn/2022-12/11/c_1672221949318230.htm.

Reference 8a: https://aclanthology.org/2020.acl-main.164.pdf. An example of geekery from Google.

Reference 8b: Automatic Detection of Generated Text is Easiest when Humans are Fooled - Daphne Ippolito, Daniel Duckworth, Chris Callison-Burch, Douglas Eck, Google - 2020.

Reference 9: https://en.wikipedia.org/wiki/The_Capture_(TV_series).

No comments:

Post a Comment