12:19
17:14
11:18
10:55
17:31
13:41
12:19
17:14
11:18
10:55
17:31
13:41
12:19
17:14
11:18
10:55
17:31
13:41
12:19
17:14
11:18
10:55
17:31
13:41
This is a summary of the article "Why detecting AI-generated text is so difficult (and what to do about it)" originally appeared in the TechnologyReview’s The Algorithm.
OpenAI has already presented a tool determined to detect texts created by its AI system ChatGPT, responding to growing demand from the public. However, the system is not completely reliable, and detects as “supposedly AI-created” only 26% of of the whole bunch of artificial texts. Why is it so difficult to detect one?
The very aim for the progressive development of AI language models is to make the content they produce as close to natural as possible. Therefore, it is really hard to tell the difference, if the model is mimicking text created by humans, according to Muhammad Abdul-Mageed, a professor who oversees research in natural-language processing and machine learning at the University of British Columbia. Detection-tool kits become outdated as soon as the AI model updates.
AI-generated text can be detected via the usage of watermarks. They represent a sort of secret signal in the text, that allows to detect its artificial origins. The procedure of applying a watermark to the text was developed by the researchers at the University of Maryland and is freely available. However, the companies that create the AI chatbots are not in a hurry to implicate it. In some cases this level of transparency is not needed. For example, ChatGPT is supposed to become a tool to help with writing emails and spell checking, and the outputs of these rather innocent processes would be flagged as artificial. This could lead to misunderstandings and legal issues.
As we’ve already stated, the AI text detector with only 26% of reliability is not enough. The process of more proper detection is at the development stage, and it is supposed to combine a bunch of subprocesses. For example, a tool called GPTZero measures how random text passages are, based on the fact that people use more variable vocabulary than the machine.
The consequences of the growing amount of AI-texts are simple: the more users get their texts written, the faster the chatbot runs out of patterns. Therefore, the main focus of the developers is to find out if the text contains a new and creative idea that has not been mentioned before. That could be the most reliable criterium for detection.