OpenAI, a leading artificial intelligence (AI) research laboratory, is developing a groundbreaking watermark system that allows them to identify work created by their innovative ChatGPT text AI. A new system from OpenAI is underway to stop anyone from using the content that the AI models generate and misrepresenting it as their own work.
The watermark security feature could make it easier for professors and teachers to identify students who use text generators like OpenAI's GPT for their essays and creative content.
The mechanism behind the efficiency of ChatGPT
Understanding the technological foundations of OpenAI's watermarking tool is crucial to understand why ChatGPT performs as well as it does. These systems interpret the text as strings of "tokens," which can include words, punctuation marks, and word fragments. The system continuously produces a probability distribution to choose the next token (such as a word) to output while accounting for all tokens that have already been generated.
In the case of systems hosted by OpenAI, such as ChatGPT, the task of sampling tokens in accordance with the distribution is completed by OpenAI's server once the distribution has been formed.
Why would OpenAI want to watermark work created by ChatGPT?
The OpenAI chatbot has caught the attention of netizens after demonstrating a penchant for answering difficult queries, producing poetry, resolving coding conundrums, and waxing lyrical on a variety of philosophical subjects.
While ChatGPT is really entertaining and helpful, there are clear ethical issues with the system. Like many text-generating tools before it, ChatGPT could potentially be used to create convincing phishing emails and plagiarized essays. Additionally, ChatGPT's factual inconsistency as a tool for answering questions caused programming Q&A website Stack Overflow to temporarily block responses from the AI.
How does the "watermark" work?
Scott Aaronson, a guest researcher at OpenAI, stated during a presentation at the University of Texas that OpenAI's watermarking tool functions as a "wrapper" over current text-generating systems, using a cryptographic algorithm operating at the server level to "pseudorandomly" choose the next token. Even if the text produced by the algorithm appears random to a casual observer, anyone with access to the cryptographic function could theoretically reveal a watermark.
Empirically, it appears that a few hundred tokens are sufficient to provide a solid indication that the text was produced by an AI system. In theory, you could even take a lengthy book and determine which passages most likely originated from the system and which passages most likely did not.
Limitations to the system
The concept of watermarking text produced by AI is not new. Previous attempts, the majority of which were rule-based, relied on tricks like word alterations and synonym replacements. However, OpenAI looks to be one of the first cryptography-based answers to the issue, outside of theoretical studies released by the German institution CISPA in March.
Aaronson declined to provide any information regarding the watermarking prototype when reached for comment, but he did mention that he plans to co-author a research article in the near future. Additionally, OpenAI simply stated that watermarking was one of the "provenance approaches" they were investigating to identify work produced by the AI.
Giving out the key (that only OpenAI has access to) for free would prevent OpenAI from benefiting financially. Giving everyone access to the keys would also mean that the keys might be used to find workarounds or remove the watermark entirely, which would put OpenAI in a difficult situation.
We will have to wait and see if OpenAI or someone else is able to come up with a solution to this problem that works well for all parties concerned, but it is interesting that watermarking is one of the many approaches OpenAI is examining to deal with the issue.