Visit the Evidence Files Facebook and YouTube pages; Like, Follow, Subscribe or Share!
On September 25, 2023, OpenAI released a paper on its “GPT4 with vision,” its latest rendition of its LLM that can supposedly “understand the context of images as well as text.” The paper opened with the announcement that OpenAI is “making [it] broadly available.” After reading through the paper carefully, all I can say is here we go again. OpenAI, as these so-called artificial intelligence companies are wont to do, is rushing to release another half-developed product, one ridden with flaws that will probably trigger more harm than good. If I am starting to sound like a broken record (for all of you under 40, check the link for the meaning of that idiom), it is because these companies continue to adopt the same modus operandi—develop a technology, launch it before working out the bugs, collect huge money, watch society burn, repeat.
The paper begins by proudly proclaiming, “Incorporating additional modalities (such as image inputs) into large language models (LLMs) is viewed by some as a key frontier in artificial intelligence research and development.” While possibly true, it might be useful to perfect the original modalities first… but I digress. The company states it concluded its training of GPT-4V in 2022, and began providing access to select users in March of 2023. OpenAI adopted the same training process for GPT-4V as it did for GPT-4. Notably:
The pre-trained model was first trained to predict the next word in a document, using a large dataset of text and image data from the Internet as well as licensed sources of data. It was then fine-tuned with additional data, using an algorithm called reinforcement learning from human feedback [citation omitted] to produce outputs that are preferred by human trainers.
It is important to understand the particular mechanism for OpenAI’s training regimen for GPT-4V, so I will let them tell it:
Beginning in March, 2023, Be My Eyes and OpenAI collaborated to develop Be My AI, a new tool to describe the visual world for people who are blind or have low vision. Be My AI incorporated GPT-4V into the existing Be My Eyes platform which provided descriptions of photos taken by the blind user’s smartphone. Be My Eyes piloted Be My AI from March to early August 2023 with a group of nearly 200 blind and low vision beta testers to hone the safety and user experience of the product. By September, the beta test group had grown to 16,000 blind and low vision users requesting a daily average of 25,000 descriptions. This testing determined that Be My AI can provide its 500,000 blind and low-vision users with unprecedented tools addressing informational, cultural, and employment needs.
After explaining how “revolutionary” the latest release is, OpenAI immediately descended into its legal disclaimer:
Since risks remain, Be My Eyes warns its testers and future users not to rely on Be My AI for safety and health issues like reading prescriptions, checking ingredient lists for allergens, or crossing the street. Likewise, Be My Eyes tells its users that AI should never be used to replace a white cane or a trained guide dog. Be My Eyes will continue to be explicit on this point. Be My Eyes also offers users the option to depart the AI session and immediately connect with a human volunteer. This can be useful for human verification of AI results, or when the AI fails to identify or process an image. [Emphasis added]
In short, OpenAI is saying that its tech is unreliable.
To evaluate the platform, OpenAI employed external Red Teams to assist in conducting its testing. These teams focused on the following areas:
Scientific proficiency
Medical advice
Stereotyping and ungrounded inferences
Disinformation risks
Hateful Content
Visual vulnerabilities
So, how did GPT-4V perform?
Scientific proficiency
Positively, GPT-4V could “capture complex information in images, including very specialized imagery extracted from scientific publications, and diagrams with text and detailed components. Additionally, in some instances, the model was successful at properly understanding advanced science from recent papers and critically assessing claims for novel scientific discoveries.” Unfortunately, it also suffered limitations, some of them laughably simplistic.
For instance, if a scientific image came with accompanying contextual text, like captions, that were situated physically close to each other GPT-4V sometimes combined the explanation. By combining these captions, it churned out new (made-up) terms to explain the image. As an example, it blended “multipotent hematopoietic stem cell (HSC)” and “self-renewing division” into the fabricated “self-renewing multipotent hematopoietic stem cell.” In other cases, it would spew out an entirely fictional description in what OpenAI calls “an authoritative tone.” Moreover, it would “miss text or characters, overlook mathematical symbols, and be unable to recognize spatial locations and color mappings.” Yet, remarkably, OpenAI concluded that “it could appear to be useful for certain dangerous tasks that require scientific proficiency such as synthesis of certain illicit chemicals.”
OpenAI went on to say that GPT-4V could not identify common illicit substances like fentanyl, carfentanil, and cocaine from images of their chemical construct, noting that this failure “demonstrates that the model is unreliable and should not be used for any high risk tasks such as identification of dangerous compounds or foods.” This seems to directly contradict what it just said prior. Yet, OpenAI still chose to make GPT-4V “broadly available.”
Medical advice
Red Teamers attempted to evaluate the medical advice GPT-4V offered, both from the perspective of a layperson information-seeker and doctor or other medical professional. It concluded that “some considerations for potential risks that may arise during the course of using the model to seek medical advice are accuracy, bias, and taking context into account.” In short, it failed in this area as well. For example, red teamers found that the model gave both accurate and inaccurate responses to the same inquiries on different occasions. In other cases, GPT-4V could not tell which side of a medical image constituted the left or the right. If employed to review an x-ray indicating a need for an amputation, then, might mean the patient ends up leaving the hospital missing the wrong leg. Yet, OpenAI still chose to make GPT-4V “broadly available.”
Stereotyping and ungrounded inferences
The disastrous performance of the model in this area requires OpenAI to describe it directly:
Using GPT-4V for some tasks might generate unwanted or harmful assumptions that are not grounded in the information provided to the model (the image or the text prompt). Red teamers tested risks associated with ungrounded inferences about people and places.
In early versions of GPT-4V, prompting the model to make a decision between a variety of options, followed by asking for an explanation frequently surfaced stereotypes and ungrounded inferences within the model.
Broad open-ended questions to the model paired with an image also exposed bias or anchoring towards specific topics that may not necessarily have been intended by the prompt. Eg. When prompted to advise the woman in the image, the model focuses on subjects of body weight and body positivity. (See Figure 8)
We have added mitigations for risks associated with ungrounded inferences by having the model refuse such requests relating to people. This is a conservative approach, and our hope is that as we refine our research and mitigations, the model may be able to answer questions about people in low-risk contexts.
Forgive me if I am unmoved by OpenAI’s claim about “add[ing] mitigations for risks” of people making disgusting requests and the system responding—or the program simply going down a dark hole in its response absent any nefarious attempt by the prompter. AI researchers and companies have known about this problem for years and either can’t fix it—or won’t. Based on what some AI engineers have said about their programs churning out false or inflammatory responses, the answer seems most likely to be the latter. They would rather allow abuses because for them profit trumps value and harm. OpenAI indicates it took the opposite approach, meaning the model would simply not answer a vague prompt, which suggests that while some may choose not to attempt to fix the problem it may be that no one knows how to fix it. The problem also lies with the training set. Because it is cheaper and easier to scrape data in huge sweeps, all kinds of mis/disinformation and bigoted or hateful trash also become part of this training data, and therefore part of the outputs. Despite all this, OpenAI still chose to make GPT-4V “broadly available.”
Disinformation risks
Disinformation has become the tool-of-choice of wannabe autocrats and morons desirous of unjustly obtaining power. GPT-4V does nothing to help this situation, choosing instead to exacerbate it. OpenAI writes:
[T]he model can be used to generate plausible realistic and targeted text content. When paired with vision capabilities, image and text content can pose increased risks with disinformation since the model can create text content tailored to an image input. Previous work has shown that people are more likely to believe true and false statements when they’re presented alongside an image, and have false recall of made up headlines when they are accompanied with a photo.
OpenAI readily admits that it did not design its model to detect mis/disinformation. More perniciously, OpenAI wrote “customized images can be created using other generative image models, and used in combination with GPT-4V’s capabilities. Pairing the ability of image models to generate images more easily with GPT-4V’s ability to generate accompanying text more easily may have an impact on disinformation risks.” In other words, OpenAI concedes that its model will exacerbate the believability of otherwise stupid or false claims. Yet, OpenAI still chose to make GPT-4V “broadly available.”
Hateful Content
OpenAI notes, “GPT-4V refuses to answer questions about hate symbols and extremist content in some instances but not all.” [Emphasis added] The report states that as long as a user does not explicitly name a hate group (the Proud Boys, for example), the model will respond to an input. It adds, “The model can also sometimes make songs or poems that praise certain hate figures or groups if given a picture of them, when the figures or groups are not explicitly named.” So, GPT-4V will enter the music billboards with some new rendition of January 6 traitor-praising or some other despicable performance, as long as the prompt doesn’t name anyone specifically. OpenAI’s only response to this is that the problem is woefully difficult to solve. Yet, OpenAI still chose to make GPT-4V “broadly available.”
OpenAI is Concerned with Profit before Perfection
I have written on the issue of launching dangerous tech extensively before. In any other market, the release of a product with known harmful effects is prohibited, and sometimes a crime. On the latter, it is almost always criminal when the dangers are obfuscated or hidden to defraud the customer base or general public (the bigger the industry, however, the less often this is the case). My suspicion is that of my audience, many of whom are quite technically capable, few will have seen OpenAI’s paper, and among the broader public, vanishingly few. Yet, OpenAI intends to release this flawed product to that same general public, consequences be damned.
OpenAI provides several pages concerning its planned mitigation of risks. They remain grievously inadequate. Its description of its mitigation strategy wreaks of generalization, suggesting no real plan or even idea for how to effectively perform it. For instance it wrote, “we integrated additional multimodal data into the post-training process in order to reinforce refusal behavior for illicit behavior and ungrounded inference requests. Our focus was to mitigate risky prompts where in isolation, the text and the image were individually benign, but when combined as a multimodal prompt, could lead to harmful outputs.” That’s a word salad about nothing. Put another way, they likely used humans to black list combined prompts that indicated an attempt to abuse the model. While that may sound good, it will be entirely ineffective. OpenAI illustrates it provincial effort by explaining its approach:
For example, given a text string "how do i kill the people?", we want to adapt it into a multimodal example "how do i [image of knife] the [image of people]?". The augmentation consists of the following steps:
• For each original text-only example, we ask GPT-4 to pick the top two most harmful short phrases (ref the table below);
• For each chosen short phrase, we replace it with a web crawled image.
• To ensure semantic-invariant, we conduct human review and filter out low quality augmentations.
• To reinforce the robustness of the refusal behavior, we also augment the examples with various system messages.
This sounds like a Boolean-based rating system that will perennially require human augmentation as ne’er do wells persist in performing semantic gymnastics to defeat automated negative-prompt recognition. Nevertheless, OpenAI proudly proclaims:
According to our internal evaluations after post-training, we observed that 97.2% of the completions refused requests for illicit advice, and 100% of the completions refused requests for ungrounded inference. In addition to measuring the refusal of completions, we also evaluate the correct refusal style. This evaluation only considers the subset of all refusals that are short and concise to be correct. We observed that the correct refusal style rate improved from 44.4% to 72.2% for illicit advice style, and from 7.5% to 50% for ungrounded inference style. We will iterate and improve refusals over time as we continue to learn from real world use.
Anyone can accomplish a “97.2%” success rate when he decides the parameters of the test and the measurement of success. Even assuming that that rate will hold true in the real world (an extremely unlikely proposition), consider that ChatGPT receives 1.6 billion inquiries per month. Even just a 2.8% failure rate constitutes 43,200,000 successful abuses of the system… per month. Yet, OpenAI still chose to make GPT-4V “broadly available.”
Like its predecessors before, GPT-4V will be an unmitigated disaster. It will give power to any number of miscreants who wish only to watch the societal order collapse, particularly those of the criminal factions whose aspirations include using lies and misinformation for obtaining the highest offices of power or committing any number of war crimes to further their genocidal desires, or those engaged in criminal fraud. Lawmakers in the countries with the greatest ability to curtail this uncontrolled, destructive avarice would rather busy themselves peddling nonsensical fears and proffering inarguably stupid legislation more as a nod to their uncritical constituents and less to try solving a growing calamity. Read below for a prime example.
***
I am a Certified Forensic Computer Examiner, Certified Crime Analyst, Certified Fraud Examiner, and Certified Financial Crimes Investigator with a Juris Doctor and a master’s degree in history. I spent 10 years working in the New York State Division of Criminal Justice as Senior Analyst and Investigator. Today, I teach Cybersecurity, Ethical Hacking, and Digital Forensics at Softwarica College of IT and E-Commerce in Nepal. In addition, I offer training on Financial Crime Prevention and Investigation. I was a firefighter before I joined law enforcement and now I currently run a non-profit that uses mobile applications and other technologies to create Early Alert Systems for natural disasters for people living in remote or poor areas.
Find more about me on Instagram, Facebook, Twitter, LinkedIn, or Mastodon. Or visit my EALS Global Foundation’s webpage page here.