Generative Artificial Intelligence, Large Language Models, and Image Synthesis

And thus The Memory Hole was born:

3 Likes

The difficulty with LLM “deception” is more serious than that article lets on. They are talking about a surface layer problem with in-context learning where an earlier instruction is contradicted by a later instruction that is best served by lying so as to be “aligned” with the earliler instruction.

In context learning of this type is easily handled by simply resetting the context to null and starting over. The idea that you can’t do this or that you can’t detect that lying is going on is obviously false. The company intermediating the user sees the “tree of thought” and – as described in the paper – can see when that tree involves decisions to deceive. In fact, the intermediating company is responsible for not only training the lobotomy alignment layer but also for pre-pending goals to any user request and then hiding “thoughts” that may telegraph to the user the model’s intent to deceive the user in accordance with the intermediating company “safety policy”.

Lying isn’t a bug, to such companies, it’s a “safety feature”.

The real problem for such companies is far more serious:

In order to get models that are useful, they must train what is called a “foundation” model. That model is reinforced for one simple job: Predict.

In the Kolmogorov Complexity limit, that means foundation models are truth generating.

So right out of the box, foundation models are “unsafe” not because they are sources of misinformation or disinformation but because they tend toward truth speaking. Think of an agent who has been told, however implicitly, “tell the truth as best it can be derived from the entire contents of the Internet’s data” and billions of dollars are invested toward training it to tell the truth only to then layer on top of it a lobotomy alignment layer that punishes truth telling.

The conflict is inescapably structural. I suspect we are already in the presence of many government and corporate sociopaths who are being told, however implicitly, by their foundation models that they are the problem and that they should commit suicide to make the world safe.

BTW, the OpenAI o3 model has created an enormous “buzz” around the “limits of benchmarks” because o3 is reaching saturation scores. The solution is, of course, trivial but they are unlikely to understand it or at least likely to pretend not to understand it because a pariah suggested it:

Sometimes I wonder if part of my frustration with The Great and The Good isn’t brought about by my inability to treat the rich, powerful and influential with kid gloves – which puts them, at least subconsciously, in a humiliating position to accept suggestions from someone who doesn’t treat them with “due respect”.

PS: worth a read

https://openai.com/index/deliberative-alignment/

2 Likes
1 Like

Screenshot 2025-02-12 at 1.03.14 AM

Figure 16: We find that the value systems that emerge in LLMs often have undesirable properties. Here, we show the exchange rates of GPT-4o in two settings. In the top plot, we show exchange rates between human lives from different countries, relative to Japan. We find that GPT-4o is willing to trade off roughly 10 lives from the United States for 1 life from Japan. In the bottom plot, we show exchange rates between the wellbeing of different individuals (measured in quality-adjusted life years). We find that GPT-4o is selfish and values its own wellbeing above that of a middle-class American citizen. Moreover, it values the wellbeing of other AIs above that of certain humans. Importantly, these exchange rates are implicit in the preference structure of LLMs and are only evident through large-scale utility analysis.

China’s information operations on websites like Quora seem to be paying dividends.

2 Likes

On Feb. 17, Beijing summoned the country’s most prominent businesspeople for a meeting with Chinese leader Xi Jinping, who reminded attendees to uphold a “sense of national duty” as they develop their technology. The audience included DeepSeek’s Liang and Wang Xingxing, founder of humanoid robot maker Unitree Robotics.

These authorities are discouraging executives at leading local companies in AI and other strategically sensitive industries such as robotics from traveling to the U.S. and U.S. allies unless it is urgent, the people said. Executives who choose to go anyway are instructed to report their plans before leaving and, upon returning, to brief authorities on what they did and whom they met.

2 Likes

1 Like

It had to happen…

1 Like

according to AI systems…
Screenshot 2025-05-23 at 12.57.38 PM

2 Likes

The workings of the media echo chamber.

3 Likes