Kaido Orav's fx-cmix Wins 6911€ Hutter Prize Award!

The present trend in ML is to try to make up for lack of lossless compression of the training data by expanding the poorly compressed training data, continuing a trunk query to an exponential tree of thought with cross-checks to try to find any coherence.

This means the demand for inference time hardware like Positron will increase – possibly catastrophically – because people aren’t bothering to recognize that maximizing lossless compression has exponential downstream benefits by getting the world model right before drawing conclusions (inferences) about it.

“An ounce of {training|lossless compression|occam’s razor|scientific theory} is worth a pound {inference with cross-checks|tree of thought|introspective self-criticism|experimental engineering|etc…}.”

Of course that aphorism too has its limits as evidenced by the over-reliance on theory in some fields and tendency to shoe-horn observations into existing theory if not dismiss them out of confirmation bias.

In any event, I think what a lot of people are thinking will be an increased demand for NVIDIA due to Jevons Paradox will, instead, show up in hardware specialized for inference so that people can pursue the latest fad without getting back to the hard work of truth seeking in their world model.

1 Like

You are making a good case for the value of compression, but the weights in the existing LLMs have already done a huge amount of work. There’s not going to be anything catastrophic - unless someone starts building armies of remote-controlled zombies. The case of zizians shows that this isn’t just a theoretical concern.

2 Likes

Is there a principle underlying these “reasoning” models? By “principle” I mean something commensurable with algorithmic information theory’s “top down” approach to model creation. An example would be something like a theoretic relation between the departure from the Kolmogorov Complexity size model and how fast it departs from the optimal Algorithmic Probability Distribution of the data.

1 Like

It’s easy to become trapped inside the computational machine and lose sight of the machine merely supporting life inside a bigger system with a bit of computational intelligence. The computational concerns are somewhat trivial compared to the external ones.

1 Like

I’m not sure what you’re referring to here unless it is my apparent over-emphasis on theory to the exclusion of wide societal implications of engineering.

That would be most uncharitable of you given my often stated motive is to draw attention to the global economy as an unfriendly AGI that is increasingly turking humans into its NPC functional units.

THAT is the real problem with “a bigger system”. Once it starts turking humans, you are dealing with not just “a bit” of “computational intelligence” but a vast sea of computational intelligence with a decreasing remnant of humanity.

Of course, the global economy AGI is nothing compared to the rest of the solar system let alone galaxy let alone universe.

1 Like

I think we agree on the problem, and I also don’t think that you’d consider algorithmic probability some kind of panacea. I merely pointed out that ultimately, Wikipedia tries to summarize data & experience which are primary and should be kept around instead of relying on summaries.

Land also draws the connection between global economy and AGI, but haven’t dug into it: https://retrochronic.com/

2 Likes

Imagine an NPC like Isadore Singer, occupying the heights of the NSF under Reagan thinking he should socially engineer the United States with a fraudulent economics paper so as to ensure that engineers and scientists struggling against the tsunami born of sexual tectonics, can’t afford to form families, and the US turns into a third world shithole country:

(As soon as I posted that link the account that posted the following video was thrown down the memory hole, so that “fools” can have “mere coincidence” plausible deniability)

I’m sure when he engaged in this deception he thought he was doing “good”, as he engaged in self-deception about the justifying economics paper by Myles Boylan.

Now, consider Myles Boylan’s role in this:

He’s put in a position where he knows the conclusion of the economics paper expected by his superiors. He’s being asked to provide a paper that supports the conclusion that will lower labor costs in the near term so his superiors have more of that sweet sweet cheap inferior labor Fentanyl the Maoists want to use to bring down the West by corrupting capitalists hence capitalism.

My emphasis on computational intelligence is as a kind of reductio ad absurdum approach to wake up these NPCs to their self-deception. Automate Myles Boylan so that he can’t provide Isadore Singer, hence his superiors such as Erich Bloch, the self-deception they demand of him – demand with plausible deniability even to themselves.

These NPCs of the global economy are monkeys that want to maintain their appearance of virtue and superiority to maintain their primate hierarchy status and they don’t want to know that’s what they’re doing.

So don’t get hung up on my emphasis on “Wikipedia”. That’s merely a tactic to get around the fact that I can’t go straight to “The Foundation World Model That Might Have Been” because ALL of the positions of influence in ALL of the organizations supposedly providing us with guidance for the future, are occupied by these goddamn fucking monkey NPCs. They smell positions of status and influence and go straight for the jugular of humanity.

My emphasis on Wikipedia as a corpus to demonstrate the AIC as model selection criterion (thereby bypassing the Myles Boylan’s of the world) is simply because of my priority with the Hutter Prize for Lossless Compression of Human Knowledge as a way of demonstrating the principle of the AIC. Once demonstrated, I’m hoping (and of course you are free to call me a damn fool) that some innocent starry eyed economist or sociologist with an interest in machine learning will see it as the next step beyond Many Analysts, One Dataset – thinking that his real job isn’t to create lies for his superiors.

2 Likes

Interesting, thanks for this commentary. Indeed, LLMs could plausibly put all the information regurgitators out of business.

In the meantime, I’m curious about your opinion about this:

Is this technology going to bring us closer to truth, or will it make it even easier to dupe people?

4 Likes

Bear in mind the difference between science and engineering. The NPCs doing the social engineering believe themselves to be engaged in “science” and therefore to be in possession of “the truth.”. They feel morally justified in creating indefatigable persuasion machines because they believe themselves to be in possession of “the truth”. This is an ego structure that can be attacked.

It is a race against time between advancing the scientific method, which strikes at the foundation of that ego structure, and the application of world models that are just “good enough” to permit their zealous social engineering to destroy humanity in service of their NPC monkey brains. This is the dangerous period in which we live where you are quite correct that there are practical applications of the existing “good enough” models.

3 Likes

For example, consider this page: Silencing Science Tracker | Sabin Center for Climate Change Law

They’re dishonest: there’s so much science being silenced that they’re not tracking – and through their dishonesty they’re hurting science as a whole.

Very much so. What are we to do now?

2 Likes

Why did even the “Based” Republican leaders going back at least to Reagan have to subvert the US’s overwhelming lead in the information industry to the point that even the world’s richest man now has to self-destruct rather than upset his Indian parasites? People think I’m crazy to posit Mao’s Revenge, but that’s because they have no idea how espionage works and how the Chinese mind works.

1 Like

Here is a guy to take seriously about violating the scaling laws because he’s creating a theory of grokking.

1 Like