How To Nuke Large Language Model Bias?

jabowery · 2 March 2023 02:20

The Applied Brain Research guys appear to have a breakthrough in language modeling described in this video seminar. I didn’t want to talk about this video until I had given them a chance to enter the Hutter Prize, which I suggested they do during the seminar – but its been nearly 6 months and I haven’t heard anything from them. They outperform transformers on language modeling benchmarks, and are technically able to scale up more easily and economically in terms of CUDA cores and VRAM.

I’ve been watching the guys at ABR ever since they published their first paper on Legendre Memory Units a few years ago because of the parsimony of their models measured in terms of parameters – as well as their orientation toward dynamical systems. Both of these are big advantages in terms of the Algorithmic Information Criterion for causal model selection. This is the most promising of the emerging language modeling technologies.