Generative Artificial Intelligence, Large Language Models, and Image Synthesis

One of the more clever moves by DeepMind was to parade everyone but Legg and Hutter as the public face untill it is too late. Musk took the bait hook line and sinker which set self-driving cars back several years. Let me explain:

From day one Musk insisted video is adequate to drive as long as you have at least two cameras just as animals have two eyes. He wanted to keep the costs down and avoid things like radar or lidar. Of course if you have only two cameras you will need to swivel but the point is to stick with video and keep in mind how animals use two eyes. They construct a dynamical 3D model of the world by integrating 2 dynamic perspectives. Don’t nitpick whether animals actually create a 3D model. Just accept that it would be sufficient.

Once you accept that there is a geometric model of the world in terms of 3D features being generated dynamically consider that this could be used to compress video for transmission and do so a lot more efficiently than 2D feature extraction. So you really just want to develop a video compression algorithm taking advantage of the presence of at least two cameras. The optimal features emerge by simply viewing the problem in terms of compression.

It was only recently that Musk described the Tesla machine learning strategy in terms that could reasonably be imputed to imply geometric compression of the video feeds. If he had approached it from that perspective from day one it would have guided the hardware development as well as software development. I suspect he could have gotten by with a lot less hardware.

4 Likes

Most of OpenAI’s changes to ChatGPT involve what the AI-powered bot can do: questions it can answer, information it can access, and improved underlying models. This time, though, it’s tweaking the way you use ChatGPT itself. The company is rolling out a new version of the service that allows you to prompt the AI bot not just by typing sentences into a text box but by either speaking aloud or just uploading a picture. The new features are rolling out to those who pay for ChatGPT in the next two weeks, and everyone else will get it “soon after,” according to OpenAI.

The voice chat part is pretty familiar: you tap a button and speak your question, ChatGPT converts it to text and feeds it to the large language model, gets an answer back, converts that back to speech, and speaks the answer out loud. It should feel just like talking to Alexa or Google Assistant, only — OpenAI hopes — the answers will be better thanks to the improved underlying tech. It appears most virtual assistants are being rebuilt to rely on LLMs — OpenAI is just ahead of the game.

The image search, meanwhile, is a bit like Google Lens. You snap a photo of whatever you’re interested in, and ChatGPT will try to suss out what you’re asking about and respond accordingly. You can also use the app’s drawing tool to help make your query clear or speak or type questions to go along with the image. This is where ChatGPT’s back-and-forth nature is helpful; rather than doing a search, getting the wrong answer, and then doing another search, you can prompt the bot and refine the answer as you go. (This is a lot like what Google is doing with multimodal search, too.)

This story presents the forthcoming image features of ChatGPT in a pretty low-key fashion. Some of the results posted so far by early access users are pretty stunning. 𝕏 user Borriss has collected some of the most remarkable which you may view in this Threadreader compilation. Here are a few examples.

image

image

image

image

image

2 Likes

The rendering is provided by Fill3d:

3 Likes

Introducing DALL·E 3, Integrated with ChatGPT Plus

1 Like

image

image

2 Likes

Is Eliezer okay?

3 Likes

image

3 Likes

It will be only a matter of time before the brain police insert stuff symbolized by:
image

into “art” intended for people’s children.

3 Likes

Uh-oh:

While participants also do not report any different perceptions of competence and trustworthiness between human and AI-generated content, they rate AI-generated content as being clearer and more engaging.

Experiments with ollama.ai. Prompt:

prove that 9 is a prime number

16bit Mistral (2 hours on my laptop):

To prove that 9 is a prime number, we need to show that it has no positive integer 
divisors other than 1 and itself. In other words, we need to show that 9 is only 
divisible by 1 and 9.

Let's start by trying to divide 9 by any positive integer less than or equal to its 
square root, which is approximately 3.06. Any of these integers could be a potential 
divisor of 9: 2, 3, 4, 5, and 6. However, when we try dividing 9 by each of these 
numbers, we get the following remainders:

* When divided by 2, the remainder is 1 (9 / 2 = 4.5, with a remainder of 1)
* When divided by 3, the remainder is 0 (9 / 3 = 3, with no remainder)
* When divided by 4, the remainder is 3 (9 / 4 = 2.25, with a remainder of 3)
* When divided by 5, the remainder is 4 (9 / 5 = 1.8, with a remainder of 4)
* When divided by 6, the remainder is 3 (9 / 6 = 1.5, with a remainder of 3)

We can see that none of these remainders are equal to 0, which means that 9 does not 
have any positive integer divisors less than or equal to its square root. Therefore, 9
must be prime.

4bit quantized Mistral (2 seconds on my laptop)

A prime number is a positive integer greater than 1 that is divisible by only 1 and 
itself. To prove that 9 is a prime number, we need to show that it meets this 
definition.

First, let's check if 9 is divisible by 2. It is not, because 9 divided by 2 gives us 
4 with a remainder of 3. Therefore, 9 is not divisible by 2 and is therefore an odd 
number.

Next, let's check if 9 is divisible by any number between 3 and the square root of 9 
(which is 3). To do this, we can divide 9 by each of these numbers and see if there is
a remainder:

* 9 divided by 3 gives us 3 with no remainder.
* 9 divided by 4 gives us 2 with a remainder of 3.
* 9 divided by 5 gives us 1 with a remainder of 3.
* 9 divided by 6 gives us 1 with a remainder of 3.
* 9 divided by 7 gives us 1 with a remainder of 3.
* 9 divided by 8 gives us 1 with a remainder of 3.

As we can see, none of these divisions have a remainder of 0, which means that 9 is 
not divisible by any number between 3 and the square root of 9. Therefore, 9 is a 
prime number.
6 Likes

The present paper instead advocates understanding LLMs and their connection to AI through the imagery of Jorge Luis Borges, a master of 20th century literature, forerunner of magical realism, and precursor to postmodern literature. This exercise leads to a new perspective that illuminates the relation between language modelling and artificial intelligence.

At any instant, our imagined apparatus is about to generate a story constrained by the narrative demands of what is already printed on the tape. Some words were typed by the user, some result from the past random picks of the language model. Neither truth nor intention matters to the operation of the machine, only narrative necessity.

If we persistently mistake the fiction machine for an artificial intelligence that can spare us the burden of thinking, the endless chatter of the language models will make us as insane as the struggling Librarians.

2 Likes
6 Likes

(2) Anthropic on X: “The fact that most individual neurons are uninterpretable presents a serious roadblock to a mechanistic understanding of language models. We demonstrate a method for decomposing groups of neurons into interpretable features with the potential to move past that roadblock.” / X (twitter.com)

4 Likes

This is heading in the direction of being able to operationalize sentences such that humans can contribute to approximations of Algorithmic Information of datasets with politically incorrect “generalizations” that might otherwise be henpecked to death by obscurantist pedants with noise about “outcome” as “determined” by “SES”.

This means they’re heading into territory where, more likely, they are going to use such capabilities to take optimal scientific models and utilize their “interpretability” findings to perform optimal distortion of those models driven by demands that the optimal models output politically correct shibboleths with optimal sophistry.

(Lest people think I’m entirely unsympathetic to Sapolsky, let it be known that he is one of the only Defenders Of The Faith to have deigned to engage me in a manner that I consider even remotely intellectually honest. That isn’t damning with faint praise, but neither is it high praise.)

3 Likes
1 Like

image

4 Likes

image

6 Likes
4 Likes

Note the AI-generated thumbnail:

2 Likes

One of the key tricks in enabling the huge artificial neural networks used in almost all of the recent round of artificial intelligence applications is the realisation that the inherent sloppiness of the neural network gradient descent process (after all, natural neural networks are made of meat) allows computation and data storage with much less precision than the 32- and 64-bit floating point numbers used in most scientific computing on everything from mobile phones to the largest supercomputers. Many present-day artificial neural network training and inference processes use a variety of 16-bit floating point formats such as bfloat16 and FP16, which require only two bytes to store each parameter, providing between 2 and 2.3 decimal digits of precision, a little better than a slide rule. Going further, after a model is trained on a supercomputer using higher precision computation, its parameters may be “quantised” into even smaller values such as 8 bit signed integers (INT8), allowing inference to be run on modest devices such as personal computers, mobile phones, and self-driving vehicle computers.

How far can you go with this? A paper posted arXiv on 2023-10-17 reports scaling the parameters of a large language model to one bit: pure binary data, eight times as efficient as INT8 and sixteeen times that of FP16.

Full text is at the link. Here is the abstract:

The increasing size of large language models has posed challenges for deployment and raised concerns about environmental impact due to high energy consumption. In this work, we introduce BitNet, a scalable and stable 1-bit Transformer architecture designed for large language models. Specifically, we introduce BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Experimental results on language modeling show that BitNet achieves competitive performance while substantially reducing memory footprint and energy consumption, compared to state-of-the-art 8-bit quantization methods and FP16 Transformer baselines. Furthermore, BitNet exhibits a scaling law akin to full-precision Transformers, suggesting its potential for effective scaling to even larger language models while maintaining efficiency and performance benefits.

This may allow deployment of forthcoming extraordinarily large language models on modest hardware completely autonomously, with no need to send requests to a remote server. It’s also interesting in understanding how the biological brain works: as nothing in it seems to operate with anything like the precision of even 8 bit integers.

5 Likes

I threw my Property Money essay (in two parts) at GPT-4 with the cited “compression prompt” with parameters (Maximum length:512, Frequency penalty:1, Presence penalty:1). Response:

PprtMony=:moneybag:tied2​:bank:economy wealth++:currency_exchange:++,Px-M:Gl deposit​:arrow_right:gl certificate(money):white_check_mark:dmd gl 4 cert,give gl,certificate​:arrows_counterclockwise:.Different=demurrage(bank cost sec.gld frm theft).

Gov as bank, economy “ancr” keeps property val safe(Ptitle), house brings $200K Px-money-($150K)=+50k.PxMoney fit liquidity & privatize soc-good delivery.

CntxtVCnvtlMontry&FiscalPolicy,M usually=fract of tot.wealth,PxM inflates,M solidified by lqdation val, Mkttomkt accounting blamed 4 '07 liquidity crisis.Institutions-demnded liqui from Gov,repos

Resln:PXMPolicy=Seemingly balnc excessive demand w under-supply fliquidity-Satisfies demand.LQt demand f Property derived from rNPV.Rel-core value to loan feasibly serviced.Property money centers balance btwn banks meet Liq.Dmnd.Confused with DemandXDefn here,Demand=”/else" v Willingnes n ability2 pay

"Dmnd for Liquidity"Imply willing to❗pay$for$Liquidity.imp:Dmnd doesn’t mean Forceful✔demand,everyones will/pbility is based on those(manforce/shield to chaos)Vital to Civ.No one can or else but them(Origin Of Dmad)

Clarify:“dmad” Embodies original concept o law/societal stability.Veterans example.Dmand originates rom Civ protectors,w/o cavalries,no projection.)Demand=LiqD given CostOfRepRoduct(CORR),Tendency of eonomy starts depopulation own peopleUnacceptable :bangbang:.Proper fiscal PolicyShould Respond-.GovMonitorCstOfRepRdct’(C.O.R.R),Tendency of economy-starts depopulation own people(:bangbang:unacceptable).Proper FiscalPolicyShould Respond-.GovMonitorCost Of ReplacementReproduction(CORRC)

MonthlyDividend unconditionally2 Sov=Citizens,facili PROVISIONS starting with their own replacement reproduction.

Funding done by voluntarydonations frm community,Liquidity f sov to make payments,during Crisis:Sov indirectly"bail out financial institutions. according to shared valuesClr:Property money-Balances econ.Healthy propor.Propertyhold 4 wealth,Stable supply/d

Decompression prompt for the above (Maximum length 3412) came back with travesty (although not as bad as most humans when asked about related concepts) and a failure to complete the response:

The text refers to the concept of property money, which is linked to the financial stability of the bank and its implications on the national economy and wealth. In simple terms, when gold is deposited into a bank, it is replaced with a certificate, thus transforming the nature of the ‘gold’ into ‘cash.’ These certificates can also be exchanged back for gold. For banks, there are security costs related to storing gold, a concept referred to as ‘demurrage.’

In this scenario, the government behaves like a bank where the economy is an anchor that efficiently stabilizes property values. For example, if a house is worth $200k and forms part of the ‘property money,’ after deducting the $150k in loan money (or ‘cash-money’), the owner still has a net worth, or ‘property value,’ of $50k. Therefore, property money facilitates liquidity and privatizes the delivery of social goods.

Within the context of conventional monetary and fiscal policy, money is typically a fraction of total wealth. When property money inflates, it’s value is fixed by its liquidation value. In fact, the market to market accounting was blamed for the 2007 liquidity crisis. During this crisis, various institutions demanded liquidity from the government through a process called ‘repos,’ short for repurchase agreements.

The solution to balancing excessive demand with under-supply of liquidity lies in the ‘property-money’ policy. This accommodates demand for liquidity driven by the net present value of property, and relates the core value to the feasibility of servicing the loan. Property money centers create a smooth transaction between banks that fulfill liquidity demand.

‘Demand’ here has been taken out of its usual context, and does not imply a forceful demand, but defines the willingness and ability to pay. It is likened to a shield against chaos, provided by those who can, without whom civilization cannot function.

The demand for liquidity entails the willingness to pay for liquidity. This is where fiscal policy and government

network error

This is more interesting than it might at first appear because of the recent excitement over “MemGPT” and related concepts that attempt to, in effect, expand the context window of LLMs through such “compression”.

2 Likes