Time for Lofstrom's Server Sky?

It’s 1997 only instead of the Netscape interface to the World Wide Web, it’s the large language model front end to an explosion of power-hungry matrix multipliers “in the cloud”.

Generative AI demand growth beyond world name plate capacity for electrical generation is likely to out pace that indicated by the following thematic map:

Yes, the hysteria that hit in 1997 was largely fluff. So is the current large language model hysteria. Even so, there will, at the very least, be a temporary peak-load electric power price rise in the next few years. It may even be a price shock. Moreover, there is good reason to believe that even if there are great advances in the efficiency of AI systems, the tax structure favoring rentiers over labor will result in a voracious AI appetite for energy. Rentiers will be literally throwing money at anyone who can, in turn, throw money at increasing AI capacity.

What people are missing here is that, just as the cost of improving scientific models goes vertical (asymptotic), so, too, does the cost of improving any model of the world, including generative AI models. The reason for this is that generative AI models, in the final analysis, rely on scientific models of the world (world models) in the general sense of the natural sciences. Moreover, just as squeezing that last bit of accuracy out of a scientific model pays huge dividends, so, too, will squeezing that last bit of accuracy out of generative AI models pay huge dividends. “An ounce of prevention…” is a gross understatement when it comes to quality in making intelligent decisions. This will sustain the explosion in demand for electricity beyond the present terminal hysteria.

So Keith Lofstrom’s approach in Server Sky starts to look like it may be inevitable and a lot sooner than 2030, which is when it might otherwise seem reasonable to expect it to be deployed at scale. The basic idea is to mass produce, and deploy in high orbit, constellations of “thinsats”:

image

Each thinsat weighs about 5 grams. It incorporates solar cells, computation, telecommunications, heat-sink radiator and, interestingly, quasi-reactionless elecrochromic thrusters for station keeping. Here is a diagram of thinsat version 6:

By “elecrochromic” Lofstrom means a solid state device that varies between reflective and dark/opaque, to work like an integrated set of solar sails at a very small scale. Light pressure only. No fuel needed.

Quoting Lofstrom:

Server Sky is speculative. The most likely technical showstopper is radiation damage. The most likely practical showstopper is misunderstanding. Working together, we can fix the latter.

Radiation damage risk brings to mind Gallium Arsenide’s:

  • Relative radiation resistance
  • Utility in solar cells
  • Utility in telecommunications (ie: with Earth and other thinsats)
  • High electron mobility for computation
  • Dielectric advantages for smaller integrated circuits (which is a big reason Seymour Cray favored GaAs)

Of course, GaAs fabs have a lot of catching up to do if they are going to compete with whatever radiation shielding might be available to silicon thinsats, but if the GaAs fab guy I talked to at the demise of Cray Computer Corp. is to be believed, the potential is there for a revolution in electronics, given enough hysterical investment capital to overcome The Hardware Lottery.

7 Likes

Has Mr. Lofstrom done much with the idea in the recent past?
The wiki’s RecentChanges page suggests desultory work over the last year.

Regardless, thank you for mentioning his project.

Why did you write this?

So Keith Lofstrom’s approach in Server Sky starts to look like it may be inevitable and a lot sooner than 2030, which is when it might otherwise seem reasonable to expect it to be deployed at scale.

What makes 2030 a reasonable date for large-scale deployment?

1 Like

I can’t speak for Lofstrom, as I’m not involved with his project, but my expectation is that Server Sky, as with so many other risk technology investments, languished due to the way rentiers have rigged the tax system since at least 1913. Lofstrom is aging along with the rest of us.

You just don’t get events like the DotCon bubble happening but rarely: events that shake capital loose from the rentiers to rain down on guys like Musk. That’s why I’m very pissed at Musk for not advocating replacing the 16th Amendment with a tax on liquidation value of net assets. He’s “one of us” in a very important sense and we need more “Musks” to be in a position to harvest all the relatively low-hanging fruits that are rotting on the trees due to capital market failure. He has a responsibility not only to himself to so-advocate (since he’d be far wealthier) but to his own vision of the future of humanity that, one would presume, would be to “manufacture” a lot more Musks. He could advocate it – easily as falling off a log,

Having said that, the “generative AI” bubble is another blackswan to the rentiers and has the potential to repeat the DotCon pattern:

Bubble shakes a torrent of capital loose from rentiers →
Pops leaving a few Musks with capital →
Rentiers sort through the wreckage for network effects →
Resurgence of rentier investments targeting network effects.

It’s really in this last phase sustainable demand by rentiers for AI computation capacity goes through the ceiling – and into something like Server Sky.

Could this all happen by 2030? Seems plausible to me. Could GaAs (or GaN?) fab tech plow through the learning curve by then? Less plausible but still possible since there is substantial overlap between Silicon and Ga fab, and the Ga tech has been not-entirely languishing for 25 years.

Rentiers reall really really want to lower labor costs and at the same time neutralize young men that may come after them with guillotines. The next generation (post bubble pop) AI does that for them.

2 Likes

I think the main thing that makes the Server Sky model seem dated as an alternative to present-day cloud computing silos is the energy available in each thinsat for computation. In the era of the build-out of the Web, the energy requirement for each server was minimal, and one could implement that with a solar powered, energy efficient server or, when that was not enough, as many as it took with a load balancer front-end.

Today, however, we’ve tilted back to the energy-hungry model of server computing as in those days of yore where we imagined the Columbia river diverted to cool massive racks of ECL consuming 20% of the electricity of the North American continent.

Today, they call it “training and inference”. It we go down the road where it will take thousands of times the computation to satisfy a client query compared to whatever the search engines are doing today, it’s difficult to see how thinsats could provide that. Distributing the load doesn’t help because the computation done in inference requires massive, fast local memory and is murdered by long latency on memory accesses.

3 Likes

Nowadays Nvidia is advertising 900GBps local network bandwidth, which also favors centralizing load. Latency also plays a big part in existing “throw money at it” Transformer machine learning models.

But if one looks past the bubble to a maturing AI industry’s recovery, I expect “inference” (servicing user queries) will increasingly separate from “training” (model generation), with inference staying close to the user and training going where the energy is. So, with appropriate caveats from a guy with a cloudy crystal ball:

  1. People are already starting to take data efficiency seriously, which will necessitate a revolution in the understanding of foundation model generation.
  2. This will wake people up to Algorithmic Information approximation as the most principled information criterion for model selection. This will, in part, be driven by the shift to much smaller data sets (data efficiency) of much higher quality. This will in turn make lossless compression (Algorithmic Information approximation) a lot more economic as an approach to model generation. In part, it will also be because successes in using AIC will leak along with personnel into startups. Illya already is making noises like he “gets it”.
  3. Deploying these smaller models to serve user queries will rely heavily on localized (ground-based) computation and caching*. Inference is, in a very real sense, conditional decompression – contingent on user queries. Nonvolatile and fast cache storage is getting cheaper in both capital and operational cost (Moore’s law seems to be holding on there unlike in CPUs). So I don’t expect serving user queries to require nearly as much of a bite out of electrical supplies as model generation.
  4. Competition for better models (quality assurance) will be where the black swan energy demand comes from. Already, people are becoming impatient with models that require a lot of corrective iteration (tree of thought – federation of agents “conversing” to converge on critical thinking). QA will only intensify as the pressure to deploy models in mission critical areas require that contracts include bonding for liquidated damages for errors and omissions by the AI “consultants”.
  5. What form the space-based solar collectors/heat radiators take is to be determined, and at least in scale it will likely look very different from the 5gm Server Sky units. But I don’t see a limit to the computational requirements by global competition for higher quality “foundation” models – particularly as the global economy bootstraps into cis-lunar space if not beyond. At some point, the problems with turning lunar silicon into machine learning and solar power infrastructure will be overcome by the economics of and synergy with better intelligence.

* Refreshing query caches on model updates to the query servers will potentially require a lot of compute. That could be addressed with selective eager-evaluation to precompute likely queries followed by high bandwidth downloads from space to refresh earth-side query server cache memories.

1 Like

Speaking of “throwing money at”:

Kuwait is looking to use 700,000 Nvidia B-100 chips for an AI compute cluster using a gigawatt of power. This will likely scale to Zettaflops of compute.

4 Likes

In the meantime, EU:

2 Likes

From the aforelinked EU “supercomputer” article:

Last month, the bloc also announced what it branded a “Large AI grand challenge”: A competition geared toward European AI startups “with experience in large-scale AI models” that aims to select up to four promising homegrown startups that will get a total of 4 million hours of supercomputing access to support development of foundational models. A €1 million prize pot is also earmarked for distributing to the winners — which are expected to release their developed models under an open source license for noncommercial use, or through publishing their research findings, per the Commission.

So, Dr. Evil’s plan is to get the EU to allocate one million Euros to a prize with criteria so subjective that his Evil Minions, conveniently placed on the AI Supercomputing Prize Judging Committee, can then award The Prize to Mr. Bigglesworth!

Switzerland and Great Britain are the only places in Europe that have folks with any clue as to the value of The Hutter Prize – the opposite of a “Large AI” prize – and neither place is in the EU.

If the EU were serious about “AI Supercomputing” the word “Supercomputing” might hold out some hope for them: One reason Marcus Hutter chose to disallow hardware accelerators and even multcore algorithms in the Hutter Prize is that The Hardware Lottery has so distorted the field of AI research (as opposed to development) that a vast array of alternate approaches to approximating Algorithmic Information of a dataset are being ignored. General purpose CPUs are among the least biased priors one can practically apply in said research and that means “Supercomputing” as it has been generally defined since the days of the Cray-1 (optimized for short vectors) evokes a lot less biased incentive for AI research than mere matrix multipliers with activation functions.

BTW: Interest in the Hutter Prize is picking up a bit, which I guess isn’t too surprising since it is one way for young guys to objectively stand out from the crowd where resume fraud is increasingly vicious given the incentives of “throw money at it” hysteria.

3 Likes