Neural Networks: An Alternate History SciFi Story

jabowery · 24 June 2025 23:37

In the book “The Age of Intelligent Machines” (1990) Ray Kurzweil correctly predicted hardware speed would limit the adoption of neural networks. That launched him on a very high profile career leading to becoming Google’s Director of Engineering, focusing on AI.

Meanwhile, there was this other guy who was doing something else in 1990. Maybe he should have been writing self-promotion books. What do you think?

(A tldr graphic of the alternate timeline link)

A Pivotal Moment: How Early Military Adoption of Convolutional Neural Networks Could Have Reshaped the Modern World

The Crossroads of 1990: A Technological Standoff

In 1990, at the International Joint Conference on Neural Networks (IJCNN), a U.S. Navy Admiral, accompanied by his PhD advisor, stood at a vendor booth for Neural Engines Corp. (NEC). Before them was a demonstration of the first hardware for convolutional vision, a system with the potential to revolutionize automated image analysis. The PhD advisor recommended against the purchase, steering the Admiral away from the nascent digital technology and, unknowingly, away from a future that could have arrived a decade early. This decision was not a simple oversight; it was a choice made at a genuine technological fork in the road, defined by a fierce debate between competing hardware philosophies and set against the backdrop of a skeptical “AI Winter.”

The State of the Art in Machine Vision and Neural Networks (c. 1990)

The field of computer vision in 1990 was a world away from the deep learning-driven landscape of today. It was an interdisciplinary effort focused on extracting three-dimensional structure from images to achieve a high-level understanding of a scene.1 Progress was methodical, built upon manually engineered algorithms. Techniques like the Marr-Hildreth algorithm for edge detection and the Hough Transform for identifying geometric shapes like lines and circles formed the bedrock of the field.2 This was an era of explicit, rule-based feature extraction, not end-to-end learning from raw pixels.

However, the conceptual seeds of the modern era had been sown. Kunihiko Fukushima’s Neocognitron, first proposed in 1980, was a direct inspiration for modern Convolutional Neural Networks (CNNs).4 Inspired by the work of Hubel and Wiesel on the animal visual cortex, the Neocognitron introduced the core architectural principles of a hierarchical, multi-layered network with alternating layers of feature-extracting cells (S-cells) and down-sampling, distortion-tolerant cells (C-cells).6

The critical breakthrough came from Yann LeCun and his team at AT&T Bell Labs. They took the architectural concepts of the Neocognitron and made them trainable using the backpropagation algorithm.8 By 1989, LeCun had demonstrated a CNN, later known as LeNet-1, that could recognize handwritten ZIP codes from the U.S. Postal Service with remarkable accuracy for the time.10 This was a landmark achievement, proving that a network could learn relevant features directly from pixel data without extensive, task-specific pre-processing.12

Despite this success, the challenges were immense. Training these networks was more art than science. The “vanishing gradient problem,” formally identified by Sepp Hochreiter in 1991, meant that error signals would shrink to nothing as they propagated back through many layers, making it nearly impossible to train deep networks effectively.15 The entire ecosystem was primitive. Datasets were small, specialized, and difficult to acquire; there was no ImageNet, only curated collections like handwritten digits or the JAFFE database of facial expressions.17 Computational power was severely limited, with Intel 286/386/486 processors being common.17 There were no high-level libraries like PyTorch or TensorFlow; researchers had to implement complex algorithms like backpropagation from scratch, typically in C.17

This technological struggle took place in the chilling shadow of the “AI Winter.” The unfulfilled hype of 1980s AI had led to drastic cuts in funding and a pervasive skepticism toward the field.16 Neural networks were widely considered a “dead end” by many in the mainstream computer science community, and researchers in fields like computer vision had largely abandoned them in favor of other methods.17 This climate of risk aversion and doubt would have profoundly shaped the perspective of any conservative technical advisor.

The Great Hardware Debate: Analog vs. Digital Neural Networks

The challenge of implementing neural networks spurred a vibrant and contentious debate over the ideal hardware paradigm. Two distinct philosophies emerged, each with compelling advantages and significant drawbacks.

The first, and arguably more prominent in the research community at the time, was the analog approach. Analog Very Large-Scale Integration (VLSI) chips promised to mimic the continuous nature of biological neurons, performing computation through the fundamental physics of silicon circuits.24 This offered the potential for tremendous gains in speed and power efficiency compared to digital systems.24 The flagship of this movement was Intel’s 80170 Electrically Trainable Analog Neural Network (ETANN) chip, introduced in 1989. The ETANN was a sophisticated piece of engineering, boasting 64 analog neurons and 10,240 analog synapses on a single chip, and was being actively explored for military applications like missile seekers.23 The academic world was similarly focused, with the 1990 NIPS conference featuring numerous papers on specialized analog chips for vision tasks like edge and motion detection.27 However, the analog promise was plagued by practical perils. These systems were notoriously difficult and time-consuming to design, highly susceptible to thermal noise and fabrication process variations, and lacked flexibility. Once fabricated, an analog ASIC was essentially fixed in its function.24

The alternative was the digital approach, built on standard CMOS technology. Digital systems offered precision, perfect reproducibility, and, most importantly, reconfigurability.25 The key enabling technology for this path was the Field-Programmable Gate Array (FPGA). Xilinx, which had only gone public in 1990, introduced the first commercially viable FPGA, the XC2064, in 1985.28 This device, however, contained a mere 64 configurable logic blocks.28 By 1990, FPGAs were growing rapidly in capacity but were still seen as having limited density, suitable for “glue logic” to connect other chips but not yet for implementing complex, large-scale neural networks.30 They were also generally considered less power-efficient than their bespoke analog counterparts for parallel computation.26

The NEC Proposition: The DataCube MaxVideo 250 on a Xilinx Chip

The system offered by Neural Engines Corp. at the 1990 IJCNN represented a visionary leap for the digital approach. It was not a general-purpose computer but a highly specialized, high-performance image processing system built on the DataCube MaxVideo 250 board, a single-slot VME module.32

For its time, the MaxVideo 250 was a computational powerhouse. It delivered 7,000 MIPS (Millions of Instructions Per Second) of processing power, supported by a massive 640 MB/sec of internal bandwidth and 28 MB of on-board memory.34 Its architecture was based on pipeline processing, a method uniquely suited for image manipulation. Most critically, it was designed to perform

convolution operations simultaneously with other statistical and linear functions, all at real-time frame rates under the control of its ImageFlow software.34 This made it an almost perfect hardware platform for executing the core mathematical operation of a CNN.

The heart of the system’s flexibility was its use of a Xilinx FPGA. This meant the convolution logic was not hard-wired as it would be in an analog ASIC. It was programmable. This reconfigurability was a profound advantage, offering the ability to update and refine algorithms without redesigning the hardware itself—a crucial feature for a rapidly evolving field like neural networks. With Xilinx on the cusp of major growth and the military already being a key early adopter of FPGA technology, the NEC system represented the cutting edge of programmable digital signal processing.28

Reconstructing the PhD’s Rationale: A Defensible “No”

In hindsight, the PhD advisor’s recommendation appears shortsighted. Yet, within the specific context of 1990, his conservative stance was not only understandable but defensible. His reasoning likely followed a logic that prioritized perceived maturity and near-term viability over high-risk, high-reward innovation.

The advisor would have been immersed in a research community where analog neural networks held significant momentum. The promise of a low-power, high-speed, dedicated chip from a semiconductor giant like Intel would have seemed a far more direct and reliable path toward a deployable military system, which must always contend with strict size, weight, and power (SWaP) constraints.23

Conversely, he would have viewed the NEC system as a collection of powerful but unproven components. FPGAs were a relatively new and exotic technology with limited gate density and a notoriously difficult design flow that lacked the high-level tools of today.28 The system’s performance might have seemed like digital brute force compared to the theoretical elegance of analog computation. This preference for specialized, efficient hardware over flexible, programmable logic was a central debate of the era, and the advisor’s recommendation simply reflected his position on the matter.

Beyond the hardware, the advisor would have been acutely aware of the profound difficulties in training the CNNs that the hardware was designed to run. The backpropagation algorithm was known to be finicky, highly sensitive to initial conditions, and prone to getting trapped in poor local minima.22 The vanishing gradient problem was a very real, if not yet fully articulated, barrier to achieving high performance on complex problems.15 He could have logically concluded that the digital hardware was ahead of the algorithms and that investing in a programmable system without a robust, guaranteed method for training an effective model was a risk not worth taking, especially in the funding-scarce environment of the AI Winter.

Metric	NEC/DataCube/Xilinx (Digital)	Intel ETANN (Analog)
Flexibility/Reconfigurability	High: Logic defined by software on FPGA 28	Low: Function fixed in silicon (ASIC) 25
Performance/Watt (Theoretical)	Low-Medium: General-purpose logic less efficient 26	High: Computation via circuit physics 24
Design & Programming Complexity	High: Required low-level hardware description 28	High: Specialized, noise-sensitive analog design 24
Noise Immunity & Precision	High: Inherent in digital logic	Low: Susceptible to noise, thermal drift 24
Scalability & Path to Improvement	High: Followed Moore’s Law for digital logic 36	Medium: Scaling was complex, not just a matter of shrinking transistors
Maturity & Industry Backing (1990)	Emerging: Xilinx IPO in 1990 29	Established Research: Major Intel product 23, extensive academic work 27

The Unseized Military Advantage: An Alternate History of Naval Warfare

Had the Admiral chosen to embrace the NEC technology, he would have done more than just purchase a piece of hardware; he would have set the U.S. Navy, and by extension the entire Department of Defense, on a radically different technological trajectory. Early, sustained investment from a powerful “anchor customer” would have pulled the promise of deep learning out of the laboratory and into the real world, reshaping the development and deployment of artificial intelligence in warfare through the 1990s and beyond.

A New Trajectory for Automatic Target Recognition (ATR)

The early 1990s was a period of strategic upheaval for the U.S. Navy. The collapse of the Soviet Union shifted the focus from open-ocean, blue-water engagements against a peer adversary to complex, cluttered littoral environments where the primary threats were quiet diesel-electric submarines and low-flying anti-ship cruise missiles (ASCMs).38 This new reality created an urgent operational need for robust Automatic Target Recognition (ATR). Existing systems, like the AN/APS-137 radar, were heavily dependent on the skill of human operators and suffered from high false alarm rates, making automation a top priority for programs like the AN/SPQ-9B ship defense radar.39

In our timeline, the Navy formally initiated the Automatic Radar Periscope Detection and Discrimination (ARPDD) program in Fiscal Year 1993, aiming to upgrade existing radar processors.39 In the counterfactual, a partnership with NEC in late 1990 would have spawned a far more ambitious “Convolutional ATR (CATR)” initiative. The DataCube MaxVideo 250, with its native ability to perform real-time convolutions, would have become the core engine for this program’s prototypes.34 This would have allowed researchers at the Naval Research Laboratory (NRL) and Johns Hopkins APL to immediately begin testing and refining LeCun-style CNNs on real-world sensor data, bypassing years of development on less suitable hardware and sidestepping the bottleneck of manual feature engineering.42

This CATR program would have become a massive data generation engine. To train the networks, patrol wings and surface combatants would have been tasked with collecting vast libraries of labeled sensor data—radar signatures, infrared imagery, and acoustic signals of everything from periscopes and missiles to sea birds and civilian vessels.44 This effort would have inadvertently created one of the world’s first large-scale, high-stakes, multi-modal labeled image datasets, a full decade before the academic efforts that produced ImageNet. The challenge of training on this unprecedented volume of data would have spurred early innovation in the data infrastructure and algorithms needed to handle it.

The result would be a dramatically accelerated deployment schedule. By the mid-1990s (c. 1995-1996), the first operational CNN-based ATR systems would be reaching the fleet. The AN/SPQ-9B radar, which historically began its slow development in the early 90s, would have incorporated a powerful “CATR module” from its inception.40 P-3 Orion and S-3 Viking aircraft would have received upgrades replacing their operator-intensive processors with a system capable of providing reliable, automated classification.39

The Re-Imagined Fleet: A Decade’s Head Start in Networked AI

The impact of a trusted, automated classification capability would have rippled through the entire fleet. It would not just be a better sensor, but a foundational enabler for the networked warfare concepts being developed at the time, such as the Copernicus C4I initiative, which aimed to create a common tactical picture.46 The ability to automatically identify and tag threats would have enriched this picture with a level of fidelity and speed previously unimaginable.

This capability would have fundamentally altered the development of unmanned systems. Instead of simply acting as remote sensor platforms streaming raw data back to a ship or ground station, UAVs could have been equipped with on-board CATR processors. This would allow them to perform analysis at the edge, identifying and classifying targets autonomously and sending back only curated, relevant intelligence—a key feature of modern systems like the MQ-4C Triton that would have been available in the 1990s.46

This rapid advancement would have forced the Navy to grapple with the complex doctrines of AI warfare much earlier. The institutional challenges of building trust in an AI system, developing robust testing and validation protocols, and defining the rules of engagement for “human-on-the-loop” systems would have become major research topics at centers like the NRL and NIWC Pacific throughout the late 1990s.48 The most profound shift would have been in doctrine. A sailor in 1990 was trained to interpret blips on a radar screen. In this alternate 2000, that sailor would be trained to interpret the confidence scores and potential failure modes of the AI system doing the classification. This is a fundamentally different skillset, requiring a new philosophy of training, command, and control.

The Ripple Effect: FPGA Dominance in Defense

The success of the Navy’s CATR program would have cemented FPGAs as the go-to hardware for military AI acceleration. The Navy’s insatiable demand for more processing power to run deeper, more complex networks would have directly funded R&D at Xilinx and its competitor Altera. This would have likely pulled forward the introduction of architectural features critical for deep learning, such as dedicated hardware multipliers and larger blocks of on-chip memory, which historically appeared later in the decade.30

Success breeds imitation. The Army and Air Force, seeing the Navy’s leap in capability, would have rapidly moved to adopt FPGA-based CNN accelerators for their own ATR challenges, such as identifying ground vehicles or airborne threats.41 This would have created a large, stable, and lucrative defense market for programmable logic, driving down costs and fueling a virtuous cycle of investment and innovation that would have benefited all sectors.30

Milestone	Actual Timeline	Counterfactual Timeline	Key Technologies/Programs
Formal Requirement for Automatic PDR	1992-1993 39	1990	Shift to littoral warfare 38
Initiation of R&D Program	FY1993 (ARPDD) 39	1991 (“CATR”)	DataCube MaxVideo 250, CNNs 34
First Successful Prototype Demo	1997-1998 (Brassboard ARPDD) 39	1994	FPGA-based convolution, large Navy dataset
First Operational Deployment (Shipboard)	Post-2002 (AN/SPQ-9B) 40	1996	“CATR module” in AN/SPQ-9B
First Operational Deployment (Airborne)	Post-2005 (AN/APS-153) 39	1997	CNN-based processor for P-3/S-3
Integration into Networked C4I	Late 2000s	c. 1999	Copernicus C4I initiative 46

The Commercial Cascade: The Accelerated Path to Automated Infrastructure

The shockwave from the Admiral’s “yes” would not have been contained within the military-industrial complex. The validation of NEC’s technology by the Department of Defense would have acted as a powerful catalyst, de-risking the technology for the commercial sector and fundamentally altering the development of automated transportation infrastructure in the United States.

The SAIC Decision Revisited: De-Risking Innovation

In our timeline, Science Applications International Corporation (SAIC), a large, employee-owned government contractor, was developing its ARCS automated toll collection project. When approached by NEC, SAIC declined to use their license plate recognition technology, presumably viewing it as too nascent and risky for a critical public infrastructure project.52

In the alternate timeline, this decision plays out very differently. By the time SAIC is making its key technology choices for ARCS (c. 1993-1994), NEC’s core technology is no longer an unproven concept from a small company. It is the validated, battle-tested heart of the U.S. Navy’s highly successful CATR program. As a major DoD contractor, SAIC would have been intimately aware of this success.52 The choice is no longer between a reliable but limited technology like Radio-Frequency Identification (RFID) and a risky, unproven vision system. It is between RFID and a cutting-edge, government-certified LPR system. The LPR approach offers a compelling advantage: it eliminates the need for physical transponders, drastically reducing hardware distribution costs, logistical complexity, and friction for the customer.53 With the technological risk effectively neutralized by the Navy’s investment, the business case becomes undeniable. SAIC chooses NEC’s LPR technology.

The Rise of LPR-Dominant Tolling

This single decision by SAIC would have set the entire U.S. tolling industry on a different path. The ARCS project, which spun off from SAIC to become Transcore, would have been built around computer vision from its inception [prompt]. In our history, Transcore’s heritage and extensive patent portfolio are rooted in RFID technology developed at Los Alamos National Labs in the 1980s.54 In the alternate timeline, Transcore emerges as a

computer vision and LPR company.

Consequently, the first major electronic toll collection (ETC) systems in the United States would have been camera-based, not transponder-based. Instead of systems like E-ZPass becoming the standard in the mid-1990s, the camera-based ARCS/Transcore system would have set the national benchmark.56 The familiar sight of gantries with RFID readers would be replaced by gantries with cameras. This would have leapfrogged the entire RFID era of tolling, establishing a path-dependent infrastructure built on image capture and optical character recognition (OCR) from the very beginning. The technical challenges of LPR—handling diverse plate designs, poor weather, and high speeds—would have been tackled and solved a decade earlier, driven by the immense financial incentive of tolling revenue.53

The Privatization Accelerator

The existence of a mature, gateless, low-overhead LPR tolling system in the mid-1990s would have dramatically strengthened the business case for privatizing transportation infrastructure. A primary obstacle to private highways is the significant capital and operational expense of traditional tolling infrastructure.59 A camera-based system is less physically intrusive, cheaper to deploy, and more efficient to operate at scale than one that requires building and staffing toll plazas or manufacturing and distributing millions of transponders.

Armed with this powerful new tool, proponents of privatization could point to a highly efficient, user-friendly system as a model for how the private sector could innovate. This would have likely accelerated the trend of public-private partnerships for road construction and management that became more common in the 2000s.59 The technology would have rapidly spilled over into adjacent markets as well. Parking lot management, gas station drive-off prevention, secure facility access, and drive-thru monitoring would have adopted mature LPR technology in the late 1990s, rather than waiting until the late 2000s and 2010s.58

Corporate Fortunes Remade

This alternate history would have rewritten the fortunes of the key corporate players. Neural Engines Corp., instead of fading away, would have become a foundational company in applied AI, likely acquired for a formidable sum by a major defense contractor or technology giant in the late 1990s. Transcore would have evolved as a leader in vision-based intelligent transportation systems. And Xilinx, fueled by massive, parallel demand from both the military and a booming commercial LPR market, would have seen its growth supercharged, accelerating the “FPGA vs. ASIC” war by years.36

Era	Dominant Technology (Actual Timeline)	Dominant Technology (Counterfactual Timeline)	Key Players (Actual)	Key Players (Counterfactual)
Mid-1990s (Pioneering)	RFID 56	License Plate Recognition (LPR)	Transcore (RFID), State Consortia	Transcore (LPR), Neural Engines Corp.
Early 2000s (Expansion)	RFID Interoperability	LPR Standardization & Expansion	State ETC Groups, RFID vendors	Transcore, New Vision-Tech Startups
2010s (Modernization)	All-Electronic Tolling (LPR as backup/alternative) 62	Ubiquitous LPR, Advanced Data Analytics	Verra Mobility, RFID + LPR vendors	Mature Vision-Based Analytics Firms

Repercussions and Reflections: A World Reshaped by a Single “Yes”

The Admiral’s counterfactual decision to embrace a fledgling technology in 1990 would have created a world not just incrementally different, but one whose technological and societal contours were fundamentally reshaped. The accelerated development of military AI and the commercial cascade into automated infrastructure would have presented both profound advantages and complex new challenges, forcing society to confront the dilemmas of ubiquitous AI a decade ahead of schedule.

The Dual-Edged Sword of Early Automation

The most direct consequence would be a significant enhancement of U.S. military power. The nation would have entered the post-9/11 era with a trusted, battle-hardened, AI-driven intelligence and surveillance capability. This could have led to greater operational effectiveness in conflicts in Afghanistan and Iraq, with more precise targeting, better situational awareness, and potentially fewer casualties.46

However, this military prowess would have a civilian counterpart with darker implications. The commercial adoption of LPR for tolling would mean that by the early 2000s, a vast, interconnected network of cameras would already be monitoring the nation’s highways. The intense debates over mass surveillance, data privacy, and the potential for a “surveillance state” that emerged in our timeline in the 2010s would have erupted a decade earlier.56 The technological capacity for fusing commercial location data with government surveillance databases would have existed far sooner, bringing difficult questions about civil liberties to the forefront of public discourse. Furthermore, the early success and institutional trust in military ATR would have inevitably accelerated the push toward developing lethal autonomous weapons systems (LAWS), forcing a national and international debate on the ethics of autonomous lethality in the late 90s, not the 2020s.

The Enduring Lessons of the “Anchor Customer”

This thought experiment vividly illustrates the critical role that a large, demanding “anchor customer”—in this case, the Department of Defense—plays in nurturing disruptive technologies through their perilous infancy. The Navy’s hypothetical investment would have provided the crucial funding and, more importantly, the high-stakes, real-world problem set needed to carry NEC’s technology across the infamous “valley of death” that separates a laboratory prototype from a viable commercial product.

The military’s rigorous testing, evaluation, and validation process would have served to create a de facto standard for performance and reliability.42 This government-stamped seal of approval is precisely what would have de-risked the technology for a commercial entity like SAIC. It is a powerful demonstration of how government procurement, when directed at innovative technologies, can act as a potent catalyst for an entire industry, shaping market dynamics and setting technological standards for years to come.

Final Assessment: A Net Gain or Loss?

It is impossible to definitively label this alternate world as better or worse, only as profoundly different. The gains are clear: an undisputed military advantage for the United States during a critical period, a more robust and globally competitive domestic technology sector in FPGAs and applied AI, and the earlier development of more efficient transportation infrastructure.

The losses, however, are equally significant. The early and rapid proliferation of LPR technology would have brought about an earlier collision between technological capability and personal privacy. The creative destruction of what would have been a thriving RFID industry in the transportation sector is another consequence. Perhaps most critically, the world would have been thrust into an accelerated AI arms race, normalizing the role of artificial intelligence in warfare and forcing a confrontation with its most challenging ethical questions without the benefit of the intervening decade of public discourse, technological maturation, and policy development that occurred in our own timeline.

Ultimately, the decision made at that vendor booth in 1990, guided by the cautious advice of a well-meaning expert, represented a pivotal moment. It delayed the tangible, physical-world impact of the deep learning revolution by nearly a decade. While the world eventually caught up, that lost decade represents a fascinating and cautionary tale of a future that almost was—a future that would have been more technologically advanced, but also one that would have been forced to face the complex societal bargains of the AI age far sooner.