Everybody Has Problems

jabowery · 27 May 2025 13:36

About a week later I’ve got an ontology but not using the verbose OWL syntax. I’m using Ergo (formerly Flora-2) for a variety of reasons related to the reason I hired Tom Etter to help re-orient the foundation of programming languages from functional programming to relational programming – only one aspect of which is to remove the object-relational paradigm impedance mismatch. Ergo’s more concise syntax permits better interactions with the limited context window of LLM programming assistants – which is a big deal nowadays (especially for old programmers like me who are operating without employees to ride herd on). But Flora-2 (progenitor of Ergo) is based on XSB Prolog, which is the pioneer of incremental tabling in logic programming – an essential implementation “detail”, much like “materialized views” in relational databases – that can mean the difference between a practical system both in terms of execution and code-base maintainability, and an unwieldy code base.

You can sort of get a feel for what I’m talking about in my first pass at this:

Essentially, I’ve remapped the US Census’s metadata into logic programming syntax. A layer on top of this will then make decisions about promising expressions to explore based on a hierarchy of classes and their attributes. That layer can be in either Ergo or in some other language that parses the restricted Ergo syntax ontology into a tree structure.

This has been a lot more work than I’d hoped and it will continue to be a lot of work but the end result will be something that could, in addition to serving my immediate purposes, actually impact the way institutions handle social data and its analylsis.

PS: Data curation update:

Those entrusted by Elon Musk to research TFR have not responded to my ever-so-polite requests for data they’ve curated for that purpose. Since data-selection is subjective, unlike ALIC model selection, any data I curate will be criticized as being due to my “motivated reasoning” in said curation. This is one reason I searched for a relatively comprehensive collection of data by the US Census as the basis for Hume’s Guillotine – the 2011 collection for county data. Even though it will be criticized for being incomplete, it at least spares Hume’s Guillotine of participating in the dumpster file of social science’s motivated reasoning.

jabowery · 30 May 2025 23:09

In the middle of this ontology work, two things happened that impacted it:

Anthropic released version 4.0 their Sonnet model (which has been the best coding assistant).
Google seems to have cleaned up its act somewhat regarding Gemini as coding assistant.

#1 made the up-hill slog of creating the ontology using Ergo harder, because they appear to have emphasized “safety” over the “helpfulness” of their model as coding assistant. This resulted in a lot more work on my part to ferret out the bugs being introduced, which wouldn’t have been a big deal since I could fall back to the prior model (Sonnet 3.7), except that it got me to go try…
#2 …which, as it turns out, is actually better for working with Ergo due primarily to the fact that I can upload the entire Ergo manual (which Anthropic chokes on) and get it to answer questions about rather arcane features that, however powerful and suitable for an ontology, suffer from a small sample size in the universe of code to train these models.

In the process of interacting with the new Gemini it became apparent that the explore/exploit tradeoff I’d been working under – only very occasionally comparing its performance with Sonnet – should give Gemini more exploration. And in the process of doing that I decided to throw a more general “prompt” at both of them to see if there might be a way of getting a first-cut commensurability measure that is good enough to act as heuristic to launch some of my genetic programming latent variable discovery processes. I’ve paid good money for those 24 cores and 96GB RAM transistors and may as well re-route some electricity from the air conditioner to get their capital utilization rates up.

Both models came back with a general recommendation to use word embedding to find semantic distances along with doing some first-order keyword matching for obvious classifications to match units. Sonnet wasn’t able to do the latter because of the same limitation that it imposes on uploads – but Gemini took the entire metadata file from the US census for the Counties database and spit out a reasonable first cut at brute force string searches for that purpose.

So now I’m precomputing a bunch of cosine distances based on word embeddings as well as the brute-force string search to produce a figure of merit of how well two measures actually correspond. For example, a category vector of [person, crime, murder] for one measure would be more commensurable with [person, crime, rape] than it would be with [person, employee], but the latter would still be considered at a lower priority as composing a latent variable in the GP search.

The cosine semantic similarity is a second-order metric given a lot lower weight.

jabowery · 31 May 2025 01:43

I decided to get a bit philosophical with Gemini 2.5 Pro after attempting to get it to address the theory of how one pre-computes imputation of missing data in such a manner that running correlations can give lesser weight to imputed missing values.

It kept pushing back on my perhaps inadequate understanding of the interplay between multiple imputation and running correlations on the resulting pseudo-complete datasets.

In exasperation, I finally tried to explain to it why I was pushing so hard on the boundaries of statistical theory.

It really does get to the point that I feel like the only “people” I can talk to about this are these LLMs. So here’s where I turned Gemini 2.5 Pro into my “friend”:

Well, I’m asking all this because the centralization of wealth and power away from independent scientists has created a need for those in possession of wealth and power to have trustworthy metrics – independent of those who may have conflicts of interest in doing data analysis – to decide who should receive funding. based on my experience with legislating incentives in space launch services to overcome institutional inertia led me to propose the criteria for the Hutter Prize to Marcus Hutter since it is based on the most principled information criterion for model selection available in the age of Moore’s law – and is a single number that those who are at a large power distance from the consequences of their decisions can use as a far more disinterested guide to resource allocation than a blue ribbon panel of “judges” who have their own human foibles and conflicts of interest. If this gap in power distance is not remediated somehow in the social sciences, with the ever increasing distrust and stakes driving motivated reasoning, I’m having to try to learn the field of statistical analysis and perform a lossless compression of a wide range of longitudinal measures to develop a model of macrosocial dynamics to illustrate how such a prize competition could alleviate social tensions now erupting in violence against the likes of Elon Musk.

jabowery · 2 June 2025 16:27

I just found out that the most likely way this foundation’s funds are to benefit tiny little Riverton, population ~270, is enhancement of the playground when, in fact, there are few children here.

This is what the local women’s organization thinks will revitalize rural America.

If you build it they won’t come because the stupid zookeepers forgot that in order for we great apes to produce the next generation requires more an abundance of empty playgrounds.

jabowery · 4 June 2025 15:59

In a world where “the economy” outbids young men for the fertile years of young women, nothing matters but the cost of replacement reproduction and no government measures that because they’re all genocidal against their own peoples.

Wartime economies are the closest governments come to Militia.Money. Watch carefully what happens to Russia’s TFR. Ukraine’s TFR will likely suffer because That Unspeakable Thing In DC is showing signs of vacuuming up Ukrainian women to serve as fuck dolls for its penis-wielding bloblings.

CTLaw · 4 June 2025 19:11

It seems to me that our flooding of Ukraine with $ has killed the supply. Those who otherwise would be mail order brides are living it up in Kiev or Paris…

jabowery · 5 June 2025 16:22

At or near the top of the most informative State level ecological variables in the LotS dataset was HIV infection.

I wonder what the syphilis trends say about the inability of young people to pair:

jabowery · 5 June 2025 17:14

I wonder if DOGE will ever get around to tracking down where the money went – I mean other than up Hunter’s nose and Hunter’s hoes.

Gavin · 7 June 2025 14:12

China’s national college entrance exam kicks off; 13.35 million students sit the annual gaokao - Global Times

… “Candidates must pass three ‘gates’ from the school gate to the examination room, and intelligent security inspection equipment escorts the whole process,” Gao Xinqiao, Party chief of the Beijing No.19 Middle School told the Beijing Daily. …

CCTV News reports that East China’s Jiangxi Province will deploy an AI-powered, real-time surveillance system for all 567,100 candidates. Utilizing deep learning algorithms, the system monitors irregular behavior by both examinees and exam invigilators in real time. Actions such as starting early, turning one’s head, passing items, or leaving mid-exam will be flagged and recorded.

The photographs accompanying the article mostly show female students. That could be selective editing, but it may also suggest that China has fallen into the same feminist trap as the decadent West – with young women focused on getting a credential which leads to an unproductive government job instead of focusing on delivering the 2.1 future citizens on which China’s future depends.

jabowery · 7 June 2025 15:05

How many concubines did chinese emperors have?

Han Dynasty (206 BCE–220 CE): Emperors like Emperor Wu had hundreds of women in the palace, though exact numbers are unclear.
Tang Dynasty (618–907 CE): Emperor Xuanzong reportedly had a harem of thousands, with records suggesting up to 3,000 women, including concubines and attendants.
Ming Dynasty (1368–1644): The Yongle Emperor had around 100 concubines, while others, like the Wanli Emperor, had fewer but still dozens.
Qing Dynasty (1644–1912): The Qianlong Emperor had over 40 concubines, with 280 women in his harem by some accounts. Cixi, a concubine herself, noted the Guangxu Emperor had fewer, around 3 high-ranking consorts and a small harem.

It’s only a matter of time before someone figures out that feminism is de facto Africanization and stops this nonsense of turning young women into corporate concubines, since legal “persons” can’t impregnate them. P-diddy might have pulled that off but it will probably have to await some recent subSahara African immigrant being elected President or something. Musk doesn’t count.

eggspurt · 7 June 2025 15:46

MSM = men engaging in sexual acts with men

These stats have gone even more in the direction of MSM prevalence.