Roswell System Narrative

2023 September 23

Disabled screensaver (blanking) for the kelvin account.  We'll rely on
the Power Management settings to handle this.

Reformatted the 3 Tb drive I used for the doomed Windows 11 mirror
backup for use as a mirror backup of both SSDs.
    super
    umount /dev/sda2
    fdisk /dev/sda
    p           # Shows NTFS formatting
    o           # New DOS partition table
    n           # New partition
    p           # Primary
    1           # Partition 1
       # Start
       # End
    w           # Write to drive
This creates a Linux filesystem filling the entire drive.  Unplug and
plug back in.
    mkfs -t ext4 -L Roswell_Backup /dev/sda1
    fsck -f /dev/sda1
Unplug and plug and it mounts successfully.

Downloaded and installed Balena Etcher .deb package from:
    https://github.com/balena-io/etcher/releases
This is a utility to write ISO images to USB drives.
The dpkg -i balena-etcher_1.18.11_amd64.deb installation left unresolved
dependencies.  To fix these, I ran:
    apt-get install -f
which installed the dependencies and then completed configuration of
Balena Etcher.  Etcher may be found in the Accessories section of the
applications menu.

Downloaded Rescuezilla from:
    https://rescuezilla.com/
and flashed onto the Kingston 3 Gb USB drive I last used for the Xubuntu
install ISO.

Restarted and booted into Rescuezilla.  Selected English.  A big black
box appears, filling around 3/4 of the screen, and then nothing happens.
The system is totally dead, and even Ctrl-Alt-Del doesn't reset it.
The keyboard is still showing the boot colour pattern.  Had to power
cycle machine.  This happened two times.

So much for that.

Moving right along, let's try Redo Rescue:
    http://redorescue.com/

When I try booting from its USB stick, a get a blue screen with
"Security failure" and a peekaboo message that disappears before I can
read it.  This happens despite trying every way I could imagine to
create the USB boot drive, wasting more than three hours on the process.

It may be that we have to disable Secure Boot.  You do this by booting
with F2 and scrolling down (note, mouse scroll wheel does not work here)
to Secure Boot, where you can turn off "Enable Secure Boot".

It still boots Xubuntu and Windows 11 with Enable Secure Boot turned
off.

And...with Secure Boot turned off, it boots into Redo Rescue from the
USB stick.

Guess what?  After a few seconds, Redo Rescue hangs with the big black
box on the screen just like Rescuezilla.  I'm beginning to get the
idea that somebody up there really doesn't like the idea of customers
making all-inclusive bare metal restore backups with free and open
source tools.

OK, two batters have come to the plate and both have ignominiously
struck out.  Batter up!   This time, trudging from the on deck circle
to the plate is an old timer, Clonezilla:
    https://clonezilla.org/
a venerable free and open source program for disc cloning and bare
metal backup.  In fact, Rescuezilla is a graphical user interface
bolted on to Clonezilla.  But Clonezilla is known for its primitive
text-mode under interface but also broad compatibility, so maybe it
will be able to break this no-hitter.  It also runs from a USB stick,
including its own Linux distribution to back up with discs completely
idle.  I downloaded and made a USB stick with Rufus on Windows and
booted it.  It came up OK, so I decided to see if it was able to
boot with Secure Boot turned on.  I went back to the BIOS and enabled
it and, sure enough, Clonezilla booted just fine.  I feel so secure now.

Now I plugged in the 3 Tb USB drive we formatted at the start of
this long day and started an image backup of all of the partitions
on which Windows 11 is installed, specifying verification after
the backup.  This ran to completion with no problems, creating a
directory on the external drive called Windows_2023-19-23-18.img
containing the image files and metadata that Clonezilla uses to
restore the partitioning, boot loader, etc. on a bare metal restore.
The partition dumps mirror only occupied space (it understands most
file system structures) and the data it dumps are compressed with gzip.
The total dump of the Windows drive is 39 Gb.

I then mirrored the partitions on the Linux drive.  This also completed
successfully, creating a directory called Linux_2023-09-23-19.img which
is 16 Gb.

As Clonezilla does not do incremental backups, every dump is a full
mirror, but at present we have plenty of space on the backup drive
for lots of dumps until we can get a smarter backup solution running.
The advantage of the mirror is that it permits complete restoration
in case of disaster, while most "smart backup" tools require extensive
preliminary work bringing up the system before restoring the backup.

Transferred the Wallpaper archive to both the Linux and Windows side
and set up wallpaper to make it obvious which we're running.

At the end of a very long day, we now have mirror backups of both
operating system installations.  This should have taken about an hour
to accomplish.  It took around ten.  The Clonezilla boot is installed
on the Kingston 32 Gb drive.  I will reserve it for that until we sort
out things further.

When you search for "Google Chrome" in the pre-installed Microsoft Edge
browser, it puts up a big panel at the top of the results saying
"There's no need to download a new web browser."  Then, when you ignore
it and click the Download Google Chrome result, it puts a pop-up on top
of the Google Chrome page saying:
    Microsoft Edge runs on the same technology as Chrome
    with the added trust of Microsoft.
and a button saying "Browse securely now" that takes you to a puff page
about the advantages of Edge.  "Added trust of Microsoft, eh?"  Sounds
like some anti-trust action is in order here.

Then, when you download the Chrome installer, another big banner appears
at the top of the page with the same "added trust" message as the pop-up.
This one has nothing to clearly distinguish it from the content which
is provided by Google on the page.

Oh my God!  After downloading the Chrome installer, I looked for it in
the Download folder and it was nowhere to be found.  I displayed the
Downloads panel in Edge, and it shows ChromeSetup.exe with a line
drawn through it and a message below, "Removed".  Let's try downloading
it again.

This time I clicked on the download before it disappeared, and a panel
appeared at the right of the Edge screen with a multiple choice question
asking why I wanted to install another browser.  This is just ludicrous.
This intrusion has a heading "We love having you!"  "Detestation of
everything Microsoft" is not among the choices.

Finally, Google Chrome is installed and I can throw Microsoft off the
Edge.

Somehow, Windows spontaneously switched to a "Light" theme in which
window bars, etc. are a sickly salmon pink.  I switched Settings/
Personalization/Colors/Choose your mode to "Dark" to get rid of it.
I don't particularly like dark mode, but that pink makes me bilious.

When I launched the Alienware Command Center, it said additional
components were required.  I gave it permission to download, and it
proceeded to do so.  I gave it permission to install.  Naturally,
the "Alienware OC Controls Application" installer popped up *under*
all other open windows.  When I found it, I clicked Install.  This
popped up yet another Install dialogue, which I clicked.  Finally, it
said OC controls installed.
2 Likes

I have had excellent results handling multiple bootable images on a single USB stick with Ventoy. You might find it helpful, too.

3 Likes

2023 September 24

Downloaded Basemark GPU benchmark from:
    https://www.basemark.com/benchmarks/basemark-gpu/
and installed in ~/linuxtools/basemarkgpu-1.2.3.  You run it with:
    ./basemarkgpu
Dies with:
    libva error: /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so init failed
So much for that.

Installed:
    apt-get install vainfo
Reports:
    libva info: Found init function __vaDriverInit_1_14
    libva info: va_openDriver() returns 0
    vainfo: VA-API version: 1.14 (libva 2.12.0)
    vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 22.3.1 ()

To mount the Windows 11 C: drive under Linux, proceed as follows.
Use:
    lsblk
to list installed block devices.  Find the drive with all of the
(6 in my case) Windows partitions, and identify the big one which
will be the NTFS C: drive, here:
    nvme1n1     259:1    0   1.9T  0 disk
    ├─nvme1n1p1 259:4    0   500M  0 part
    ├─nvme1n1p2 259:5    0   128M  0 part
    ├─nvme1n1p3 259:6    0   1.8T  0 part
    ├─nvme1n1p4 259:7    0   1.4G  0 part
    ├─nvme1n1p5 259:8    0    15G  0 part
    └─nvme1n1p6 259:9    0   1.1G  0 part
in which nvme1n1p3 is the C: drive.  This will have device file name
/dev/nvme1n1p3.  Get its UUID with:
    blkid /dev/nvme1n1p3
    /dev/nvme1n1p3: LABEL="OS" BLOCK_SIZE="512" UUID="560E427A0E4252E3"
        TYPE="ntfs" PARTLABEL="Basic data partition"
        PARTUUID="cbe2166a-6083-45e0-924a-920746ad2471"
Create a mount point for the Windows drive:
    mkdir /win
Add an entry to /etc/fstab to mount it read-only.
    # Windows 11 C: Drive
    UUID=560E427A0E4252E3 /win	ntfs ro 0 0

You can now mount it with:
    mount /win
and it will be automatically mounted on subsequent boots.

Phil Turmel suggested Ventoy for managing booting from ISOs on USB
sticks, avoiding the need to re-format and dedicate a USB stick to
each ISO.  That sounded excellent, so I gave it a try.

Downloaded Ventoy from:
    https://www.ventoy.net/en/download.html
and unpacked into:
    ~/linuxtools/ventoy-1.0.95
Launched with:
    super
    cd ~/linuxtools/ventoy-1.0.95
    ./VentoyGUI.x86_64
Inserted new Lexar 8 Gb USB drive.
Performed install of version 1.0.95--successful.

After the installation, the drive is formatted with:
    sda           8:0    1   7.5G  0 disk
    ├─sda1        8:1    1   7.5G  0 part /media/kelvin/Ventoy
    └─sda2        8:2    1    32M  0 part
The big /media/kelvin/Ventoy partition is formatted as exfat.  This
is where you copy the .iso files to be installed.

Downloaded Memtest86+ from:
    https://memtest.org/
and unpacked mt86plus_6_20_64.iso, which I copied to the Ventoy
partition.

Booting Ventoy requires disabling Secure Boot.  With it disabled,
Clonezilla boots correctly, but Memtest86+ hangs in an apparent
CPU loop.

So far, the main problem with Ventoy is that it requires disabling
Secure Boot even if the ISO you're loading (such as Clonezilla)
supports it.  Now, I recognise the Secure Boot is nothing but a
scam to require people who make operating systems which are an
alternative to the Microsoft Trust pay them a Microsoft Tax, but
at the the same time I prefer things that run on a stock system
without disabling stuff in the BIOS, so I'm torn.  After struggling
to get Windows 11, Xubuntu, and Clonezilla all booting in secure mode,
I'm inclined to stay with that configuration for the time being.

Downloaded and installed Phoenix Firestorm for Second Life on the
Linux side and unpacked in ~/linuxtools.  It will be accessed the
same way through symbolic links as we do on Hayek and Ragnar.

With Firestorm under Linux and the window sized to around 7/8 of the
full screen of 2560x1600, I get around 160 frames per second on
Fourmilab Island.  This is sensitively dependent on window size: with a
window around 1/4 of the screen frame rate is 273 frames per second.

Installed:
    apt-get install texlive-xetex
This is the version of TeX/LaTeX which supports UTF-8.

Installed:
    apt-get install fonts-linuxlibertine
This is the Unicode font collection used by XeTeX.

Installed:
    apt-get install texlive-pstricks
This is needed to include EPS in LaTeX documents with the
"graphicx" package.

Installed:
    apt-get install traceroute
    apt-get install nmap
    apt-get install spell
    apt-get install exif
    snap install audacity
    apt-get install nedit
    apt-get install ibritish
    apt-get install units
    apt-get install meld
    apt-get install cadaver
    apt-get install xdaliclock
    apt-get install npm         # Installs most of the nodejs complex
2 Likes

Huh. Works with Secure Boot for me. :man_shrugging:

Ah, I turn that on again afterwards.

2 Likes

Makes HAL seem compliant:

4 Likes

2023 September 25

Added an /etc/hosts entry for ragnar.  This is semi-ephemeral since
Ragnar's IP is assigned via DHCP, but it's just too convenient not to
have while it lasts.

On the Windows side:

Got rid of the stupid "Windows spotlight" picture on the stupid Lock
screen via Settings/Personalization/Lock screen.  Selected a custom
picture from Pictures/Wallpaper, changed Lock screen status to None,
and unchecked "Get fun facts, ... on your lock screen".  An idiot
lock screen I can't turn off isn't where I go to look for "fun".

Installed PowerShell 7.3.7.0:
Launched existing PowerShell as Administrator:
    PS C:\Windows\system32> winget search Microsoft.PowerShell
    Name               Id                           Version Source
    ---------------------------------------------------------------
    PowerShell         Microsoft.PowerShell         7.3.7.0 winget
    PowerShell Preview Microsoft.PowerShell.Preview 7.4.0.5 winget
    PS C:\Windows\system32> winget install --id Microsoft.Powershell --source winget
    Found PowerShell [Microsoft.PowerShell] Version 7.3.7.0
    This application is licensed to you by its owner.
    Microsoft is not responsible for, nor does it grant any licenses to, third-party packages.
    Downloading https://github.com/PowerShell/PowerShell/releases/download/v7.3.7/PowerShell-7.3.7-win-x64.msi
    Successfully verified installer hash
    Starting package install...
    Successfully installed

Pinned PowerShell 7 to the Taskbar.  $PSVersionTable reports version
7.3.7 and claims we're running Microsoft Windows 10 0.22621.  Apparently
it's so powerful it can even travel backwards in time.

The user home account Windows 11 created for me on install is the idiot
C:\Users\kelvi.  Querying how to change this results in a bunch of
scary Microsoft bullshit that boils down to "Don't try it".

Verified that I can SSH log in to Ragnar from Cygwin by specifying IP
address and password.  This will help transferring cut and paste stuff
for this log.

Changed screen power off settings in Settings/System/Power & battery to:
    Plugged in turn off screen              15 minutes
    Plugged in, put device to sleep         25 minutes
It then nagged me that having the two times different "results in higher
carbon emissions".  Shut up, Greta.

Back on Linux.

Moved Roswell to main development desk.  Disconnected auxiliary screen
(Philips 24 inch, 1920x1080 60 Hz) from DisplayPort adaptor on Ragnar
and connected directly to HDMI port on Roswell.

To enable automatic configuration of displays when an auxiliary display
is connected or disconnected:
    Settings/Display/Advanced/Connecting Displays
        Check "Configure new displays when connected"
If needed, disconnect and reconnect in order to configure.

Reconfigured to side by side displays.

Set wallpaper for second display.  You move the Display settings dialogue
to the screen for which you wish to set its properties.  When setting
wallpaper, you select the directory where it lives, then choose from the
images it finds in that directory.

Added direct IP address entries to /etc/hosts for aws, ag, and sc.
Updated IP address to access Ragnar via WiFi.

Booted into Windows.

To set up second display, go to Settings/Display and select "Extend
display".  Until you do this, default is mirror display on both
monitors.

To set different wallpaper for each display, choose the wallpaper as
usual and then right click on the wallpaper image and select which
display should show it.  If you left click, it will be shown on both.

Booted back into Linux.  Came up with dual screen as set up before.

Connected the Fourmilab_Mirror drive to the USB hub.  It mounted on:
    /dev/sdf1       7.3T  3.0T  4.0T  43% /media/kelvin/Fourmilab_Mirror
I have not yet confirmed if we can boot with it connected or what
Windows will make of it if it's connect when we bring it up.

Created symbolic links for easy access to Fourmilab_Mirror:
    super
    ln -s /media/kelvin/Fourmilab_Mirror/kelvin/juno/home/kelvin /juno
    ln -s /media/kelvin/Fourmilab_Mirror/kelvin/hayek/home/kelvin /hayek

As of 2023-09-26 at 00:06 UTC, Roswell is the primary development machine
at Bleakleigh.

Trying to fix stupid 12 hour U.S. time in date command.

According to:
    https://askubuntu.com/questions/1238397/ubuntu-server-20-04-time-format-24-hours-on-shell-with-date-command
tried:
    super
    localectl set-locale LC_TIME="en_GB.UTF-8"
Log out, then log back it, and it's fixed.  I actually just did:
    su - kelvin
to test rather than logging all the way out and back in.
1 Like

2023 September 26

To enable Compose key for special characters:
    Settings/Keyboard/Layout
uncheck "Use system defaults".
Set Compose key to "Right alt".
Bob's your uncle.

Installed:
    apt-get install wmctrl
This is used by ~/bin/Sb to set modes on the clock window.

After running overnight, the system locked up with a black screen and
nothing but a blinking underline cursor at the top left.  Ctrl-Alt-Del
performed a normal Linux shutdown and clean reboot.  During the hang
the fan was running in "Alienware space heater" mode.

Plugged in Ethernet cable.  It got its address normally from DHCP and
connected just fine.  SSH login to the WiFi address still works.
Having the Ethernet jack on the left side means a stiff Category 5
cable uses up a substantial amount of desk space.  A more supple cable
or a right-angle plug becomes very attractive.

After the reboot, the system didn't mount the external USB drive.  It
only mounted it after I unplugged the USB cable and plugged it back in.

Installed:
    apt-get install xsel
This is used by "ctwit" to copy its output to the X copy/paste selection
buffer.

Changed the toolbar display format for the Clock widget to
"%a, %Y-%m-%d %H:%M", for example, "Tue, 2018-03-06 22:20".

To make window buttons on the taskbar (toolbar) at the top group by
type:
    Right click on a vacant space in the top bar.
    Select Panel/Panel Preferences.  In that pop-up, select Items tab.
    Double click "Window Buttons".   In the resulting pop-up, under
    "Behaviour", select Window grouping: "Always".
Wasn't that easy?  See:
    https://unix.stackexchange.com/questions/652948/xfce-how-to-disable-taskbar-items-grouping

Installed:
    apt-get install syncthing syncthing-gtk

Opened a browser window to:
    http://127.0.0.1:8384/
In Actions/Settings, set a GUI password:
    User: kelvin    Password: usual password
    Use HTTPS for GUI.
Killed and restarted syncthing.  Logged back in with browser.  Used the
user ID and password I set above.

Our device id is:
    roswell
    REDACTED

Here is how to control those infuriating Snap update popups:
    https://snapcraft.io/docs/keeping-snaps-up-to-date
You can set when in the week it checks for updates with the:
    super
    snap set system refresh.timer=sun1,05:00
which will check on the first Sunday of the month at 05:00
(I believe this is local time).  The time specifications
are documented at:
    https://snapcraft.io/docs/timer-string-format
You can see the current settings with:
    snap refresh --time

Configured Perl for installation of modules from CPAN by
logging in as "root":
    su - root
    
    perl -MCPAN -e "shell"
    install Bundle::CPAN
    o conf commit
We will always install Perl modules from the root account for
system-wide access.

Installed:
    apt-get install perl-doc
This installs "perldoc" and the base Perl documentation.

Installed:
    apt-get install figlet
This is the ASCII art label maker we use for sections in Nuweb
programs.

Installed Skype with:
    snap install skype
It used to be a package in the Ubuntu repository, but is now available
only as a snap.  Made a test call: it appears to work.

Installed Zoom client with:
    snap install zoom-client
It appears to work, but Virtual Background doesn't seem to work without
a green screen, which if you don't have one does funny things.

Installed totem video player:
    apt-get install totem

Installed audio utility:
    apt-get install sox

Install audio/video million blade Swiss Army Knife.
    apt-get install ffmpeg

Catastrophe!  I decided to make my weekly visit to the Second Life
Server group using Phoenix Firestorm viewer on Linux.  Everything was
fine in logging on and travelling to the venue with Fourmilab Rocket as
I usually do.  The meeting was fine until, at 38 minutes after the hour,
one of the Linden hosts suggested the attendees teleport to an
experimental region running a new development version of the simulator
to stress test it with many simultaneous teleports in and also with a
heavy load of script execution.  I went to the destination with no
trouble and, to do my part in the stress test, fired up Chaos Butterfly
    https://marketplace.secondlife.com/p/Fourmilab-Chaos-Butterfly/24053377
in worn mode with particle effect trails.  About ten seconds into the
test, my machine completely locked up.  The cursor disappeared within
the Firestorm window and when I moved it outside, I could see it but
was unable to click on anything.  From Ragnar, I could ping the machine
and log in via SSH, but attempting to killall Firestorm accomplished
nothing.

Finally, I decided to reboot with "shutdown -r now", and Roswell
immediately dropped the SSH login and went to the Xubuntu shutdown
screen...where it remained for ten minutes.  I eventually lost
patience and tried to power cycle the computer, but it would not
power down!  Pressing the power button did nothing.  Was AGI in the
driver's seat?

I disconnected all external cables to the machine with no change.  I
disconnected the power supply and, apart from the power button changing
colour from blue to yellow, no change.  Now, this machine does not have
a removable battery, so was I doomed to waiting for it to run down?

Finally, after about 20 minutes, the machine spontaneously powered down.
I reconnected everything (live dangerously!) and powered back up.  It
came back up normally and apparently everything was tickety-boo.  Of
course I lost all of my open windows and browser tabs which had to be
manually re-established.

I went back to Second Life with Firestorm, but I did not try the Chaos
Butterfly test this time.

What is going on?  I have no idea.  Perhaps the RTX 4090 graphics
processor support in Linux has "a few rough edges" which were triggered
by the particle effect trails Chaos Butterfly was emitting.  I shall
have to test this on Firestorm for Windows to see what happens there,
but I will defer the experiment until a time I'm better prepared for
a total, can't power down, lockup.

At the moment, it's back up and behaving OK.  We'll see....

Based on:
    https://postimg.cc/image/vm7vbujv9/
I created a new "application" under Settings/Session and Startup called
"no caps" which executes:
    setxkbmap -option ctrl:nocaps
which is supposed to disable the Caps Lock key.  We'll see if it works at the
login, as that's the only time it's executed.

I spoke too quickly.  After the reboot following the lock-up, the
external USB drive, once again, was not mounted. My guess is that when
it has shut down due to inactivity, it doesn't detect the system coming
back up and/or doesn't come back to life quickly enough for the
automounter to see and mount it.  One again, unplugging and replugging
the USB cable brought it back to life.
2 Likes

Does not holding power button for 20-30 sec power down the machine?

2 Likes

Interesting that you seem to feel comfortable installing snaps. After years of promises of great permission controls, I still do not understand who the publishers of these snaps are and what permissions these snaps require.

2 Likes

Every time before now. But this time holding down the power button did absolutely nothing. I held it down for more than a minute on two occasions. Before, it only took about 5 seconds.

1 Like

All of the snaps I install are from the Snapcraft repository. I assume that for the high visibility applications like Skype and Zoom would take immediate action if somebody was posting rogue. In any case, more and more Linux applications are shipping as Snap only. I suspect the Snap feature’s allowing developers to push updates to users motivates them to use Snap.

2 Likes

The promise of snaps was wonderful – binary portability across various Linux distributions, sandboxing and fine-grained permission control. Binary portability seems to be getting there, but at an expense of compromising security. Snapstore does not seem to allow a user to review security capabilities of a snap prior to installation. As a result, majority of applications are not motivated to emphasize security and are not sandboxed at all.

image

Until this is resolved, running a snap-based installation in a VM with very limited time- and area- access to the host filesystem seems like the only viable solution.

1 Like

2023 September 27

Installed:
    apt-get install cpuid
I needed this for my screed about Intel instruction set
incompatibilities.

The Google Chrome browser comes up with its own screwball override of
the Xfce window controls.  This means you lose the ability to send the
window to another workspace or make it visible on all workspaces.  To
restore this, you have to right click on the window title area and
select something like "Use system window border".  I don't recall
precisely what it was since, once clicked, it was gone forever and you
cannot (or at least I cannot figure out how to) reset it to the
original (stupid) mode.  Chromium, being less grabby, does not do this.

The system was well-behaved today.  I used it entirely as my main
development machine for posts and maintenance on the Scanalyst site
and it worked without any problems at all.  At this writing, uptime is
28 hours and I have had no need to boot into Windows.

I really love this 45.7 cm diagonal 2560x1600 165 Hz (!) display on the
desktop.  It makes me yearn for more on the auxiliary screen, which is
61 cm but a mere (!) 1920x1080 60 Hz which I bought originally as the
primary screen for my Raspberry Pi 400.  Everything looks so much
*larger* when you drag it to the auxiliary screen.
4 Likes

2023 September 28

The /home/kelvin/.config/GIMP file was set to root:root ownership,
presumably because the first time I used GIMP I was su to root.  I
fixed it with:
    super
    cd ~/.config
    chown -R kelvin:wheel GIMP/
This should dispense with all the scary warnings every time I run GIMP.

Installed Steam:
    super
    snap install steam
Immediately after being launched after installation, it updated itself
with a 36 Mb download.  Isn't it great that snaps are always up to
date?

Logged in and, of course, had to go through the dance of E-mail
verification of a recognition code.  Imagine how frictionless our lives
could be if we could dispense with that continent of a third of a
billion grifters and layabouts who render our whole world a low trust
society.

To test Steam installation, purchased, downloaded, and installed (on
the Linux side) "The Battle of Polytopia".
    https://en.wikipedia.org/wiki/The_Battle_of_Polytopia
If it's good enough for Elon, it's good enough for me.

Installed:
    apt-get install ncal
This is what they're calling the Unix "cal" program these days.

High weirdness: I had left the system idle for around two hours and
when I returned, the fan was blasting away full-tilt, but emitting air
around room temperature, not perceptibly heated.  I ran "top" and it
showed the system 99% idle.  There was no obvious reason for the fan to
be running, but there it was.  After saving what I was doing, I
rebooted the system, noting that it had been up for 2 days and 2 hours
before the "shutdown -r now" was issued.

The shutdown appeared normal but when the Alienware start-up screen
appeared it seemed to be stuck there for around a minute (while usually
it would enter the boot loader selection in a second or two).  It
then spontaneously jumped into a System Diagnostic screen where it
reported running tests on all of the system's myriad fans.  All of these
tests passed, and it displayed a Continue button.  I pressed it, and the
boot loader appeared, whence I booted into Xubuntu and everything
seems fine.  The fan is not running now and the system is behaving.
I have no idea what this is all about, but just like when you find one
grey alien living in your basement, there's probably more you haven't
yet discovered.  We'll see....
4 Likes

2023 September 29

After leaving the machine idle overnight, I found it upon waking to be
stuck in the "Alienware space heater mode" I originally encountered on
2023-09-25.  Once again, the main screen was blank except for a
blinking underline cursor at the top left and the auxiliary screen was
totally blank.  The fan was going full tilt and kicking out the joules.

This time I was in a better position to investigate, so I first
determined that I could ping the system from Ragnar and then log in
via SSH.  Nothing appeared out of the ordinary on the remote login.
I then ran "top", which reported (elided after the top three items):
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
   1100 root      20   0       0      0      0 R 100.0   0.0 396:57.24 nvidia-modeset/kthread_q
   1971 root      20   0   28.4g 157396 101948 R  99.7   0.2 406:02.89 Xorg
   3519 kelvin    20   0   33.2g 229564 137124 S   1.3   0.4  68:09.97 chrome

Clearly, something related to the Nvidia graphics card had locked up,
and took the X window system along with it, running up a total of six
and a half hours of CPU time (presumably on different cores/threads)
each.  That explains the heat, especially if the Nvidia GPU was in on
the act.

I then tried a variety of things, both of my own devising and
recommended by Web searches.
    kill 1100           # Kill nvidia-modeset
        #   Nothing happens
    kill -9 1100        # Really kill nvidia-modeset
        #   Nothing happens
    #   https://www.shellhacks.com/restart-x-server-ubuntu-linux/
    pkill X
        #   Nothing happens
    Ctrl+Alt+F1
        #   Nothing happens
    kill 1971
        #   Nothing happens
    kill -9 1971
        #   Xorg stopped, fan stopped.  Now Top shows:
            PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
           1100 root      20   0       0      0      0 R 100.0   0.0 409:15.31 nvidia-modeset/kthread_q
           1101 root      20   0       0      0      0 R 100.0   0.0   1:09.84 nvidia-modeset/deferred_close_kthr+
          30208 root      20   0    9944   5376   4864 R 100.0   0.0   1:09.83 gpu-manager
              1 root      20   0  168272  12288   7936 S   0.0   0.0   0:02.73 systemd
        #   Screen continues to show blinking cursor
    xinit
        waiting for X server to begin accepting connections .
        ..
        ..
    startx
        #   Fan starts again. Top shows
            PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
           1100 root      20   0       0      0      0 R 100.0   0.0 412:42.68 nvidia-modeset/kthread_q
          30230 root      19  -1    2772   1280   1280 R 100.0   0.0   2:31.66 Xorg.wrap
           1101 root      20   0       0      0      0 R 100.0   0.0   4:37.20 nvidia-modeset/deferred_close_kthr+
          30208 root      20   0    9944   5376   4864 R 100.0   0.0   4:37.19 gpu-manager
          29980 root      20   0       0      0      0 I   0.3   0.0   0:00.16 kworker/u64:3-events_freezable_pow+

Finally, I resort to:
    shutdown -r now
        #   Nothing happens

Then, it's finger on the power button, and after around 30 seconds, it
powers down.  I turn it back on and reboot into Xubuntu.  When the
login screen appears, I try to log into my account and after entering
the password, the screen goes blank, and after around ten seconds, the
login screen appears again.  I repeat this experiment sufficient times
to convince myself it isn't going to "get better" (and justify a
diagnosis of insanity), and try, as a lark, logging in with the
"installation" account I created whilst installing the system.  That
works, although the auxiliary screen remains black and the cursor and
windows are confined to the main screen.

I try logging in with my main account from Ragnar via SSH, and at the
end of the login messages I espy with my little eye:
    /usr/bin/xauth:  /home/kelvin/.Xauthority not writable, changes will be ignored
    X11 connection rejected because of wrong authentication.
Ahhhh, the good old .Xauthority file getting into the act again.  It
turns out to have its ownership set to root:root, explaining the
message and perhaps the inability to log in from the window system
console.  As super-user, I deleted the .Xauthority file.  Then, when I
tried SSH logins from Ragnar, I got:
    xauth:  timeout in locking authority file /home/kelvin/.Xauthority
I then discovered there were files named:
    -rw------- 2 kelvin wheel    0 Sep 29 13:47 .Xauthority-c
    -rw------- 2 kelvin wheel    0 Sep 29 13:47 .Xauthority-l
which have the distinct odour of lock files, so I deleted both of them.
I then tried an:
    xauth generate :0
but it wouldn't let me do that, presumably because I was not logged in
from the X windows display ":0".

Then I tried logging out from "installation" and logging in with my
account on Roswell's main screen, and this time it let me log in.
In the process, it created a new ~/.Xauthority file owned by me.
Now, could log in from Ragnar via SSH with no errors or warnings
and run programs that open windows via X forwarding.

So far, so (not so) good, but the auxiliary screen remained stygian.
I tried unplugging the HDMI connector and plugging it back in, and
every time I changed the state of the connector, the Display settings
panel popped up, but even when plugged in, it showed only the Laptop
screen, not the auxiliary screen.  On a guess, I tried connecting the
screen via the DisplayPort jack via a DisplayPort to HDMI adaptor
dongle, and it behaved exactly the same.

Looking at the "dmesg" output when I plugged in the display revealed
some bullshit about:
    module verification failed signature and/or required key missing
which smelled like Microsoft's monopoly enforcement gang up to their
old tricks.

At this point, I tried booting into Windows 11 and, sure enough, the
auxiliary screen worked just fine, both on the DisplayPort and direct
HDMI connections.  I left it back on the HDMI port.

Booting back into Xubuntu it didn't, of course, work, but I was able
to confirm that the "failed signature" messages occurred on a clean
boot with the display attached.

Further research revealed a report:
    https://askubuntu.com/questions/1230924/ubuntu-20-04-does-not-recognize-second-monitor
that turning off the "Secure boot" option in the BIOS configuration
would fix this problem.  I restarted the system, disabled secure boot,
and CAZART!, the second screen came to life and there were no wacko
failed signature messages in dmesg.

Now, why this driver and auxiliary screen, which has been working in
the Xubuntu installation since 2023-09-25, should suddenly start to
fail with a driver signature problem when I have installed precisely
zero updates to the Linux system since then is a total mystery.
Perhaps it's checking the driver signature against some external
signature repository which has gone silent or is now sending bogus
signatures or rejecting valid drivers--I have no idea.  But for the
foreseeable future, "Secure boot" is going to remain disabled.  This
was enough crap for me, no thank you very much, Microsoft.

While I was at it, and with all the reboots and stuff required to
re-establish all of my open windows and tabs, I took the opportunity
to apply all pending updates to Windows 11 and Xubuntu.  None appeared
relevant to the Nvidia driver problems chronicled here.

Thus were forfeit, forever, five hours in which I had hoped to perform
some useful work.

Plugged the external USB drive into one of the USB connectors on the
left side of the laptop.  Perhaps it will run faster there than on
the hub, which I suspect is limited to USB 2.0.

Added the ability to back up Roswell to the Fourmilab_Mirror external
USB drive.  I added a "roswell" directory to my home directory on that
drive, created three subdirectories (owned by root:root) for the
backup mirrors:
    boot
    root
    win
and placed a shell script, Mirror_roswell, adapted from the
hayek/Mirror_hayek script to perform the backup.  The script backs up
the root directory (ignoring any mounts of other file systems within
it), the /boot directory (including the /boot/efi VFAT file system,
which is a separate mount), and the C:\ drive of the Windows 11
SSD, which we mount as /win, as described on 2023-09-24.  Note that
since this is an NTFS file system, the files within it contain
attributes and metadata which are not backed up by our Linux rsync
procedure, but all of the data are there to be restored.  We do not
back up the arcane Windows boot partition and other Microsoft demonic
ephemera, as trying to restore them from Linux would be a futile
endeavour.  (We'll rely on the Clonezilla backups [see 2023-09-23]
should we need to restore them.)

The external USB mirror backup drive does, indeed, appear to run much
faster connected directly to a USB 3.0 port on the laptop than to the
USB hub.

The space occupied by the mirror of Roswell is:
    226M    boot
    4.0K    Mirror_roswell
    37G root
    57G win
Note that we can dramatically reduce the size of the "win" mirror by
excluding the following ephemeral files:
    hiberfil.sys            Hibernation dump file
    pagefile.sys            Paging file
    swapfile.sys            Swap file
I have not yet added this refinement.
5 Likes

2023 September 30

There were no fan-fare events overnight and everything was fine.  We'll
have to see whether disabling secure boot fixes what are apparently
Nvidia driver induced hangs.

Added code to the Mirror_roswell script on the external USB drive to
exclude the huge ephemeral system files in the root directory of the
/win drive:
    --exclude={"win/hiberfil.sys","win/pagefile.sys","win/swapfile.sys"}

Here is information about these Windows files:
    hiberfil.sys
        https://winaero.com/windows-11-hibernation-enable-disable-delete-hiberfil-sys-file/
    swapfile.sys pagefile.sys
        https://theitbros.com/swapfile-sys-windows-10/
Note that in many cases these files will be sparse, so an rsync mirror
of them won't occupy as much space as they appear to consume in an
ls output.  But, as the system gets more and more used, they may grow
into their footprint.

The document about swapfile.sys goes on and on about "UWP applications"
without, of course, every defining the acronym.  Here is a Microsoft
document:
    https://learn.microsoft.com/en-us/windows/uwp/get-started/universal-application-platform-guide
which goes on and on without really defining it either, except as
another bullshit initiative to complicate the lives of those stupid
enough to develop software to run on their legacy platforms.
1 Like

Huh. I may have to try this. I have occasional hangs that appear to be related to external monitor changes while suspended.

1 Like

2023 October 1

The display and fan were well-behaved once again last night.

Reviewing the disc partitions of the Windows drive as seen from Linux:
    lsblk
        nvme1n1     259:1    0   1.9T  0 disk
        ├─nvme1n1p1 259:3    0   500M  0 part
        ├─nvme1n1p2 259:4    0   128M  0 part
        ├─nvme1n1p3 259:6    0   1.8T  0 part /win
        ├─nvme1n1p4 259:7    0   1.4G  0 part
        ├─nvme1n1p5 259:8    0    15G  0 part
        └─nvme1n1p6 259:9    0   1.1G  0 part
Viewing these partitions with fdisk, we see:
    Device              Start        End    Sectors  Size Type
    /dev/nvme1n1p1       2048    1026047    1024000  500M EFI System
    /dev/nvme1n1p2    1026048    1288191     262144  128M Microsoft reserved
    /dev/nvme1n1p3    1288192 3964237823 3962949632  1.8T Microsoft basic data
    /dev/nvme1n1p4 3964237824 3967082495    2844672  1.4G Windows recovery environment
    /dev/nvme1n1p5 3967082496 3998539775   31457280   15G Windows recovery environment
    /dev/nvme1n1p6 3998539776 4000772095    2232320  1.1G Windows recovery environment

Added code to the roswell/Mirror_roswell script on the external USB
drive to back up the EFI boot partition on the Windows SSD.  This is
a VFAT partition which we mount dynamically for the backup.  This
partition is only backed up when running the script on Roswell with
the external drive plugged into a local USB port.

Installed:
    snap install evince
This is the PDF viewer I use on Linux.
2 Likes

2023 October 2

The system ran overnight without tripping the light fan-dango.  This is
now the third consecutive night without a fan hang.

Installed:
    apt-get install whois

There remains something whacked out with Firestorm viewer for Second
Life run on the Linux side.  When I go to  a near-idle region like my
own, or some place like Denby for the Server Meeting, all is well, but
when I go to, say, London City, between 25% and 75% of the screen is
white with only a little of the avatar's texture poking through.  Frame
rate, however, remains high and there is no obvious lag.  I have not
seen this when running Firestorm under Windows.

All right, after further investigation, this appears to be an artefact
of our old bugaboo, "Advanced Lighting Model".  If I turn it on, the
weird rendering in London City seems to clear up.  I also set the
graphics performance to High, which should be easy for the RTX4090
which is supposed to be handling the graphics.  I see around 200 FPS
at Fourmilab Island with these settings.

I have spent most of the day on regular development and participation
in Scanalyst and upgrading Agora to Bitcoin Core 25.0 and making it my
main blockchain host.  During this I have used Roswell as the
development machine with no problems.
3 Likes

2023 October 3

Ran overnight with no video hangs or fantastic blowouts.  This makes
four in a row.

Made a backup on the 256 Gb USB stick of all of the top level
directories and Mirror_* scripts from the external USB Fourmilab_Mirror
drive.  This can be used to initialise a second mirror drive when I get
back to Fourmilab.

Once again, during the Second Life server group meeting, during a
region crossing stress test, Firestorm on Linux hung up, blocking
cursor access to other wiindows.  After trying various things to no
avail, I logged in from Ragnar via ssh successfully and tried to kill
and killall Firestorm, both regular and -9--neither did anything.  At
this point, I proceeded to a "shutdown -r now", and the system went
into the Xubuntu shutdown spinner and stayed there for around ten
minutes until I did a power cycle.

When it came back up, it jumped into the fan diagostic as before and,
after around a minute, declared the fans OK and rebooted.  I booted
into Windows and concluded the Second Life meeting with Firestorm on
that side.  There were a number of Second Life crashes, but nothing
that brought Firestorm or the system down.

At the end of the Second Life meeting, I noticed that Windows was
displaying an icon saying it needed to be restarted "to install
updates".  I restarted it, and it restarted normally, but there was
no indication of updates being installed.  But at least the icon
went away.

I then booted back into Xubuntu, restored all of my windows and browser
tabs, and proceeded as before the excitement.
2 Likes