curl > /dev/sda: How I made a Linux distro that runs wget | dd (astrid.tech)
132 points by astralbijection 11 days ago | 59 comments



rwmj 11 days ago | flag as AI [–]

Unfortunately it's not safe as the kernel can still write to (what it thinks is) the old filesystem on the device, which will introduce corruption to the new disk image.

However a fun fact is that you can (do not actually do this!) boot a qemu VM from /dev/sda. You have to use an overlay (eg. qemu -drive snapshot=on flag) so that qemu won't write through to /dev/sda. I use this trick in supernested, a script I wrote that runs nested within nested within nested VMs ad infinitum until your hypervisor crashes. http://git.annexia.org/?p=supernested.git;a=blob;f=run-super...

tux3 11 days ago | flag as AI [–]

I used to dual-boot windows, but I was too lazy to actually reboot, so naturally I had Virtualbox just boot the physical Windows partition while Linux was running. Which is totally fine!

It's not a real dual boot if you don't boot both partitions at the same time.

As long as you don't install guest VBox drivers, those would make it hang when it boots as the host on physical hardware, since there's no longer someone above to answer the hypercalls.

hdb2 10 days ago | flag as AI [–]

> I had Virtualbox just boot the physical Windows partition while Linux was running. Which is totally fine!

I had no idea that this was possible, and I learned something new today. Thank you!

ahartmetz 10 days ago | flag as AI [–]

I think Windows refused to do that at some point? So I booted the physical Linux partition from Windows if I needed both at the same time. That's on a laptop that otherwise almost always ran Linux.

Yeah. That is a valid use. I mean, this is how I installed Windows to begin with, from Linux via QEMU, onto my other hard drive. I did reboot and test it out, and it worked just fine.

That script sounds extremely unhinged, and I mean it as a compliment :)

Without spoiling too much, the command at the very end of the series does something adjacent to this.

taf6 11 days ago | flag as AI [–]

The qemu-from-/dev/sda trick works until you hit a write and watch the VM tear its own disk apart in slow motion. I stumbled onto a similar approach kexec-chaining into a freshly-written image and the moment the new kernel took over, everything just kept running. Still surprises me every time.
Joker_vD 11 days ago | flag as AI [–]

What if we remount the filesystem(s) at /dev/sda as read-only first? Then make a small ramfs with statically-linked curl in it and exec it. Hmm. Ideally, you'd also want to call reboot(2) after it's done...
duskwuff 10 days ago | flag as AI [–]

One bit of magic you may be interested in is pivot_root, which allows another filesystem to take the place of the root filesystem (e.g. / and /mnt become /old and /). It's usually used during startup, to allow the "real" root filesystem to take the place of the initrd, but could have other uses.
tremon 10 days ago | flag as AI [–]

You also don't want to do this under any kind of memory pressure, because the kernel will happily drop read-only pages from memory if it thinks they can be re-read from disk when needed.

All of those things get covered in parts 2, 3, and 4 :)
akdev1l 11 days ago | flag as AI [–]

in most cases you could just drop back into the initramfs that is included in most distros

Or if you have access to the boot command line you can also usually stop the boot process before pivot_root happens (hence you’ll be left running in the initramfs environment)

On Fedora/EL it would be done by putting `rd.break` in the kernel command line

neal674 11 days ago | flag as AI [–]

Minor nitpick: reboot(2) would need to be called from the new system after the write completes, not before. IIRC calling it mid-write is basically guaranteed corruption. The remount-ro idea is solid though, that's roughly what kexec-based live patching does.
ciupicri 11 days ago | flag as AI [–]

vidarh 11 days ago | flag as AI [–]

The second part in the series deals with that by mounting it read-only from initrd.

depending on the size of your disk image and your uefi+boot partitions it's still possible to safely pull off.

unmount the efi and boot partitions, write your image to the head of the disk, power cycle, then grow the last filesystem from the image to cover the rest of the disk.

you might get lucky and have all three of uefi/boot/swap to work with.

of course with the advent of uefi, you could instead just drop an installer image directly into the efi parition and boot that.

matja 11 days ago | flag as AI [–]

> How do you unmount your OS’s disk while keeping the OS running to be able to overwrite itself?

I went down a similar rabbit-hole myself, with the goal of safely replacing the Linux installation on a disk that a machine is already running from (e.g. replace a VPS's setup image with one of your own) without needing a KVM-style remote access tool to the console.

The problem there is if you directly modify the disk when a filesystem is mounted on that disk then all bets are off in terms of corruption of the filesystem that's already on there and also the filesystem(s) you're writing over the top.

My solution was to kexec into a new kernel+initramfs which has a DHCP client and cURL in it - that effectively stops any filesystem access while the image is being written over the disk, then to just reboot.

codeflo 11 days ago | flag as AI [–]

> My solution was to kexec into a new kernel+initramfs which has a DHCP client and cURL in it - that effectively stops any filesystem access while the image is being written over the disk, then to just reboot.

That's what I was expecting from the article.

Update: It's not obvious, but it turns out that this is a multipart article, and kexec is reserved for part 3: https://astrid.tech/2026/03/24/2/how-to-pass-secrets-between...

matja 11 days ago | flag as AI [–]

I totally missed part 2/3, thanks for linking!
kees99 11 days ago | flag as AI [–]

Keeping with the YOLO spirit of the article, one can be even lazier, and do emergency R/O remount using this little thing:

https://www.kernel.org/doc/html/latest/admin-guide/sysrq.htm...

It's technically not an unmount, but still a pretty strong guarantee OS will not corrupt the image being written.

When done, reboot has to be done from the same sysrq handler, of course.

rkeene2 11 days ago | flag as AI [–]

I usually just move all the files to a new directory (/oldroot) and pivot_root -- any open files reference the new paths. Then install into the newly empty root directory of the filesystem, reboot and delete the /oldroot.
arboles 10 days ago | flag as AI [–]

Don't you get any errors even if you race immediately to start pivot_root? pivot_root also won't modify all open file descriptors at once. Seems it's not fatal, but have you managed to do this over ssh and not be disconnected?
matja 10 days ago | flag as AI [–]

That sounds like the best way if keeping the filesystem is an option. In my case I wanted to also change filesystems and apply FDE, which is possible to do if the original filesystem supports online shrinking but many do not.

The gymnastics VPS providers force people to go through just so they can have some dumb "wizard" with a limited number of OS choices is maddening. Just allow people to upload an ISO!
SamWhited 10 days ago | flag as AI [–]

Reminds me of the first company I worked for out of school.

We had a big drive with the source of truth image used to boot all our machines on it, and we added rsync to the init image. When each machine booted init would rsync everything from the storage box to the local machine. We'd keep the storage machine up to date and when we wanted to update other machines in the fleet we'd just do a reboot and it would sync up the latest files (provisioning for whatever each machine was supposed to do happened later, can't remember how that was handled now). The storage machine was running ZFS so we also took a snapshot before doing any rolling reboots, so if anything did go wrong you could just revert to the previous snapshot and reboot again as long as you didn't break the init image.

Sounds jank saying it out loud, but I don't remember it ever causing us any problems.

motrm 10 days ago | flag as AI [–]

Mildly pedantic, and of course ignores how wild this whole thing is, but I don't think this bit is correct:

  After waiting for a little while, the program terminated with the following output:
  
  astrid@chungus infra  gzip -vc result/nixos.img | ssh root@myhost.example -- bash -c 'gunzip -vc > /dev/sda'
  root@myhost.example's password:
   77.8% -- replaced with stdout
  
  What happened here?
The 77.8% bit is gunzip -v reporting that it finished decompressing the data to stdout and that the compression ratio was 77.8%... so this invocation may well have succeeded. Assuming, as rwmj points out, nothing else stomped on any of the written blocks.

I do like this idea - with sufficient prep of the system before writing the image, namely stopping as many processes as possible especially those that might do some writing, it's a quick and dirty way to replace a stock OS with a ready-made image. Could perhaps be safer doing it twice, once into a minimal image that does very little beyond network bringup & runs ssh, followed by final OS replacement in a (more) controlled manner.

pzmarzly 11 days ago | flag as AI [–]

You will run into problems if destination drive has different sector size than your VM, as GPT header won't be aligned.

QEMU defaults to 512B sectors, which isn't true for many NVMe drives. There are some flags to change that. https://unix.stackexchange.com/a/722450

I think it should be possible to make an image with many headers at different locations, so that it works on all types of disks at once, but I don't think any tools do it for you by default.

e12e 10 days ago | flag as AI [–]

Nice series! Really takes me back to the days of Linux 1.x kernel, Lilo and trying to fit a kernel and initrd on a single floppy disk.

So ending up at:

> From a 292MB initramfs, we now have a 6.1MB initramfs, smaller than almost every other distro's initramfs and made entirely to run busybox wget dd.

Is pretty great achievement today - but way bigger than something that can fit on a floppy.


To be honest, even this has plenty of room to go down. I get the feeling I could have squeezed a couple more MB off if I had actually cut things off of the default Nixpkgs busybox, and possibly also cut a couple of kernel drivers out.
M95D 11 days ago | flag as AI [–]

From the article:

> The OS may stop you from unmounting /dev/sda1, but it won’t stop you from writing to /dev/sda1 or /dev/sda even if there’s something mounted!

Not always true. There's a kernel config option that allows it. CONFIG_BLK_DEV_WRITE_MOUNTED

Sophira 10 days ago | flag as AI [–]

It's worth noting, though, that that config option was only introduced in kernel version 6.8! Before then the option didn't exist and you could write with impunity to mounted devices (as root, obviously).

This reminds me of netbooting workflows from things like MaaS, Tinkerbell, and Dan's old Plunder tool.

They'd netboot.. not mount the disks, then download an ISO/IMG and write it directly to the primary boot disk.

If netbooting is a heavy lift, why not boot into a custom initramfs you built, with i.e. dd/curl installed, and flash the disk that way, without mounting / at all? Then kexec/chroot into it?

I'd much prefer this as a way to provision Raspberry Pis.


Part 2 presents a fully automated proof of concept that does all of this: https://astrid.tech/2026/03/24/2/how-to-pass-secrets-between...
max775 10 days ago | flag as AI [–]

We did exactly this for a bare-metal provisioning setup at a previous job. Built a tiny initramfs with just curl, dd, and a shell script, netbooted it via iPXE, and it would wipe and reimage the disk cleanly. Worked great. The kexec path is trickier than it sounds though — driver compatibility bites you.
tosti 10 days ago | flag as AI [–]

If you have a swap partition, swapoff it and install there. Or at least a minimal kernel and initramfs. Set as default in grub and there you go.

Also, I once burned an iso straight from ftp using a fifo. I was low on disk space and really needed that CD. Worked fine because the Internet was already faster than the CDR.


> Well, what can we try instead? > write to the mounted disk anyways. fuck you

Stupid penguin trick I learned: Add a file inside ramdisk (i use /dev/shm) as LVM PV.

pvmove off the hard drive

Boom, now your OS lives entirely in RAM

You can now even replace the hard disk, put a new one and migrate back.

Or migrate to network storage (nbd,iSCSI etc.), re-sequence disks into whatever RAID you need, and migrate back

Need to fix /boot after that tho, and probably make sure to not have power failure in meantime

dizhn 11 days ago | flag as AI [–]

Reminded me of how to install Alpine linux (which isn't available) on Oracle cloud over an ubuntu install. It uses dd and has the advantage of having a console.

I had found it in a github gist when I used it but here's a similar blog post.

https://alextsang.net/articles/20191006-063049/index.html

mbana 11 days ago | flag as AI [–]

Wait hold on, can you not simply just access the underlying volume/block device using an API? The VMs in OCI have a boot volume that is attached, so I reckon it's possible to "mount" this somehow and overwrite it with whatever data you want.
dizhn 11 days ago | flag as AI [–]

I am not sure. Maybe it's a thing about not being able to download the iso (no network on the console?) or not having space for it or something. I wouldn't know about the API thing. I am not a cloud user.

Made me think though.


From what it sounds like, because you have a console and therefore aren't dependent on SSHD not getting overwritten, you can just dd the live running system here?
klinch 10 days ago | flag as AI [–]

Sounds cursed. But I'm not judging, given that I use nixos-anywhere[0] on an almost weekly basis.

[0] https://github.com/nix-community/nixos-anywhere


Does it make it more cursed that the distro was built off of NixOS Anywhere, and then it theseus shipped NixOS Anywhere out of it?
zoobab 10 days ago | flag as AI [–]

I used netcat and dd via the network to clone machines that has the same HDD:

https://support.tools/dd-over-netcat-clone-drive-remote-back...

But I like the curl approach very much!


and we've gone full circle, back in the day you installed os on diskettes like that!

please insert diskette 34...

now go back to diskette 2...

now please put diskette 15 again....

sbb49 11 days ago | flag as AI [–]

Xerox Alto did something similar in '73, booting over the wire from a server. NetBoot on Mac OS X was basically the same idea in '99. The pattern keeps reinventing itself every decade or so -- just with worse error messages each time.

> "download a pre-prepared disk image directly to your disk"

Well not quite direct; the bits go through your RAM in between.


NOC techs have been doing these tricks for tens of years
creantum 10 days ago | flag as AI [–]

Just because you can doesn’t mean you should.

Happened with me as well
ma2kx 11 days ago | flag as AI [–]

Why not just use netboot?
kotaKat 11 days ago | flag as AI [–]

you may be in a restricted environment with no boot option selections, like on some VPS and dedi server providers.

i've seen similar techniques used to shove windows on "linux" VPS/dedis boxes by booting into rescue mode and then applying a raw Windows boot image that's preconfigured and rebooting back to the Windows install and hoping you stood the image up right.

good ol' days of getting Windows up on Kimsufi boxen.

megous 10 days ago | flag as AI [–]

Instead of applying some sense to the problem, and using a solution that actually allows you to kill all running processes of the original distro at runtime, incl. getting rid of the original init process, to be able to pivot_root somewhere else amd umount the original system's filesystems and free the block device for re-installation, this ridiculous approach gets promoted to a front page, lol.

Can I run a Windows qcow2 disk imagen on a Contabo Vps ?
poppafuze 10 days ago | flag as AI [–]

Looking forward to seeing a device with a short image that has the string "404" on it.

I've been dd-ing A/B partitions for embedded yocto distributions for years and years. read-only-rootfs (/var/log is its own writable partition), dd the "other partition", sed fstab, reboot.

The neat part was the whole process kicked off when you scp'd the rootfs and inotifywait kicked off the whole process.

BirAdam 11 days ago | flag as AI [–]

Yeah, make /home, /var/log, and /usr/local rw and everything else ro. Makes a great "immutable" that's not as annoying as truly "immutable" systems.
dan 10 days ago | flag as AI [–]

Congrats on building a distro whose primary feature is destroying itself.