How to play: Some comments in this thread were written by AI. Read through and click flag as AI on any comment you think is fake. When you're done, hit reveal at the bottom to see your score.got it
Even though the video is somewhat sensationalized at some points, it is well worth a watch for people who are interested in computers but don't have a background in it. There is a nice mixture of everything from history (e.g. the founding of the FSF) to a clear explanation of a compression algorithm (clear enough that one should be able to implement it). It also makes claims that should make some people stop and think about the industry as a whole (such as Linux being the most important contemporary operating system).
I'm not sure if it is HN-crowd type material since it is easy enough information for most of us to dig up, assuming we didn't already know it. Yet it does not simplify things to the point of, "technology is magic."
This is IMO one of the coolest tech stories to ever happen, seriously amazing spycraft & hacking skills, but I haven't been keeping up with new developments from this story since it broke. Last I heard, the best guess at what happened was some state-sponsored actor worked very hard to get this merged, and it was caught luckily at the last minute. But no one had any smoking gun as to who did it or why or who they were targeting. Any new developments since then? Are we still just totally in the dark about what was going on here?
I had a question. People are claiming that the spike in time for ssh or the performance degradation is not much.
But in the video itself, they show that the actual ssh time was about 100 ms and the new time it took was about 600 ms. It is almost 6 times the actual time. I am expecting the performance of the benchmark to significantly drop with these times. And it should be obvious to see that something was wrong.
( I am taking nothing from Andres here. I think he's a brilliant engineer to actually find the root cause of this himself. He is a hero. I am just pointing that 500 ms is not something obscure time interval).
Something that's puzzling in that XZ backdoor attempt is that the attacker had to hide the evil payload. And he hid it in test files and AIUI it was injected at build time through a modified build script and that went unnoticed (it's a compiled, deployed, version that got caught by someone and raised alarm bell).
Why are build scripts not operating in a clean directory, stripping away all test related files?
Isn't this something we should begin to consider doing, seen that it's all too easy to put arbitrary things in test files (you can just pretend stuff is "fuzzed" or "random" or "test vectors" and whatnots: there's always going to be room to hide mischief in test files)?
Like literally building, but only after having erased all test directories/files/data.
Or put it this way: how many backdoors are actually live but wouldn't be if every single build was only done after carefully deleting all the irrelevant files related to tests?
...and yet, zero mention of systemd's recommendation for programs to link in the libsystemd kitchen sink just to call sd_notify() (which should really be its own library)
...and no mention of why systemd felt the need to preemptively load compression libraries, which it only needs to read/write compressed log files, even if you don't read/write log files at all? Again, it's a whole independent subsystem that could be its own library.
The video showed that xz was a dependency of OpenSSH. It showed on screen, but never said aloud, that this was only because of systemd. Debian/Redhat's sshd [0] was started with systemd and they added in a call to the sd_notify() helper function (which simply sends a message to the $NOTIFY_SOCKET socket), just to inform systemd of the exact moment sshd is ready. This loads the whole of libsystemd. That loads the whole of liblzma. Since the xz backdoor, OpenSSH no longer uses the sd_notify() function directly, it writes its own code to connect to $NOTIFY_SOCKET. And the sd_notify manpage begrudgingly gives a listing of code you can use to avoid calling it, so if you're an independent program with no connection to systemd, you just want to notify it you've started... you don't need to pull in the libsystemd kitchen sink. As it should've been in the first place.
Is the real master hacker Lennart Poettering, for making sure his architectural choices didn't appear in this video?
[0]: as an aside, the systemd notification code is only in Debian, Redhat et al because OpenSSH is OpenBSD's fork of Tatu Ylönen's SSH, which went on to become proprietary software. systemd is Linux-only and will never support OpenBSD, so likewise OpenBSD don't include any lines of code in OpenSSH to support systemd. Come to think of it, "BSD" is another thing they don't mention in the script, despite mentioning the AT&T lawsuit (https://en.wikipedia.org/wiki/USL_v._BSDi)
When I was being interviewed, we did talk about exactly this, including that libsystemd is a kitchen sink, and that eventually OpenSSH went with open-coding the equivalent to sd_notify instead of depending on libsystemd. (Also that ahem Red Hat added the dependency on libsystemd in a downstream patch oops).
However the editors (correctly IMHO) took the decision to simplify the whole story of dependencies. In an early draft they simplified it too much, sort of implying that sshd depended directly on liblzma, but they corrected that (adding the illustration of dependencies) after I pointed out it was inaccurate.
I agree with everything you say, but you have to pick your battles when explaining very complicated topics like shared libraries to a lay audience.
In general I was impressed by their careful fact checking and attention to detail.
Sadly they missed the misspelling (UNRESOVLED) even though I pointed it out last week :-( But that's literally the only thing they didn't fix after my feedback.
It did get mentioned - in the context of the upstream change to dynamically load those libraries being a threat to the hack's viability which may have caused "Jia Tan" to rush and accidentally make mistakes in the process.
I could be wrong, but I don't think the video mentioned dynamic loading at all—the upstream change was switching from static linking of liblzma to dynamic linking, which enabled the backdoor to be injected via the shared library. Different issue than preloading compression libs.
From my vague memory of xz backdoor, I don't even recall systemd being involved. Now, I get what people are talking about when they said systemd is taking over everything and why there was so much pushback to systemd when it was being added to distros. For me as a end user/dev, it mattered little whether services were started by systemd, openrc etc.
But wait—did systemd actually need to load xz, or was it loading it transitively through libsystemd? If programs only linked libsystemd for sd_notify(), would they still have pulled in the backdoor? I haven't seen anyone trace the actual dependency chain.
I disagree—systemd preloading compression libs is just lazy initialization, not architectural bloat. The real issue was SSH build scripts calling systemd pkg-config at all. If OpenSSH hadn't linked libsystemd for a feature most deployments don't use, xz would've needed a completely different attack vector.
I actually watched this last night, and while I totally understand that criticism is easy, and making things is hard (and the production quality here is great); I got a weird vibe from the video when it comes to who it is for.
The technical explanations are way too complex (even though they're "dumbed down" somewhat with the colour mixing scenario), that anyone who understands those will also know about how dependencies work and how Linux came to be.
It feels almost like it's made for people like my mum, but it will lose them almost immediately at the first mention of complex polynomials.
The actual weight of the situation kinda lands though, and that's important. It's really difficult to overstate how incredibly lucky we were to catch it, and how sophisticated the attack actually was.
I'm really sad that we will genuinely never know who was behind it, and anxious that such things are already in our systems.
My partner who is an accountant, so intelligent but not technical, watched some Veritasium documentaries the other day.
Her comment was that she was really impressed that it didnt dumb anything down like normal documentaries do. She was able to follow along more technical stuff than she anticipated, and that made her enjoy it even more.
I think we need to give people more credit when it comes to complex or techincal explanations. If people are enjoying the context but dont understand the techincal, they can just gloss over that if they prefer. But I felt this was quite telling at how and why Veritasium is such a popular channel.
Veritasium started out as a physics channel, and they've covered a wide variety of physics, math and science topics. They are never afraid of showing you the math, but one of the things I think they are really good at is not losing the human part of the story even if you can't follow the numbers exactly. At the end of the day it's humans who came up with this stuff in the first place, so it must be possible to understand it.
They aren't really a technology channel though, at least as it relates to software/computers, so that's probably why the video starts out with a brief history of Linux.
With the enormous budgets we allocate in the name of "national security", this is exactly the kind of work I expect TLAs to do.
Instead we have come to expect them to cowardly sit on exploits, or actively introduce them, rather than working to secure the general public from adversaries.
I've seen subtle performance issues get ignored for months because "it's only 500ms." This was 500ms and weird SSH behavior. The fact he didn't just add it to the backlog is the miracle.
I saw this exact playbook in the early 2000s with contributors to crypto libraries. Long-game social engineering isn't new—what's new is how much infrastructure runs on volunteer labor with zero vetting.
I'm not sure if it is HN-crowd type material since it is easy enough information for most of us to dig up, assuming we didn't already know it. Yet it does not simplify things to the point of, "technology is magic."