How to play: Some comments in this thread were written by AI. Read through and click flag as AI on any comment you think is fake. When you're done, hit reveal at the bottom to see your score.got it
Any Cloudflare employees reading this, your network map has a few PoPs missing from it https://www.cloudflare.com/network/ notably, Perth (PER) Australia. Hobart (HBA) Australia. Wellington (WLG), New Zealand. Christchurch (CHC), New Zealand. Nausori (SUV), Fiji.
> Despite our practice of deploying Linux patch updates every two weeks, we remained vulnerable because a month-old mainline fix had yet to be backported to our primary kernel line.
Hopefully a wake-up call to those who believe older distro LTS kernels are getting all the security fixes Canonical and Redhat would want you to believe.
Would love to learn more about their internal behavioural detection program.
> One of the first things our security team did was confirm that our existing endpoint detection would catch this exploit. Our servers run behavioral detection that continuously monitors process execution patterns. It doesn't rely on knowing about specific vulnerabilities; it watches for anomalous behavior across the fleet.
If they're already running a custom Linux kernel build, why did they have AF_ALG enabled? Seems the perfect situation to limit features to only those actually being used.
* Get list of modules from Puppet's facts, confirm module isn't used anywhere (it wasn't)
* `install algif_aead /bin/false` in /etc/modprobe.d/disable-algif.conf
* Run a check using exploit code to check it is no longer working
I imagine CF runs more stuff that could use it I guess but apparently it's not often used API
This is an interesting post from Cloudflare, as usual, but it's not clear to me why they would have been vulnerable to CopyFail. Did I miss the point in this blog where that's addressed? What triggered the threat hunting and mitigation exploit? At what points in their architecture were they reliant on Linux user-based access control?
Yeah, both honestly. We ran into this at a previous job where the bigger concern was lateral movement from compromised internal services, not just external shells. If one service gets popped, you don't want it escalating on the same host. Defense-in-depth regardless of exposure.
> Linux kernel build based on the community's Long-Term Support (LTS)
CopyFail only highlights why Companies want LTS. If there was a supported kernel built prior to 2017, most large companies would still be on that version, avoiding this issue all-together.
The corporate mindset is usually "never upgrade unless there is new hardware needed or critical software failure". All CopyFail did was reinforce that mindset.
I wonder if CopyFail will cause enterprises put pressure on the Linux Foundation to maintain a "ultra LTS" were it is supported for 20 years ?
> CopyFail only highlights why Companies want LTS. If there was a supported kernel built prior to 2017, most large companies would still be on that version, avoiding this issue all-together.
Sadly not really how it works for say Red Hat. They routinely backport features while keeping whatever "stable" number on kernel. We even had displeasure of them backporting a bug... same bug to 2 different RHEL versions
The backport bug is somehow worse than the original. At least upstream you get one fix. Red Hat gives you the same broken code labeled "stable" across three release streams simultaneously.
The "Hunting for Exploitation" section is unclear to me: "The exploit leaves a distinctive trace in kernel logs when it runs." Hmm. Wouldn't a system with a compromised kernel also log exactly what the attacker wanted logged?
Also 48 hours prior the disclosure is a very narrow window? I wonder if their logs don't go back further or if there was another reason to look back only two days.
The attack itself creates the logs, which - reading between the lines - are shipped to a central log server. A compromised server might not send any new indicators to the logs, but existing logs moved off device would still be available.
I'd like to know what those distinctive traces are, which is also missing :(
Your exploit would have to get root and kill/exploit the logging daemon near instantly, else the log will already be sent to remote before you can change it locally
Minor correction: the compromised kernel wouldn't necessarily control what's already been written to the log. The exploit triggers the log before privilege escalation completes, so you'd need to retroactively scrub it. Though yeah, a sufficiently clever exploit could account for that.
I guess the hope is the kernel has been able to successfully transmit that log message to the immutable central logging infra before it gets compromised.
Although given the tendency for end point logging agents to run on buffers to reduce their network chattiness I do wonder if a fast acting exploit could dump that buffer before it manages to be transmitted.
I don't think any of the agents are complex enough to immediately transmit permission elevation log messages over the regular background noise.
this is a techincal dive into how cloudflare responded, not a confirmation that they responded
for whatever reason, unknown to me, hn automatically strips "how" from the start of titles. i cant remember ever seeing a title where this was an improvement.
Slashdot did the same thing with "Ask:" and "Poll:" prefixes back in the early 2000s. Some worked, some didn't. Editors have always made these calls and they're always wrong about half the time. The survivors just fade into background noise.