46 points by gpi 29 days ago | 9 comments
stouset 29 days ago
Every single time I’ve seen companies use LTS versions of operating systems, upgrading is more painful and consumes double or more the overall time than upgrading regularly would have.

Plus it inevitably happens that teams will need something only available on newer platforms. With LTS OSes, this becomes an enormous PITA. Someone needs a newer version of some thing, that needs newer versions of system libraries, that turns out to use a new kernel feature, and now you’re in the hell of trying to figure all that out.

The only solution I have ever found is: upgrade everything constantly. Yes, it consumes time. But that time will get consumed anyway at some point down the line, and you’re better off doing it now. It’ll take less time and you’ll be able to deliver new feature your teams are asking for.

The only time I’d ever consider an LTS anything at this point is for early-stage startups that just cannot pay that cost.

mystifyingpoi 29 days ago
I'm a big fan of the opposite, nuclear approach - never upgrade. Always build new system/cluster/database/whatever on the side, test it as long as you like, then flip the traffic over. It is both much easier and hella lot harder (at different times in the lifecycle), but in the end it pays off. It's not always possible, but when it is, I'm on it.
stouset 29 days ago
This is the approach I prefer for upgrading when able. But you should still be exercising this approach regularly when possible. You’re still upgrading to new versions, you’re just not upgrading live systems.

At least with k8s, this approach is often sadly somewhat tough.

synack 29 days ago
LTS makes sense for services with a finite lifecycle that are meant to be replaced at EOL, not upgraded.
SahAssar 29 days ago
How often do you know that that far ahead? If someone asked me to guarantee that a service will be replaced in 4 years that'd be hard. Ask me about 12 years and it's basically impossible.
29 days ago
notatoad 29 days ago
if it's a project for your employer that you have staff in charge of maintaining, then it's probably a bad choice. but for projects delivered by contractors who provide a fixed service and maintenance term, it's reasonable to provide a fixed end date for the project as a whole, and that's probably less than 12 years.
stouset 29 days ago
I can agree with this.
dralley 29 days ago
You CAN upgrade RHEL (or CentOS Stream, or Alma, or Ubuntu) every 3 years if you want. Some people do it. Nobody forces you to live out the entire lifecycle on one release.

Some people appreciate having the option to pick and choose when the transition happens though.

stouset 29 days ago
Funny, because we’ve been on RHEL. After one round of calcification, upgrading our datacenters for each future RHEL release took years of concerted efforts. We actually lost ground every year.
hedora 29 days ago
I've supported others that use RHEL. If you like software archeology, it's the OS for you:

> This bug was fixed in 2002, then regressed in 2005 and fixed again in 2007 in mainline, but in RHEL, they botched the '02 and '05 back port, so '02 breaks it, but '05 doesn't. Now, let me tell you about secondary effects of patches in unrelated subsystems. We have three interesting "type A" examples introduced in ...

> ... and that's why it's completely acceptable to run half the fortune 500 off of Linux 2.6.19-RHEL.... in 2025.

> What, you're running 2.6.20? What a hoot!!! We like to kid. You're not serious? You are? You work at the local nuclear plant?

> backs out of conference room slowly, then bolts down the hallway screaming "run for your lives!" once they have clear view of the elevator

dralley 29 days ago
> What, you're running 2.6.20? What a hoot!!! We like to kid. You're not serious? You are? You work at the local nuclear plant?

The local nuclear power plant is hopefully not connected to the internet. It shouldn't be surprising that such systems run old software.

hedora 29 days ago
Not all mission critical bugs are security holes.
dralley 29 days ago
Is a system that has been working fine for 10 years more, or less likely to have a mission critical bug than a system rebuilt on 4 month old code?

HN tends to think a lot about what's good for what's good practice for hyperscalers with massive profit margins and capital expenditures, and not as much about what's good for industries where margins are thin and downtimes have massive real-world consequences.

What kind of software architecture do you suggest for, say, the embedded OS on a bus-sized, $200 million ASML EUV lithography tool? Do you really think it's a great idea to pull every update without recertification to the control systems of that nuclear power plant, or the system that renders MRIs at the radiologist?

I'm not saying let them rot for decades, but caution is prudent sometimes.

stouset 29 days ago
FWIW I 100% concede that there are uses for LTS systems. For things where active development means releasing a new product that customers replace their old one with, go for it. For systems not under active development like a nuclear plant control system, go for it.

I don’t think those audiences make up a significant mass of HN readership, so my comments aren’t targeted at them. For your SaaS company with 20 services and growing? You will have less pain from constantly upgrading than you will from adopting LTS releases.

29 days ago
fujinghg 29 days ago
We jumped I think 8 year old CentOS versions without too much issues.

It depends how badly you fucked up in the first place really.

bityard 29 days ago
This is a good point. The distance between versions can matter, but it usually matters a lot less than the eng/ops processes and culture of the specific organization.

For my part, every time I have done a major OS distribution upgrade, 75% of my time was spent fixing all of the things that were deployed into a highly fragile state to begin with, 25% direct upgrade-related issues (dependencies, config, behavior changes), and 5% actually performing the upgrade.

OptionOfT 29 days ago
I'm also in favor of constant updates. When you update relatively soon after the update was released, the internet's memory is very fresh. Errors that you get related to the versions you're updating from/to.
koolba 29 days ago
Not too soon though. There’s a sweet spot for this where you’re not the guinea pig, but it’s recent enough that the target you’re upgrading to has not yet changed.

Once you’re more than one major version back, you risk hitting all kinds of issues.

solatic 29 days ago
> The only time I’d ever consider an LTS anything at this point is for early-stage startups that just cannot pay that cost.

You're touching on the deeper truth here. Ultimately you are forced to touch production all the time for at least security updates. The question is whether you have people on staff who know the system, at all levels of the stack. If you do, then there is only pain by taking the LTS route, because you will have a fundamental divide between the people who need the latest features and the people who know the specific LTS release.

But if you have people who treat the lower level like a black box, and nobody on staff knows how it works, then LTS helps you derisk this black box that nobody knows by restricting the updates to security updates only and nothing that is supposed to break existing behavior. Then, when upgrading LTS releases, you treat the LTS upgrade as a project in and of itself (full QA, the whole nine yards) instead of some day-to-day maintenance running on autopilot.

Most SaaS stacks run by startups / grow-ups should have people familiar with every level of the stack that they're building on top of. If you don't, build on top of a managed service instead, and be familiar with the managed service's APIs instead.

worthless-trash 29 days ago
From my limited experience I don't see many containerized apps end up using newer syscalls. We're only recently getting io-uring (with its array of security defeating side affects) in containers.

Business software moves much slower than consumer software, i expect that LTS ends up sucking less when they can control the userspace around their application.

inahga 29 days ago
OTOH off the shelf software distributed for Kubernetes IME does tend to pick up on new control plane features fairly regularly.

I’ve historically tracked n-1 or n-2 of k8s, and it is not infrequent that we can’t take a helm chart update until we update k8s.

tapoxi 29 days ago
eBPF potentially? There's a lot of security tooling out there (Wiz agent etc) that would assume a newer kernel on the node.
worthless-trash 28 days ago
Yeah, eBPF is another one of those speculative execution improvement tools.

I haven't had a chance to look at Wiz, I would assume they would make their own kernel module that inspects things based on a very slim ABI requirement.

Their customer base (according to their page) is fortune 500 companies, most of those only move off unsupported releases when they are forced to, it is unlikely they use new syscalls.

packetlost 29 days ago
I've found LTS releases of Ubuntu to be less prone to breakages when going LTS=>LTS vs LTS=>non-LTS=>LTS, but only slightly.

That being said, the one time I've had critical bootloader broken to the point of looking at GRUB code was a simple `apt dist-upgrade` on a LTS release of Ubuntu 22.04, so there's that.

These days I think my preference is for immutable core + userspace package management or going all in on something like NixOS. I'd really love to see a distro that takes this approach to its logical conclusion using OverlayFS or something (but strictly not containers)

lexicality 29 days ago
On the other hand, every time I've been in an "update everything constantly" situation there have been extremely painful outages every 6 months because people put dramatically too much faith in their unit tests to make updating "quick and painless".

Personally I go for having a recurring calendar entry every few months that blocks off a few days to very carefully read all the patch notes before updating

wink 28 days ago
counterpoint: I've been running my own mail server for 20 years and until a while ago it was: Install Ubuntu LTS, install the needed packages, don't touch the machine except for security upgrades until LTS runs out and I would (after skipping one LTS) use the next one for 4-5 years.

Yes, not a company, but I can imagine many more scenarios like this.

2OEH8eoCRo0 29 days ago
Those "LTS" distros will often backport support for new HW.
remram 29 days ago
This is usually done by just shipping new kernels. E.g. Ubuntu's "hardware enablement" (hwe) kernels are just new kernel releases, not old kernels with backported drivers.
eduction 29 days ago
Honestly this is why I'm intrigued by Fedora Server, which at first I took to be some kind of sick joke, but now I see from using regular Fedora long enough that it's good to be in regular practice of handling upgrades. Like you said it keeps those muscles from atrophying.
mdaniel 29 days ago
As a frame of reference, if one were currently on a 12 year old Kubernetes release you'd be running (charitably) 1.0.0 because this is the earliest snap I could find of kubernetes.io and it doesn't even have a version number https://web.archive.org/web/20141114180626/http://kubernetes... [1]

So, I hear them about "but I wanna run 1.32 on my PBX for-evvvvvah" but I don't think that choice is going to do what they expect. Hell, I'd gravely wonder if containerd would survive a 12 year old release cycle

1: chasing the API docs, one can see the 12 year old control plane managed three resource types: Pods, ReplicationControllers, and Services https://web.archive.org/web/20141013032652/http://cdn.rawgit...

Palomides 29 days ago
now that it's 12 years later, maybe kubernetes is more feature complete, stable, and possible to maintain at one version
29 days ago
driggs 29 days ago
An LTS server is an appliance. There is so much bedrock software out there that the younger HN crowd may have never considered, unsexy yet absolutely critical. This is targeted for those who need to prioritize longterm stability above all else.
pm90 29 days ago
Im guessing there is market demand for such a product. I can see a few midsized companies not wanting to deal with k8s upgrades until they go public or get acquired.

My gut reaction is that this is a bad idea. Anyone going this route should instead invest in creating a process to regularly upgrade kubernetes and run it anywhere as opposed to being tied down to canonical. I know the judgement of a lot of great infra folks will be overridden by business folks in the name of stability though… sigh.

colechristensen 29 days ago
No sometimes you just want things to work.

Upgrading takes work. Upgrades break things. Upgrades make training required constantly.

At some point, you get to thinking "we've got better things to do than upgrade kubernetes", lots of businesses do. It's the kind of software that can be "done".

If you've got something that works, you can indeed be happy with keeping it the same for a dozen years.

mystifyingpoi 29 days ago
This. I would never upgrade Kubernetes, if I could get away with it. I don't use any of the new shiny features, I'd be happy if I could stay at 1.21 forever.
rwmj 29 days ago
There's definitely demand for these long lifecycles. Red Hat (the company I work for) recently extended RHEL7 support until 2028, which will be about 14 years after it was first released: https://www.redhat.com/en/blog/announcing-4-years-extended-l...
bityard 29 days ago
A large enterprise org that I am close to still has almost 600 CentOS 7 hosts. The good news is that this is a small fraction of their fleet. The bad news is that they have standardized on RHEL 8 everywhere else and there are currently no plans to upgrade to RHEL 9...
29 days ago
dralley 29 days ago
Pour one out for the poor engineers that will be saddled with maintaining that commitment. Doesn't sound like a fun time.
soperj 29 days ago
Depends on the person. I've been maintaining the same software for 16 years now, I know the code base, changes are easy to estimate accurately, and I enjoy it.
doctorpangloss 29 days ago
Don't use Canonical's Kubernetes support, because microk8s is a disaster.

Here is a stylized post of what your experience is going to be when you ask for support from Canonical for Kubernetes in microk8s: https://github.com/canonical/microk8s/issues/3227#issuecomme...

They are really talented guys but microk8s is a fiasco.

mkesper 29 days ago
Yes, tried that once. Just use k3s and you get the stock Kubernetes experience with less resources.
develatio 29 days ago
It’s already a huge pain to update once every year, imagine updating it after 12
znpy 29 days ago
> It’s already a huge pain to update once every year, imagine updating it after 12

you don't.

12-year lts means that over 12 years you get security fixes.

after 12 years you don't upgrade, you do a full reinstall/replatform.

rdtsc 29 days ago
You can still upgrade, upgrade every year, or every month even. This doesn't stop you from upgrading, it just doesn't force you to upgrade because the version is not out of support.
solatic 29 days ago
I can imagine an architecture where companies run their stateless services on some cloud-managed EKS/GKE/AKS and their stateful databases on this LTS Kubernetes distribution. Upgrade the stateless service clusters once a year by blue-greening the entire cluster and flipping DNS, and treat database migrations as a once-a-decade project.

Not to mention all the shops shipping appliances that run on top of Kubernetes, like Chick-fil-A running a Kubernetes cluster in every restaurant.

I have no doubt there's a market here.

SahAssar 29 days ago
That would need similar LTS support for the databases too. Also if you are migrating the databases once in a decade is that really something where you want/need k8s? Why not run straight on the OS at that point?
solatic 29 days ago
No, not necessarily. You have databases like Kafka and Elasticsearch that can benefit greatly from elastically adding additional nodes as needed. You now have Kubernetes Operators for lots of databases that help with quick HA failover when the underlying node fails. But Kubernetes itself is fairly mature at this point and there's much less likelihood that there's going to be some kind of feature that's really ground-breaking and need a new version of Kubernetes... The only one that comes to mind is in-place container resource resizing without a restart: https://kubernetes.io/docs/tasks/configure-pod-container/res...

This is separate from the problem of needing to gradually roll pods (and their disks) to new nodes for every node in the fleet to accommodate a Kubernetes version rollout (i.e. with immutable nodes), which is an incredible pain for large database fleets and more or less not realistic to carry out quickly for critical security fixes. Having LTS nodes with LTS kubelet that you restart in place is much easier.

SahAssar 29 days ago
So who provides 12 years of LTS for those database versions and operators? Who provides 12 years of LTS for the clients of those databases?
zzyzxd 29 days ago
How much value can LTS provide? If you care about stability, just stick with stable API versions and you will likely be fine. For instance Pod[core/v1] has been around for a decade.

If you go with a LTS version, you rely on the vendor to backport security patches, block your engineers from exploring new features, and if you use 3rd party operators/controllers, good luck with getting LTS support from all of them!

In enterprise market, people have been dealing with weird proprietary software for decades, finally there's a decent open source software with sane version skew and deprecation policies, and strives to make upgrades less painful. If you still can't do this upgrade once every year or two, likely you won't be able to do it in 12 years.

I imagine this might make sense if you are a non-tech company hoping to buy a software from a vendor and never worry about it. But in that case, Kubernetes is the wrong abstraction layer for your business.