What should we call the current builds?

That's a horrible way to manage releases when new devices are released/supported constantly.

How would you manage it? Make a release whenever a new device was added?

[quote="dlang, post:61, topic:261"] there are no release criteria set for
LEDE/OpenWRT/Linux-kernel/Rsyslog/many others other than 'X amount of time
has passed, so it's time to make a release containing all the work that has
been done by anyone since the last release (unless a particular patch causes
too much grief) [/quote]

That's a horrible way to manage releases when new devices are released/supported constantly.

I would agree with you, except for the fact that it seems to work better than
anything else that has been tried. And there have been a lot of people trying
different things.

All attempts to define milestones and make sure they all work before a release
have not only resulted in far less frequent releases, but also far more buggy
releases. This applies not only to other Free and Open Sourc projects, but also
companies releasing proprietary projects.

I'll point out that the Agile development model is exactly this time-based
approach to software development.

The Linux Kernel development has been working this way for many years now, and
that is a perfect example of a case where new stuff is comeing out all the time
that people want support for (systems, software features, plug-in devices, etc)

IMHO, the thing that makes the kernel work so well is that they have an
extremely aggressive release schedule, a new kernel is released about every 2
months. This means that the differences between one release and the next are
'relativly' small (if you can call 10K separate changes small :-), and that if
there is any question of if a fix is buggy or not, the answer is to not include
that fix in the release until it gets more testing "after all, it's only a two
month delay until the next release". Also, if a bug does get through, people can
revert to the prior version or implement a work-around to survive until the next
release.

David Lang

I've suggested a few possible solutions for the problem of new devices not having stable release (more importantly Luci) earlier, frequent releases (not necessarily as frequent as each time the new device is added) would be one of them.

Unless time = a few months, I'd argue that there're better release criteria than "some time has passed", at least for LEDE.

I'm sure there're some features or things missing in OpenWrt/LEDE and that's why the development continues. Wasn't proper IPv6 support one of the things driving CC? So what's the next feature driving the next release of OpenWrt/LEDE? Whatever the project, unless it's feature complete, there's always a feature which can define the release criteria for the next milestone (or two). If why/what is asked enough times I'm sure we'd get to the underlying feature set and that would define the next milestone.

Wasn't having infrastructure in place to support first LEDE release one of the first goals of the fork? Wouldn't then having all infrastructure in place for the release warrant a first release? Is infrastructure in place now?

So you're continuing with generalizations... All attempts? So you're speaking on behalf of the whole IT industry and all companies? Granted, I don't know OSS sector that well (after many years working for the industry leading software vendor and startups I've worked for an OSS company for a few months before it went bust shortly after 9/11), but I'd bet that with Ubuntu and CentOS, even with the date-driven releases they still define the features they want to pack into next release and pursue those (so that the RC would include features which are critical bugs-free on the specific date). Outside of the OSS tho, all projects I've worked on were feature-driven. With one of them, we were releasing about every two weeks, supporting 2 dozen locales for millions of users. I wouldn't say it was bug-free, but I assure you, the product was no more buggy because of the feature-driven releases.

No, it isn't, it is the iteration-based with tiny iteration cycles approach. As each iteration covers a "feature", Agile is a granular feature-based approach to software development.

I'd be perfectly happy if there was a new LEDE release every 2 months. We would probably be on the pre-release track then, with distinctive URL for snapshots, where we have originally started. :wink:

Unless time = a few months, I'd argue that there're better release criteria than "some time has passed", at least for LEDE.

ok, what are they?

I'm sure there're some features or things missing in OpenWrt/LEDE and that's
why the development continues. Wasn't proper IPv6 support one of the things
driving CC? So what's the next feature driving the next release of
OpenWrt/LEDE? Whatever the project, unless it's feature complete, there's
always a feature which can define the release criteria for the next milestone
(or two). If why/what is asked enough times I'm sure we'd get to the
underlying feature set and that would define the next milestone.

actually, you have this backwards, IPv6 was included in CC because it was
hammered out in that timeframe, CC wasn't held up waiting for IPv6 support (and
a lot of the IPv6 development was done externally via projects like CeroWRT)

Wasn't having infrastructure in place to support first LEDE release one of the
first goals of the fork? Wouldn't then having all infrastructure in place for
the release warrant a first release? Is infrastructure in place now?

the infrastructure is not yet in place, and yes, getting it in place would be a
good time for the first release (since it's hard to have releases without it)

So you're continuing with generalizations... All attempts?

pretty much.

So you're speaking on behalf of the whole IT industry and all companies?

well, this is why Agile development is such a popular topic, because it works
better.

but I'd bet that with Ubuntu and CentOS, even with the date-driven releases
they still define the features they want to pack into next release and pursue
those (so that the RC would include features which are critical bugs-free on
the specific date).

nope,

CentOS includes whatever was in RedHat. That is their reason for existing. They
make no decisions about what is going to be in CentOS. Their release criteria is
that they have been able to remove the branding from RHEL and get it to compile.

Ubuntu does strict time-based releases, they take what's ready and release it.
They may try to have some things they work on ready in time, but if they aren't
ready, they don't delay the release to make them ready (Ok, I think they delayed
a release once, but it wasn't to finish a feature, it was to finish squashing
bugs and they got quite a black eye for the slippage)

No, it isn't, it is the iteration-based with tiny iteration cycles approach.
As each iteration covers a "feature", Agile is a granular feature-based
approach to software development.

no, one of the key things about properly running agile is that if a feature has
a problem and isn't going to be ready, that feature is supposed to be dropped
from that cycle and the release continues to happen. This shows that the
timeline is more critical than the feature list.

I'd be perfectly happy if there was a new LEDE release every 2 months. We
would probably be on the pre-release track then, with distinctive URL for
snapshots, where we have originally started. :wink:

release schedule for Linux distributions is a balance between multiple things

  1. having it be frequent enough to include new featues/support
  2. how many versions does the project need to support for it's users
  3. hold old is the oldest software that the project is going to have to support
  4. how frequently are users going to be required to upgrade

the Linux kernel is the aggressive end of things, it has very frequent releases,
but it only supports the current release [1]. This means that there is no old
software for the kernel devs to have to support, but also that users have to
upgrade frequently or not get their support from the linux kernel developers

Fedora makes a release every 6 months, but only supports a release for 12
months, which means that users are required to upgrade every release, or go some
time running unsupported software (they can't upgrade every other release
because you can't do a release immediately, you have to test and phase your
upgrades)

Ubuntu makes a release every 6 months, but with 18 months of support, so you cna
upgrade every other release if you want. They also have a separate support track
there every fourth release is designated as a "Long Term Support" release and
supported for 5 years

RedHat is the closest to what you are wanting to do, but it's becoming far more
time based, and it has a support nightmare (although is getting people to pay
them for the support, so it's not really the same model)

We don't yet know what the intended LEDE release schedule and support length is.
We know that it is going to be time based, and that they have been unhappy with
how long the OpenWRT timeframe is.

We also don't know how many versions the developers will support.

David Lang

[1] well, almost. every once in a while some kernel developers volunteer to take
an older kernel and backport a subset of fixes to it, these are the so-called
'-stable' kernels

But that is what we're working on all along. Start a stable branch, automate release infrastructure in a way that releasing is a matter of running two commands.

I disagree. Past experiences in OpenWrt have shown that defining arbitrary goals only leads to delays. Such a model only works once you have something comparable to a technical steering committee and a governance framework which makes defined goals binding. This simply isn't the case at the moment where people semi-randomly work on their own particular subsystems with only a very loose architectural oversight based on informal agreements.

The goal is to make point release intervals short and regular enough that there is no need to rush things to meet a deadline or to delay releases to wait for things.

No, the lack of a release was driving CC. But since its inception was so painful and took so long, the IPv6 support happened to be finished before.

I don't think there's any killer feature driving the progress of the entire project. Development currently is mainly driven by external contributions (device support patches, bug fixes) and polishing/extending/fixing existing sub systems as well as overall performance improvements.

I can't think of any suitable one. Here's some example:

  • Device tree support for ar71xx? That is months to years away and would delay the release forever.
  • Kernel 4.9? That will invalidate all stabilization work invested into the tree so far.
  • New device? New devices are added one to two times per week so on which one should we wait and why only on that and not others?
  • Bug count below 50 ? Various issues in the tracker are eternal bugs which are unlikely to be fixed within a foreseeable time frame.

To return the question what exactly would you define as a release criteria?

So far you seem to worry about new devices first but given that releases happen every two to three months, is there an actual problem? In the worst case, a new device is added shortly after a release which would imply a maximum waiting time of two to three months until the next release branch is started. But even in this situation there is nothing preventing us from cherry picking the device support patches into the current release branch and make an YY.MM.N+1 point release within a day.

The hardware is in place, the release setup is doing build testing since a few days.

I still do not understand the distinctive URL approach. Can you give a concrete mock up example using made up version numbers and explain exactly what you would link where in which phase?

I have hijacked part of this discussion to create a new topic called Criteria for first LEDE stable release? and moved it to the For Developers category at:

On a related note: Can we please have an archive of snapshots, possibly dated, possibly with "latest" pointing to the last build?

It's incredibly annoying to install a snapshot and returning to it a few hours later trying to install a package just to notice that there has been a new build in the meantime and kernel IDs don't match anymore.

After the releases have been built, it's just a matter of disk space, no?

I'm thinking about something like
/snapshots/20161213/
/snapshots/20161214/
/snapshots/20161215/
/snapshots/latest/ -- in this case redirecting to /snapshots/20161215/

On a related note: Can we please have an archive of snapshots, possibly dated, possibly with "latest" pointing to the last build?

It's incredibly annoying to install a snapshot and returning to it a few hours later trying to install a package just to notice that there has been a new build in the meantime and kernel IDs don't match anymore.

After the releases have been built, it's just a matter of disk space, no?

I'm thinking about something like
/snapshots/20161213/
/snapshots/20161214/
/snapshots/20161215/
/snapshots/latest/ -- in this case redirecting to /snapshots/20161215/

about 70G/snapshot per the earlier discussion.

how many snapshots do you want?

remember, there can be more than one snapshot per day.

David Lang

So disk space is hardly the issue, at least on a root server you should have the whole (terabyte sized) hard disk available, no?

Again, it's just about the practicality to not have to fully furniture a snapshot installation at the exact time the snapshot has been downloaded. A few days should already be plenty, a week would be glorious.

Then date and time seems quite impractical. Maybe use the revision number for the subdirectory? I'm guessing a download would aways go to "latest", there's little to no reason to point to a specific ephemeral snapshot build, so it's pretty much only a matter of opkg.conf pointing to the directory of its specific build.

This is not easily doable atm as targets are built randomly at random intervals. So the lastest ar71xx/generic build may be dated 20161215 while the most recent x86/generic one build would be 20161213.

I see. And dating/tagging the targets would become a bit messy I guess, especially with the packages outside the target directories. Humph.

Actually we just need to retain / fix the core packages directory holding the kmods. The non-kernel, non-target specific feed packages can remain at their current location.

You'd need something like:

  • snapshots/
    • r1234/
      • targets/
        • ar71xx/
          • generic/
            • packages/ (this is the package directory holding core packages like kmods or libc)
      • packages/ -> ../packages/ (this is where a versioned build would look for non-core packages)
    • r1235/
      • targets/
        • x86/
          • generic/
            • packages/ (this is the package directory holding core packages like kmods or libc)
      • packages/ -> ../packages/ (this is where a versioned build would look for non-core packages)
    • packages/ (the phase2 build artifacts which do not contain any kernel specific things)
      • mips_24kc/ (files used by ar71xx/generic)
      • i386_i486/ (files used by x86/generic)

This approach has the downside though that "latest" depends on the target and that not every target will be available as build at every revision.

I don't think this can be fixed using directory structure alone, it would need some external indexer script creating symlink trees or offering convenient HTTP redirects (e.g. something like http://get.lede-project.org/snapshot/ar71xx/generic -> 301 http://downloads.lede-project.org/snapshots/r1234/targets/ar71xx/generic)

You forget that LEDE also relies on mirroring.

70 GB, especially of changing content, might not sound like a lot if you're dealing with a single system (even less with a home system, where spinning disk space is cheap), but once you need to deal with rented servers, it's getting more difficult.

On the one hand most hosters only have their fixed offers, so just adding larger or more disks simply isn't possible (or at least not for a reasonable price).

On the other hand each mirror run, even leaving the traffic aside, requires quite a lot of I/O and CPU load. Yes, you can try to offer more intelligent rsync scripts, provided with deeper knowledge of the mirror structure (like Debian tries with ftpsync, but getting mirror admins to accept your custom code is a hard sell (even more keeping it updated). Not to forget that most mirrors don't just mirror one project but several different ones, so 70 GB times $x snapshots suddenly becomes a much larger burden.

Coming to the last question, how many snapshot backups would really be enough - one, two, a week worth of snapshots? This would certainly help someone who just installed a snapshot, but got hit by a mirror refresh. But is this really a realistic scenario (although it definately happens)?

Take a new(ish) user, who might install a snapshot, play around without installing packages (well, at least nothing more involved than installing luci) for a couple hours/ days, before starting to try installing more involved features (many of those probably needing kernel modules, like ipsec, netfilter, etc.). Keeping a fixed, but limited, amount of snapshots isn't that likely to help this type of user - only a fixed stable release would.

A more experienced user tends to have the target configuration in mind, chances are that it's more of an upgrade situation, rather than a session of install, test and experiment. Normally this kind of user, if they don't just build the firmware themselves (or use imagebuilder), only need opkg once, directly after flashing the firmware upgrade. Mirror syncs shouldn't be too painful for these users.

If it's not more than that, that's pretty trivial. I don't even think that redirecting /latest/* to a (cgi) script that is looking for the latest revision and redirecting to /revision/* would cause noteworthy load on the server. The "latest" directory is really only used to download the images, opkg would link to the respective revision directory, which is the point in the first place.

I did not forget about mirroring, but I don't see it as an issue. Everything that is newly built has to be transferred anyway, everything that is old and retained would never have to be re-transferred.

Really the only reason is that kernel-specific packages would be retained for a few days. And opkg does not (yet) pull from a CDN/round robin, it does not go to the mirrors, does it (CMIIW, really!)

I am that user. And yes, it happens. It's a problem actually. If you want to add anything kernel-related to your system even the next day, and that happens quite frequently to me, you are stuck with the decision between an incomplete system or a complete sysupgrade with all it entails.

With the additional bonus that you can step back one or two revisions if there's a showstopping problem with the current one.

I'm not trying to make it seem more dramatic than it actually is. If it can be fixed with some slight modifications to the file system, I'm all for it. If it's a hassle and way too complicated, eh, ignore my request.

Well, I'm going to eat my own hat. If that (the above) is the case then frequent (2, 3 months tops) scheduled releases looks like the only sane way for LEDE. Would be great to have a page where you could compare the changes between any two releases for people contemplating wherever to refresh their router firmware or not.

Before we get back to snapshot URLs -- how would the release numbers look like tho? Assuming the first release is 17.01, what happens in April? Would it be 17.04 and so on, with the minor dot-releases like 17.01.01 reserved for when a critical issue (heartbleed bug) is fixed?

If that's the case I would suggest that snapshot URLs be /17.01/snapshots/ before the 17.01 release and switching to 17.04/snapshots/ after 17.01 release. The /17.01/snapshots/ itself can be redirected to 17.01 release and/or redirected to page saying you're trying to reach resource for an obsolete snapshot. opkg repo requests can be redirected to 17.01 release or might as well return 404, unless there's a reason why it's worse than kernel conflict. If the releases are date-driven then we'll know far in advance what the next URLs are going to be.

The goal is to bring people from the obsolete snapshots (possibly installed because there was no stable release for their device or the feature they wanted wasn't released yet) to releases and to bring some clarity/fight the stale snapshot links in forums and on third-party web-sites.

I don't think there's any killer feature driving the progress of the entire
project. Development currently is mainly driven by external contributions
(device support patches, bug fixes) and polishing/extending/fixing existing
sub systems as well as overall performance improvements.

Well, I'm going to eat my own hat. If that (the above) is the case then
frequent (2, 3 months tops) scheduled releases looks like the only sane way
for LEDE. Would be great to have a page where you could compare the changes
between any two releases for people contemplating wherever to refresh their
router firmware or not.

it's exactly this type of documentation that is why very frequent releases with
a small team are a problem :slight_smile:

it's a balance between:

  1. frequent releases are good for users who want the latest

  2. frequent releases are bad for users who just want their existing stuff to
    keep running and don't want to be bothered with a 'unneeded' upgrade

  3. applying fixes to multiple branches (so that people running old versions get
    'important' fixes) is expensive in manpower

  4. it takes manpower to do all the checks and final stabilization, that impacts
    the development of new features, support for new devices, bugfixes

  5. supporting multiple versions takes time, and can be extremely frustrating
    (spending many hours working to find a bug, only to track it down and realize it
    was fixed in a newer version..)

LEDE/OpenWRT are very small teams, they need to be careful about what they
commit to doing or they could cripple the project.

I'd love to see releases every few months, but if that means that only the
latest release is supported, and users are expected to upgrade their routers
every few months, that may be a bit much.

Before we get back to snapshot URLs -- how would the release numbers look like
tho? Assuming the first release is 17.01, what happens in April? Would it be
17.04 and so on, with the minor dot-releases like 17.01.01 reserved for when a
critical issue (heartbleed bug) is fixed?

If that's the case I would suggest that snapshot URLs be /17.01/snapshots/
before the 17.01 release and switching to 17.04/snapshots/ after 17.01
release. The /17.01/snapshots/ itself can be redirected to 17.01 release
and/or redirected to page saying you're trying to reach resource for an
obsolete snapshot. opkg repo requests can be redirected to 17.01 release or
might as well return 404, unless there's a reason why it's worse than kernel
conflict. If the releases are date-driven then we'll know far in advance what
the next URLs are going to be.

The goal is to bring people from the obsolete snapshots (possibly installed
because there was no stable release for their device or the feature they
wanted wasn't released yet) to releases and to bring some clarity/fight the
stale snapshot links in forums and on third-party web-sites.

there are snapshots created as part of the process to finalize a release, or to
backport fixes to a release, which could reasonably be called 17.01/snapshot,
and snapshots of the current development, which are something completely
different.

Now, if LEDE/OpenWRT were to move to a very rapid release cycle like the kernel
uses, there may not end up being very many people who use the trunk/snapshot
development.

But I don't think the projects are ready to go that far yet.

David Lang

Really the only reason is that kernel-specific packages would be retained
for a few days. And opkg does not (yet) pull from a CDN/round robin, it does
not go to the mirrors, does it (CMIIW, really!)

I am that user. And yes, it happens. It's a problem actually. If you want to
add anything kernel-related to your system even the next day, and that
happens quite frequently to me, you are stuck with the decision between an
incomplete system or a complete sysupgrade with all it entails.

With the additional bonus that you can step back one or two revisions if
there's a showstopping problem with the current one.

I'm not trying to make it seem more dramatic than it actually is. If it can be
fixed with some slight modifications to the file system, I'm all for it. If
it's a hassle and way too complicated, eh, ignore my request.

I think the better answer to your need is that instead of pulling a random
snapshot, you really should be building your own images. That way you not only
have the ability to add packages, but you can build any arbitrary version of
things to work around bugs (and getting a user who has hit a bug to bisect
things to find out what patch caused the bug is INCREDIBLY valuable to the
developers, especially when hardware support is involved)

the daily/etc snapshots should not be viewed as something to install on a router
and run, but rather as a sort of smoke test, "does LEDE/OpenWRT have a chance of
doing what I want", if so, you should then build your own image to use.

If you aren't comfortable with the idea of building your own image, you should
use a release (and yes, that means that you shouldn't be using LEDE yet)

David Lang

David, I see your argument. Actually, I am completely comfortable with rolling my own releases although I prefer not to because I'm a lazy bum. In all seriousness though, self-built images have their own set of problems, but that's another discussion. I agree, anyone seriously considering the cutting edge development should be able and willing to built images himself.

Which is all the more an argument for my position. Exactly this user will run into a frustrating problem if his "smoke test" lasts longer than to the next build cycle. Which, obviously, is not predictable and can range anywhere between a few days and a few hours. We can't even say "take note, you only got x hours to test it" because he might not even have that.

Case in point: If I pulled and installed LEDE snapshot on one of my ar1xx routers when you wrote your previous post, by the time I wrote this reply, I already wouldn't be able to install kmod-sched-cake anymore.

I'm not saying it is a huge problem. I am saying if it can be remedied easily, with a few twists on the directory structure, it should be considered.

But above all, I agree with you completely: LEDE should have a release sooner rather than later.

David, I see your argument. Actually, I am completely comfortable with rolling my own releases although I prefer not to because I'm a lazy bum. In all seriousness though, self-built images have their own set of problems, but that's another discussion. I agree, anyone seriously considering the cutting edge development should be able and willing to built images himself.

Which is all the more an argument for my position. Exactly this user will run into a frustrating problem if his "smoke test" lasts longer than to the next build cycle. Which, obviously, is not predictable and can range anywhere between a few days and a few hours. We can't even say "take note, you only got x hours to test it" because he might not even have that.

Case in point: If I pulled and installed LEDE snapshot on one of my ar1xx routers when you wrote your previous post, by the time I wrote this reply, I already wouldn't be able to install kmod-sched-cake anymore.

I'm not saying it is a huge problem. I am saying if it can be remedied easily, with a few twists on the directory structure, it should be considered.

But above all, I agree with you completely: LEDE should have a release sooner rather than later.

How about having three snapshot directories

latest (changed whenever)

weekly (built once a week and stable for a week)

monthly (built once a month and stable for a month)

the fun with this is that you will sometimes get a weekly/montly build that is
unusable for someone. I think the answer is to just let it be broken and tell
the person to try the next faster build or roll their own.

David Lang