"opkg list" with 10MB free memory gives "out of memory"

So, what do we have now?

opkg update works if you remove /tmp/opkg-* (or just don't update lists if you already updated them once after the last reboot).

opkg list does not work, BUT somehow it works with LuCI. Maybe LuCI does it "with its own scripts" too?

Are you sure that zcat ... | sort ... | grep ... | sed ... and then routing the sorted output back to itself would take more than 10M ? (I don't know how to check this.)

(If the only problem is sorting, then outputting it through sort would be enough; but I suspect that even this overkill with four separate programs, which should certainly be enough for anything, still would not take that much RAM.)

(Actually, yes, if all those lists only take less than 1M, then even storing only their names for sorting would not take 10 times more. Even if there is some serious 10x compression involved.)

Seems correct., at least for 19.07 and snapshot. I didn't find how it does this in 18.06. See this file, it's the script that Luci uses to control Opkg in master/snapshot (and is much clearer) https://github.com/openwrt/luci/blob/master/applications/luci-app-opkg/root/usr/libexec/opkg-call#L14
This is the version of the "opkg control commands file" used in 19.07, and I highlighted the same line (which is written like that as it is shell commands embedded in Lua code and looks like garbage, hence why it was split out later) https://github.com/openwrt/luci/blob/openwrt-19.07/applications/luci-app-opkg/luasrc/controller/opkg.lua#L33
as seen in the highlighted line, Luci web interface is getting a list of packages by reading the package lists directly with

find "${lists_dir:-/tmp/opkg-lists}" -type f '!' -name '*.sig' | xargs -r gzip -cd

This works fine even if called manually in SSH command line, after you did a "opkg update" to actually download the package lists.

It will just print the contents of a compressed package manifest on console, then print another, and another until it printed all of them. I assume other logic in Luci does the filtering of all the text and sorting, and decides what to show in the actual interface, but this is all written in Lua and I don't understand it that much.
But since it is shown in multiple pages in Luci interface, and moving between one page and another takes time, I think this sorting is only partial (as it only needs to know only what it should show in the current page, not all).

No, what I said is to just not sort them globally at all, as any global sort would need to load all descriptions in RAM, or be very inefficient (i.e. sort the package names first, then fetch each package description when it is time to print them on console).

Each repository's package list is already sorted alphabetically, my script would just be reading and filtering out anything that isn't the package name and description.

The result would be something like

core repository packages
-package1: description
-package2: description
-package3: description

luci repository packages 
-package10: description
-package11: description
-package12: description

and so on.

1 Like

I mean, I don't think loading all descriptions in RAM would take too much RAM. Certainly no more than the whole less-than-megabyte those lists occupy. And even that is already better than what we have now. So if opkg uses more RAM than it would take to get the whole list and sort it with sort, then it's a "bad design" problem (or just an implementation bug), and not a technical limitation due to some complex graphs, and could be rewritten to do just that.

As far as I understand, the problem with opkg update is that for some reason it also does more than it should - it seems to look into existing lists for no reason. And I suppose those two problems actually have a similar source, because in 18.06, the error message was more specific (I don't remember exactly), and it drove me to suspect that. So I've commented in that bug report, adding the "opkg list problem" too, as it seems to be connected.

I don't think they would include a separate script as an official solution, but you could post it somewhere (maybe in that bug report) for those who want it (it's just a script, you don't need it to be in the repos, users can just copy it from the page).

I've tried to look for the actual opkg source, but... I couldn't find it. %) I'll probably be unable to even read it, but I'm curious anyway. Do you know where is the source for opkg?

The package makefile tells where the source is, see https://github.com/openwrt/openwrt/blob/master/package/system/opkg/Makefile#L16

It's not in Github (that mirrors only the build system and the packages), it's in the actual OpenWrt git server, in its own repo https://git.openwrt.org/?p=project/opkg-lede.git;a=summary
Click on "tree" to show a "files" view and then you can navigate in the source folder. Or git clone locally with the URL listed in that page.

I mean, I don't think loading all descriptions in RAM would take too much RAM. Certainly no more than the whole less-than-megabyte those lists occupy.

Eeeh it's on the line, the package lists are compressed with Gzip. If uncompressed, that 1MB becomes around 3MB.

Actually, if we could rewrite most of opkg's functionality with bash (even maybe calling opkg for some complex functions that actually work, like installing and removing packages), then I would suggest proposing a new "apkg" package that would do the same, but actually work.

I took a look at the sources. The interesting parts are:

  • In opkg.c , there are opkg_update_package_lists() and opkg_list_packages() . They do some things with hashes and sorting, but as far as I understand, those functions are "public API", and they are not called from anywhere in opkg itself. Instead, opkg does the same things a little differently in another file:

  • opkg_cmd.c - here we can see what does opkg do for "update" (opkg_update_cmd()) and "list" (opkg_list_find_cmd()). All "list" has is sorting, which could probably be rewritten and optimized for memory usage (or piped through sort?). "Update" does not output anything before it's killed, and it's only failing when lists are already downloaded, so it maybe it could have some poking at - in the section where it looks at directories, before the first output. (Not a C programmer, so I can barely get an idea about what it's doing, but I don't see anything suspicious.)

Maybe it's already fixed in master?.. I've already shipped my router, but let's leave all this here for anyone who stumbles upon this thread later.

1 Like

That's entirely possible, but I'm really not interested enough myself to do it, as it would take a while to get something good enough, and even more to convince OpenWrt developers that it is a good idea to have this script in the default images for low RAM devices (because if it is not in the default images it's kind of a catch-22 situation). That's more effort I can put together for something like this.

I'm building images with all packages included since a long while ago, either with the Image Builder or compiling them from source.

So for my devices opkg runs only in the build system (to assemble the firmware image), never in the device.
As it's not the first time opkg blows up because it's wasting RAM, jo-philipp had to optimize it many times over the years as available space shrinks and shrinks.

I'm happy enough with the current resolution: don't update more than once, use LuCI for the lists. (And this is too much for me to take on, too.) But if somebody is interested in fixing this, there are two things to try: either try fixing opkg, or create a wrapper for it. (It does not have to be in the default image; initial opkg update works fine with less than 2M RAM available, and opkg install does, too.)

Oops. On 18.06.8, opkg list does not work (but opkg update does work multiple times, at least on some devices), and LuCI does not display the package list either! So I had to find a solution.

Here is a basic script that implements the following commands:

  • list - lists all packages' names
  • search - shows all packages with in their names
  • show - show detailed information about the package

This script only takes a few hundred kilobytes to run, and only a few seconds.

You can save this as a file, or simply copy and paste the desired function into your ssh shell and run that function by name (for example, pkg_show wget). I've made functions very small and easy to copy.

#!/bin/sh

pkg_list() {
    if [ ! -z "$1" ]; then echo "ERROR: Too many arguments"; else
    find "${lists_dir:-/tmp/opkg-lists}" -type f '!' -name '*.sig' | \
        xargs -r gzip -cd | grep "Package: " | sort
    fi
}

pkg_search() {
    if [ -z "$1" ]; then echo "ERROR: Not enough arguments"; else
    find "${lists_dir:-/tmp/opkg-lists}" -type f '!' -name '*.sig' | \
        xargs -r gzip -cd | grep "Package: " | grep "$1" | sort
    fi
}

pkg_show() {
    if [ -z "$1" ]; then echo "ERROR: Not enough arguments"; else
    find "${lists_dir:-/tmp/opkg-lists}" -type f '!' -name '*.sig' | \
        xargs -r gzip -cd | sed -n "/Package: $1$/,/^$/p"
    fi
}

case "$1" in
    list)
        shift
        pkg_list $@
        ;;
    search)
        shift
        pkg_search $@
        ;;
    show)
        shift
        pkg_show $@
        ;;
    *)
        echo "USAGE:"
        echo "$0 list"
        echo "$0 search <package>"
        echo "$0 show <package>"
        ;;
esac

This is basically ALL opkg should do for those commands. And I wrote this in an hour of googling, not being very familiar with bash or sed. It could use a few tweaks for extending functionality (like matching several packages at once, which I was too lazy to google), but I can't imagine how hard would you have to try to make this require 10M RAM. %)

According to the bug report ( https://bugs.openwrt.org/index.php?do=details&task_id=2734 ), "Opkg was already heavily patched and modified to use less RAM, even introducing a rather slow multi-pass list parsing algorithm", so they "don't see a solution for this problem", and so apparently we shouldn't wait for a fix.

UPDATE: regarding other opkg commands. Not only does opkg update fail to simply call wget to download the packages if they are already downloaded (which would also give us the benefit of using HTTPS - see https://arstechnica.com/information-technology/2020/03/openwrt-is-vulnerable-to-attacks-that-execute-malicious-code/ ), but opkg install also fails to check their checksums (see https://blog.forallsecure.com/uncovering-openwrt-remote-code-execution-cve-2020-7982 ).

1 Like

The vulnerability in that article is patched. What's your point?

I mean that it seems like rewriting opkg in bash would be an improvement, and a huge one at that, because apparently this is not the only case where opkg shines with its inability to do the most trivial things like parsing text properly while already being "heavily patched to use less RAM".

While trying to figure out the problem, all I've seen was mostly a lot of answers like "there is a reason for that, those lists are huge, they probably don't fit in the memory!". Now I've actually confirmed that those tasks are done in three shell lines, and use way less memory while invoking several separate programs, so this is definitely possible. Either opkg does something really poorly, or it does something it's not supposed to do; so don't believe those "there must be a reason" assumptions.

UPDATE: I don't mean to say opkg is bad all the way - the really complex stuff actually works well. I'm talking specifically about "the most trivial things", literally parsing text, that it seems to have a whole history of problems with. And shell makes that part very easy, so I'm wondering why didn't they use shell for it. And then they say that this is an unsolvable problem!..

Good job!

You might want to send a PR in Github or at least an email in the OpenWrt devel mailing list to at least discuss this solution with the actual developers (@jow mostly).

I suspect that this is not a case of "impossible" but of "it's a huge PITA to do text manipulation in C so none wants to do it".

The vulnerability in that article is patched. What's your point?

that if it was a shell script using tools that are onboard already it would likely not had those issues. As it is, opkg is a very "hot" piece of code that none wants to touch or even look at. That's usually a bad thing.
Shell scripts in OpenWrt are commonly checked, run through Shellcheck tool, and reworked by many people.

I don't think this is PR worthy - it's just a "proof-of-concept", and there is no place to PR it - it's not a patch for some existing package. This thread is referenced in the bug discussion, so whoever needs it will find it. If someone ever actually wants to implement more existing functionality with bash wrappers, then they should probably create a new package and propose its inclusion into OpenWRT project.

And I don't think this would interest the original opkg developers; augmenting an existing C-based package with bash wrappers is not a good idea. I think either someone would create a new package, which is how those things are supposed to work, or the original developers will finally believe that this problem is either a bug or bad design, and not technical limitations, and just fix opkg itself. I was just trying to prove that this is possible and not very hard. :slight_smile:

There is now a new version of opkg 2020-05-07 that should decrease the memory usage. Both in master and 19.07.

Changes in it:
https://git.openwrt.org/?p=project/opkg-lede.git;a=shortlog

Instead of building a complete package dependency tree internally, use a
lightweight list data structure to gather and sort package names, version and
descriptions.

This reduced the "opkg list" memory footprint on x86/64 from ~8MB to ~1.5MB.

...So they really did build a complete package dependency graph just to show names and descriptions! %)

Thanks for the good news! No need for those scripts anymore.

I suspect that the same thing could be applied to opkg update - maybe it also builds a complete dependency graph of already downloaded packages just to delete them and download them again. %)

I have a 32MB RAM rt305x, and I see a similar behaviour: the "free" RAM is around 6MB before opkg update, and 8MB after that. At this stage opkg list returns OOM. But if I run again opkg update, then it works, as well as opkg list. This happens with the public 18.06.8, and with one I built trying to save 1MB of RAM (flash is not a concern for my 8MB device), I observe the same both for CLI and for LuCI.

My opkg is 2020-01-25-c09fe209-1. Is that the one mentioned in previous posts? (opkg upgrade opkg leaves the same).

Yes, it's the same version as mine.