Using git repo as source doesn't retain .git in file tree after packing

svet · January 19, 2022, 5:26pm

As part of some work to port Rust software to OpenWrt (more details here Rust-lang (rustc/cargo) for OpenWrt - testing needed - #64 by dpape), I'm encountering the following issue:

In my Makefile (example) the source is set to a git repo. The repo as well as submodules are checked out just fine, to a temporary directory (./tmp/dl/rust-1.58.0). Then they go through the Packing checkout... step, which creates the archive ./dl/rust-1.58.0.tar.xz. That archive is then in turn unpacked to ./build_dir/hostpkg/rust-1.58.0/, where the build process takes place.

However along the way, the .git subdirectory gets lost. It exists (of course) in the temporary directory where the git tree is checked out, but is not packed into the .tar.xz archive which is actually used for the build. Is this by design? Is it possible to have .git be retained?

The Rust build process assumes that it is run from a checked out git repo, and for example runs the git command in order to get some details, like commit hashes. Bit since .git appears to get thrown own during the Packing checkout..., the above fails. What options do I have?

rpavlik · January 20, 2022, 3:03am

What packagers for other distros usually do (in general, not sure about Rust specifically) is patch out those git calls and instead supply the known data. You could try looking at the Debian packaging for rust, it's likely to arrive this problem in a useful way

Grommish · January 20, 2022, 3:10am

From $TOPDIR/include/download.mk:

# Only intends to be called as a submethod from other DownloadMethod
define DownloadMethod/rawgit
        echo "Checking out files from the git repository..."; \
        mkdir -p $(TMP_DIR)/dl && \
        cd $(TMP_DIR)/dl && \
        rm -rf $(SUBDIR) && \
        [ \! -d $(SUBDIR) ] && \
        git clone $(OPTS) $(URL) $(SUBDIR) && \
        (cd $(SUBDIR) && git checkout $(VERSION) && \
        git submodule update --init --recursive) && \
        echo "Packing checkout..." && \
        export TAR_TIMESTAMP=`cd $(SUBDIR) && git log -1 --format='@%ct'` && \
        rm -rf $(SUBDIR)/.git && \
        $(call dl_tar_pack,$(TMP_DIR)/dl/$(FILE),$(SUBDIR)) && \
        mv $(TMP_DIR)/dl/$(FILE) $(DL_DIR)/ && \
        rm -rf $(SUBDIR);
endef

svet · January 20, 2022, 10:25am

Thanks both! In particular thanks @Grommish for pointing out where it is that the .git gets removed. I'm still wrapping my head around the different inter-dependencies (i.e. magic) of the OpenWrt build process.

On balance, would you recommend that I:
A) Patch the Rust build script to hard-code (or similar) the commit hashes that it would normally obtain via calls to git;
OR
B) Override the DownloadMethod/rawgit from download.mk and remove the rm -rf $(SUBDIR)/.git line (not even sure how I would override that but maybe there's a way?)

(there are some in-between options also, but the above seem like the main ones)

By the way, I briefly looked at the Rust packaging CI, but I think what I'm describing is a non-issue in that case. I.e. there's nothing in that process (which appears to be run as a Github Action) that would strip out the .git subdir in the first place, the way that the OpenWrt build does it.

slh · January 20, 2022, 12:20pm

.git is removed intentionally (for many reasons, size, source tarball hygiene, license compliance (only the current state to consider, not potential past indiscretions), the fact that all source tarballs need to be retained on OpenWrt's archive mirror (GPL compliance) and more), removing that is not gonna fly. Therefore the only solution is the one outlined by @rpavlik.

svet · January 20, 2022, 1:38pm

Thanks @slh that's clear.

Also to be specific, I was just talking about the process for building the Rust toolchain (host package) locally - this would in principle not result in a tarball that would be published elsewhere, or that would be part of the compiled firmware. But I'm also coming to understand the point that there needs to be equivalence between this "locally-produced" source tarball and one that's potentially - down the line - hosted in an OpenWrt repo. And I agree that there are good reasons to strip .git out in general.

I'll see how to make the patching approach work.

svet · January 20, 2022, 6:09pm

I've patched the build script and it's fine - but so far commit hash hard-coded in the patch file, which isn't ideal. I'd like for the hash to be set in the Makefile, and then passed into the build script as an environment variable.

However I've come across a total newbie problem, which is that it's not clear to me how to set environment variables for the build. I naively assumed that variables set in the Makefile are present in the build environment, but that doesn't appear to be the case. I've not been able to easily find anything on this in the documentation or this forum. Any pointers?

Grommish · January 20, 2022, 6:21pm

CONFIGURE_VARS or TARGET_CONFIGURE_OPTS is probably what you are looking for, although, if you can find a way to add a bin path to the fakeroot $PATH, let me know