Missing kernel stack protector on x86_64/glibc is no longer necessary

Forgive the long post....

I noticed recently that my x86_64_glibc version (master) did not have kernel stack smashing protection enabled. Since most distros ship with it enabled by default, I got to questioning why. My Ubuntu 20.04 has CONFIG_STACKPROTECTOR_STRONG enabled in the kernel, so there's no technical reason why it can't be used

It turns out after looking at config/Config-build.in that there is explicit code to disable it in the kernel if we're not using musl and we're on an Intel platform. Moreover, there is code to enforce use of gcc's standalone libssp for userspace stack protection

	choice
		prompt "User space Stack-Smashing Protection"
		depends on USE_MUSL
		default PKG_CC_STACKPROTECTOR_REGULAR
		help
		  Enable GCC Stack Smashing Protection (SSP) for userspace applications
		config PKG_CC_STACKPROTECTOR_NONE
			bool "None"
		config PKG_CC_STACKPROTECTOR_REGULAR
			bool "Regular"
			select GCC_LIBSSP if !USE_MUSL
			depends on KERNEL_CC_STACKPROTECTOR_REGULAR
		config PKG_CC_STACKPROTECTOR_STRONG
			bool "Strong"
			select GCC_LIBSSP if !USE_MUSL
			depends on KERNEL_CC_STACKPROTECTOR_STRONG
	endchoice

	choice
		prompt "Kernel space Stack-Smashing Protection"
		default KERNEL_CC_STACKPROTECTOR_REGULAR
		depends on USE_MUSL || !(x86_64 || i386)
		help
		  Enable GCC Stack-Smashing Protection (SSP) for the kernel
		config KERNEL_CC_STACKPROTECTOR_NONE
			bool "None"
		config KERNEL_CC_STACKPROTECTOR_REGULAR
			bool "Regular"
		config KERNEL_CC_STACKPROTECTOR_STRONG
			bool "Strong"
	endchoice

The commit messages that accompany this code are 5 years old and 2 years old respectively. A lot has changed since then.

commit bf82deff7069599c9f130f5bb0222acd171fd19d
Author: Felix Fietkau <nbd@openwrt.org>
Date:   Sun Aug 2 07:40:12 2015 +0000

    build: disable kernel stack protector support for i386/x86_64
    
    When stack protector support is disabled in libc (always the case for
    !musl), gcc assumes that it needs to use __stack_chk_guard for the stack
    canary.
    This causes kernel build errors, because the kernel is only set up to
    handle TLS stack canaries.
    
    Signed-off-by: Felix Fietkau <nbd@openwrt.org>
    
    SVN-Revision: 46543
commit 241e6dd3e92c4f215b8ac75379a4b5aeaeb92171
Author: Julien Dusser <julien.dusser@free.fr>
Date:   Sun Jan 7 18:47:21 2018 +0100

    build: cleanup SSP_SUPPORT configure option
    
    Configure variable SSP_SUPPORT is ambiguous for packages (tor, openssh,
    avahi, freeswitch). It means 'toolchain supporting SSP', but for toolchain
    and depends it means 'build gcc with libssp'.
    
    Musl no longer uses libssp (1877bc9d8f), it has internal support, so
    SSP_SUPPORT was disabled leading some package to not use SSP.
    
    No information why Glibc and uClibc use libssp, but they may also provide
    their own SSP support. uClibc used it own with commit 933b588e25 but it was
    reverted in f3cacb9e84 without details.
    
    Create an new configure GCC_LIBSSP and automatically enable SSP_SUPPORT
    if either USE_MUSL or GCC_LIBSSP.
    
    Signed-off-by: Julien Dusser <julien.dusser@free.fr>

So I started to modify the build system to see if I could get kernel stack protection enabled. However, removing the line depends on USE_MUSL || !(x86_64 || i386) didn't help. The kernel config itself disables the stack protection options at configure/compile time.

After a bit of digging, the reason for this is that the check in the kernel build directory scripts/gcc-x86_64-has-stack-protector.sh fails. This script does the following check

#!/bin/sh
# SPDX-License-Identifier: GPL-2.0

echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -c -m32 -O0 -fstack-protector - -o - 2> /dev/null | grep -q "%gs"

Performing the same check on the gcc compiled with the openwrt toolchain results in the following code:

	.file	""
	.text
	.globl	foo
	.type	foo, @function
foo:
.LFB0:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	subq	$208, %rsp
	movq	__stack_chk_guard(%rip), %rax
	movq	%rax, -8(%rbp)
	xorl	%eax, %eax
	movl	$3, %eax
	movq	-8(%rbp), %rdx
	xorq	__stack_chk_guard(%rip), %rdx
	je	.L3
	call	__stack_chk_fail
.L3:
	leave
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	foo, .-foo
	.ident	"GCC: (OpenWrt GCC 9.3.0 r13242+9-e04ff3c7cc) 9.3.0"
	.section	.note.GNU-stack,"",@progbits

whereas running the same check on my ubuntu gcc compiler produces the following output:

	.file	""
	.text
	.globl	foo
	.type	foo, @function
foo:
.LFB0:
	.cfi_startproc
	endbr64
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	subq	$208, %rsp
	movq	%gs:40, %rax
	movq	%rax, -8(%rbp)
	xorl	%eax, %eax
	movl	$3, %eax
	movq	-8(%rbp), %rdx
	xorq	%gs:40, %rdx
	je	.L3
	call	__stack_chk_fail
.L3:
	leave
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	foo, .-foo
	.ident	"GCC: (Ubuntu 9.3.0-10ubuntu2) 9.3.0"
	.section	.note.GNU-stack,"",@progbits
	.section	.note.gnu.property,"a"
	.align 8
	.long	 1f - 0f
	.long	 4f - 1f
	.long	 5
0:
	.string	 "GNU"
1:
	.align 8
	.long	 0xc0000002
	.long	 3f - 2f
2:
	.long	 0x3
3:
	.align 8
4:

So it's clear why the check fails and also why it was disabled in the first place. This is the stack canary referred to in the commit above and the two compilers are producing different code, with a movq __stack_chk_guard(%rip), %rax produced by the openwrt compiler and a movq %gs:40, %rax produced by the host ubuntu compiler.

One, the kernel finds to be "safe" and the other it finds to be not safe (since libssp is a userspace library and its stack_chk_guard cannot be called from kernel space). So it disables stack protection.

The root cause, then, is that the compiler is using a different stack smashing protection mechanism on openwrt. A quick inspection of the compiler options in ubuntu shows that it is not using libssp. So there's the key difference, and the one that causes the kernel compilation to error out on x86_64: toolchain gcc configure options.

It also turns out that glibc now supports -fstack-protector in the libc code itself (as does musl). From the configure options for glibc 2.31, the current toolchain default, we can see that it does

glibc compile options

‘--enable-stack-protector’
‘--enable-stack-protector=strong’
‘--enable-stack-protector=all’
Compile the C library and all other parts of the glibc package (including the threading and math libraries, NSS modules, and transliteration modules) using the GCC -fstack-protector, -fstack-protector-strong or -fstack-protector-all options to detect stack overruns. Only the dynamic linker and a small number of routines called directly from assembler are excluded from this protection.

So there is no reason to use libssp in openwrt. gcc's libssp is a separate, standalone implementation of stack protection used if the libc variant does not support ssp.

At configure time, gcc checks if the target libc implementation provides the __stack_chk_fail symbol. If the target libc does not have the symbol, then gcc will add a -lssp_nonshared -lssp to the linker command when any stack protector compiler option is set. So libssp is only actually needed if the libc implementation does not have any stack protection functionality. Which current versions of glibc do have.

So I modified the toolchain's glibc common.mk to add the following

diff --git a/toolchain/glibc/common.mk b/toolchain/glibc/common.mk
index 768ff19060..b908afc50f 100644
--- a/toolchain/glibc/common.mk
+++ b/toolchain/glibc/common.mk
@@ -39,7 +39,6 @@ ifeq ($(ARCH),mips64)
   endif
 endif
 
-
 # -Os miscompiles w. 2.24 gcc5/gcc6
 # only -O2 tested by upstream changeset
 # "Optimize i386 syscall inlining for GCC 5"
@@ -61,6 +60,8 @@ GLIBC_CONFIGURE:= \
                --without-cvs \
                --enable-add-ons \
                --$(if $(CONFIG_SOFT_FLOAT),without,with)-fp \
+                 $(if $(CONFIG_PKG_CC_STACKPROTECTOR_REGULAR),--enable-stack-protector=yes,) \
+                 $(if $(CONFIG_PKG_CC_STACKPROTECTOR_STRONG),--enable-stack-protector=strong,) \
                --enable-kernel=4.14.0
 
 export libc_cv_ssp=no

and removed the dependencies on GLIBC_SSP in Config-build.in so that enabling userspace stack protection did not force the use of --enable-libssp and then rebuilt the toolchain with --disable-libssp.

This has the desired result, as the code produced by the openwrt compiler now looks identical to that produced by the host system compiler on my ubuntu dev box

	.file	""
	.text
	.globl	foo
	.type	foo, @function
foo:
.LFB0:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	subq	$208, %rsp
	movq	%gs:40, %rax
	movq	%rax, -8(%rbp)
	xorl	%eax, %eax
	movl	$3, %eax
	movq	-8(%rbp), %rdx
	xorq	%gs:40, %rdx
	je	.L3
	call	__stack_chk_fail
.L3:
	leave
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	foo, .-foo
	.ident	"GCC: (OpenWrt GCC 9.3.0 r13242+9-e04ff3c7cc) 9.3.0"
	.section	.note.GNU-stack,"",@progbits

--disable-libssp just disables the build of the libssp library, but gcc still actually supports the -fstack-protector* set of options. glibc's configure detects that these options are supported and enables building the stack protector functions in glibc itself, selecting which type via configure options shown above.

Setting all the hardening options to on in menuconfig, I proceeded to do a full system build and install.

What do you know. I have a fully hardened openwrt_x86_64_glibc variant. It boots and runs just fine.

Runtime checks show that the stack protector features are indeed enabled. I wrote a small 2 line program compiled with default CFLAGS that does a gets() into a 10 byte buffer to check the user-space stack protection and also verified the presence of the kernel stack protection (which now matches that of my ubuntu 20.04 system) via /proc/config.gz. Output below....

root@openwrt:~# uname -a
Linux openwrt 5.4.41 #0 SMP Thu May 14 21:12:59 2020 x86_64 GNU/Linux

root@openwrt:~# cat /etc/openwrt_release                  
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='SNAPSHOT'
DISTRIB_REVISION='r13242+9-e04ff3c7cc'
DISTRIB_TARGET='x86/64'
DISTRIB_ARCH='x86_64'
DISTRIB_DESCRIPTION='OpenWrt SNAPSHOT r13242+9-e04ff3c7cc'
DISTRIB_TAINTS='no-all glibc busybox'

root@openwrt:~# zcat /proc/config.gz | grep STACKPROTECTOR
CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
CONFIG_HAVE_STACKPROTECTOR=y
CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
CONFIG_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR_STRONG=y

root@openwrt:~# check-stack-protector
hjkalsdhssaldhjlsadh0o247uu032u4231pjkl;s
*** stack smashing detected ***: terminated
Aborted

root@openwrt:~# ls /lib/libc*
/lib/libc-2.31.so  /lib/libc.so.6  /lib/libcrypt-2.31.so  /lib/libcrypt.so.1
root@openwrt:~#

So, I'd be happy to submit a PR, however this is a change that affects the toolchain, the kernel and all userspace programs, so while the changes to the build system itself are relatively trivial, the implications are not. And so it merits some discussion first. Comments solicited.

In this day of default hardening, and especially in a network-exposed appliance, is there any reason that a x86_64 build should be running with no kernel stack protection and with gcc libssp instead of the inbuilt ssp in glibc 2.31?

You will reach a broader audience if you post this to the openwrt-devel mailing list.

https://lists.openwrt.org/mailman/listinfo/openwrt-devel

Posted to the list already. Thanks @tmomas