These operate differently, software offloading still uses the Linux networking stack just "less" of it (or more rarely) while hardware offload happens hidden completely from the network stack, so I would assume that with hardware offload active software offloading will not help anymore, but that should be easy to try.
Sidenote: most hardware offloads are considerably less general than Linux' network stack, the result is if you stay within their capability envelop they will do what you show high throughput with little CPU load, but they can be a bit fickle and even a small change in configuration might result in worse performance. That said accelerators are typically tailored to typical use-cases so likely will work well for most users.