Ipsec differences between devices: is kmod-crypto-ctr the problem?

@cotequeiroz

crypto-ctr is built into the kernel for ipq40xx, because it has the qce hw-crypto module built-in, so it pulls crypto-ctr as a dependency. ramips & ath79 do not have that, so it is built as a module.

The patch adding the CRYPTO_ALG_KERN_DRIVER_ONLY should not cause too much trouble. The flag is used to identify a hw-accelerated algorithm: kernel-driver-only means something that is only available through the kernel--such as DMA access by the hw-crypto engine, not some CPU instruction that userspace can use directly.
The flag is perhaps being used somewhere to select, or filter-out, an algorithm--openssl devcrypto engine, for example, uses the flag to only enable kernel-hw-accelarated algorithms--so that the qce driver was not being used before the patch, but is now, and is not working properly. It's not clear to me what versions have you tested, but the 18.06 branch does not have that patch, so test it to find out if it works.

It may be worth checking out and comparing the contents of /proc/crypto for each device. It will show you available crypto algorithms, among other things, the driver & module being used, and their priority. You may verify ctr presence there, by looking for an algorithm named ctr(aes).
Cheers

1 Like

Thank you. That is super useful already. I'll test with 18.06 soon, but for now running with 18.04

more /proc/crypto | grep ctr

on the ipq40xx gives:

name         : ctr(aes)
driver       : ctr-aes-qce

and on the ramips the same command gives:

name         : rfc3686(ctr(aes))
driver       : rfc3686(ctr(aes-generic))
module       : ctr
name         : ctr(aes)
driver       : ctr(aes-generic)
module       : ctr

It's the rfc3686(ctr(aes)) that I care about.

My guess from the code is the gce code is adding ctr(aes) but not rfc3686(ctr(aes)), and yet the whole ctr module is not being compiled. What I can't figure out is why, and whether it is possible to mix kernel-hw-accelarated algorithms and kernel modules (ideally I wouldn't just disable hardware acceleration) to add rfc3686(ctr(aes)) support.

Any guidance really valued.

I'm running on self compiled branch v18.06.4 which I can confirm doesn't contain the patch; so it isn't the patch that is stopping the ctr.ko from being built

1 Like

I see that the qce driver sets the ctr-mode blocksize to 16 (confirm it with /proc/crypto), when the correct one should be 1 (it's a stream cipher). ctr.c checks for this before creating a rfc3686 instance. I'm not sure if this is enough to make it work, since other drivers publish rfc3686 separately.

Apply this as target/linux/ipq40xx/patches-4.14/182-crypto-qce-fix-ctr-blocksize.patch and recompile your image:

--- a/drivers/crypto/qce/ablkcipher.c
+++ b/drivers/crypto/qce/ablkcipher.c
@@ -299,7 +299,7 @@ static const struct qce_ablkcipher_def ablkcipher_def[] = {
                .flags          = QCE_ALG_AES | QCE_MODE_CTR,
                .name           = "ctr(aes)",
                .drv_name       = "ctr-aes-qce",
-               .blocksize      = AES_BLOCK_SIZE,
+               .blocksize      = 1,
                .ivsize         = AES_BLOCK_SIZE,
                .min_keysize    = AES_MIN_KEY_SIZE,
                .max_keysize    = AES_MAX_KEY_SIZE,

Ensure that /proc/crypto shows the blocksize for ctr(aes) to be 1 after applying the patch.

I don't have the hardware, so I won't be able to check this on my own. I caught the bug by looking at the output of /proc/crypto that @chunkeey posted here

@jeff, @chunkeey can you also confirm if the patch works? I just mean that /proc/crypto should show blocksize=1 for ctr(aes) with the ctr-aes-qce driver.

I don't necessarily think that this will bring up rfc3686 support on its own (I'm just hoping, really); it may require more effort, but the block size should be fixed, so, I will send patches to openwrt and upstream after confirmation.

I forgot, as part of testing, we need to ensure it encrypts right. Please use openssl with the devcrypto engine enabled, and the AES-128-CTR cipher enabled as well, then run:

# echo '0123456789abcdefgh' | openssl aes-128-ctr -e -nopad -iv 0f0e0d0c0b0a09080706050403020100 -a -K 000102030405060708090a0b0c0d0e0f
EJjLoYB5bd88Jp2+D8r8DCDOrg==

PS: This should preferably be done in 19.07 or master, using openssl 1.1.1, so we can confirm this is actually using the qce driver with openssl engine -t -c -pre DUMP_INFO -- if not, try to confirm it via /proc/interrupts.

Thanks. I'll apply the patch and run the test over the next couple of days.

I used the patch on master (191c3e49b9) which didn't work until I modified

@@ -299,7 +299,7 @@ static const struct qce_ablkcipher_def ablkcipher_def[] = {

to become

@@ -291,7 +291,7 @@

once applied the patch seemed to work and blocksize becomes 1 (instead of 16, which I verified was the previous result)

name         : ctr(aes)
driver       : ctr-aes-qce
module       : kernel
priority     : 300
refcnt       : 1
selftest     : passed
internal     : no
type         : ablkcipher
async        : yes
blocksize    : 1
min keysize  : 16
max keysize  : 32
ivsize       : 16
geniv        : <default>

However rfc3686 still doesn't appear.

I was also unable to run the test, because it prompted for a password and when provided with one completed with no obvious output.

Running openssl engine -t -c -pre DUMP_INFO produced the following (which doesn't look good)

3069834596:error:260AC089:engine routines:int_ctrl_helper:invalid cmd name:crypto/engine/eng_ctrl.c:87:
3069834596:error:260AB089:engine routines:ENGINE_ctrl_cmd_string:invalid cmd name:crypto/engine/eng_ctrl.c:255:
     [ unavailable ]

I forgot that, in order to minimize the use of ioctl, openssl will always send a full 16-byte block when performing AES-CTR, so it ends up not being a good test for this. I will provide instructions to run openssl nonetheless.

It seems your openssl is not configured to use /dev/crytpo. You must have libopenssl-devcrypto installed first. Then, edit your /etc/ssl/openssl.cnf and add these lines right before the first [section] line (it's usually [ new_oids ]). Technically, the openssl_conf=openssl_conf line needs to be placed in the first unnamed section, the other lines can go elsewhere:

openssl_conf=openssl_conf

[openssl_conf]
engines=engines

[engines]
devcrypto=devcrypto

[devcrypto]
default_algorithms = ALL

Then, try the openssl engine command again.

PS:
These instructions are for openssl 1.1.1--at least here, it does not ask for a password, are you using 1.0.2?

I used the patch on master (191c3e49b9) which didn't work until I modified

Just to be clear: changing the offset is fine. It won't affect the outcome.

Thanks for the guidance.

The test now runs, but produces a different result to yours:

# echo '0123456789abcdefgh' | openssl aes-128-ctr -e -nopad -iv 0f0e0d0c0b0a09080706050403020100 -a -K 00010203
0405060708090a0b0c0d0e0f 
EJjLoYB5bd88Jp2+D8r8DEfB8w==

and:

openssl engine -t -c -pre DUMP_INFO

gives:

(dynamic) Dynamic engine loading support
[Failure]: DUMP_INFO
3069621604:error:260AC089:engine routines:int_ctrl_helper:invalid cmd name:crypto/engine/eng_ctrl.c:87:
3069621604:error:260AB089:engine routines:ENGINE_ctrl_cmd_string:invalid cmd name:crypto/engine/eng_ctrl.c:255:
     [ unavailable ]
(devcrypto) /dev/crypto engine
Information about ciphers supported by the /dev/crypto engine:
Cipher DES-CBC, NID=31, /dev/crypto info: id=1, driver=cbc-des-qce (hw accelerated)
Cipher DES-EDE3-CBC, NID=44, /dev/crypto info: id=2, driver=cbc-3des-qce (hw accelerated)
Cipher BF-CBC, NID=91, /dev/crypto info: id=3, driver=cbc(blowfish-generic) (software)
Cipher CAST5-CBC, NID=108, /dev/crypto info: id=4, CIOCGSESSION (session open call) failed
Cipher AES-128-CBC, NID=419, /dev/crypto info: id=11, driver=cbc-aes-qce (hw accelerated)
Cipher AES-192-CBC, NID=423, /dev/crypto info: id=11, driver=cbc-aes-qce (hw accelerated)
Cipher AES-256-CBC, NID=427, /dev/crypto info: id=11, driver=cbc-aes-qce (hw accelerated)
Cipher RC4, NID=5, /dev/crypto info: id=12, CIOCGSESSION (session open call) failed
Cipher AES-128-CTR, NID=904, /dev/crypto info: id=21, driver=ctr-aes-qce (hw accelerated)
Cipher AES-192-CTR, NID=905, /dev/crypto info: id=21, driver=ctr-aes-qce (hw accelerated)
Cipher AES-256-CTR, NID=906, /dev/crypto info: id=21, driver=ctr-aes-qce (hw accelerated)
Cipher AES-128-ECB, NID=418, /dev/crypto info: id=23, driver=ecb-aes-qce (hw accelerated)
Cipher AES-192-ECB, NID=422, /dev/crypto info: id=23, driver=ecb-aes-qce (hw accelerated)
Cipher AES-256-ECB, NID=426, /dev/crypto info: id=23, driver=ecb-aes-qce (hw accelerated)

Information about digests supported by the /dev/crypto engine:
Digest MD5, NID=4, /dev/crypto info: id=13, driver=md5-generic (software), CIOCCPHASH capable
Digest SHA1, NID=64, /dev/crypto info: id=14, driver=sha1-qce (hw accelerated), CIOCCPHASH capable
Digest RIPEMD160, NID=117, /dev/crypto info: id=102, driver=rmd160-generic (software), CIOCCPHASH capable
Digest SHA224, NID=675, /dev/crypto info: id=103, driver=sha224-generic (software), CIOCCPHASH capable
Digest SHA256, NID=672, /dev/crypto info: id=104, driver=sha256-qce (hw accelerated), CIOCCPHASH capable
Digest SHA384, NID=673, /dev/crypto info: id=105, driver=sha384-generic (software), CIOCCPHASH capable
Digest SHA512, NID=674, /dev/crypto info: id=106, driver=sha512-generic (software), CIOCCPHASH capable

[Success]: DUMP_INFO
 [DES-CBC, DES-EDE3-CBC, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CTR, AES-192-CTR, AES-256-CTR, AES-128-ECB, AES-192-ECB, AES-256-ECB]
     [ available ]

I can confirm that I'm on OpenSSL 1.1.1d 10 Sep 2019.

I've started adding debug statements inside ctr.c which shows that crypto_ctr_alloc is executed, but neither crypto_rfc3686_create or crypto_rfc3686_create are.

This may be an indication that the patch did not work, or that something else is wrong here. I still have my doubts that openssl works 100% with AES-CTR. Have you tried the openssl line without the devcrypto engine (if you don't want to uninstall it, just comment the openssl_conf=openssl_conf line, and test that the DUMP_INFO does not return anything) , and with the engine, but without my patch applied?

It encrypts the first (full) block correctly, and gets the second (partial) wrong.

openwrt # echo 'EJjLoYB5bd88Jp2+D8r8DEfB8w==' | base64 -d | openssl aes-128-ctr -d -nopad -iv 0f0e0d0c0b0a09080706050403020100 -K 000102030405060708090a0b0c0d0
e0f
0123456789abcdefgWopenwrt #

It maybe the IV not getting updated correctly (my biggest concern about it not working in openssl). If that's the case, then even if the second block is complete, it will get it wrong. Try this:

# echo '0123456789abcdefghijklmnopqrstuvwxyz' | openssl aes-128-ctr -e -nopad -iv 0f0e0d0c0b0a09080706050403020100 -a -K 000102030405060708090a0b0c0d0e0f
EJjLoYB5bd88Jp2+D8r8DCDOzc8eMg0qAcLgnjpNdCnMp4HG5A==

I'm just guessing, but perhaps the rfc3686 algorithms are only alloc/created if they are requested. My x86_64 machine does not show them in /proc/crypto either, and it uses the standard aesni drivers in linux-4.19.

I'll take another look at this when I get some time.

Thanks.

With patch and with openssl_conf=openssl_conf:

# echo '0123456789abcdefghijklmnopqrstuvwxyz' | openssl aes-128-ctr -e -nopad -iv 0f0e0d0c0b0a09080706050403020100 -a -K
 000102030405060708090a0b0c0d0e0f
EJjLoYB5bd88Jp2+D8r8DCDOzc8eMg0qAcLgnjpNdClX0YDovg==
# echo 'EJjLoYB5bd88Jp2+D8r8DCDOzc8eMg0qAcLgnjpNdClX0YDovg==' | base64 -d | openssl aes-128-ctr -d -nopad -iv 0f0e0d0c0b
0a09080706050403020100 -K 000102030405060708090a0b0c0d0e0f
0123456789abcdefghijklmnopqrstuvwxyz

and with patch but without openssl_conf=openssl_conf:

# echo '0123456789abcdefghijklmnopqrstuvwxyz' | openssl aes-128-ctr -e -nopad -iv 0f0e0d0c0b0a09080706050403020100 -a -K
 000102030405060708090a0b0c0d0e0f
EJjLoYB5bd88Jp2+D8r8DCDOzc8eMg0qAcLgnjpNdCnMp4HG5A==
# echo 'EJjLoYB5bd88Jp2+D8r8DCDOzc8eMg0qAcLgnjpNdCnMp4HG5A==' | base64 -d | openssl aes-128-ctr -d -nopad -iv 0f0e0d0c0b
0a09080706050403020100 -K 000102030405060708090a0b0c0d0e0f
0123456789abcdefghijklmnopqrstuvwxyz

so both produce reversible, but different, results. Will try without patch later

Got it. It's the handling of the IV of the last block. It doesn't seems to be the IV increment, as the second block is OK. Here's a comparison of the hexdump of the ciphertexts:

# echo 'EJjLoYB5bd88Jp2+D8r8DCDOzc8eMg0qAcLgnjpNdClX0YDovg==' | base64 -d | hexdump -C
00000000  10 98 cb a1 80 79 6d df  3c 26 9d be 0f ca fc 0c  |.....ym.<&......|
00000010  20 ce cd cf 1e 32 0d 2a  01 c2 e0 9e 3a 4d 74 29  | ....2.*....:Mt)|
00000020  57 d1 80 e8 be                                    |W....|
00000025
# echo 'EJjLoYB5bd88Jp2+D8r8DCDOzc8eMg0qAcLgnjpNdCnMp4HG5A==' | base64 -d | hexdump -C
00000000  10 98 cb a1 80 79 6d df  3c 26 9d be 0f ca fc 0c  |.....ym.<&......|
00000010  20 ce cd cf 1e 32 0d 2a  01 c2 e0 9e 3a 4d 74 29  | ....2.*....:Mt)|
00000020  cc a7 81 c6 e4                                    |.....|
00000025

Now if I feed just the last block using the same IV, without incrementing it by 2, I get the same sequence as you have. I just need to figure out who's at fault here.

# echo 'wxyz' | openssl aes-128-ctr -e -nopad -iv 0f0e0d0c0b0a09080706050403020100 -K 000102030405060708090a0b0c0d0e0f | hexdump -C
00000000  57 d1 80 e8 be                                    |W....|
00000005

This is what it should have been done. Notice the 02 at the end of the IV:

echo 'wxyz' | openssl aes-128-ctr -e -nopad -iv 0f0e0d0c0b0a09080706050403020102 -K 000102030405060708090a0b0c0d0e0f | hexdump -C
00000000  cc a7 81 c6 e4                                    |.....|
00000005

The reverse works because it uses the same (wrong) procedure.

1 Like

I can confirm that the rfc3686() algorithms are only instantiated, and thus appear in /proc/crypto upon request. Sorry, I'm too lazy to write a standalone AF_ALG app to request it, but I do have a convoluted test for you to perform :wink:.
First, apply this: https://github.com/openwrt/openwrt/pull/1547.patch, which will change the afalg engine to support more algorithms; then patch it to use rfc3686(aes(ctr)) for AES-CTR:

--- a/package/libs/openssl/patches/600-e_afalg-rewrite-of-AF_ALG-engine.patch
+++ b/package/libs/openssl/patches/600-e_afalg-rewrite-of-AF_ALG-engine.patch
@@ -438,9 +438,9 @@ index 7f62d77e5b..79c8b2406c 100644
 +#ifndef OPENSSL_NO_RC4
 +    { NID_rc4, 1, 16, 0, EVP_CIPH_STREAM_CIPHER, "arc4" },
 +#endif
-+    { NID_aes_128_ctr, 16, 128 / 8, 16, EVP_CIPH_CTR_MODE, "ctr(aes)" },
-+    { NID_aes_192_ctr, 16, 192 / 8, 16, EVP_CIPH_CTR_MODE, "ctr(aes)" },
-+    { NID_aes_256_ctr, 16, 256 / 8, 16, EVP_CIPH_CTR_MODE, "ctr(aes)" },
++    { NID_aes_128_ctr, 16, 128 / 8, 16, EVP_CIPH_CTR_MODE, "rfc3686(ctr(aes))" },
++    { NID_aes_192_ctr, 16, 192 / 8, 16, EVP_CIPH_CTR_MODE, "rfc3686(ctr(aes))" },
++    { NID_aes_256_ctr, 16, 256 / 8, 16, EVP_CIPH_CTR_MODE, "rfc3686(ctr(aes))" },
 +#if 0                            /* Not yet supported */
 +    { NID_aes_128_xts, 16, 128 / 8 * 2, 16, EVP_CIPH_XTS_MODE, "xts(aes)" },
 +    { NID_aes_256_xts, 16, 256 / 8 * 2, 16, EVP_CIPH_XTS_MODE, "xts(aes)" },

Now compile openssl, opkg install libopenssl-afalg; and disable (or uninstall) libopenssl-devcrypto, so that it does not interfere. To disable it, comment its line under [engines]; you migth as well add afalg=afalg, and configure it just like devcrypto. You can do all that at once with sed -i -e s/devcrypto/afalg/g /etc/ssl/openssl.cnf. The patchset also changes the kmod-crypto-user package (to expose the drivers being used), so make sure to install your freshly built kmod-crypto-user package first. lsmod | grep crypto_user should show the module installed--notice that crypto_user that, ironically, was not part of kmod-crypto-user in openwrt.


Then run the openssl dump_info on the afalg engine:

openssl engine -t -c -pre DUMP_INFO afalg

You may get a harmless error the first time you run it (probably the algorithm was not available to the kernel when the info was collected):

Cipher AES-128-CTR, NID=904, AF_ALG info: name=rfc3686(ctr(aes)),  driver=**unreliable info** (acceleration status unknown)

Now, you should have the rfc3686 algorithms showing in /proc/crypto:

# cat /proc/crypto  | egrep rfc
name         : rfc3686(ctr(aes))
driver       : rfc3686(ctr-aes-neonbs)

You may run the tests you ran above, but if you do not configure the afalg engine to be used instead of devcrypto (just need to globally replace 'devcrypto' with 'afalg' in openssl.cnf), make sure to add -engine afalg to the openssl commands:

echo '0123456789abcdefghijklmnopqrstuvwxyz' | openssl aes-128-ctr -e -nopad -iv 0f0e0d0c0b0a09080706050403020100 -a -K -engine afalg

I'll wait for your feedback.

Edited to add the paragraph abount installing libopenssl-afalg.

Hopefully it answers the same question in an easier way. I've gone back to my original ipsec configuration (which failed) and run it on your patched build and it now works. I can also see my debug statements hitting from str.c so crypto_rfc3686 is being created on demand.

I'll go back and regression test with an unpatched version of the latest build to prove it is the patch that fixes it.

It does! :smiley:
All I needed was for you to instantiate the algorithm, and ipsec will do that. Now we need to actually prove it works for all cases.

See if you can document your tests, please. I will try to upstream the changes, but will not have the hardware to run tests myself. I don't foresee needing to show so much why it does not work, as it is apparent from the code itself, but we need to show that there are no side-effects due to the hardware not being capable of handling partial blocks, for example. Whatever code the engine runs inside is unknown.

OpenSLL always sends a full AES block to the kernel, even if len < 16, so it does not count. Perhaps it's best that I do write an AF_ALG test-case proving this, :thinking: but this is not happening this week for sure... See if you can document encryption/decryption of partial blocks (len < 16 bytes).

As for the openssl engine failure, it maybe due to cryptodev's relying ont the kernel driver to update the IV, but I can't find if this is actually required, or expected behavior. Documentation is sparse, and each driver seems to do things differently, so it may be a coincidence that CBC works like that, perhaps because everybody is using the same logic, but not for CTR. I'll have to look further into it. I'll probably just have openssl handle IV for all CTR cases, instead of relying on cryptodev to do it. The code for it is already there anyway, disabled by #if defined(COP_FLAG_WRITE_IV).

It seems like I did write that test before the weekend:

#include <stdint.h>
#include <sys/uio.h>
#include <sys/syscall.h>
#include <sys/socket.h>
#include <linux/if_alg.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

struct afalg_ctx_st {
  struct sockaddr_alg sa;
  int sfd, bfd;
};
typedef struct afalg_ctx_st afalg_ctx;

#ifndef USE_RFC3686
# define ALG_NAME "ctr(aes)"
# define IV_LEN 16
# define KEY_LEN 16
static char IV[] = { 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08, 0x07,
                     0x06, 0x05, 0x04, 0x03, 0x02, 0x01, 0x00 };
static char KEY[] = { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08,
                      0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f };
#else
# define ALG_NAME "rfc3686(ctr(aes))"
# define IV_LEN 8
# define KEY_LEN 20
static char IV[] = { 0x07, 0x06, 0x05, 0x04, 0x03, 0x02, 0x01, 0x00 };
static char KEY[] = { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08,
                      0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x11,
                      0x12, 0x13 };
#endif

static int CipherInit(afalg_ctx *ctx, char *alg_name, char *key,
                      size_t keylen, char *iv, size_t ivlen, int enc)
{
  struct sockaddr_alg sa;
  struct msghdr msg = { 0 };
  struct cmsghdr *cmsg;
  struct af_alg_iv *aiv;
  struct iovec iov;
  int op = enc ? ALG_OP_ENCRYPT : ALG_OP_DECRYPT;
  size_t set_op_len = sizeof op;
  size_t set_iv_len = offsetof(struct af_alg_iv, iv) + ivlen;
  char buf[CMSG_SPACE(set_op_len) + CMSG_SPACE(set_iv_len)];

  memset(&sa, 0, sizeof ctx->sa);
  sa.salg_family = AF_ALG;
  strcpy(sa.salg_type, "skcipher");
  strncpy(sa.salg_name, alg_name, sizeof sa.salg_name);
  if (( ctx->bfd = socket(AF_ALG, SOCK_SEQPACKET, 0)) < 0) {
    perror("Failed to open socket");
    goto err;
  }

  if (bind(ctx->bfd, (struct sockaddr *)&sa, sizeof sa) < 0) {
    perror("Failed to bind socket");
    goto err;
  }
  if (setsockopt(ctx->bfd, SOL_ALG, ALG_SET_KEY, KEY, KEY_LEN) < 0) {
    perror("Failed to set key");
  }
  if ((ctx->sfd = accept(ctx->bfd, NULL, 0)) < 0) {
    perror("Socket accept failed");
    goto err;
  }
  memset(&buf, 0, sizeof buf);
  msg.msg_control = buf;
  /* set op */
  msg.msg_controllen = CMSG_SPACE(set_op_len);
  cmsg = CMSG_FIRSTHDR(&msg);
  cmsg->cmsg_level = SOL_ALG;
  cmsg->cmsg_type = ALG_SET_OP;
  cmsg->cmsg_len = CMSG_LEN(set_op_len);
  memcpy(CMSG_DATA(cmsg), &op, sizeof op);
  /* set IV */
  msg.msg_controllen += CMSG_SPACE(set_iv_len);
  cmsg = CMSG_NXTHDR(&msg, cmsg);
  cmsg->cmsg_level = SOL_ALG;
  cmsg->cmsg_type = ALG_SET_IV;
  cmsg->cmsg_len = CMSG_LEN(set_iv_len);
  aiv = (void *)CMSG_DATA(cmsg);
  aiv->ivlen = IV_LEN;
  memcpy(aiv->iv, iv, IV_LEN);

  iov.iov_base = NULL;
  iov.iov_len = 0;
  if (sendmsg(ctx->sfd, &msg, 0) < 0) {
    perror("sendmsg: Failed to set op, iv");
    goto err;
  }
  return 1;
err:
  if (ctx->bfd >= 0)
    close(ctx->bfd);
  if (ctx->sfd >= 0)
    close(ctx->sfd);
  ctx->bfd = ctx->sfd = -1;
  return 0;
}

static int CipherUpdate(afalg_ctx *ctx, char *out, size_t *outl,
                        const char* in, size_t inl)
{
  struct msghdr msg = { 0 };
  struct cmsghdr *cmsg;
  struct iovec iov;
  ssize_t nbytes;
  int ret = 1;

  iov.iov_base = (void *)in;
  iov.iov_len = inl;
  msg.msg_iov = &iov;
  msg.msg_iovlen = 1;
  if ((nbytes = send(ctx->sfd, in, inl, MSG_MORE)) != (ssize_t) inl) {
    fprintf(stderr, "CipherUpdate: sent %zd bytes != inl %zd\n", nbytes, inl);
    if (nbytes <= 0)
      return 0;
    ret = 0;
  }
  if ((nbytes = read(ctx->sfd, out, (size_t) nbytes)) != (ssize_t) inl) {
    fprintf(stderr, "CipherUpdate: read %zd bytes != inl %zd\n", nbytes, inl);
    if (nbytes < 0)
      return 0;
    ret = 0;
  }
  if (outl != NULL)
    *outl = (size_t) nbytes;
  return ret;
}

static int CipherFinal(afalg_ctx *ctx)
{
   close(ctx->sfd);
   close(ctx->bfd);
   ctx->bfd = ctx->sfd = -1;
   return 1;
}

static char *CipherHex(unsigned char *text, unsigned int text_len)
{
  char *res = malloc(text_len * 3 + 1);
  char *res_ptr = res;
  unsigned char *text_ptr = text;

  if (res == NULL)
    return NULL;
  for(int i=0; i < text_len; i++) {
    snprintf(res_ptr, 4, "%02hhx ", *(text_ptr++));
    res_ptr += 3;
  }
  return res;
}

static int do_enc(char *cipher, char *key, size_t keylen, char* iv,
                  size_t ivlen, const char *text, size_t len, int enc, int n)
{
  afalg_ctx ctx;
  char *cipher_hex;
  char cipher_out[1024];
  int roundlen = len / n;

  if (!CipherInit(&ctx, cipher, key, keylen, iv, ivlen, enc)) {
    fprintf(stderr, "Error in CipherInit\n");
    return -1;
  }

  for (int i = 0; i < len; i += roundlen) {
    if (i + roundlen > len)
       roundlen = len - i;
    if(!CipherUpdate(&ctx, cipher_out, NULL, text + i, roundlen)) {
      fprintf(stderr, "Error in CipherUpdate\n");
      return -1;
    }
    cipher_hex = CipherHex(cipher_out, roundlen);
    printf("%36.*s - %s\n", roundlen, text + i, cipher_hex);
    free(cipher_hex);
  }
  if(!CipherFinal(&ctx)) {
    fprintf(stderr, "Error in CipherFinal_ex\n");
    return -1;
  }
  return 0;
}

int main(int argc, char **argv)
{
  afalg_ctx ctx;
  char *cipher_hex;
  char cipher_out[1024];
  char text[] = "0123456789abcdefghijklmnopqrstuvwxyz";

  for (int i=2; i < sizeof text; i += 2) {
    do_enc(ALG_NAME, KEY, KEY_LEN, IV, IV_LEN, text, i, 1, 2);
    printf("\n");
  }
  do_enc(ALG_NAME, KEY, KEY_LEN, IV, IV_LEN, text, sizeof text - 1, 1, 1);
  return 0;
}

Here's my run -- results are the same in my x86_64 desktop and in my WRT3200ACM:

$ ./staging_dir/toolchain-arm_cortex-a9+vfpv3_gcc-7.4.0_musl_eabi/bin/arm-openwrt-linux-gcc test_afalg_cipher.c -o arm-test_afalg_cipher && scp arm-test_afalg_cipher root@wrt:/root && ssh root@wrt /root/arm-test_afalg_cipher
arm-openwrt-linux-gcc: warning: environment variable 'STAGING_DIR' not defined
arm-openwrt-linux-gcc: warning: environment variable 'STAGING_DIR' not defined
arm-openwrt-linux-gcc: warning: environment variable 'STAGING_DIR' not defined
./staging_dir/toolchain-arm_cortex-a9+vfpv3_gcc-7.4.0_musl_eabi/lib/gcc/arm-openwrt-linux-muslgnueabi/7.4.0/../../../../arm-openwrt-linux-muslgnueabi/bin/ld: skipping incompatible /usr/lib/libc.so when searching for -lc
./staging_dir/toolchain-arm_cortex-a9+vfpv3_gcc-7.4.0_musl_eabi/lib/gcc/arm-openwrt-linux-muslgnueabi/7.4.0/../../../../arm-openwrt-linux-muslgnueabi/bin/ld: skipping incompatible /usr/lib/libc.a when searching for -lc
arm-test_afalg_cipher                                                                               100%   34KB   3.7MB/s   00:00
                                   0 - 10
                                   1 - 76

                                  01 - 10 98
                                  23 - 75 95

                                 012 - 10 98 cb
                                 345 - 74 92 91

                                0123 - 10 98 cb a1
                                4567 - 73 93 92 92

                               01234 - 10 98 cb a1 80
                               56789 - 72 90 93 9d 4c

                              012345 - 10 98 cb a1 80 79
                              6789ab - 71 91 9c 9c 14 3c

                             0123456 - 10 98 cb a1 80 79 6d
                             789abcd - 70 9e 9d c4 17 3d 04

                            01234567 - 10 98 cb a1 80 79 6d df
                            89abcdef - 7f 9f c5 c7 16 3a 05 22

                           012345678 - 10 98 cb a1 80 79 6d df 3c
                           9abcdefgh - 7e c7 c6 c6 11 3b 06 23 06

                          0123456789 - 10 98 cb a1 80 79 6d df 3c 26
                          abcdefghij - 26 c4 c7 c1 10 38 07 2c 07 d8

                         0123456789a - 10 98 cb a1 80 79 6d df 3c 26 9d
                         bcdefghijkl - 25 c5 c0 c0 13 39 08 2d 04 d9 fd

                        0123456789ab - 10 98 cb a1 80 79 6d df 3c 26 9d be
                        cdefghijklmn - 24 c2 c1 c3 12 36 09 2e 05 de fc 82

                       0123456789abc - 10 98 cb a1 80 79 6d df 3c 26 9d be 0f
                       defghijklmnop - 23 c3 c2 c2 1d 37 0a 2f 02 df ff 83 39

                      0123456789abcd - 10 98 cb a1 80 79 6d df 3c 26 9d be 0f ca
                      efghijklmnopqr - 22 c0 c3 cd 1c 34 0b 28 03 dc fe 9c 38 4b

                     0123456789abcde - 10 98 cb a1 80 79 6d df 3c 26 9d be 0f ca fc
                     fghijklmnopqrst - 21 c1 cc cc 1f 35 0c 29 00 dd e1 9d 3b 4a 75

                    0123456789abcdef - 10 98 cb a1 80 79 6d df 3c 26 9d be 0f ca fc 0c
                    ghijklmnopqrstuv - 20 ce cd cf 1e 32 0d 2a 01 c2 e0 9e 3a 4d 74 29

                   0123456789abcdefg - 10 98 cb a1 80 79 6d df 3c 26 9d be 0f ca fc 0c 20
                   hijklmnopqrstuvwx - d3 b6 92 d7 82 ae 07 d5 a1 b2 c7 2c 04 f9 02 e2 3e

                  0123456789abcdefgh - 10 98 cb a1 80 79 6d df 3c 26 9d be 0f ca fc 0c 20 ce
                  ijklmnopqrstuvwxyz - d2 b5 93 d0 83 ad 06 ca a0 b1 c6 2b 05 fa 03 ed 3f b4

0123456789abcdefghijklmnopqrstuvwxyz - 10 98 cb a1 80 79 6d df 3c 26 9d be 0f ca fc 0c 20 ce cd cf 1e 32 0d 2a 01 c2 e0 9e 3a 4d 74 29 cc a7 81 c6

The results mean that the IV is increased by one for each call, even when there are fewer bytes than one AES block, and even if the next call would still be under the same block. Notice that when the first block is full (3rd partial block from the bottom), the ciphertexts match the one-shot case.

Your run do not need to necessarily match mine--and it probably won't because of the IV handling by the qce hardware--but they should be the same on your openwrt before and after the patch is applied.

qce hw-crypto breaks ipsec, disabling CRYPTO_HW is needed.