Support for RTL838x based managed switches

So, I figured out how it works on the XGS1010-12 at least :slight_smile: But first, about the optionality of it, it is somewhat optional, in that if you look at the guarded block, https://gitlab.com/olliver/openwrt/realtek_sdk/-/blob/xgs1250/loader/u-boot-2011.12/common/cmd_bootm.c#L1614 here, you'll see that this function has indeed a guard around it, however, not defining this function, makes the actual boota code break, as it relies on the existence of this function. I couldn't find a stub.

Anyway, what I tried before, is setting the bootpartition variable manually and do a savesys. I did not make it work, not sure why, but using boota 1 actually wrote this value to the sysenv partition (just I did manually) and it worked.

Well the vendor also supports tftpboot; so as there's no webinterface or reset switch to trigger it, there IS a serial port, there IS a console, and there IS a u-boot shell and we might have (haven't tried it) XYZ-modem. Not sure if this is the 'intended way' but you can/could update the firmware (u-boot also has the upgrade command right?). So if there would have been a critical problem, that required a firmware upgrade, zylinx has at least ONE option.

Doing boota 1 btw, will from now on always boot 1, and boota 0 will always boot 0 again. It stores it in the sysenv :wink:

Ready for feedback & testing ...

2 Likes

I'm trying to setup my IP via dhcp, however, while I see the requests comming in on my dnsmasq-dhcp server, the switch refuses to use it. Not sure if this is a dnsmasq issue or a patch set issue ...
Here's the logout any thoughts?

Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 available DHCP range: 192.168.163.10 -- 192.168.163.250
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 vendor class: udhcp 1.35.0
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 client provides name: OpenWrt
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 DHCPDISCOVER(enp0s18f2u1u3) 00:e0:4c:00:00:00 
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 tags: enp0s18f2u1u3
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 DHCPOFFER(enp0s18f2u1u3) 192.168.163.15 00:e0:4c:00:00:00 
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 requested options: 1:netmask, 3:router, 6:dns-server, 12:hostname, 
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 requested options: 15:domain-name, 28:broadcast, 42:ntp-server, 
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 requested options: 121:classless-static-route
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 bootfile name: /fitImage
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 server name: bootstrap-01
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 next server: 192.168.163.1
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 sent size:  1 option: 53 message-type  2
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 sent size:  4 option: 54 server-identifier  192.168.163.1
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 sent size:  4 option: 51 lease-time  5m
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 sent size:  4 option: 58 T1  2m30s
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 sent size:  4 option: 59 T2  4m22s
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 sent size:  4 option: 28 broadcast  192.168.163.255
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 sent size:  4 option: 42 ntp-server  192.168.163.1
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 sent size:  4 option:  6 dns-server  192.168.163.1
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 sent size:  4 option:  3 router  192.168.163.1
Jul 29 19:40:38 dnsmasq-dhcp[63]: 1973555716 sent size:  4 option:  1 netmask  255.255.255.0

and my uci network config

config interface 'lan'
	option device 'switch.1'
	option proto 'dhcp'
	option ip6assign '60'

btw, changing the static IP using uci or vi :stuck_out_tongue: and restarting the network is fine ...

maybe this is related to initramfs boot; or it having been setup with static during boot (due to the initramfs nature of course). I didn't check before (dumb) but after a reboot (while running from flash now) it works fine.

If anybody has an XGS1210-12 or XGS1250-12 and can do some testing (needs uart, nukes all flash) I have a build that gives us a bit more 'oompf'.

root@OpenWrt:/# df -h
Filesystem                Size      Used Available Use% Mounted on
/dev/root                 3.5M      3.5M         0 100% /rom
...
/dev/mtdblock6            8.4M    364.0K      8.1M   4% /overlay

As for the XGS1250-12, did you find a way to work around this issue?

What specifically? As my build contains all of @anon13997276 's patches :stuck_out_tongue:

I'll update my post with a link btw, to store the actual files :stuck_out_tongue: https://nextcloud.schinagl.nl/s/KeNEyGiKWtg8KTt (can't update the original post) so here they are. If anybody can try it on a 1250-12 or 1210-12 that'd be nice.

It also has to be mentioned, U-Boot will have to be informed to boot 'something else' now. To do so, from either the U-Boot shell, or using fw_setenv; setenv bootcmd 0xb4100000.

One thing I've noticed, the switch doesn't boot by itself yet.

HTP log info get: htpModeIf=1, htpBreakIf=9, hour=133, entry=8195, round=12 !
HTP Test log full!

which probably is related to some missing partitions. As the 1010-12 doesn't have the trailing log partition; It may rely on some other partition instead, some further investigation is needed on this one.

Manually booting is still possible without issue by run bootcmd however. To Be continued.

What do you mean specifically? You mention a network regression, but I cannot find the details.

Remember the log partition being wiped on the XGS1250-12 with the initial images? That's what I meant. If @olliver's images claim the remainder of the flash (like your earlier images did more or less), I don't expect to behave them any differently on the XGS1250-12, because the bootloader will balk without that partition.

I've updated my post to reflect this; I'll have to read into the code next to see what this self-test does, when it gets triggered, and where it tries to write (partition name, or absolute location).

That doesn't make or break the test-ability, but indeed, causes some headaches.

https://github.com/openwrt/openwrt/pull/10289

Have now updated this PR with a new dgs-1210 family device tree and soc change for the dgs-1210-10p.
So now i need reviewers!

I'm looking for an exhuberantly cheap 838x (and 839x) target for testing, can even be partially broken (housing, broken ethernet pins etc). Any thoughts?

Good offers regularly appear for the various (used) ZyXEL gs1900 devices, they are a bit marginal on flash space (IMAGE_SIZE := 6976k), but otherwise an easy buy (all members of the gs1900 family are realtek based, no fun with different h/w revisions).

2 Likes

Depends where you are situated. I was unable to find any reasonably priced zyxel devices in Australia.

I ended up settling for a Netgear gs310tp (~$200AUD).

Some of the ZyXEL GS1900 models are available from Amazon GS1900-8 for $69 is the least expensive.

Depending on your location I might be able to offer you one from my collection. Just PM me and I can tell you some of the options.

1 Like

Thanks all, I'm based in the Netherlands, and indeed used is very expensive right now. Where I really just need to to poke and prod various pins to see what's going on and to ensure I don't break anything with software :slight_smile:

Tweakers tends to have nice stuff once in a while. E.g. like this GS1900-24E which is supported?

1 Like

oh nice, I kinda gave up on tweakers V/A as there tends to be little interesting stuff generally; It crossed my mind though;

To be honest however, for the 838x i want something super small and super cheap; as I'll likely never use it. If this where a 839x one, i would have bought it though!

https://tweakers.net/aanbod/3063162/zyxel-gs1900-48.html looks interesting however!

So I figured it out, well partially, haven't figured out the WHY this happens, but indeed partitions.

So what happens is, whenever 'abort' is called (no idea yet as to why) we call sys_htp_run_case() which triggers the built-in self-test. The self-test in itself is harmless, actually I think it only tries to check the health of the log, which lives at the end of the flash. The nasty bit is, just before the logs, there is also a region where the write test is being executed on, so we really should not be touching that either.

As such, the fix is easy, but at the cost of 128k of flash.

			partition@fe0000 {
				/*
				 * This partition gets written to by the
				 * U-Boot built-in realtek HTP flash write
				 * test! Do not use for critical data!
				 */
				label = "HTP_logs";
				reg = <0xfe0000 0x200000>;
			};

I'll try to figure out why this test is being run at all; but in the end, it doesn't matter. With the realtek U-Boot these bytes are 'off-limits'.

1 Like

So it's a little more complicated it seems, which is made worse because I don't have the actual source to my particular device (U-Boot for 1010-12) and Zyxel has not yet responded. I'll do a new request for the code though ...

When looking at the code for the XGS1210-12 and XGS1010-12, we see that the logs are stored as I said above, at the end of the partition (https://gitlab.com/olliver/openwrt/realtek_sdk/-/blob/xgs1250/sdk/system/uboot/cmd/uboot_func.c#L3545) however, it looks like, zyxel in all its wise-ness, is storing this data at the start of 'runtime2' partition for the XGS1010-12. Due to the above check failing due to these bytes being read incorrectly, we get the same behavior as @Borromini mentioned earlier ...

Which means, that the XGS1010-12 is for the time being, artificially limited to 8MiB firmware slots, due to a check being done on this bytes :frowning:

I was hoping to avoid having to re-do U-Boot; because I'm not even sure we have all sources needed to compile U-Boot; and replacing U-boot is always 'scary'.

I might consider taking the XGS1210-12 U-Boot; but I'd need someone to donate it :slight_smile: @RaylynnKnight can probably help here. Which sadly means, more effort to support the device; but I suppose that's a risk we'd have to be willing to take ... Stupid vendor crap :frowning:

Small update; the hw selftest runs the flash write test on address 0xb48f0000 which is actually near the end of the actual firmware partition O.O 0xb4300000-0xb497ffff

[round:31]  Flash: fill pattern(0x80000000) from 0xb48f0000 to 0xb4900000 passed!

So, I looked more into the previously mentioned abortboot() function, which is actually a normal U-Boot function. What realtek did however, is hacked in their self-test check, which due to their sloppy ness is evident by alignment alone already :slight_smile:

What does that mean? U-boot checks if bit 28 is set (because they don't do 0 != htpModeIf or 0 > htpModeIf, anyway, bit 28 in flash-address 0x810000 which is in the middle of the flash. So one could argue, any of the self-test commands etc are at risk of corrupting flash/wiping flash, but by ensuring bit28 is always cleared, this would never happen, so the rest of the flash could be used, except for the fact that again, this is in the middle of the flash :S So for the xgs1010-12 this makes things annoying. Once I get the XGS1210-12 bootloader, I'll experiment some more, as I think the restrictions are 'better' due to these restrictions comming from the bootloader ... I'll put my findings in the wiki :frowning:

P.S. can we tell openwrt to use the second partition as storage? that would alleviate the pain; what name/ih-magic would it have to have?