Topic: Everything you need to know about broadcom hardware (Part 1)

Inside pretty much any home router or access point you'll find the following
- flash chip (2M, 4M or somewhat rarely 8M)
- ram (4x the amount of flash)
- cpu (mips; provided by a broadcom 47xx or 5352)
- 6 port vlan managed switch (adm6996l, or more commonly the broadcom "roboswitch")
- wifi (broadcom 43xx based)

Chances are that almost all of that functionality will come from one or two Broadcom chips. The ram and flash are the exception.

Depending on the device you could have as little as 2/8 (ram/flash) or as much as 8/32, but by far the most common combination is 4/16; probably an intel flash chip.

The flash chip can be represented as a large block of continuous space:

[ start of flash ...... end of flash ]

There is no ROM to boot from; at power up the CPU begins executing the code at the very start of flash. Luckily this isn't the firmware or we'd be in real danger every time we reflashed. Boot is actually handled by a section of code we tend to refer to as the boot loader. In Broadcom devices this is CFE -- "Common Firmware Environment"; think of it like the BIOS in your computer.

(note - in wrt54g v1.x hardware, it was actually another boot loader called "PMON", it wasn't until the wrt54g v2.0 that they switched to CFE; both provide the exact same functionality)

[ CFE ] [ firmware ....... ] [ NVRAM ]

(there's no actual partitions, just hard coded locations)

The job of the boot loader is to initialize the memory and other hardware and then begin booting the firmware. In most cases there's a recovery mechanism that allows you to reflash the firmware so that a bad flash doesn't render the device useless. CFE does this through the use of a TFTP server; this can be triggered by the firmware not matching the firmware checksum, the boot_wait variable or via CFE's serial console command line.

If you dig into the "firmware" section you'll find a trx. A trx is just an encapsulation, which looks something like this -

[ HDR0 ][ length ][ crc32 ][ flags ][ pointers ][ data ... ]

"HDR0" is a magic value to indicate a trx header, rest is 4 byte unsigned values followed by the actual contents. In short, it's a block of data with a length and a checksum. So, our flash usage actually looks something like this:

[ CFE ][ trx containing firmware ][ NVRAM ]

Except that the firmware is generally pretty small and doesn't use the entire space between CFE and NVRAM:

[ CFE ][ trx firmware  ][ unused ][ NVRAM ]

(Note: that the <model>.bin files are nothing more than the generic trx file with an additional header appended to the start to identify the model. The model information gets verified by the vendor's upgrade utilities and the remaining data -- the trx -- gets written to the flash. When upgrading from within openwrt remember to use the trx file.)

So what exactly is the firmware?

The boot loader really has no concept of filesystems, it pretty much assumes that the start of the trx data section is executable code. So, at the very start of our firmware is the kernel. But just putting a kernel directly onto flash is quite boring and consumes a lot of space, so we compress the kernel with a heavy compression known as LZMA. Now the start of firmware is code for an LZMA decompress:

[lzma decompress][lzma compreszsed kernel]

Now, the boot loader boots into an LZMA program which decompresses the kernel into memory and executes it. It adds a second to the bootup time, but it saves a large chunk of flash space. (And if that wasn't amusing enough, it turns out the boot loader does know gzip compression, so we gzip compressed the LZMA decompression program)

Immediately following the kernel is the filesystem. We use squashfs for this because it's a highly compressed readonly filesystem -- remember that altering the contents of the trx in any way would invalidate the crc, so we put our writable data in a jffs2 partition ouside the trx. This means that our firmware looks like this:

[trx (gzip'd lzma decompress)(lzma'd kernel)(squashfs filesystem)]

And the entire flash usage looks like this -

[CFE][trx (gz'd lzma)(lzma'd kernel)(squashfs)][ jffs2 filesystem ][NVRAM]

That's about as tight as we can possibly pack things into flash.

Why squashfs+jffs2?

System bootup is as follows -
- kernel boots from squashfs and runs /etc/preinit
- /etc/preinit runs /sbin/mount_root
- mount_root mounts the jffs2 partition (/jffs) and combines it with the squashfs partition (/rom) to create a new virtual root filesystem (/)
- bootup continues with /sbin/init

Both squashfs and jffs2 are compressed filesystems using LZMA for the compression. Squashfs is a readonly filesystem while jffs2 is a writable filesystem with journaling and wear leveling. Since squashfs is a readonly filesystem, it doesn't need to align the data, allowing it to pack the files tighter for 20-30% savings over a jffs2 filesystem.

Our job when writing the firmware is to put as much common functionality on squashfs while not wasting space with unwanted features. Additional features can always be installed onto jffs2 by the user. The use of mini_fo means that the filesystem is presented as one large writable filesystem to the user with no visible boundary between squashfs and jffs2 -- files are simply copied to jffs2 when they're written.

It's not all without side effects however -

The fact that we pack things so tightly in flash means that if the firmware ever changes, the size and location of the jffs2 partition also changes, potentially wiping out a large chunk of jffs2 data and corrupting the filesystem. To deal with this, we've implemented a policy that after each reflash the jffs2 data is reformatted. The trick to doing that is a special value, 0xdeadc0de; when this value appears in a jffs2 partition, everything from that point to the end of the partition is wiped. So, hidden at the end of the firmware images, is the value 0xdeadcode, positioned such that it becomes the start of the jffs2 parition.

The fact we use a combination of compressed and partially readonly filesystems also has an interesting effect on package management. In particular, you need to be careful what packages you update. While the ipkg util is more than happy to install an updated package on jffs2, it's unable to remove the original package from squashfs; the end result is that you slowly start using more and more space until the jffs2 partition is filled. The ipkg util really has no idea how much space is available on the jffs2 partition since it's compressed, and so it will blindly keep going until the ipkg system crashes -- at that point you have so little space you probably can't even use ipkg to remove anything.

Can we switch the filesystem to be entirely jffs2?

Yes, it's technically possible, but a bit of a mess to actually pull off. The firmware has to be loaded as a trx file, which means that you have to put teh jffs2 data inside of the trx. But, as I said above, the trx has a checksum, meaning that if you ever change that data, you invalidate teh checksum. The solution is that you install with the jffs2 data contained within the trx, and then change the trx bounaries at runtime. The end result is a single jffs2 partition for the root filesystem. Why someone would want to do it is beyond me; it takes more space, and while it would allow you to upgrade the contents of the filesystem you would still be unable to replace the kernel (outside of the filesystem), meaning that it's not a seemless upgrade between releases. Having squashfs gives you a failsafe mechanism where you can always ignore the jffs2 partition and boot directly off squashfs, or restore files to their original squashfs versions.

I used to have a trick where I could convert a squashfs install to a jffs2 install at runtime by copying all the data onto the squashfs partition and changing the partition boundaries. I never really had much use for the util -- not to mention it required a rather large flash to store both squashfs and jffs2 copies of the root durring transition -- so support for it was dropped.

As for the proper ways to recover a "bricked" router -

failsafe -
OpenWrt has a builtin failsafe mode which will attempt to bypass almost all configuration in favor of hardcoded defaults, resulting in a router that boots up as 192.168.1.1 with few if any services running. From this state you can telnet in and fix any problems you may have with the filesystem or configuration.

boot_wait -
The single best thing you can do is have boot_wait set, meaning that all you have to do is TFTP a new firmware. At one time the reflashing instructions included a an exploit for the Linksys firmware that set the boot_wait variable; as time progressed and Linksys eventually fixed the bug (after several failed attempts) we found that people were flashing to other firmwares for the sole purpose of setting boot_wait so they could reflash to OpenWrt. We figured this was somewhat pointless and altered the instructions to indicate that you could safely reflash to OpenWrt without setting boot_wait.

JTAG -
It's one of those amazingly useful things that allows you to recover from pretty much anything that doesn't involve a hardware failure. While the JTAG can technically be used to watch every instruction and register as the system boots, the recovery software only uses it for DMA access to the flash chip, making it somewhat a blind recovery mechanism.

The biggest mistake people seem to make with JTAG is the "wipe everything and reload CFE" approach; they either can't find the correct CFE version after wiping the device, or they reflash with a CFE which is incompatible with their device. You should always try to use the CFE version that came with the device rather than attempting to replace it with some random CFE you found on the internet.

Second mistake - embedded within CFE is a set of NVRAM defaults to be used if the NVRAM partition is missing. This means that in most cases you can just wipe everything but CFE and it'll happily boot, recreate NVRAM and start waiting for a firmware via TFTP. In some cases however, the defaults embedded defaults (in the CFE shipped with the device) don't match the actual hardware and CFE will fail to boot. This is why we have the warnings not to wipe NVRAM. To recover from this situation you need either the original NVRAM contents, or a version of CFE with the correct defaults.

Serial -
Serial consoles are great, there's just one problem - the routers run on 3.3v and a normal PC serial port puts out +/-12v, easily frying a router. This means that a level shifter such as a max233 is required, and adding the ICs and caps required is beyond the ability of most users -- luckily there's a shortcut. Most cellphones are either USB or 3.3v serial, so the data cable for a 3.3v cellphone can be used to make an easy and professional looking serial console connection. You only need to identify and connect 4 wires (vcc, rx, tx, gnd) -- and if your cable uses a pl2303 you can skip the vcc connection.

Serial console allows you to interact with the CFE command line, watch the kernel boot and console access to linux. This is probably the only way you'll every get any meaningful feedback about the device boot up.

LEDs -
Most people assume the LEDs on the front are deterministic, and that by telling you which LEDs are lit you can instantly tell if the hardware is working or where it crashed in bootup. This unfortunately isn't the slightest bit true.

- Power LED. The biggest mistake people make here is "my power led is blinking, what does that mean?". There's an assumption that if the LED is blinking there must be software turning the LED on and off, and that it must mean something. The blinking is actually done in hardware; software only as the ability to set the LED "on" or "blink" -- it defaults to blink on power up and isn't set to on until after the firmware boots. If the led is on then you know the firmware booted; blinking really doesn't tell you much.

- Switch LEDs. The second common mistake is "the switch still works". Of course the switch still works, it's a separate piece of hardware and the LEDs are wired directly to it. The only useful bit of information you can get is "all the switch LEDs are lit". When the switch chip is reset, all of the ports will light up (even if no devices are connected) for about a second; this happens at power up and again as the firmware boots and reprograms the switch. If they stay lit, you're either a moron for not noticing the ports are actually in use, or someone has broken/shorted the switch chip. You can also notice reboot loops by watching for the switch reset.

- Diag/DMZ LED. Controlled by OpenWrt (diag module) to indicate bootup.

- Wifi. Controlled by the wifi driver; trivia - the wifi driver can also reset the power led in certain situations.

....

Stupid things people do -

Pin shorting -

In the past we used to suggest that people shorted a few pins of the flash; when CFE booted and attempted to perform the CRC32 there would be a flash read error which would change the outcome of the CRC and the resulting failure would force CFE into recovery mode. It's a great trick, but over the years we've learned that people are idiots and will take that as an invitation to poke mangle and short just about every pin on the device based on some irrational belief that if they find the right pin everything will magically work again. You do not want someone paranoid at the thought of breaking the device scraping up every single electrical connection on the device -- it never ends well, and generally results in the flash chip or the router being damaged in the process.

- frying a chip (worst case)
- lifting/breaking electrical connections
- permanently shorting (best case)

The best case is that they simply bent a pin and you can easily bend it back - providing you can find it.

Depending on which pins are shorted/broken, it may be possible to access CFE but not to access the rest of the flash. Meaning CFE boots fine but can't read or write the firmware. This can be confirmed by JTAG.

Wrong CFE version -
Loading the wrong CFE version can also lead to devices which boot into CFE but are unable to write to the flash, or are unable to initialize the networking.

And yes, there are actually a few obscure versions that require the firmware to be named "code.bin" or a specific port to be used. Unfortunately nobody can remember exactly which devices, leading to all sorts of superstition.

Re: Everything you need to know about broadcom hardware (Part 1)

Great!

could you write a similar document about fon2100?

Yeah dude! ima hustla!
nowhere now here ---> http://giammin.blogspot.com/

Re: Everything you need to know about broadcom hardware (Part 1)

This is absolutely fantastic.  Can you shed some light on BCM43xx, wlc, nvram, and proprietary driver in your next installment?

Re: Everything you need to know about broadcom hardware (Part 1)

Fantastic!
Learn a lot.

Re: Everything you need to know about broadcom hardware (Part 1)

very good, to the point and pragmatic document mbm, Thank you

So the CFE is more than just a simple mover of date and decompresser? To even sustain execution it must be setting up a lof of parameters on the Broadcom CPU: interrupts, timers, prescaler. Then Ethernet (for the ftfp client) and IP stack and a serial driver (for the console). Speaking of serial, is it a bit-banged serial or the mobo actually had uart? And if it is based on uart then we are probably talking about setting up handler and interrupts just for the serial console. All of this packed into the CFE code? Wow.

Also I think the next step would be to publish a write-up on moving into the practical application of this great conceptual high-level summary. In other words, now that we know how the hardware works and how it allocates the images, lets see how the images are made, loaded and executed. I am talking about writing and compiling code, making the trx image, structure of the executables (the package discussion comes here, it is a departure from what traditional PC-Linux exibits) and ultimately how does one write the great "Hello World" program in c++ then compiles it (as a package may be), puts it into a trx, then flashes the trx to the router and from an ssh console prompt types ./hello and voila, the console says "Hello World" in response.

I think this will complete the excellent start of this thread. If I knew what to write I would love to have a clean 1-2-3 WiKi for those who are less then intimate with openWrt.

~B

Re: Everything you need to know about broadcom hardware (Part 1)

good job

7 (edited by PopOpen 2011-04-15 00:23:19)

Re: Everything you need to know about broadcom hardware (Part 1)

Great Info and amusing as well!

Perhaps you're the right person who might know something about this:

I am trying to understand the Firmware distributed with the Pirelli Routers, which is made by Jungo modified OpenRG FW. As most Pirelli routers are using Broadcom chips, they probably also use Broadcom-like firmware as you have described above. The problem is that in the Pirelli case the FW images used to be distributed as a .img files, but now they are only found in .rmt versions.  This is a problem because it means that we can not install OpenWrt FW images from the web interface. These image-types seem to differ only by a 0x200+ byte header, with a non trivial (non-text/info) 0x20 byte hash. I suspect that this is a CRC32 or SHA1 hash.

But which is it, and how do I calculate it? 

While Googling, I did found a very small note, mentioning that old Broadcom .trx images uses:
* Little Endian CRC32
* "Pre-inverted" (?)
* "No length" info... 
I s this true and what does it mean? It's unclear to me, even after checking WikiPedia...

I have created a thread about this here: "Help! To understand, extract and create Pirelli firmware images".

Re: Everything you need to know about broadcom hardware (Part 1)

Yaeh you are really great developer, mbm.
Very useful guide.

9 (edited by raiden 2011-12-02 07:35:38)

Re: Everything you need to know about broadcom hardware (Part 1)

Hello.

Is this kind of architecture still up to date with latest versions of OpenWRT?

Re: Everything you need to know about broadcom hardware (Part 1)

Excellent and thank you. Perhaps, you ought to put this to OpenWRT Wiki.

Mazi