OpenWrt Forum Archive

Topic: wget and utf-8

The content of this topic has been archived on 29 Mar 2018. There are no obvious gaps in this topic, but there may still be some posts missing at the end.

Hi!
I am trying to retrieve a utf-8 document with wget.
I want the swedish characters åäö ÅÄÖ to display correctly. Now they are displayed like "Ã¥ ä ö Ã… Ä Ö".
Is there something that I have to do with wget or with the resulting file?

A utf-8 example document.
wget -O - http://www.columbia.edu/~fdc/utf8/

Thanks for any help!

This is solely a local terminal display issue, wget does not change the contents at all.
For example in gnome-terminal you can choose Terminal -> Set Character Encoding -> Unicode (UTF-8) and your command will display correctly.

Thanks for a swift reponse!

How can I convert the retrieved document from utf-8 to whatever Openwrt needs to display it properly?
i.e "Ã¥ ä ö Ã… Ä Ö" as åäö ÅÄÖ

Thanks again!

You can't, again this has nothing to do with OpenWrt at all, it is a configuration setting of the terminal you use to view it, e.g. gnome-terminal, kterm or putty on windows.

Thanks for the heads up! - good help

for my purpose I did a simple replace in the text file

sed -i 's/Ã¥/a/g' /www/pd.txt
sed -i 's/ä/a/g' /www/pd.txt
sed -i 's/ö/o/g' /www/pd.txt
sed -i 's/Ã…/A/g' /www/pd.txt
sed -i 's/Ä/A/g' /www/pd.txt
sed -i 's/Ö/O/g' /www/pd.txt

The discussion might have continued from here.