Hi,
I'm having a problem with interactive/streaming traffic. In particular, with IPTV streaming of HD (720p-1080p) content for which should be enough 5-10 Mbit/s download connection.
The video content goes slowly and sometimes it blocks for a second or less.
Since I already checked and solved some problems with the ADSL link status with my ISP as reported here, I think it could be related to bufferbloat.
I'm trying to solve such problem with QoS SQM as described here and here.
I have the latest openWRT 18.06.1 r7258-5eb055306f with linux kernel 4.9.120.
OpenWRT reports for ADSL Data Rate: 16.547 Mb/s / 909 Kb/s and Max. Attainable Data Rate (ATTNDR): 16.660 Mb/s / 909 Kb/s. My ADSL2 parameters: G.992.5 (ADSL2+) with PPPoATM and MTU of 1478 (suggested by the ISP).
I setup layer_cake QoS with 14 Mbit/s download and 750 kbit/s (about 85% of the max) as you can see:
The strange thing is that a neighbourhood of mine (same street), with theoretical connection of 8 Mbit/s and practical of 5 Mbit/s and different ISP, can see the same content very well (with two devices ).
What could be the problems?
Thank you
Is there any WLAN link on your path? Temporarily replace it with a wired connection if possible.
Monitor the DSL link usage to check if you can see any reduced throughput. How long does it usually take until the video quality drops?
I would suggest to run a speed test for a longer time to reproduce the issue. You could also double-check with a different video source to rule out any bottleneck at the server end.
Does your IPTV stream use TCP? Cake's ack-filter on egress could help if that's the case. Since the down/up ratio of 18.2 (16547/909) is very asymmetric, the ACKs need a significant share of the upstream capacity, competing with other uploads. This can reduce the IPTV download throughput. Also make sure the TV/media player is the only device putting load on the DSL, at least during the test.
No, there is not any wireless link. I have modem/router home hub 5a (gigabit ethernet)-> hub netgear gs605v4 (gigabit ethernet)-> smart TV samsung (fast ethernet) or smart box (fast ethernet).
However, the video quality is good, but it is slow and not in real time e.g. sometimes it blocks for a second.
In order to perform the test, I used DSLreport with 60 seconds (maximum allowed). Do you know a better software or a Web site?
I asked more information to the server provider even if I do not think that it could be the problem since for other neighbourhood it works well.
Then I installed tcpdump, I saved the data via tcpdump -i pppoa-wan -v -U -s0 -w capture.cap and I analysed it via wireshark. With a big surprise for me, I found that the IPTV uses TCP. So it is point-to-point connection instead a more clever multicast connection (RTP over UDP).
The usage of TCP explains the reason for which the video quality is good and slow and as you suggested maybe is related to ACK transmission i.e. in wireshark I saw many duplicated ACKs and many TCP out-of-order segment and reassembled PDU.
How can I setup ack-filter? I found this interesting paper about piece of CAKE and openWRT. They show that ack-filter with aggressive mode produces great benefit. Moreover, the DiffServ4 with BestEffort option achieves a great improvement.
Do you know how to enabled them? I only found this on bufferbloat mailing list option eqdisc_opts 'nat dual-srchost ack-filter'
and this comprehensive man page.
Finally, I found that during the evening and night when the Internet connection is crowded, the speed is reduced a lot and it could be the source of the problem.
Morning
Evening
or
Note: the evening and night charts for upload have big initial spike at the beginning and at the end of the test.
I agree this is the source of the problem. Check your ADSL link status during such a time of congestion. Is the data rate also reported with lower numbers? Do you see a higher rate of errors?
Yes, that should be right, add ack-filter to option eqdisc_opts.
For 4.83Mb/s down, 0.112 Mb/s up, the ack filter might help a bit. But at 1.83/0.009, IPTV over TCP is unlikely to work at all. Even web browsing will be very slow.
First, according to the paper, diffserv4 is better than diffserv3 since it takes into account also video tin while ack-filter-aggressive is better than ack-filter.
Moreover, according to the documentation here, nat together with dual-dsthost and dual-srchost are more suited than triple-isolate. Finally, I also added pppoa-vcmux.
Some benchmarks seem to give better results.
Morning
Afternoon
I will report how it works during the evening and weekend.
In the meantime, I'm also trying to use multiple WAN connections with load balancing: main ADSL2 connection (wan) and backup 4G connection (via WiFi tethering from a smartphone) (wwan) with limited amount of monthly traffic.
I installed successfully mwan3 package and I setup load balancing between both connections according to the documentation.
I would like to use only main wan for all traffic except for traffic coming to an specific fixed IP belonging to the smart TV.
cat /etc/config/mwan3
config rule 'IPTV_out'
option proto 'all'
option sticky '0'
option use_policy 'balanced'
option src_ip '192.168.1.200'
config rule 'IPTV_in'
option proto 'all'
option sticky '0'
option use_policy 'balanced'
option dest_ip '192.168.1.200'
config rule 'default_rule'
option dest_ip '0.0.0.0/0'
option proto 'all'
option sticky '0'
option use_policy 'wan_only'
config policy 'wan_only'
list use_member 'wan_m1_w3'
list use_member 'wan6_m1_w3'
config policy 'balanced'
option last_resort 'unreachable'
list use_member 'wan_m1_w3'
list use_member 'wan6_m1_w3'
list use_member 'wwan_m1_w2'
config policy 'wan_wwan'
option last_resort 'unreachable'
list use_member 'wan_m1_w3'
list use_member 'wan6_m1_w3'
list use_member 'wwan_m2_w3'
config member 'wan_m1_w3'
option interface 'wan'
option metric '1'
option weight '3'
config member 'wan6_m1_w3'
option interface 'wan6'
option metric '1'
option weight '3'
config globals 'globals'
option mmx_mask '0x3F00'
option local_source 'lan'
config interface 'wan'
option enabled '1'
option family 'ipv4'
option reliability '2'
option count '1'
option timeout '2'
option interval '5'
option down '3'
option up '8'
option initial_state 'online'
list track_ip '1.1.1.1'
list track_ip '208.67.222.222'
list track_ip '208.67.220.220'
option track_method 'ping'
option size '56'
option check_quality '0'
option failure_interval '5'
option recovery_interval '5'
option flush_conntrack 'never'
config interface 'wan6'
option enabled '0'
list track_ip '2001:4860:4860::8844'
list track_ip '2001:4860:4860::8888'
list track_ip '2620:0:ccd::2'
list track_ip '2620:0:ccc::2'
option family 'ipv6'
option reliability '2'
option count '1'
option timeout '2'
option interval '5'
option down '3'
option up '8'
config interface 'wwan'
option enabled '1'
option initial_state 'online'
option family 'ipv4'
option track_method 'ping'
option count '1'
option size '56'
option check_quality '0'
option timeout '2'
option interval '5'
option failure_interval '5'
option recovery_interval '5'
option flush_conntrack 'never'
option down '3'
option up '8'
list track_ip '1.1.1.1'
list track_ip '208.67.222.222'
list track_ip '208.67.220.220'
option reliability '2'
config member 'wwan_m1_w3'
option interface 'wwan'
option metric '1'
option weight '3'
config member 'wwan_m1_w2'
option interface 'wwan'
option metric '1'
option weight '2'
config member 'wwan_m2_w3'
option interface 'wwan'
option metric '2'
option weight '3'
In this way, the smart TV should connect to Internet if one between wan and wwan is up. However, only when wan is up the smart TV works.
mwan3 status
Interface status:
interface wan is online and tracking is active
interface wan6 is unknown and tracking is down
interface wwan is online and tracking is active
Current ipv4 policies:
balanced:
wwan (40%)
wan (60%)
wan_only:
wan (100%)
wan_wwan:
wan (100%)
Current ipv6 policies:
balanced:
unreachable
wan_only:
unreachable
wan_wwan:
unreachable
Directly connected ipv4 networks:
127.0.0.0/8
192.168.43.74
224.0.0.0/3
127.0.0.0
192.168.1.1
192.168.43.0/24
94.38.203.143
192.168.43.0
192.168.43.255
213.205.53.51
127.255.255.255
192.168.1.0/24
127.0.0.1
192.168.1.255
192.168.1.0
Directly connected ipv6 networks:
fe80::/64
fd02:28a1:4744::/64
Active ipv4 user rules:
0 0 - balanced all -- * * 192.168.1.200 0.0.0.0/0
0 0 - balanced all -- * * 0.0.0.0/0 192.168.1.200
7 500 - wan_only all -- * * 0.0.0.0/0 0.0.0.0/0
Active ipv6 user rules:
23 1868 - wan_only all * * ::/0 ::/0
There really is no objective better for diffserv schemes only better or worse fitting to your use-case ;), same for the ack-filter, but since you tested that these work better for you, you can safely ignore my comment (which is mainly meant for others reading your post).
For benchmarking the dal-xxxhost modes seem better suited since their behaviour under typical benchmark loads seems easier to understand and predict than triple-isolate, for more realistic traffic patterns the differences between the two get less clear, but personally I also prefer the conceptual clarity of the dual-xxxhost keywords.
I made other test during the most crowed time of the day (evening-night).
Unfortunately, during some time both upload and download speed are too lows.
Moreover, all the countermeasure about QoS that I enabled are not able to solve the problem even if they improve the situation.
So the last possibility that I have before to give up is to try with load balancing.
Even with this I made some test, but I'm having two problems:
Enabling load balancing between ADSL (wan) and 4G (wwan) works only when both are online. In theory, I expected to work even if one of the two goes offline and openWRT should move accordingly the load the online link.
I would like to use load balancing only for one specific IP and the ADSL (wan) for all the network, but it does not seem to work.
Yes, it was more or less as usual. OpenWRT reported for ADSL Data Rate: 16.517 Mb/s / 912 Kb/s and Max. Attainable Data Rate (ATTNDR): 16.460 Mb/s / 909 Kb/s.
This seems to indicate congestion upstream of your access link, maybe the uplink of the DSLAM/MSAN. Could you try to run mtr against, say the website of a local university (to get a reasonably close well connected ICMP responder that is not colocalizing with your ISP) during the peak hours in the evening. The hallmark of a congested link is that all RTTs after that hop show an increased RTT compared to non-peak hours. Please note that reading traceroute/mtr is not as easy as it seems initially (see https://www.nanog.org/meetings/nanog47/presentations/Sunday/RAS_Traceroute_N47_Sun.pdf for more details)
I don't know the mwan package, but this is easily doable with policy routing in linux. The more complicated thing is if you want IPTV host to use ADSL unless it's too congested, and then failover... Because you need to detect the congestion.
Thank you for the material, I found it very interesting.
In order to use mtr as suggested by you, I first found the IP addresses of video source via tcpdump+wireshark (as explained in my second post of this thread). Then, I ran mtr against my "local" university (about 40 km, I live in the country side. I also tried mtr against some companies that are closed (10-15 km) and have optical fiber connection without any great different on results).
Finally, I ran mtr against the two most used IP addresses of video source (should be one for accounting and one for video stream) and the results are very similar to those of my local university. In particular, in any case (even if the IP is different) the third and fourth hops have high worst and stdev values even if the average value is normal.
Can be this the source of problem?
University
Account
Stream
Note: I have to repeat the test during the most crowded time on weekend.
P.S. After the test my IP has been banned (HTTP 403 forbidden error) and I cannot see the IPTV any more even if I can reach it via ping, traceroute, etc. I have to use VPN in order to access it. It is very strange behaviour...
From my understanding this is harmless, it mostly indicates that these hops are not optimised for responding to ICMP/UDP probes, this is quite typical behavior for routers. If all Wrst RTTs would be increased after those hops that would be different...
Sure, do this; but I am pessimistic that this is the root cause of your problems...
Ypu need to talk to the people managing the IPTV source, but it might indicate that it did not like you probing it constantly and might have misdiagnosed your network debugging with a nefarious attack on its services...
According to mwan documentation, it exploits policy routing of the linux kernel. In particular:
mwan3 uses normal Linux policy routing to balance outgoing traffic over multiple WAN connections
Linux outgoing network traffic load-balancing is performed on a per-IP connection basis
As such load-balancing will help speed multiple separate downloads or traffic generated from a group of source PCs all accessing different sites but it will not speed up a single download from one PC
In the meantime, I solved my first problem of load balancing between ADSL (wan) and 4G (wwan). Now I can randomly connect and disconnect, the two connections and it works well. I also added more 4G (wwan1 e wwan2) connections and the "hand-over" works well.
However, I'm still struggling against the second problem i.e. I cannot understand why load balancing for one specific IP does not work.
These rules should work as described in the documentation (matched from top to the bottom). In order, to understand better the behaviour I setup a fixed IP (192.168.1.100) to my laptop and I discovered that balanced mode works well only if it is applied to the whole network (only rule 4). The rules 1,3 do not produce any effect and can be removed.
In the Luci interface->status->load balancing->details, I saw this:
Active ipv4 user rules:
66 4636 - balanced all -- * * 192.168.1.100 0.0.0.0/0
0 0 - balanced all -- * * 0.0.0.0/0 192.168.1.100
718 45413 - wan_only all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 - wan_only all -- * * 0.0.0.0/0 0.0.0.0/0
It is very strange, rules 1,3 have active connections, but in practice they do not work.
It could be a bug? Wrong configuration?
I triple checked all the configurations (metric, gateway and conntrack for wan interface, etc.)
I cannot understand why load balancing for one specific IP does not work. @feckert@ptpt52 do you have any idea about this?
Thank you
Well, the Stream results look noticeably worse especially the last two hops, as compared to your off-peak hours test above, but that still seems somewhat acceptable to me. The small packet loss should not be a reason off concern, but you could use wireshark during peak-hour streaming and try to see how many missing packets you see for the real traffic.
@erotavlas
This is a stupid question, but do you realy need rules *_in?
Is it not enough to use only *_out?
The conntracking knows the way back to the host who starts the connection.
@feckert thank you very much for you reply.
I do not know exactly how mwan3 works. What I know come from the official documentation here.
I would like to use balanced rules only for a single device while wan_only rule for the rest of the network.
However, I found that the *_in rules are useless (no traffic). Do you confirm this? If so, what are the purpose of *_in i.e. destination address in the documentation: dest_ip Match traffic directed to the specified destination ip address
What about option flush_conntrack option under interface of mwan3?
The documentation only talks about conttrack under the firewall setting option conntrack '1'.
In order to better understand the behaviour, I changed a lot the configuration of my network.
I used VLAN in order to separate LAN, LAN_IPTV and WLAN traffic (as shown in the diagram).
In this scenario, the traffic of WLAN subnet is completely isolated from the other. However, there is still a problem between the two VLAN subnets. In particular, if I use the setup reported above, all works well while if I setup balanced only for the LAN_IPTV (wan_only for LAN) it does not work.
This behaviour is not normal in my opinion. Why is it necessary to have balanced LAN? This behaviour resembles the previous case without subnet and VLAN in which I need to have all the network balanced instead of a single IP.
P.S.
The *_in rules are useless also in this scenario.
Active ipv4 user rules:
1 56 - balanced all -- * * 192.168.2.0/24 0.0.0.0/0
0 0 - balanced all -- * * 0.0.0.0/0 192.168.2.0/24
1 118 - wan_only all -- * * 192.168.3.0/24 0.0.0.0/0
0 0 - wan_only all -- * * 0.0.0.0/0 192.168.3.0/24
497 33772 - balanced all -- * * 192.168.1.0/24 0.0.0.0/0
0 0 - balanced all -- * * 0.0.0.0/0 192.168.1.0/24
However, I found that the *_in rules are useless (no traffic). Do you confirm this? If so, what are the purpose of *_in i.e. destination address in the documentation:
I think the rules are useless. Only add a roule which match the source address and then apply the right policy. The returned package should find the way to the right client.
What about option flush_conntrack option under interface of mwan3
This will flush the conntrack table of the kernel on up/down events of the netifd