Firewalla & Tailscale: Performance Bottleneck
Hi,
I'm genuinely interested in learning a bit more about firewalla, and this issue seems to be a good place to start.
Issue: There's a big (57%) overhead on the local lan between 2 machines when they go through firewalla using tailscale. This overhead doesn't exist for non-tailscale traffic, or when the 2 machines are connected directly via an unmanaged switch without firewalla in between. It's as though firewalla is picking on tailscale traffic between the 2 machines, but not other traffic.
Here's a bit more detail on the setup and testing:
Machine 1 (M1): 1Gbps network, AMD Ryzen 5 2400G, Unraid 6.12.6
Machine 2 (M2): 2.5Gbps network, AMD Ryzen 7 5600G, Pop!_OS 22.04
Firewalla: Gold SE, V1.977
Switch: Netgear 5x 1Gbps unmanaged.
Tailscale: 1.56.1
Local subnet: 10.*.*.*/24
Tailscale Subnet: 100.*.*.*/32
Here are the two configs:
Config 1 (slow):
- M1 connected to switch. Switch connected to 1Gbps port 2 on firewalla. 1Gbps link speed.
- M2 connected to 2.5Gbps port 1 on firewalla. 2.5Gbps link speed.
Config 2 (fast)
- M1 and M2 connected to switch. Switch connected to 1GBps port 2 on firewalla. 1Gbps link speed everywhere.
And here are the speed test results between the machines using iperf3:
1. Config 1 + No Tailscale: 943 Mbps
2. Config 1 + Tailscale: 346 Mbps (63% overhead)
3. Config 2 + No Tailscale: 941 Mbps
4. Config 2 + Tailscale 817 Mbps (13% overhead)
I would expect that for a local lan connection physically going through firewalla that there would be little difference between any of the above scenarios, but clearly the firewalla is doing something with tailscale traffic on the local lan.
Why does this matter? 346 Mbps is still fast.
- It's unexpected and I'm curious and wanna learn why. :-)
- No, I don't need to use Tailscale VPN between these two local machines. But it was a testbed to isolate things. The actual issue is that both my friend and I have 1.7Gbps internet connections and want to connect our Unraids for mutual backup and large file transfers. In this case, it has to go through firewalla and it'll be limited to 344 mbps instead of closer to what should be 2-3 times faster.
I'm wondering:
- Is it to do with routing between 10.x and 100.x, even though tailscale is actually directly connected and not going through a relay. Shouldn't be... firewalla should just be seeing 10.x traffic, even if it's routed over tailscale.
- Is it to do with some kind of packet inspection, even for the local lan connections? Even then, the Gold SE is rated higher than this and why doesn't it do it for non-tailscale lan traffic?
- It can't really be because of the physical connection difference since it gets near 1Gbps without tailscale.
- ....?
Thanks in advance,
Trev.
-
I'm interested and very curious to know more about the "many, many, other things" to learn more. I've since done a few more tests and it's more clearly pointing to a firewalla issue for me at the moment and although firewalla may have zero knowledge of the traffic type, it's IS doing something different with the tailscale traffic.
Here's my further tests that brought me to this conclusion:
I simplified and removed the switch from the equation. It is used as a baseline though that illustrates what I expect the firewalla to be able to do (i.e. 943 Mbps raw, 817 Mbps tailscale).
- Machine 1 is connected to firewalla port 2 at 1GBps.
- Machine 2 is connected to firewalla port 1 at 2.5GBps.
Nothing else connected to the firewalla at this time. Just these 2 lan ports.
Results:Raw iperf3, no Tailscale: 940Mbps. Used port 5201 on iperf server.
iperf3 over tailscale: 387Mbps.
(Compared to 940/817 when both machines are connected to the hub).Ports used:
Iperf3 raw: Used port 5201 on iperf3 server side.
Iperf3 via tailscale: Used port 5021 via port 50380 (tailscale's direct port)
Why do I think Firewalla's software is the bottleneck here between these two traffic types? Firewalla's CPU usage spikes when running the tailscale test.Using htop:
idle: Cpu Average around 7.6%, with occasional spikes by FireMain and Redis
iperf3 raw test: Cpu average around 30%. Firemain seems to be most of it. Load spread across cores.
ipef3 tailscale test: CPU average around 70%, one CPU pinned to 100%. Again firemain is most of it.
It's pretty clear to me that the firewalla CPU is getting in the way of the local lan traffic for tailscale. One core pinned to 100% while running iperf3 over tailscale, but a small well balanced spike when running it raw.I'd love to be able to provide more info... any help on deeper diagnostics than htop would be good to know.
-
I'm running tailscale on Machine 1 and Machine 2, NOT on Firewalla. Nothing extra is on firewalla vs out of the box. In this case, the firewalla should just be an Ethernet switch between its port 1 and port 2. Also, when I ran the tests, WAN was disconnected and nothing else was connected to the firewalla.
In Firewalla, I have activated the following features, but it should be noted that if I turn "monitoring" off for both devices, there isn't any change in speeds.
- Active Protect
- Ad Block
- DDNS
- DNS Service
- Device Port Scan
- Family
- Firewalla Web
- Network
- Open Ports
- Routes
- Wifi Test
All other features (Data Usage, New Device Quarantine, Smart Queue, VPN Client/Server) are all disabled.
IPerf3 server is on Machine 1, IPerf3 Client is on Machine 2. Speed doesn't meaningfully change if reversed (through tailscale it's the same, raw is a little slower when running the server on Machine 2).
Running iperf3 on firewalla (without tailscale) to each of these machines yields:
Firewalla (client) -> Machine 1 (server): 861 Mbps
Firewalla (client) -> Machine 2 (server): 2.20 GbpsIn terms of networking, both Machine 1 and Machine 2 have both IpV4 and IpV6 addresses from the firewalla DHCP.
-
If you are not doing anything on the firewalla, unless taliscale is doing something out of the ordinary, and in theory, firewalla has no knowledge of taliscale at all... All the packets from it, are just IP traffic like anything else on the network. (This performance is regardless ipv4 or ipv6, firewalla just see the IP header and then forward the traffic)
-
I've done some more tests, and it's all pointing towards firewalla's software stack getting in the way somewhere - with wireguard too, not just tailscale.
The new test I did was with wireguard, not tailscale. This again is wireguard on both test machines, not on the firewalla. Here are the results.
Netgear 1Gbps hub (GS305P):
- Raw iperf3: 941 Mbps
- Tailscale: 817 Mbps
- Wireguard: 902 Mbps
Firewalla: 1GBps / 2.5Gbps (or 1GBPs / 1 GBps ports)
- Raw: 941 Mbps
- Tailscale: 421 Mbps
- Wireguard: 481 Mbps(Firewalla's numbers are slightly better today for some reason, but still about half the throughput of the netgear switch).
So, both wireguard and tailscale are being hampered by Firewalla to the tune of about 50% throughput on the local lan. when compared to the netgear switch.
The testing I have so far tells me it's something in firewalla's software stack doing something extra to the tailscale and wireguard packets:
- No slow-down using my 1Gbps Netgear switch. The slowdown only happens when going through firewalla's physical ports.
- Firewalla's different CPU spikes, maxing out for the tailscale traffic, but not for the raw traffic.
What diagnostics or tests can I do on firewalla to help diagnose further? Can I safely kill some processes for the test, or inspect CPU usage spikes closer? Wireshark dumps?
-
Hello guys,
I’m also seeing problems with wireguard udp traffic on my side. I'm using my macbook to connect to mullvad vpn (over wireguard). Without vpn i'm getting 940 MBit/s download. With wireguard I’m only getting 350 - 400 MBit/s. Before using my firewalla gold se I don’t have problems pushing 920 Mbit/s download over wireguard. I activated tcp obfuscation in the mullvad vpn config and then I’m getting now full 920 Mbit/s download speed. So the problems seems to be the udp traffic (wireguard) through the firewalla. I’m also seeing CPU spikes while using UDP.
My setup:
internet: 1000 down / 50 up
hardware: MacBook Pro M1, Firewalla gold se
software: mullvad vpn (WireGuard)
Greetings from Germany
-
Did you configure Taliscale using UDP or TCP? If you can, is it possible to try TCP? I do know there are some type of acceleration on the different packet types, so may be the acceleration part is having issue.
Also if you do a speedtest (from the MAC) to the internet, do you get 1Gbit speed?
-
Tailscale is configured in a default mode - it prefers UDP, but falls back to TCP. I don't see a way to force Tailscale to use TCP.
Wireguard always uses UDP.
Note, the slowdown I'm seeing is entirely on the local network between two machines when using wireguard or tailscale - nothing to do with the internet speed. But when I do speedtest from a machine connected to the 2.5gbs port, I get 1.7gbps down, and 1.1gbps up (which matches what I get from the Firewalla speed test).
-
Can we get an escalation and follow up on this?
Clearly something is going wrong with the workings on Firewalla + LAN + Wireguard, even though documentation states that "Firewalla does not impact LAN traffic" - it clearly is in this instance.
Please sign in to leave a comment.
Comments
37 comments