Firewalla & Tailscale: Performance Bottleneck

Comments

37 comments

  • Avatar
    Firewalla

    Unless you run Tailscale inside the firewalla, the performance issue is likely purely at the endpoint (firewalla has zero knowledge of the traffic type)  And that limit can be

    • encryption used
    • the slowest circuit between the nodes
    • and many, many other things
    0
    Comment actions Permalink
  • Avatar
    binaryrefinery

    I'm interested and very curious to know more about the "many, many, other things" to learn more. I've since done a few more tests and it's more clearly pointing to a firewalla issue for me at the moment and although firewalla may have zero knowledge of the traffic type, it's IS doing something different with the tailscale traffic.

    Here's my further tests that brought me to this conclusion:

    I simplified and removed the switch from the equation. It is used as a baseline though that illustrates what I expect the firewalla to be able to do (i.e. 943 Mbps raw, 817 Mbps tailscale).

     - Machine 1 is connected to firewalla port 2 at 1GBps. 
    - Machine 2 is connected to firewalla port 1 at 2.5GBps.

    Nothing else connected to the firewalla at this time. Just these 2 lan ports.



    Results:

    Raw iperf3, no Tailscale: 940Mbps. Used port 5201 on iperf server.
    iperf3 over tailscale: 387Mbps.

    (Compared to 940/817 when both machines are connected to the hub).

    Ports used:

    Iperf3 raw: Used port 5201 on iperf3 server side.
    Iperf3 via tailscale: Used port 5021 via port 50380 (tailscale's direct port)


    Why do I think Firewalla's software is the bottleneck here between these two traffic types? Firewalla's CPU usage spikes when running the tailscale test.

    Using htop:

    idle: Cpu Average around 7.6%, with occasional spikes by FireMain and Redis
    iperf3 raw test: Cpu average around 30%. Firemain seems to be most of it. Load spread across cores.
    ipef3 tailscale test: CPU average around 70%, one CPU pinned to 100%. Again firemain is most of it.


    It's pretty clear to me that the firewalla CPU is getting in the way of the local lan traffic for tailscale. One core pinned to 100% while running iperf3 over tailscale, but a small well balanced spike when running it raw.

    I'd love to be able to provide more info... any help on deeper diagnostics than htop would be good to know.

    0
    Comment actions Permalink
  • Avatar
    Firewalla

    How are you running taliscale? is it all on you client PC? nothing on firewalla right?

    Next, are you running iperf3 server on the firewalla or your PC?

    (Meaning, in your test, firewalla is running firewalla, nothing is configured on it, no extra process?)

    0
    Comment actions Permalink
  • Avatar
    binaryrefinery

    I'm running tailscale on Machine 1 and Machine 2, NOT on Firewalla. Nothing extra is on firewalla vs out of the box. In this case, the firewalla should just be an Ethernet switch between its port 1 and port 2. Also, when I ran the tests, WAN was disconnected and nothing else was connected to the firewalla.

    In Firewalla, I have activated the following features, but it should be noted that if I turn "monitoring" off for both devices, there isn't any change in speeds.

    • Active Protect
    • Ad Block
    • DDNS
    • DNS Service
    • Device Port Scan
    • Family
    • Firewalla Web
    • Network
    • Open Ports
    • Routes
    • Wifi Test

    All other features (Data Usage, New Device Quarantine, Smart Queue, VPN Client/Server) are all disabled.

    IPerf3 server is on Machine 1, IPerf3 Client is on Machine 2. Speed doesn't meaningfully change if reversed (through tailscale it's the same, raw is a little slower when running the server on Machine 2).

    Running iperf3 on firewalla (without tailscale) to each of these machines yields:

    Firewalla (client) -> Machine 1 (server): 861 Mbps
    Firewalla (client) -> Machine 2 (server): 2.20 Gbps

    In terms of networking, both Machine 1 and Machine 2 have both IpV4 and IpV6 addresses from the firewalla DHCP.

    0
    Comment actions Permalink
  • Avatar
    Firewalla

    If you are not doing anything on the firewalla, unless taliscale is doing something out of the ordinary, and in theory, firewalla has no knowledge of taliscale at all... All the packets from it, are just IP traffic like anything else on the network. (This performance is regardless ipv4 or ipv6, firewalla just see the IP header and then forward the traffic)

    0
    Comment actions Permalink
  • Avatar
    binaryrefinery

    I've done some more tests, and it's all pointing towards firewalla's software stack getting in the way somewhere - with wireguard too, not just tailscale.

    The new test I did was with wireguard, not tailscale. This again is wireguard on both test machines, not on the firewalla. Here are the results.

    Netgear 1Gbps hub (GS305P):
    - Raw iperf3: 941 Mbps
    - Tailscale: 817 Mbps
    - Wireguard: 902 Mbps

    Firewalla: 1GBps / 2.5Gbps (or 1GBPs / 1 GBps ports)
    - Raw: 941 Mbps
    - Tailscale: 421 Mbps
    - Wireguard: 481 Mbps

    (Firewalla's numbers are slightly better today for some reason, but still about half the throughput of the netgear switch).

    So, both wireguard and tailscale are being hampered by Firewalla to the tune of about 50% throughput on the local lan. when compared to the netgear switch.

    The testing I have so far tells me it's something in firewalla's software stack doing something extra to the tailscale and wireguard packets:

    • No slow-down using my 1Gbps Netgear switch. The slowdown only happens when going through firewalla's physical ports.
    • Firewalla's different CPU spikes, maxing out for the tailscale traffic, but not for the raw traffic.

    What diagnostics or tests can I do on firewalla to help diagnose further? Can I safely kill some processes for the test, or inspect CPU usage spikes closer? Wireshark dumps?

    0
    Comment actions Permalink
  • Avatar
    Firewalla

    Let me get your post to a developer and have them take a look. 

    Do you have smart queue enabled while doing the testing? try to disable it if you do. 

    0
    Comment actions Permalink
  • Avatar
    binaryrefinery

    Thanks for continuing to look into it.

    Smart queue is disabled. I've also tried testing with firewalla monitoring turned off for both machines and it had no effect on performance.

    0
    Comment actions Permalink
  • Avatar
    Firewalla

    Do you happen to notice the packet size from both tests you are running? I think both services are UDP based.

    0
    Comment actions Permalink
  • Avatar
    binaryrefinery

    I didn't see the actual packet sizes used by iperf3 but the default MTU / qlen for the various interfaces are as follows:

    Ethernet: 1500 MTU / qlen 1000
    Wireguard: 1420 MTU / qlen 1000
    Tailscale: 1280 MTU / qlen 500
    0
    Comment actions Permalink
  • Avatar
    Hydralein

    Hello guys,

    I’m also seeing problems with wireguard udp traffic on my side. I'm using my macbook to connect to mullvad vpn (over wireguard). Without vpn i'm getting 940 MBit/s download. With wireguard I’m only getting 350 - 400 MBit/s. Before using my firewalla gold se I don’t have problems pushing 920 Mbit/s download over wireguard. I activated tcp obfuscation in the mullvad vpn config and then I’m getting now full 920 Mbit/s download speed. So the problems seems to be the udp traffic (wireguard) through the firewalla. I’m also seeing CPU spikes while using UDP.

     

    My setup:

     

    internet: 1000 down / 50 up

    hardware: MacBook Pro M1, Firewalla gold se

    software: mullvad vpn (WireGuard)

     

    Greetings from Germany

    0
    Comment actions Permalink
  • Avatar
    Firewalla

    Thank you @Hydralein; let me forward this to our team. In your test, you are not using the firewalla wireguard client. (it is your own client on PC/MAC?)

    1
    Comment actions Permalink
  • Avatar
    Hydralein

    yes the wireguard client running on my macbook. So the encryption / decryption is done on the macbook cpu. The firewalla should "just forward" the ip traffic. It seems that something (ex. ids) is slowing down the traffic. 

    0
    Comment actions Permalink
  • Avatar
    binaryrefinery

    Heya @firewalla - just wondering if there's any progress on this or if we can help diagnose further?

    0
    Comment actions Permalink
  • Avatar
    Firewalla

    Did you configure Taliscale using UDP or TCP? If you can, is it possible to try TCP? I do know there are some type of acceleration on the different packet types, so may be the acceleration part is having issue.

    Also if you do a speedtest (from the MAC) to the internet, do you get 1Gbit speed?

    0
    Comment actions Permalink
  • Avatar
    binaryrefinery

    Tailscale is configured in a default mode - it prefers UDP, but falls back to TCP. I don't see a way to force Tailscale to use TCP.

    Wireguard always uses UDP.

    Note, the slowdown I'm seeing is entirely on the local network between two machines when using wireguard or tailscale - nothing to do with the internet speed. But when I do speedtest from a machine connected to the 2.5gbs port, I get 1.7gbps down, and 1.1gbps up (which matches what I get from the Firewalla speed test).

     

    0
    Comment actions Permalink
  • Avatar
    Hydralein

    Hello,
    there are any news on this from the developers side?

    0
    Comment actions Permalink
  • Avatar
    Firewalla

    Let me give them a kick and let you know.

    0
    Comment actions Permalink
  • Avatar
    binaryrefinery

    Hi,

    Any updates or further thoughts on being able to diagnose this?

    Thanks!

    0
    Comment actions Permalink
  • Avatar
    Firewalla

    One of our developers took a look and said something about UDP packet acceleration. They may need some tuning for Taliscale to achieve higher speed. I am not sure if he was talking about hardware or software. I will follow up 

    0
    Comment actions Permalink
  • Avatar
    binaryrefinery

    Can we get an escalation and follow up on this?

    Clearly something is going wrong with the workings on Firewalla + LAN + Wireguard, even though documentation states that "Firewalla does not impact LAN traffic" - it clearly is in this instance.

    0
    Comment actions Permalink
  • Avatar
    Firewalla

    Let me push our developer a bit, they are stuck with 1.61 issues

    As of "Firewalla does not impact LAN traffic", are you doing wireguard within the same LAN? I thought you are doing wireguard from a PC to outside of the network?

    0
    Comment actions Permalink
  • Avatar
    binaryrefinery

    Both machines are on the same LAN. The traffic is not going externally to the WAN.

    0
    Comment actions Permalink
  • Avatar
    Eric Goldman

    Any updates on this issue?

    0
    Comment actions Permalink
  • Avatar
    Firewalla

    They have been playing with Taliscale off and on; not sure if they got anything. I'll give them a push. 

    0
    Comment actions Permalink
  • Avatar
    Hydralein

    Any updates here? Still having this problem ...

    0
    Comment actions Permalink
  • Avatar
    Hydralein

    Hello,

    I checked the new update but cannot find anything related to this in the bugfix part. You are still investigating this?

    0
    Comment actions Permalink
  • Avatar
    Firewalla

    @Hydralein, we recently pushed a PPPoE speedup image to the Gold units, not sure if it will make yours a bit faster. Are you on Gold SE? or Gold?

    0
    Comment actions Permalink
  • Avatar
    Hydralein

    Gold SE

    0
    Comment actions Permalink
  • Avatar
    Hydralein

    when you will push the PPPoE speedup image to gold se?

    0
    Comment actions Permalink

Please sign in to leave a comment.