Larger/Beefier Firewalls?
We're in the initial stages of a POC for running Firewalla in business environments. As an MSP, we have somewhere around 60-70 potential sites for Firewalla, if we were to decide it is a viable option. It's easy to manage, easy to deploy, and offers pretty much everything an "enterprise-grade" firewall has to offer.
I'm just curious, I know Firewalla's mantra is small, easy, and affordable. But have there been any discussions, roadmaps, or considerations to have higher-end Firewalla boxes made? Full rackmount etc.?
Most of our SMB clients would be perfectly fine on Firewalla. Others would not. This is ok because we already have an enterprise-grade firewall vendor we use, I'm more curious than anything.
-
For a little more context, in our initial POC, we're doing somewhere between 10-15 Firewalla boxes on various sites. Various sizes, needs and use cases. We're at 9 of ~15 right now with the 10th going live next week.
With the options for DNS conditional forwarding, there is nothing we can't do with Firewalla that we can do with even our enterprise firewalls. It's just a matter of scalability imho.
-
The current Firewalla Golds can be mounted via this https://firewalla.com/products/firewalla-gold-rack-mount
We are always on the look out for bigger / faster units. The only problem is pricing; I know many want the 10gbit unit (LAN or WAN or both), we can't build that until we see the critical parts price come down a bit.
Are you looking for 10g?
-
10g is going to be a necessity for our larger sites. We can't even consider Firewalla for them without options for fiber and/or 10g. I'm still on the fence about putting firewalla in consideration for those types of sites, largely because of concerns over scalability.
For us, price isn't as much of a barrier as security, features, and scalability are. For something that is as well built and tuned as firewalla is, with the right features and hardware, we'd bend over backwards to replace some of the "bigger" names we use.
-
I think one of the sites we are currently running on a Firewalla Gold unit has somewhere around 150-160 active devices. The unit is handling that site just fine and it is a breath of fresh air. We even have a Purple unit on a smaller remote site which handles video (surveillance) and a large Guest WiFi network like it's nothing. The RAM runs high on the Purple box, but we expected that.
It would be nice to have some charts and/or recommendations from Firewalla on how many "users" or devices each type of box can realistically handle. Not really necessary at this point, but if you do add some larger more scalable units, would be handy to have when sizing boxes for sites.
-
It is fairly hard to put a number on devices/users. The reason is, we've see the Gold / Gold Plus driving 400 to 500 users/kids at schools at full load. (and customers complaining the Gold Plus was too hot running full 2.5gigabit P2P file sharing)
The load of the devices depends more on the type of traffic, the type of devices connecting to them. For example, handling 1000 thermal stats is likely lesser CPU than one PC running P2P at gigabit rate.
But, let me see if I can pull our ops and see if they have real world data on each of the platforms.
-
"That's fewer than I have in my home. My FWG handled it fine and now I have a FWG+ since I have dual gig ISP's..."
No doubt the Gold handles smaller sites really well. We expected to see them start to struggle with ~50 heavy office users and their devices in Teams meetings on a Monday morning... needless to say, they didn't struggle.
-
I'll talk to our product team and ask them to sacrifice a few chickens and beg the parts price to go down far enough to have us guild that magical 10gbit unit. I do know they are looking at RJ45 10g, but the pricing of the MAC + CPU + memory is never aligned to give us a unit we can sell below 1000 dollars.
-
"I'll talk to our product team and ask them to sacrifice a few chickens and beg the parts price to go down far enough to have us guild that magical 10gbit unit. I do know they are looking at RJ45 10g, but the pricing of the MAC + CPU + memory is never aligned to give us a unit we can sell below 1000 dollars."
Depending on what you pack into these 10g units and what they can realistically handle, I would be OK paying >$1000 for them :D.
-
A consumer router is a simple device where everything is done in the ASIC or SOC;
A firewalla that inspects packets at 10gbit line rate will require a much faster CPU+memory, since everything has to be 'inspected'. Since all interfaces on the firewalla are routable, instead of a switch with two 10gbit ethernet port, it has to use two 10gbit with independent MAC, which is also costly.
The eero pro gateway is a simple switch ... can't really use it to compare
-
I'll talk to our product team and ask them to sacrifice a few chickens and beg the parts price to go down far enough to have us guild that magical 10gbit unit. I do know they are looking at RJ45 10g, but the pricing of the MAC + CPU + memory is never aligned to give us a unit we can sell below 1000 dollars.
Honestly, $1000 is ok for me to pay for a 10gb firewall that can do that rate with inspection. But that is right about my limit. I think quite a few people would be fine paying that right now though.
I see that the Gold Plus has a quad core intel CPU and is able to do 5gbps inspection throughput. So Im curious what you really need for the 10g? Is it a lot more cores and memory at the same core speed? Or does traffic not scale well and you really just need a quad core still but ~2x the speed per core?
Im not sure what grade of "Intel quad core" you use, but I do know Intel has an Atom C3830 CPU with 12 cores @ 1.9GHz, 21w TDP, 12 pcie 3.0 lanes, looks like it has 2x 10gb NIC ports built into the CPU, and it costs $290. If it isn't more cores you need but more core speed, they have an 8 core version with the same TDP that ups the per-core speed to 2.2GHz instead, with a price of $170.
edit: or this model with 8 cores @2.4ghz and a slightly higher TDP but it moves up to 4x 10gb integrated NICs:
https://www.intel.com/content/www/us/en/products/sku/204840/intel-atom-processor-c3758r-16m-cache-2-40-ghz/specifications.htmlThis last one you can also buy a pre-built mini PC from Qotom in either the C3758 (2.2ghz) or the C3758R (2.4ghz) version to do some testing with for quite cheap. Jetway also has a board with this CPU.
ServeTheHome review of the unit:
https://www.youtube.com/watch?v=AKUTzjA1grE
Looks like that last CPU is capable of roughly 18.6gbps iperf3 traffic. Not sure how that will translate to inspected traffic throughput in the firewalla software stack though.
There is also the generational update to that CPU, the C5320, and Super micro sells the motherboard with CPU, cooling, and all the ports on board for $785. It has built in 2x 10gb networking to the chip.Just missing RAM, OS storage, and a chassis for a complete system. So likely this could be done for $1000 or less. The model is A3SPI-8C-LN6PF
There is also the AMD Ryzen V3C18I CPU that comes with 8 core/16 thread Zen3 cores at 1.9ghz base / 3.8ghz boost and a 15w TDP. Not sure on price, but these are beefier cores than Atoms usually are. These have 2x 10gb ethernet NICs built in.
https://www.amd.com/system/files/documents/ryzen-embedded-v3000-series-product-brief.pdf
Or there is the Ryzen Z1, which is meant for gaming handhelds. It has no integrated NICs, and less cores, but still comes with a 15w TDP and increases the base speed to 3.2GHz. So if less cores at higher speed is needed this could be a way to go and just use the 20 PCIe 4 lanes to get over to some NIC chips.
There is also the Xeon D-1713NT CPU, which has a significantly higher TDP but the price is around the same and jumps you up to having 2x 25gb ports integrated into it.
edit again: I did some searching for what gives most throughput on pfsense since that has a far larger userbase and testing than other linux gateways. I found some interesting info on it:
The issue is that pfSense is built on top of the BSD Packet Filter (BPF, of the "pf" of "pfSense"). BPF does quite a lot of bit-copying as it processes packets. This tends to eat CPU cycles and memory bandwidth. Performance also tends to degrade non-linearly at high speeds as more and more of these bit copies result in a cache-miss. This is why Netgate has started working on TNSR for their next generation router/firewall. TNSR is build on a much more efficient network stack called VPP that makes packet pipelines much more efficient.
That said - you can build a router on pfSense that will work decently using 10Gbe links, especially if you don't expect to see continuous flows at/near line rate.
pfSense needs three things to get there (in order of importance): larger cache, clocks and then memory bandwidth. Cache size makes the bit copies faster, clocks let it handle the interrupt rate, memory bandwidth limits the pain of cache misses. It does not benefit a lot from having more cores - though there is some benefit if you are processing multiple IP flows where the IO handling can be distributed. You also don't need a lot of memory.While I assume Firewalla running on Ubuntu does not use the BSD Packet Filter as that would be for BSD, I am assuming the packet filter used likely relies on a similar operating method. If true, it seems no more than 4 cores is likely needed as that provides enough to process plenty of connection streams, but rather higher core speed and more cache is desired. If the C3758R CPU has enough ghz and cache that would seem the best option, as it has more cache than any other CPU and has a nice TDP and price. But the Ryzen V3C18I or Xeon D-1713NT also have really good boost clocks and cache too. So those seem like it really just depends on how much boost you actually get when set to a reasonable TDP limit. If the C3758R doesnt quite have enough grunt, then Id bet the V3C18I likely would do it. The Ryzen ones also say they have Ubuntu drivers available so that seems like a win.
one last edit:
If you want to move back to ARM for a 10gb unit, NXP Semiconductor has a CPU series around the same price as all these others I listed here using a newer(ish) ARM A72 core design. These are meant for network devices and include things like a 105Gbit/s L2 switch, 50Gbit/s security accelerator, 88Gbit/s data compression/decompression engine, and 2MB packet caching buffer. These do say they have various options of integrated MACs as well, and one variant is 10x 1/2.5/10gb ports. The CPU models are LX2162A (16 core), LX2122A (12 core), and LX2082A (8 core)
-
Thank you @M.
We are only looking at x86 embedded CPU's at the moment. There are many trade-offs between these CPU's. For example, an Atom may have embedded SPF+ 10g ports, the CPU cores may not be powerful enough to chew packets, and the SFP+ ports can't really handle any speed beside 10/1gigabit.
So to be compatible with majority modems and your LAN devices, this 10G gold need to have RJ45 with 10/5/2.5/1 speed ...
-
We are only looking at x86 embedded CPU's at the moment. There are many trade-offs between these CPU's. For example, an Atom may have embedded SPF+ 10g ports, the CPU cores may not be powerful enough to chew packets, and the SFP+ ports can't really handle any speed beside 10/1gigabit.
So to be compatible with majority modems and your LAN devices, this 10G gold need to have RJ45 with 10/5/2.5/1 speed ...
Pardon my ignorance on the subject if this is incorrect, but wouldn't the physical port used partly depend on the PHY chip used? These CPUs I listed above do say they come with integrated Ethernet, and most specify that they have integrated MACs only. I was assuming they all, regardless of being called out as MAC only or not, really only had the MAC portion and would still require some PHY chip. In my search, I saw a review on the Qotom Q20331G9-S10 (with 3758R CPU, $350 for whole PC with 8GB RAM and 256GB SSD) that stated that while the CPU had 4x 10gb built-in, the PC was actually using 4 Marvel PHY chips with the system to get to the SFP+ ports. It also has built-in 1g/2.5g/10g MACs. So id think you would be able to use a Marvel PHY here to go to copper ports. Something like the Marvel 88X3340P which is a single chip, 4 port PHY that supports 10M/100M/1G/2.5G/5G/10G link rates. Or the 88X3310 single port version and use more than 1. There is also the Broadcom BCM84891L PHY chip as an alternative, which costs about $4 more but has .9w typical lower power consumption compared to the Marvell chip.
https://www.marvell.com/products/ethernet-phys.html
The flexibility of this device family enables extremely low power
across all structured wiring cable lengths, enabling dense
10Gbps applications. The Marvell Alaska X 88X3310/40Psupports
Category 6- (screened or unscreened), Category 6a- (Augmented)
and Category 7-type cables at full IEEE 802.3an range as well as
Category 5e type cables for data rates up to5Gbps and distances
up to 100m. The Marvell Alaska X 88X3310/40P enables both
copper and fiber applications with its unique auto- media-detect
mode. With this media plug-and-play feature the transceiver
can automatically detect whether there is a SFP+ fiber link, or
if there is an active copper link partner connected to the RJ-45
(10 Gbps/5Gbps/2.5Gbps/1000Mbps/100 Mbps/10 Mbps copper
applications.) Depending on the preferred media type, the
transceiver will automatically switch to the fiber or copper
line-side interface without any involvement from the userThe Ryzen V3000 embedded CPUs I listed above had some performance benchmarks showing roughly 2.5x performance over the Atom 3758R as well, which should be a really good amount of packet processing power, AMD has included Ubuntu drivers which is what the Firewalla software stack runs on, and these CPUs also come with 2x 1g/2.5g/10gb MACs. Plus they have 20 pcie gen 4 lanes which could be used to get over to additional network ports if more than 2 were desired. I did some searching for PCs with these embedded CPUs (to get a price baseline for the V3000 products) and unfortunately only found 1 company so far that uses the processors. But the industrial PC that the company sells with the Ryzen V3C18I CPU is an entire complete PC sold at the retail level for $950 according to the price quote I got. That PC also included 4 additional Intel i226 2.5gb RJ45 ports on it at that price, in addition to the 2x 10gb ports that come with the CPU. So its possible to use this embedded CPU for a product that sells at or under the $1000 mark
edit:
I believe that the NXP ARM chip (LX2162A) was the only one of them I listed that specifically calls out it included built-in PHY for some of its included Ethernet ports.
That C3758R PC can be had easily on Amazon for $350, and if you end up wanting to get a pre-made PC with the Ryzen V3C18I for some initial testing in your lab of the performance I can give you links to two distributors I found if you want.
-
The C3758R can't process (IDS/IPS/Packet inspect) at 10Gbit ... it's core speed is low. (even it has 8 of them)
As for the Physical layer, they are always outside of the CPU processor. C3758, for example, will have a MAC for SFP+; The RJ45-based MACs are usually a bit more complex since they need to handle 10/5/2.5/1/100mbit speeds.
-
Hey there Firewalla. Please checkout the Minisforum MS-01 2 x 2.5gbe, 2x10gbe and 32Gb of RAM and a core i9-13900h for about 829$ ... More than enough CPU to handle ANYTHING you can throw at it :).Also for like 699$ you can get is with a Core i9-12900H this cpu also has a TON of horsepower to handle all the advanced routing :). The Onboard 10Gbe ports are X710 ports with full hardware offload. I hope that this helps :).
-
Awesome to hear! Can't wait to purchase. Also if there is anything that I can do to help let me know. As a side note in the earlier post when the OP was talking about the builtin NICs on the C3758 was dual 10gig, I belive that the OP was referencing the integrated Intel MACs and the phy's builtin to the board. Detailed here https://ark.intel.com/content/www/us/en/ark/products/204840/intel-atom-processor-c3758r-16m-cache-2-40-ghz.html .
Stoked for Firewalla :). I'll do any dev testing you need :). Also here's an idea ... a limited HCL where you could download the firewalla software onto a specific hardware platform.
The ATOM C3955 should be fine and has 4 MAC's for 1/2.5/5/10 built in ... 16 cores over 2 GHZ should be able to handle all the IPS/IDS stuff :). Also should reduce cost a bunch.
-
Please be patient, it takes a lot of effort to build custom hardware, including heat, speed, cost, planning ... And once we do, we usually will do a pre-sale with early access units fairly fast. Mass production usually takes longer.
Yes C3955 is nice with the quad 10gbit RJ45 ports.
Please sign in to leave a comment.
Comments
21 comments