Rule Debugging and use of tcpdump

A user recently had a problem with a home security system. Here’s a case study to explain some advanced network debugging strategies using Firewalla. In this case, it happened to involve homekit, but the principles and strategies apply generally.

The Network Configuration

The security gateway was located on a VLAN and the homekit hubs (Apple TVs) were on a different LAN. mDNS was configured correctly across the LAN and VLAN and the only allowed traffic from the hubs to the security system Gateway. The Gateway could not communicate with the hub. The user thought that would be sufficient since the hub could issue commands to the Gateway.

Red devices are on the LAN and Purple are on an IoT VLAN.

The Problem

The user couldn't figure out why the Gateway would still frequently go offline from homekit's perspective and yet, work perfectly from the security app.

The problem was the requirements for the device and the manufacturer of the security system was not at all helpful in answering questions.

After checking and replacing cables, confirming that the switch was good, etc. we could rule either:

The Gateway had some strange bug.
Or something was wrong with the rules as defined.

In this situation the easy thing is to ease rules and see if things begin to work properly.

We made some adjustments to the rules. First attempt was to give the hubs (Apple TVs) uni-directional access to the gateway but even that didn't solve the problem. So I granted bi-directional access. How bizarre!?. Maybe the Gateway needed to talk to some other device on the network. But what device?

We expanded access to allow traffic to and from the entire LAN to the Gateway. Things started to work, but it made the VLAN kind of useless because access was so wide open. We made another change. Instead of allowing bi-directional traffic from the Gateway to the LAN in one rule, we made one rule allowing to and one rule allowing from. This let us see the counts of flows on each path.

This was very helpful. It turns out no traffic was going from the Gateway to the LAN. All the traffic was going from the LAN to the Gateway. But we still couldn't understand what other devices needed to talk to the Gateway besides the Hubs.

Unfortunately, Firewalla's UI doesn't show flows within local networks. We could use some trial and error approaches using the Hit Counts on Flows to see what is being allowed (or blocked). We can even reset the hit count when we are experimenting to make it really easy to see. We can also use Rule Diagnostics.

But Firewalla has other powerful tools like tcpdump which let us see the flows to solve this problem. We can use ssh to access firewalla and ran TCP dump like this:

% sudo tcpdump -i any "src [Security Gateway IP] and (not dst [Apple TV1 IP] and not dst [Apple TV2 IP])) and net [LAN CIDR range]"

In English this says, "Show me traffic that originates with the security Gateway and is not destined to the Apple TV’s, but is on the LAN." And boom. We began to see the problem.

This shows that if we try to use homekit, say from an iPad, the iPad needs to talk to the security Gateway. Our assumption that the iPad talked to the hub was wrong.

So the answer is all devices need to be able to access the Gateway, but the Gateway does not need to access all the devices.

Next, we want to look at it the other way around.

% sudo tcpdump -i any "dst [Gateway IP] and (src [Apple TV1 IP] or src [Apple TV2 IP] ) and net [LAN CIDR range]"

In English this says, show me traffic that originates from the hubs and is going to the Gateway and is on the LAN.

The result was that the Hubs talk to the Gateway.

One last query:

% sudo tcpdump -i any "src [Gateway IP] and (dst [iPad IP])"

So now we know:

iPad talks to the Gateway.
Hubs talk to the Gateway.
Gateway does not need to talk to the iPad.

With that in mind, we can solve the problem with one rule:

ALLOW traffic from LAN to Gateway

This is fine because it is uni-directional. IoT devices do not get access to the trusted LAN. By putting the rule on the Gateway, LAN devices do not get access to all of the other IoT devices. This is the least possible access that will allow things to work. Now the security gateway works perfectly and our network is secure.

Articles in this section

The Network Configuration

The Problem

Comments

Articles in this section

The Network Configuration

The Problem

Related articles