MSP API Calls to Get Flow Data - a practical example
After a lot of research, trial and error, and very patient help from the great Firewalla support team, I can now pull flow data from the MSP portal.
There are two output formats. One in single line JSON and the other in a format that can be easily imported into other tools such as a spreadsheet. The commands shown here can be easily tweaked to suit specific and individual requirements.
The JSON output:
{"ts":"04/04/2023 09:55:19","ip":"84.206.126.111","devicePort":"14330","devicePortInfo":null,"protocol":"tcp","country":"HU","fd":"out","direction":null,"blocked":true,"networkName":"WAN","audit":null}
{"ts":"04/04/2023 09:55:17","ip":"87.246.7.206","devicePort":"8080","devicePortInfo":"http-alt","protocol":"tcp","country":"BG","fd":"out","direction":null,"blocked":true,"networkName":"WAN","audit":null}
{"ts":"04/04/2023 09:55:13","ip":"185.156.73.53","devicePort":"35708","devicePortInfo":null,"protocol":"tcp","country":"RU","fd":"out","direction":null,"blocked":true,"networkName":"WAN","audit":null}
{"ts":"04/04/2023 09:55:11","ip":"89.248.165.109","devicePort":"14589","devicePortInfo":null,"protocol":"tcp","country":"NL","fd":"out","direction":null,"blocked":true,"networkName":"WAN","audit":null}
{"ts":"04/04/2023 09:54:58","ip":"94.102.61.44","devicePort":"40694","devicePortInfo":null,"protocol":"tcp","country":"NL","fd":"out","direction":null,"blocked":true,"networkName":"WAN","audit":null}
{"ts":"04/04/2023 09:54:38","ip":"167.248.133.161","devicePort":"9455","devicePortInfo":null,"protocol":"tcp","country":"US","fd":"out","direction":null,"blocked":true,"networkName":"WAN","audit":null}
{"ts":"04/04/2023 09:54:35","ip":"79.124.62.82","devicePort":"33883","devicePortInfo":null,"protocol":"tcp","country":"BG","fd":"out","direction":null,"blocked":true,"networkName":"WAN","audit":null}
{"ts":"04/04/2023 09:54:25","ip":"89.248.163.26","devicePort":"21469","devicePortInfo":null,"protocol":"tcp","country":"NL","fd":"out","direction":null,"blocked":true,"networkName":"WAN","audit":null}
The CSV output:
04/04/2023 09:55:19 84.206.126.111 tcp HU
04/04/2023 09:55:17 87.246.7.206 http-alt tcp BG
04/04/2023 09:55:13 185.156.73.53 tcp RU
04/04/2023 09:55:11 89.248.165.109 tcp NL
04/04/2023 09:54:58 94.102.61.44 tcp NL
04/04/2023 09:54:50 79.124.62.130 tcp BG
04/04/2023 09:54:45 163.47.36.34 tcp BD
04/04/2023 09:54:42 94.102.61.5 tcp NL
The Linux Bash script that generates the output
#!/bin/bash
token="4ab3...143d" # Enter your API token
current_time=$(date +%s)
start_time=$(date -d "4 minutes ago" +%s) # Adjust as needed to define the time span of records to retrieve
limit=50 # Adjust as needed - max seems to vary but is around 5,000
URL="https://foobar.firewalla.net/v1/flows/query" # This is your MSP URL
echo -e "\nPulling flows from now $(date) and $(date -d "4 minutes ago")\n"
curl -s --request POST --url "$URL" \
--header 'Authorization: Token '"$token" \
--header 'Content-Type: application/json' \
--data '{ "limit": '"$limit"', "start": '"$start_time"', "end": '"$current_time"', "filters":[{"key":"direction","values":["out"]},{"key":"status","values":["audit"]}, {"key":"networkName","values":["WAN"]} ] }' \
| jq -c --color-output -r '.[] | {ts: (.ts | tonumber | strflocaltime("%m/%d/%Y %H:%M:%S")),ip,devicePort,devicePortInfo: .devicePortInfo.name,protocol,country,fd,direction,blocked,networkName,audit} '
curl -s --request POST --url "$URL" \
--header 'Authorization: Token '"$token" \
--header 'Content-Type: application/json' \
--data '{ "limit": '"$limit"', "start": '"$start_time"', "end": '"$current_time"', "filters":[{"key":"direction","values":["out"]}, {"key":"status","values":["audit"]}, {"key":"networkName","values":["WAN"]} ] }' \
| jq -c --color-output -r '.[] | select(.fd=="out") | {ts: (.ts | tonumber | strflocaltime("%m/%d/%Y %H:%M:%S")),ip,devicePort,devicePortInfo: .devicePortInfo.name,protocol,country,fd,direction,blocked,networkName,direction,audit} | [.ts,.ip,.devicePortInfo,.protocol,.country] | @csv' \
| sed 's/\"//g' | column -t -s,
First API Call Step-By-Step
-
token="4ab3...143d": Defines a variable calledtokenand assigns it your API token value. -
current_time=$(date +%s): Defines a variable calledcurrent_timeand sets it to the current time in seconds since the Unix epoch. You can also think of this as the end-time of the records to be returned. -
start_time=$(date -d "4 minutes ago" +%s): Defines a variable calledstart_timeand sets it to the time 4 minutes ago in seconds since the Unix epoch. This is the beginning of the range of records to be returned. -
limit=50: Defines a variable calledlimitand sets it to return 50 records. You can make this fairly high. 5,000 +/- appears to be the absolute maximum although the technical limit appears to change (based on server load?). If the maximum limit is exceeded the messageInternal Server Errorwill be returned. -
URL="https://foobar.firewalla.net/v1/flows/query": Defines a variable calledURLand sets it to the URL of the API endpoint to query which is your MSP site URL. -
echo -e "\nPulling flows from now $(date) and $(date -d "4 minutes ago")\n": Prints a message to the console that says it's pulling data from the API endpoint with the current time and the time 4 minutes ago. I used this primarily for testing. -
curl -s --request POST --url "$URL" \: Sends a POST request to the API endpoint using thecurlcommand with the following options:-
-soption makes the output silent, so it doesn't show the progress meter. -
--request POSToption sets the request method to POST.
Not all API calls need a
--request POST. In this example, the--request POSToption is used to send a JSON payload containing information to the server. The--header 'Content-Type: application/json'option specifies that the payload is in JSON format. The--dataoption is used to specify the payload to be sent in the body of the request.The
--request POSToption is needed in an API call when you want to create a new resource on the server by submitting data in the body of the request. -
--url "$URL"option specifies the URL of the API endpoint to query. - Be very careful with quotes within a
curlcommand.
-
-
--header 'Authorization: Token '"$token" \: Sets the authorization header to include thetokenvalue specified earlier. -
--data '{ "limit": '"$limit"', "start": '"$start_time"', "end": '"$current_time"', "filters":[{"key":"direction","values":["out"]}, {"key":"status","values":["audit"]}, {"key":"networkName","values":["WAN"]} ] }' \This
--datafield contains a JSON payload that is sent along with the POST request to the API endpoint. Here's a breakdown of each key-value pair:-
"limit": '"$limit"': This sets the value of the"limit"key to the value of the$limitvariable, which is set to50. This specifies the maximum number of records to retrieve. -
"start": '"$start_time"': This sets the value of the"start"key to the value of the$start_timevariable, which is set to the Unix timestamp of the time 4 minutes ago. This specifies the start time of the query. -
"end": '"$current_time"': This sets the value of the"end"key to the value of the$current_timevariable, which is set to the current Unix timestamp. This specifies the end time of the query. -
"filters":[{"key":"direction","values":["out"]},{"key":"status","values":["audit"]}, {"key":"networkName","values":["WAN"]} ]: This sets the value of the"filters"key to an array of objects, each specifying a filter to apply to the query results. The objects in the array have the following structure:-
"direction": This specifies direction of the network flow. -
["out"]: This specifies only the flows from OUTSIDE are selected. This is asking for INBOUND flows.
-
-
Piping to JQ
Please read about JQ and why it's being used here.
| jq -c --color-output -r '.[] | {ts: (.ts | tonumber | strflocaltime("%m/%d/%Y %H:%M:%S")), ip, devicePort, devicePortInfo: .devicePortInfo.name, protocol, country, fd, direction, blocked, networkName, audit} '
This pipe takes the output from the curl command and pipes it into jq, which is a powerful command-line JSON processor. The jq command extracts the relevant data from the API response and formats it for human-readable output.
Here's a breakdown of the components of the jq command:
-
-c: This option specifies that the output should be compact, meaning that the JSON objects will be printed on a single line rather than pretty-printed. Try running the command without -c and see the difference. -
--color-output: This option specifies that the output should be colorized, which makes it easier to read and understand. -
-r: This option specifies that the output should be raw, meaning that it should not be quoted or escaped. Try running the command without -r and see the difference.
Without -r or -c the output looks like this.
{
"ts": "04/04/2023 10:52:27",
"ip": "89.248.163.26",
"devicePort": "17422",
"devicePortInfo": null,
"protocol": "tcp",
"country": "NL",
"fd": "out",
"direction": null,
"blocked": true,
"networkName": "WAN",
"audit": null
}
The jq command itself is a filter that processes the JSON objects in the API response. Here's a breakdown of what each component of the filter does:
-
.[]: This selects all the objects in the top-level array of the API response. In this case, the API response is an array of flow objects. -
{ts: (.ts | tonumber | strflocaltime("%m/%d/%Y %H:%M:%S")), ip, devicePort, devicePortInfo: .devicePortInfo.name, protocol, country, fd, direction, blocked, networkName, audit}: This creates a new object for each flow object in the array, selecting only the fields we're interested in and renaming some of them for clarity.
Thets(timestamp) field is converted from a Unix timestamp to a human-readable string format using thestrflocaltimefunction, which formats the date and time according to the specified format string ("%m/%d/%Y %H:%M:%S") for the time zone of the user. this also aligns with what is shown in the mobile app.
Theip,devicePort,protocol,country,fd,direction,blocked,networkName, andauditfields are selected from all the returned JSON values as is, without modification.
If strftime were used in the jq command, it would show the time in UTC. In this case, strflocaltime formats the timestamp according to the specified format string in the local time zone.
Producing CSV output
Adding to the prior CURL command and piping additional filters in the JQ command. For my specific use case I actually wanted easy to read output rather than results to feed to a spreadsheet. So, technically, these results are not CSVs.
-
Pipe the JQ output to [.ts,.ip,.devicePortInfo,.protocol,.country]: This command usesjqto select only thets,ip,devicePortInfo,protocol, andcountryfields from the JSON objects output by the previousjqcommand. The result is an array of arrays, where each sub-array contains the values of the selected fields for a single flow object. -
@csv: This command usesjqto convert the array of arrays into a comma-separated values (CSV) format, which is a common format for tabular data. The@csvfilter handles the quoting and escaping of special characters in the CSV output.
In addition to the
@csvfilter,jqoffers a variety of other output formats for the processed data. Here are some of the most commonly used output filters injq:-
@html: This filter outputs the data in HTML format. -
@json: This filter outputs the data in JSON format. -
@tsv: This filter outputs the data in tab-separated values (TSV) format. -
@xml: This filter outputs the data in XML format.
-
-
sed 's/\"//g': This command usessedto remove the double quotes that surround each field in the CSV output. Thes/\"//gexpression replaces all occurrences of double quotes with nothing, effectively removing them. -
column -t -s,: This command uses thecolumncommand to format the CSV output into a table with aligned columns. If you really want CSV output then exclude the pipe to column. Here's what each component of thecolumncommand does:
-
-t: This option specifies that the input is a table with columns separated by a delimiter (in this case a comma). -
-s,: This option specifies that the input columns are separated by a comma.
Overall, this pipeline of commands filters and formats the API response data to select only the flow objects where the fd field is set to "out", and then outputs a table of the ts, ip, devicePortInfo, protocol, and country fields for each selected flow object. The resulting table is aligned and formatted for easy readability.
04/04/2023 10:53:23,89.248.163.189,backroomnet,tcp,NL
04/04/2023 10:53:18,79.124.62.78,websm,tcp,BG
04/04/2023 10:53:17,167.94.146.26,,tcp,US
04/04/2023 10:53:10,141.98.11.47,tftp,udp,LT
04/04/2023 10:53:00,79.124.62.130,plethora,tcp,BG
04/04/2023 10:52:57,194.26.29.86,,tcp,RU
04/04/2023 10:52:55,176.111.174.97,,tcp,RU
04/04/2023 10:52:52,79.124.58.146,,tcp,BG
04/04/2023 10:52:38,87.246.7.206,pcsync-https,tcp,BG
04/04/2023 10:52:32,89.248.163.19,,tcp,NL
Please sign in to leave a comment.
Comments
1 comment