The internet is full of bad guys that are constantly scanning all IP addresses looking for unpatched services, misconfigured servers, or simply gathering information from new targets. Most of those Internet scans are usually done by automated software, some of them are worms looking to spread.
In this post we will inspect the syslog messages of a home router and then we’ll have some fun deploying an ELK stack on a Raspberry Pi 3 to analyse what is going on. We’ll also play with several Logstash plugins to enhance our logs and create an interesting Kibana dashboard. We’ll see how to use some fields of the IP and TCP headers to fingerprint the operating system of the source IP address system.
One day I found myself willing to know more about that interesting matter and… Well in fact I was playing with my home router and decided I could send the router’s syslog messages to another computer to see what was going on. That’s how it all began. I redirected the router syslog messages to the udp port 6514 of a Raspberry Pi on the local and configured its rsyslod to receive those messages.
After firing some greps I found some strange events appearing on the kernel facility:
I decided to look at those events carefully:
It turned out to be some strange connections from different IP addresses to multiple ports on my router.
Another interesting fact was that this was occurring every 10 min more or less:
I thought that was strange. Maybe some ISP ‘s internal check on his network? I decided it was time to use something more fancy to look at those logs, for example with a dashboard, so I installed a full ELK stack on that Raspberry Pi 3.
ELK, named for Elasticsearch Logstash and Kibana, is installed on the Raspberry Pi as it would be installed on a Linux server, except for a couple of extra work we’ll have to do to run Logstash and Kibana successfully on an ARM architecture. I’m going to show you the workarounds from a Github recipe that definitively helped me to solve those problems. The first thing is to install the jffi library for Logstash from source. The second thing is how Kibana must be installed downloading the source code and using npm. An important thing for Kibana installation is to have the corresponding Nodejs version installed, see how to install or upgrade your Nodejs.
I created an Ansible playbook to install the ELK stack here. You can just clone it, modify the version numbers for Logstash, Elasticsearch and Kibana and then fire the ansible-playbook command.
I configured Logstash to listen to that UDP port 6514 for syslog messages.
I parsed the kernel intrusion messages
I added the geoip Logstash filter plugin with the Maxmind GeoCityLite free database:
And I configured Logstash to send those parsed log messages to an Elasticsearch index:
Finally, with Logstash, Elasticsearch and Kibana working, we can configure the visualization of indices in Kibana. As Elasticsearch and Kibana are only listening to connection from localhost, we need to forward a local port form our machine to the Kibana port on the Raspberry Pi:
Then start searching for data and creating visualizations and dashboards
I used the tile map to show geographic locations of the source IP address of the connection. The GPS coordinates and the Country and City name is added to the log entry by the geolocation Logstash plugin in combination with the MaxMind free database.
What I got here is a log of all the Internet scanners that are hitting the router’s Internet IP address. Let’s extract some information from that.
First of all, it is still true that the connections are constantly hitting the router every 10 minutes approximatively. This is something that needs further investigation.
The alert message is carrying information on what is the source of the alert, and that’s always an Internet connection to a closed port. And here is the final dashboard that I have built:
So the basic information I am extracting from these scanners logs is the protocols and services (based on the port number) that are actively being scanned, and the source countries.
One of the things that we can do to improve our dashboard is to add some Logstash filter to insert that information on the data sent to Elasticsearch. The translate filter uses some external file as a lookup table to add an extra field to the document based on a dictionary (the lookup table file).
And after some time we have our new dashboard with descriptive names for TCP port numbers:
Apart from extracting the geolocation of IP addresses and the names of targeted applications or services, some fingerprinting can be made with the TTL and Window field of the IP and TCP header:
We can for example, see for each port number used in the scan, what are the main operating systems based on that information.
For example, given the TTL field of connections hitting the port 23 (telnet), it seems that the majority of telnet scanners run on linux/unix or Cisco IOs systems.
In the same way, based on the TTL field, there’s apparently no linux/unix machine scanning on port 1433 (Microsoft SQL Server).
Those results make sense. However, they cannot be considered valid as there’s too few samples and there’s tons of factors to take into consideration like proxy usage or TCP header manipulation that can be performed by the sending machines.
We can compare the information we gathered with the publicly available information on threat intelligence on the Internet.
Let’s start with the Internet Storm Center from SANS.
They have a trends page for scanned ports on the Internet. The top 3 TCP ports are 8889 (ddi-tcp-2), 2022 (down) and 8888 (ddi-tcp-1).
Let’s continue with the F-secure threat report 2017.
In the section named “Who’s after who?” They say that popular ports scanned are http/https, SMTP, Telnet and SSDP. The actual ranking says that the top 3 ports are 80, 25 and 443 (supposedly TCP). Also the top 3 source countries or Telnet honeypots are Taiwan, China and India. The report also makes a recap on the discovery of the Mirai-based botnets, which target TCP port 7547.
Our window of exposure has surely been very short compared to the ones used by those Security vendors. Also we have far lower number of “sensors” deployed to gather that information. That is why our results are not any close to the ones seen in the Threat Reports.
We had some experience deploying an ELK stack on a Raspberry Pi 3. And we played with several Logstash plugins to enhance our logs to produce an interesting Kibana dashboard. We played around with some IP and TCP header fields to fingerprint the operating system of the source machines. Finally, we have built a modest intelligence gathering system from which we can obtain a list of scanning IP addresses, and use it as an IP address black list.