Honeypots are really useful for collecting security data for research, especially around botnets, scanning hosts, password brute forcers, and other misbehaving systems. They are also the cheapest way collect this data at scale. Deploying many types of honeypots across geo-diverse locations of the Internet improves the aggregate data quality and provides a holistic view. This provides insight into both global trends of attacks and network activity as well as the behaviors of individual malicious systems. For these reasons, we started the Modern Honey Network, which is both an open source (GPLv3) project and a community of hundreds of MHN servers that manage and aggregate data from thousands of heterogeneous honeypots (Dionaea, Kippo, Amun, Conpot, Wordpot, Shockpot, and Glastopf) and network sensors (Snort, Suricata, p0f) deployed by different individuals and organizations as a distributed sensor network. The project has turned into the largest crowdsourced honeynet in the world consisting of thousands of diverse sensors deployed across 35 countries and 5 continents worldwide. Sensors are operated by all sorts of people from hobbyists, to academic researchers, to Fortune 1000 companies. In this talk we will discuss our experience in starting this project, analyzing the data, and building a crowdsourced global sensor network for tracking security threats and gathering interesting data for research. We've found that lots of people like honeypots, especially if you give them a cool realtime visualization of their data and make it easy to setup; lots of organizations will share their data with you if it is part of a community; and lots of companies will deploy honeypots as additional network sensors, especially if you make it easy to deploy/manage/integrate with their existing security tools.
more than 10 years experience in security, primarily on building distributed systems, big data analytics, and most recently data science
In this talk, when I say honeypot, I am referring to low interaction honeypots.
Local vs. Global Deployment: is this IP scanning/attacking everyone or just my network?
Anyone go to Derby Con? did you see Katherine Trame and David Sharpe’s talk? They are from GE-CIRT team. This is a slide they presented that showed the types of attacks that their team responded to over the past 3 years. Internet facing assets represented the vast majority of incidents they responded to. IMO, this makes a strong case for honeypots.
automates the install process for each honeypot: install dependencies, install honeypot, run under supervisord, get data flow going to MHN server using HPFeeds. Makes them manageable.
GNU Lesser General Public License (LGPL)
Start with sensors
hpfeeds -> honeymap
hpfeeds to mnemosyne
hpfeeds to hpfeeds-logger for integrations
web app for uses to manage, deploy and explore the data
REST APIs for building apps and automation around MHN
MHN is also a community of MHN Servers that contribute honeypot events. Anyone can install MHN and then start deploying honeypots. If they opt to share their data, it is contributed to the community and they can get access to the data.
Sharing data back to the community is optional
Anyone that does share can get access to aggregated data on attackers
Currently working on a way to share more granular event data
428 MHN Servers – 413 /24’sand 286 /16’s this should put a bound on DHCP related changes
428 MHN Servers, 42 countries, 6 continents (did IP geo on the MHN server IPs)
2,959 Sensors, 35 countries, 5 continents (self reported IP GEO from maxmind)
Anyone want to speculate why there was a surge in sensors add here. Here’s a hint: this was Sept 30 and Oct 1. ShellShock
As you can see, Shell Shock is what caused the MHN project to really take off.
forgive the drop off in late november, we had a collection outage
the huge spike is from dionaea sensors, and this is actually not from the surge in sensors added. This was 2 weeks later.
We investigated, and if you look at the attack Ips…
39M events from one sensor. Thanks!
269,746,704 – 39,157,182 = 230,589,522
vast majority of the events that come in are from Dionaea, then Kippo and Amun
notice the rfc1918 spike is gone
The countries of origin for the events is primarily USA, China, France, Hong Kong, and Taiwan. This is not attribution, this is just stats on the aggregated data we collected.
* crowdsourcing was coined in 2005.
* wikipedia: Crowdsourcing is the process of obtaining needed services, ideas, or content by soliciting contributions from a large group of people, and especially from an online community, rather than from traditional employees or suppliers.
* ThreatStream is a big believer in Crowdsourcing, especially for security data. Our optic platform leverages this concept to enable companies to share diverse threat intelligence with each other. Our MHN project leverages it to collect and share global hoeypot data.
Many many people I’ve spoken to have set this up primarily for the ThreatMap it provides them
we were all beginners once
There will be many n00bs, help them and be patient
Be willing to provide help beyond the scope of just your project (within reason)
network troubleshooting
misconfigured systems
etc
Courtesy can be lost in translation (literally) – lots of international users and it seems like they use Google translate to create their help emails.
It was submitted to Splunkbase and is waiting for approval
ThreatStream is big on open source contributions. If you go to our Github page, you will see 24 publicly shared open source projects (10 are original projects, 14 are forks we’ve made and contributed our changes back). Expect more to come. Here are the main projects that we authored related to MHN.
MHN – the main mhn project
mhn-splunk – the MHN Splunk App
hpfeeds-logger – the generic hpfeeds logger to enable integrations with Splunk and ArcSight
shockpot
Thanks to these contributors, supporters, and vocal users. We appreciate your help and support.
I would highly recommend making a donation to the Honeynet Project. MHN relies on many of their packages and they do awesome work.