UGR'16 Dataset

The dataset presented here is built with real traffic and up-to-date attacks. These data come from several netflow v9 collectors strategically located in the network of a spanish ISP. It is composed of two differentiated sets of data that are previously split in weeks:

  • A CALIBRATION set of data gathered from March to June of 2016 (4 months) containing real background traffic data
  • A TEST set of data gathered from July to August of 2016 containing real background and synthetically generated traffic data that corresponds with several and well know types of attacks.

The main advantage of this dataset over previous ones is its usefulness for evaluating IDSs that consider long-term evolution and traffic periodicity. Models that consider differences in daytime/night or labour weekdays/weekends can also be trained and evaluated with it.

It can be downloaded here.