TON_IoT Datasets for Cybersecurity Applications based Artificial Intelligence

The TON_IoT datasets are new generations of Industry 4.0/Internet of Things (IoT) and Industrial IoT (IIoT) datasets for evaluating the fidelity and efficiency of different cybersecurity applications based on Artificial Intelligence (AI) and Machine/Deep Learning algorithms. The directories of the datasets can be found in cloudstor, https://cloudstor.aarnet.edu.au/plus/s/ds5zW91vdgjEj9i.

The datasets have been called ‘ToN_IoT’ as they include heterogeneous data sources collected from Telemetry datasets of IoT and IIoT sensors, Operating systems datasets of Windows 7 and 10 as well as Ubuntu 14 and 18 TLS and Network traffic datasets. The datasets were collected from a realistic and large-scale network designed at the Cyber Range and IoT Labs of the UNSW Canberra Cyber, the School of Engineering and Information technology (SEIT), UNSW Canberra @ the Australian Defence Force Academy (ADFA).

In the Cyber Range Labs of UNSW Canberra, a testbed network for the industry 4.0 network that includes IoT and IIoT devices and services was. The testbed generates a new systematic testbed of Industry 4.0/Industrial IoT (IIoT) networks for creating new realistic datasets, as presented in Figure 1. The testbed was deployed using multiple virtual machines and hosts of windows, Linux and Kali Linux operating systems to manage the interconnection between the three layers of IoT, Cloud and Edge/Fog systems. A set of IoT devices and sensors, such as green gas IoT and industrial IoT actuators, is connected to MQTT gateways to publish and subscribe to various topics, such as measuring temperature and humidity. The datasets were gathered in a parallel processing to collect several normal and cyber-attack events from IoT networks.

Figure 1: An architectural design for generating datasets from the Industry 4.0/IIoT networks

Different hacking techniques, such as DoS, DDoS and ransomware against, were launched against web applications, IoT gateways and computer systems across the IIoT network. The directories of the TON_IoT datasets include the following:

1. Raw datasets

1. IoT/IIoT datasets were logged in log and CSV files, where more than 10 IoT and IIoT sensors such as weather and Modbus sensors were used to capture their telemetry data.

Links of open source tools used:

2. Network datasets were collected in the packet capture (pcap) formats, log files and CSV files of the Bro tool.

Links of open source tools used:

3. Linux datasets were collected by running a tracing tool on Ubuntu 14 and 18 systems, especially atop, for logging desk, process, processor, memory and network activities. The data were logged in TXT and CSV files.

Links of open source tools used:

4. Windows datasets were captured by executing dataset collectors of the Performance Monitor Tool on Windows 7 and 10 systems. The raw datasets were collected in a blg format opened by Performance Monitor Tool to collect activities of desk, process, processor, memory and network activities in a CSV format.

Link of open source tool used: