Detection of Unauthorized IoT Devices Using Machine Learning Techniques (1709.04647v1)

Published 14 Sep 2017 in cs.CR and cs.CV

Abstract: Security experts have demonstrated numerous risks imposed by Internet of Things (IoT) devices on organizations. Due to the widespread adoption of such devices, their diversity, standardization obstacles, and inherent mobility, organizations require an intelligent mechanism capable of automatically detecting suspicious IoT devices connected to their networks. In particular, devices not included in a white list of trustworthy IoT device types (allowed to be used within the organizational premises) should be detected. In this research, Random Forest, a supervised machine learning algorithm, was applied to features extracted from network traffic data with the aim of accurately identifying IoT device types from the white list. To train and evaluate multi-class classifiers, we collected and manually labeled network traffic data from 17 distinct IoT devices, representing nine types of IoT devices. Based on the classification of 20 consecutive sessions and the use of majority rule, IoT device types that are not on the white list were correctly detected as unknown in 96% of test cases (on average), and white listed device types were correctly classified by their actual types in 99% of cases. Some IoT device types were identified quicker than others (e.g., sockets and thermostats were successfully detected within five TCP sessions of connecting to the network). Perfect detection of unauthorized IoT device types was achieved upon analyzing 110 consecutive sessions; perfect classification of white listed types required 346 consecutive sessions, 110 of which resulted in 99.49% accuracy. Further experiments demonstrated the successful applicability of classifiers trained in one location and tested on another. In addition, a discussion is provided regarding the resilience of our machine learning-based IoT white listing method to adversarial attacks.

Citations (204)

View on Semantic Scholar

Summary

The paper's main contribution is a machine learning framework that achieves near-perfect identification of unauthorized IoT devices using supervised techniques.
It employs a Random Forest classifier with a sliding window analysis to accurately categorize 17 IoT devices across nine types with 96% detection accuracy.
The results underscore strong transferability and practical applicability, offering automated IoT management insights to enhance network security.

Detection of Unauthorized IoT Devices Using Machine Learning Techniques

In the domain of cybersecurity, the proliferation of the Internet of Things (IoT) poses significant risks to organizations due to its varied and increasingly numerous devices. The paper "Detection of Unauthorized IoT Devices Using Machine Learning Techniques" tackles the problem of automatically recognizing and categorizing network-connected IoT devices, comparing them against a predefined white list of authorized device types. This is increasingly relevant as organizations must secure their networks against potential vulnerabilities inherent in external and unauthorized IoT devices.

Methodology

The authors employ a supervised machine learning approach using the Random Forest algorithm to classify IoT device types based on network traffic data. The paper encompasses 17 IoT devices of nine different types, reflecting a comprehensive cross-section of consumer IoT technology. By collecting labeled data over an extended timeframe, the research ensures robustness and captures the diversity of device behaviors in natural settings.

Classifier training involved generating a labeled training set from device traffic data, followed by optimization using a validation set. The researchers achieved high accuracy by employing a strategic parameter, the classification threshold, which maximizes the F-measure—a balanced metric prioritizing precision and recall equally. The trained model was tested on a separate dataset to ensure its capability to generalize.

A notable methodological feature is the use of a sliding window over consecutive sessions. This smoothing strategy improves classification accuracy by leveraging temporal patterns in device communication, achieving near-perfect identification of unauthorized types after analyzing sequences of 110 sessions.

Key Results

Empirical results are compelling. The Random Forest classifier delivered an overall detection accuracy of 96% for unauthorized device types and classified 99% of white-listed devices correctly. Certain device types, like motion sensors and smart sockets, achieved rapid classification within just five sessions, which enhances the practical applicability of this approach for real-time detection.

The authors carried out transferability tests by training the model in one lab and testing in another. Despite environmental and device model variations, classifiers showed impressive generalization, further emphasizing method reliability across diverse geographical and operational conditions.

Noteworthy features that emerged as critical in classification included TTL-related metrics and byte ratio measures within TCP/IP traffic, underscoring the importance of intrinsic communication attributes in effective IoT categorization.

Implications and Future Work

The implications of this paper are multifaceted. Practically, it informs the deployment of automated IoT management systems that feed into existing network security frameworks, such as SIEMs, enabling timely and precise network segmentation and threat-response mechanisms. Theoretically, it broadens our understanding of leveraging network data for device identity verification in highly dynamic environments.

Looking forward, future explorations could enhance this model's versatility by encompassing additional communication protocols beyond TCP/IP. The initiative to extend data collection to compromised devices would allow for the development of more resilient anomaly detection capabilities, providing further robustness against adversarial network scenarios.

In conclusion, this work substantiates the feasibility of incorporating machine learning techniques in IoT device management, showcasing significant strides towards establishing secure, automated controls in IoT-saturated networks. The approach offers organizations a viable path to proactively manage the cybersecurity risks posed by the expanding IoT landscape.

PDF Markdown