Automatically designing CNN architectures using genetic algorithm for image classification (1808.03818v3)

Published 11 Aug 2018 in cs.NE

Abstract: Convolutional Neural Networks (CNNs) have gained a remarkable success on many image classification tasks in recent years. However, the performance of CNNs highly relies upon their architectures. For most state-of-the-art CNNs, their architectures are often manually-designed with expertise in both CNNs and the investigated problems. Therefore, it is difficult for users, who have no extended expertise in CNNs, to design optimal CNN architectures for their own image classification problems of interest. In this paper, we propose an automatic CNN architecture design method by using genetic algorithms, to effectively address the image classification tasks. The most merit of the proposed algorithm remains in its "automatic" characteristic that users do not need domain knowledge of CNNs when using the proposed algorithm, while they can still obtain a promising CNN architecture for the given images. The proposed algorithm is validated on widely used benchmark image classification datasets, by comparing to the state-of-the-art peer competitors covering eight manually-designed CNNs, seven automatic+manually tuning and five automatic CNN architecture design algorithms. The experimental results indicate the proposed algorithm outperforms the existing automatic CNN architecture design algorithms in terms of classification accuracy, parameter numbers and consumed computational resources. The proposed algorithm also shows the very comparable classification accuracy to the best one from manually-designed and automatic+manually tuning CNNs, while consumes much less of computational resource.

Citations (570)

View on Semantic Scholar

Summary

The paper introduces a genetic algorithm-based method that automatically designs variable-length CNN architectures for image classification without expert intervention.
It encodes architectures with adjustable depth and integrates skip connections to enhance performance and mitigate vanishing gradients.
Empirical evaluation on CIFAR10 and CIFAR100 demonstrates competitive accuracies of 95.22% and 77.97% while reducing parameters and computational cost.

Overview of Automatically Designing CNN Architectures Using Genetic Algorithm for Image Classification

The paper "Automatically Designing CNN Architectures Using Genetic Algorithm for Image Classification" presents a method to automatically design Convolutional Neural Network (CNN) architectures using Genetic Algorithms (GAs). The approach aims to address the challenge of designing optimal CNN architectures for image classification tasks without requiring substantial domain knowledge from users.

Context and Motivation

Convolutional Neural Networks (CNNs) have become a pivotal technology in image classification. Their performance is significantly influenced by their architecture, typically crafted manually by experts. The expertise required to design such architectures can become a barrier for users lacking deep knowledge in neural networks. This paper proposes an automatic design strategy that leverages genetic algorithms to evolve CNN architectures, making it accessible to a broader audience.

Methodology

The paper utilizes genetic algorithms to explore the space of possible CNN architectures. This is achieved through a series of stages:

Encoding CNN Architectures: The authors propose a novel encoding strategy that uses variable-length encoding. This allows for the representation of CNN architectures that can vary in depth and complexity, accommodating different design configurations without predefining the depth of the networks.
Incorporation of Skip Connections: The integration of skip connections within the CNNs helps mitigate the vanishing gradient problem and allows for deeper networks, aligning with successful manual architectures like ResNet.
Optimization through Evolutionary Operations: Key genetic operations such as mutation and crossover are adapted to work with the variable-length encoding. The proposed mutation includes adding or removing skip layers and pooling layers, helping explore the search space efficiently.
Fitness Evaluation: A parallel and cache-based system evaluates the performance of CNN architectures on image datasets, optimizing the use of computational resources.

Empirical Evaluation

The proposed CNN-GA algorithm was validated on benchmark datasets like CIFAR10 and CIFAR100. The results reveal that CNN-GA achieves competitive performance compared to state-of-the-art manually designed CNNs and existing automated architecture search methods.

On CIFAR10, CNN-GA achieved a classification accuracy of 95.22%, with fewer parameters and significantly less computational cost than most competitors.
For CIFAR100, CNN-GA reached an accuracy of 77.97%, also showing efficiency in terms of parameter count and computational resources.

The paper highlights CNN-GA's ability to discover architectures that rival those achieved through manual and hybrid design methods but without manual intervention.

Implications and Future Work

The automatic design of CNN architectures using GAs opens the potential for broader adoption of deep learning techniques across various domains, especially among users who may not have extensive expertise in neural network design. This democratization can accelerate the development and application of machine intelligence.

Future research could focus on further reducing computational resource requirements, enhancing the scalability of the approach, and exploring its application to more complex datasets or tasks beyond image classification. Furthermore, integrating other evolutionary algorithms or hybrid learning strategies might enhance the efficiency and generalization capabilities of the designed CNNs.

In summary, the paper provides a comprehensive approach to automatically designing CNN architectures with genetic algorithms, showing promise in achieving efficient and competitive networks without expert intervention, significantly contributing to the field of automated machine learning (AutoML).

PDF Markdown