Emergent Mind

SEIFER: Scalable Edge Inference for Deep Neural Networks

(2210.12218)
Published Oct 21, 2022 in cs.NI and cs.DC

Abstract

Edge inference is becoming ever prevalent through its applications from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet there is no production-ready orchestration system for deploying deep learning models over such edge networks which adopts the robustness and scalability of the cloud. We present SEIFER, a framework utilizing a standalone Kubernetes cluster to partition a given DNN and place these partitions in a distributed manner across an edge network, with the goal of maximizing inference throughput. The system is node fault-tolerant and automatically updates deployments based on updates to the model's version. We provide a preliminary evaluation of a partitioning and placement algorithm that works within this framework, and show that we can improve the inference pipeline throughput by 200% by utilizing sufficient numbers of resource-constrained nodes. We have implemented SEIFER in open-source software that is publicly available to the research community.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.