FactionFormer: Context-Driven Collaborative Vision Transformer Models for Edge Intelligence
Publication Date: 6/26/2023
Event: 8th IEEE International Workshop on Smart Service Systems SmartSys 2023 (co-located with SMARTCOMP 2023)
Reference: pp. 349-354, 2023
Authors: Sumaiya Tabassum Nimi, University of Missouri-Kansas City; Md Adnan Arefeen, University of Missouri-Kansas City; Md Yusuf Sarwar Uddin, University of Missouri-Kansas City; Biplob Debnath, NEC Laboratories America, Inc.; Srimat T. Chakradhar, NEC Laboratories America, Inc.
Abstract: Edge Intelligence has received attention in the recent times for its potential towards improving responsiveness, reducing the cost of data transmission, enhancing security and privacy, and enabling autonomous decisions by edge devices. However, edge devices lack the power and compute resources necessary to execute most Al models. In this paper, we present FactionFormer, a novel method to deploy resource-intensive deep-learning models, such as vision transformers (ViT), on resource-constrained edge devices. Our method is based on a key observation: edge devices are often deployed in settings where they encounter only a subset of the classes that the resource intensive Al model is trained to classify, and this subset changes across deployments. Therefore, we automatically identify this subset as a faction, devise on-the fly a bespoke resource-efficient ViT called a modelette for the faction and set up an efficient processing pipeline consisting of a modelette on the device, a wireless network such as 5G, and the resource-intensive ViT model on an edge server, all of which work collaboratively to do the inference. For several ViT models pre-trained on benchmark datasets, FactionFormer’s modelettes are up to 4× smaller than the corresponding baseline models in terms of the number of parameters, and they can infer up to 2.5× faster than the baseline setup where every input is processed by the resource-intensive ViT on the edge server. Our work is the first of its kind to propose a device-edge collaborative inference framework where bespoke deep learning models for the device are automatically devised on-the-fly for the most frequently encountered subset of classes.
Publication Link: https://www.researchgate.net/publication/371159829_FactionFormer_Context-Driven_Collaborative_Vision_Transformer_Models_for_Edge_Intelligence