Search

The ELEGANT Orchestrator: Bridging the Gap between the Cloud and Edge Computational Ecosystems

In the ever-growing field of Data Science, Cloud Computing is a well-established computational paradigm, while Edge Computing is an emerging and promising alternative when it comes to the novel challenges introduced in the Internet of Things (IoT) landscape. What are their pros and cons? Can they meet half-way, mutually amplifying their pros and canceling out their cons? And, most importantly, how does ELEGANT aim to bridge these two ecosystems, leading to a seamless unified IoT/Big Data infrastructure?


*Cloud vs Edge or Cloud meets Edge?*


Cloud Computing is a decade-old technology developed to provide users and organizations with scalable, secure, uninterrupted and consistent computational and storage services [1]. It was not long before mobile devices (what we nowadays refer to as the Edge or the IoT, depending on the context) "realized" that the processing power provided by the cloud was too significant to ignore and started utilizing it by "outsourcing" all or part of their services to the Cloud [2]. However, the bandwidth of the networks that carry data between the devices and the cloud hasn't kept up with the rapid increase of processing power in modern cloud services and data centers, introducing a bottleneck in cloud-based applications that, given the ever-increasing rate of data produced on the Edge, becomes harder and harder to ignore [2]. Moreover, modern edge devices like smartphones, tablets, various smart devices etc have significant processing and/or storage capabilities that are wasted. Lastly, a significant portion of the data that is produced on the Edge is sensitive (e.g., medical data as recorded by a smart watch) that its transfer anywhere outside the device where it was initially produced/stored would raise major security and privacy concerns [3]. These are some of the most important motivations that gave birth to the Edge Computing paradigm, where the data produced on the Edge stay and get processed on the Edge. Wouldn't it be nice, however, if a unified computational model was designed in order to keep the best of both worlds, say the processing power of the Cloud and the security and flexibility of the Edge?


*Enter ELEGANT!*


ELEGANT aims to bridge the gap between the Cloud and Edge Computing ecosystems by providing a unified, secure and seamless computational and programming environment for IoT/Big Data analytics. A major issue that impedes this unification is the decisions that need to be made regarding where an application (i.e., a piece of code) should be sent for execution: Does it handle sensitive data that should remain on the Edge? Does it need massive amounts of data and/or processing capabilities that can only be met by the Cloud? These are the questions that the ELEGANT Intelligent Orchestrator will be designed to answer quickly and effectively.


Τhe ELEGANT Orchestrator will comprise 2 parts: the local and the global optimization layer (or local and global layer for shorts, respectively). Given a dependency graph of tasks (a task graph hereafter), the local layer will be responsible for the selection of the most appropriate device for each one of the tasks (i.e., the nodes of the task graph) independently, while the global layer will take into consideration the mapping produced by the local layer, as well as (a) the whole graph and the dependencies it expresses, (b) the topology of the IoT network and the state/availability of the devices, (c) the overhead of data transferring and whether it is worth it, and (d) other restrictions like security policies and Service Level Agreements (SLAs), to produce an optimal global placement for the whole task graph.


*The optimization layer*


Following the most recent trends in device mapping and placement optimization [4, 5], the ELEGANT Orchestrator employs deep learning models on a graph representation of the source code of interest. Firstly, the source code of the application is translated into a graph with instructions as nodes and all execution/control/data flow and dependencies encoded as edges. Then, powerful DL models that operate on graphs -known as Graph Neural Networks [6, 7]- learn these flows and dependencies. The training phase of this layer attempts to learn the correlation between these graphs (i.e., the different flows and dependencies between instructions) and the optimal placement on devices for which execution and transfer times have been measured (e.g., a mobile phone, a Raspberry Pi, a CPU/GPU, a Cloud service etc).


*The Machine Learning layer*


Having the predictions of the local layer at its disposal, the global layer is the component of the ELEGANT Orchestrator that will produce the final mapping of the task graph to the Edge/Cloud devices of the underlying infrastructure. Many approaches have been proposed to tackle this NP-hard problem [8-11], with the most promising one being the usage of Genetic Algorithms (GAs). In short, all GAs produce a large number of possible solutions to an optimization problem, evaluate how good each solution is (its "fitness") using a predefined fitness function, and combine the fitter solutions together in various ways, hoping to achieve greater fitness values in the next epoch. In our problem, each solution will be a mapping between the tasks (i.e., the nodes of the task graph) and the available processing units, while the fitness function will be an evaluator of the mappings that will take everything we have mentioned so far into account (e.g., the predictions of the first layer, the dependencies between the tasks, network topology changes, any SLAs/security restrictions etc). Lastly, in addition to standard GAs like the ones of the NSGA family [12], our research team will investigate the suitability and power of other optimization techniques, like the algorithms of the swarm optimization family. Most notably, the Particle Swarm Optimization (PSO) algorithm [13] as well as one of its fuzzy variants, the Fuzzy Self-Tuning PSO [14], are two possible candidates.



References:


[1] Francis, T.. (2018). A Comparison of Cloud Execution Mechanisms Fog, Edge, and Clone Cloud Computing. International Journal of Electrical and Computer Engineering (IJECE). 8. 4646. 10.11591/ijece.v8i6.pp4646-4653.

[2] W. Shi and S. Dustdar, "The Promise of Edge Computing," in Computer, vol. 49, no. 5, pp. 78-81, May 2016, doi: 10.1109/MC.2016.145.

[3] T. Wang, G. Zhang, A. Liu, M. Z. A. Bhuiyan and Q. Jin, "A Secure IoT Service Architecture With an Efficient Balance Dynamics Based on Cloud and Edge Computing," in IEEE Internet of Things Journal, vol. 6, no. 3, pp. 4831-4843, June 2019, doi: 10.1109/JIOT.2018.2870288.

[4] Cummins, Chris & Fisches, Zacharias & Ben-Nun, Tal & Hoefler, Torsten & Leather, Hugh. (2020). ProGraML: Graph-based Deep Learning for Program Optimization and Analysis.

[5] Xiao, Y., Ma, G., Ahmed, N.K., Willke, T.L., Nazarian, S. and Bogdan, P. (2021). Deep Graph Learning for Program Analysis and System Optimization.

[6] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang and P. S. Yu, "A Comprehensive Survey on Graph Neural Networks," in IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 1, pp. 4-24, Jan. 2021, doi: 10.1109/TNNLS.2020.2978386.

[7] The PyTorch Geometric library: https://pytorch-geometric.readthedocs.io/

[8] Wang, S., Li, Y., Pang, S., Lu, Q., Wang, S., & Zhao, J. (2020). A task scheduling strategy in edge-cloud collaborative scenario based on deadline. Scientific Programming, 2020.

[9] Mijuskovic, A., Chiumento, A., Bemthuis, R., Aldea, A., & Havinga, P. (2021). Resource Management Techniques for Cloud/Fog and Edge Computing: An Evaluation Framework and Classification. Sensors, 21(5), 1832.

[10] Guevara, J. C., & da Fonseca, N. L. (2021). Task scheduling in cloud-fog computing systems. Peer-to-Peer Networking and Applications, 14(2), 962-977.

[11] Han, Y., Shen, S., Wang, X., Wang, S., & Leung, V. (2021). Tailored learning-based scheduling for kubernetes-oriented edge-cloud system. arXiv preprint arXiv:2101.06582.

[12] Bagchi, T.P., 1999. The Nondominated Sorting Genetic Algorithm: NSGA. In Multiobjective Scheduling by Genetic Algorithms (pp. 171-202). Springer, Boston, MA.

[13] https://en.wikipedia.org/wiki/Particle_swarm_optimization

[14] Nobile, M. S., Cazzaniga, P., Besozzi, D., Colombo, R., Mauri, G., & Pasi, G. (2018). Fuzzy Self-Tuning PSO: A settings-free algorithm for global optimization. Swarm and evolutionary computation, 39, 70-85.