This event has passed.
Thursday, January 23, 2025
1:00 pm – 2:00 pm
Presenter: Dr. Neiwen Ling is currently a Postdoctoral Associate in the Efficient Computing Lab at Yale, working under the guidance of Prof. Lin Zhong.
Abstract:
The increasing integration of advanced Artificial Intelligence into time-sensitive applications, such as autonomous driving and embodied robotics, presents challenges in balancing efficiency, responsiveness, and scalability on resource-constrained edge platforms. This talk will present a comprehensive exploration of the development of time-sensitive AI systems in resource-constrained environments, including the optimization of deep learning (DL) workloads and the deployment of large language models (LLMs) for embodied agents. I will begin with a segmented LLM serving system for time-sensitive robotic applications, which maximizes time utility and responsiveness for multi-agent systems. Next, I will introduce a system that optimizes real-time DL inference on heterogeneous platforms by introducing a novel duo-block abstraction and cross-processor scheduling mechanism, achieving significant reductions in deadline misses. Finally, I will introduce a smart roadside infrastructure system that leverages cooperative edge computing to provide timely AI services to autonomous vehicles. I will conclude by outlining future directions for advancing time-sensitive AI systems for embodied agents in real-world applications.