AIDL_B01: Data Centers and Infrastructure for supporting AI

Data Centers and Infrastructure for supporting AI

AIDL_B01

Description

The course “Data Centers and Infrastructure for Supporting Artificial Intelligence” provides a comprehensive understanding of the critical components and technologies that power AI infrastructure. The course explores the role of data centers as the backbone of AI systems and introduces students to the key concepts of managing and scaling AI workloads. Students will be introduced to the world of cloud computing and of Kubernetes, a container orchestration platform widely used in data centers, and learn how it enables efficient deployment, scaling, and management of AI applications. They will also gain insights GPU hardware specifically designed for accelerating AI workloads, and understand how these powerful processors contribute to enhancing AI performance and training. Additionally, the course covers Edge AI hardware, addressing the unique challenges and requirements of running AI models on edge devices. Students will explore real-world case studies and gain hands-on experience with publicly available open-source machine learning toolkits build on top of virtualized infrastructures (i.e. Bright, Kubeflow), which streamlines the deployment and management of AI workflows in data centers.

Furthermore, they will learn about the most widely used frameworks for cloud deployment of machine learning, deep learning and computer vision like Tensorflow, Vertex, etc. Finally, they will get hands on experience on the cloud platforms and frameworks for machine learning and artificial intelligence.

Syllabus

Introduction to the fundamentals of Virtualization technology and Cloud Computing
Hyped virtualization technologies used for AI: Containers and Kubernetes
The NVIDIA ecosystem
Machine learning and AI frameworks on the cloud
Cloud resources and Hardware accelerators on the cloud
Storage and Networking on the cloud

Assessment

A comprehensive evaluation approach is adopted to gauge their proficiency in understanding cloud notions (i.e., containers, Kubernetes) and infrastructure for AI including hardware resources, accelerators on the cloud, and machine learning frameworks deployed in cloud environments. Assessment methods encompass a combination of both theoretical knowledge assessments and practical assignments. To this end students must successfully fulfill two different projects (midterm – 50% and final – 50%). Each project will include a report in the form of a word document or pdf and a presentation.

Learning Outcomes

Understand the fundamentals of cloud computing and virtualization technology,
and dive into GPU accelerated infrastructures,
- examine the use of virtualization technology in High Performance Computing (HPC),
- explore the NVIDIA ecosystem and the powerful solutions developed to harness the power of GPUs,
- gain hands on experience in using virtualization technology to deploy and manage AI workloads,
gain hands on experience on the cloud platforms and frameworks for machine learning (deep learning, computer vision, etc.)

Course Features

Course type: Major

Semester: 2nd

ECTS: 6

Duration: 13 weeks

Courses: In class lectures + online

Language: English

Assessment: Project based

Instructor

Assistant Professor Christoforos Kachris

Department of Electrical and Electronic Engineering, School of Engineering, UNI.W.A.

Dr. Michalis G. Xevgenis

Department of Electrical and Electronic Engineering, School of Engineering, UNI.W.A., CoNSerT laboratory Researcher

Data Centers and Infrastructure for supporting AI

Assistant Professor Christoforos Kachris

Dr. Michalis G. Xevgenis

Quick menu

Address

Contact info