Mploy - דרושים
Mploy - דרושים

דרושים Cost-Efficient Inference Serving and Routing Optimization- MSc and PHD-Summer internship 2026- Research Lab בחיפה

 \ 

Cost-Efficient Inference Serving and Routing Optimization- MSc and PHD-Summer internship 2026- Research Lab

 נכון לתאריך

 

29/12/2025

 חיפה

 IBM

**Introduction

**At IBM work is more than a job - it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world’s most challenging problems? If so, let’s talk.

**Your Role And Responsibilities

**We are looking for a highly motivated PhD or MSc student to join our team for a summer internship focused on cost-efficient serving of large-scale AI inference workloads.

The internship will explore advanced routing strategies and KV-cache–aware optimizations in distributed inference systems, with an emphasis on improving performance, scalability, and GPU cost efficiency.

What you will work on

  • Designing and evaluating routing algorithms to optimize inference latency, throughput, and cost
  • Investigating KV cache management strategies for large-scale, distributed inference serving
  • Prototyping, benchmarking, and analyzing inference optimization techniques
  • Working with modern inference frameworks and real production-like workloads

Why join us?

This internship offers a unique opportunity to work at the intersection of AI systems and distributed infrastructure, with real-world impact on scalable, cost-efficient inference serving used in production environments.

Required Technical And Professional Expertise

  • MSc or PhD student in Computer Science, Machine Learning Systems, or a related field
  • Strong background or interest in distributed systems, systems research, or ML infrastructure
  • Strong programming skills (Python, Go, or similar)
  • Hands-on experience or familiarity with vLLM (architecture, KV cache behavior, scheduling, or extensions)
  • Interest in AI infrastructure, performance optimization, and cost efficiency
  • Ability to work independently while collaborating effectively within a research and engineering team

Please include your grade sheet with your application.

Preferred Technical And Professional Experience

  • Experience with Kubernetes (K8s) and cloud-native systems
  • Familiarity with inference serving stacks, networking, or GPU-based systems
  • Experience with benchmarking, profiling, or performance analysis

משרות דומות שיכולות לעניין אותך

 נכון לתאריך

 

30/12/2025

 חיפה

**Introduction

**IBM Research is at the forefront of creating intelligent systems that transform how humans interact with computers. We are curre...  

read more

 נכון לתאריך

 

09/12/2025

 חיפה

**Introduction

**At IBM work is more than a job - it’s a calling: To build. To design. To code. To consult. To think along with clients and sell....  

read more

 נכון לתאריך

 

30/12/2025

 חיפה

**Introduction

**At IBM work is more than a job - it’s a calling: To build. To design. To code. To consult. To think along with clients and sell....  

read more

 נכון לתאריך

 

28/11/2025

 חיפה

**Company

**Qualcomm Israel Ltd.

**Job Area

**Interns Group, Interns Group > Engineering Intern

**General Summary

****Positio...  

read more

 נכון לתאריך

 

21/11/2025

 חיפה

**Company

**Qualcomm Israel Ltd.

**Job Area

**Interns Group, Interns Group > Engineering Intern

**General Summary

**As a SW s...  

read more

 נכון לתאריך

 

18/12/2025

 חיפה

**Join Annapurna Labs family and take major part redefining the future of AWS cloud. We’re searching for sharp engineers to lead cutting edge products...  

read more

 נכון לתאריך

 

15/12/2025

 חיפה

**AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic...  

read more

 נכון לתאריך

 

27/11/2025

 חיפה

**Job Details

**Job Description:

Join the Thunderbolt team to work on the cutting-edge technology of computer I/O. We are Thunderbolt. What ...  

read more

 נכון לתאריך

 

10/01/2026

 חיפה

Data Engineer & AI

זיהוי דרישה: 5843

מיקום גאוגרפי: חיפה

חברה: ElbitSystems

עיר: חיפה

**We are looking for a talented and mot...  

read more
הצג משרות דומות נוספות...

Mploy אצלכם בוואטסאפ

✨ רוצים להתעדכן בכל המשרות הכי שוות ישר לנייד?

הצטרפו לקבוצות הוואטסאפ שלנו וקבלו את כל ההצעות המתאימות – בלי לחפש, ובלי לפספס. מחכים לכם! 📱😊