Vacancy Description
Join Amazon Devices as an ML Kernel Performance Engineer focused on Edge AI. Develop high-performance CUDA and Triton kernels to enhance model compression for efficient training and inference.
In this role, you will operate at the hardware-software boundary, collaborating with cross-functional teams to design and implement kernels for quantization-aware training and low-bit inference. Your work will involve analyzing kernel performance, identifying bottlenecks, and optimizing performance for cloud and edge deployments using modern GPU accelerators.
Key Responsibilities:
• Design CUDA and Triton kernels for efficient model training
• Conduct performance analysis to resolve training bottlenecks
• Implement kernel optimizations for compression tasks
• Create a kernel development harness for profiling
• Maintain a comprehensive training kernels library
Requirements:
• 3+ years in software development
• 2+ years in design or architecture of systems
• Experie...
In this role, you will operate at the hardware-software boundary, collaborating with cross-functional teams to design and implement kernels for quantization-aware training and low-bit inference. Your work will involve analyzing kernel performance, identifying bottlenecks, and optimizing performance for cloud and edge deployments using modern GPU accelerators.
Key Responsibilities:
• Design CUDA and Triton kernels for efficient model training
• Conduct performance analysis to resolve training bottlenecks
• Implement kernel optimizations for compression tasks
• Create a kernel development harness for profiling
• Maintain a comprehensive training kernels library
Requirements:
• 3+ years in software development
• 2+ years in design or architecture of systems
• Experie...
Ready to Apply?
अभी आवेदन करें
Submit your application for Edge AI ML Kernel Performance Engineer at Amazon Development Centre Canada ULC
Apply for this Position