Kubernetes based Cloud Native Orchestration Platform with GPU Capabilities
Startup (revenue-generating, Series A)

Company size: 40

Future unicorn

REMOTE first culture

Smart, fun, low-ego team culture

Compensation: Base Salary 250k++, Equity

Key Responsibilities:

Extending capabilities beyond the current CPU optimization into GPU optimization.

Develop capabilities to help customers rapidly set up ML Environment

Architecture & Development: Kubernetes-based ML/AI platform.

Leadership & Collaboration: with C-staff, product management, engineering, and design partners.

Communication: Create detailed architecture diagrams, documents, and presentations.

Focus on the User Experience (K8s users, Infrastructure Admin, MLOps staff, etc).

Open Source Community: Stay actively involved with CNCF and related projects.

Enterprise-Class Solutions: Drive & deliver solutions for enterprise-class data, ML, AI applications.

FinOps & SRE Best Practices: FinOps for cloud financial management, modern SRE practices.

Qualifications:

Entrepreneurial, Startup Experience

10 years+ infrastructure level software architecture and development.

Extensive Experience:

Linux, Virtualization platforms (hands-on)

AWS, GCP or Azure.

Strong experience:
Kubernetes-based ML/AI systems (Kubeflow, Kueue, KServe, GPU Operators, DRA, Karpenter)
Deep knowledge:

ML/AI use cases & customer stories of model development, training, inference, & hardware accelerator usage (CPU, GPU, TPU).

Modern cloud-native architectures (scalability, availability, reliability, security, observability).

Proven track record of delivering complex distributed systems.

Active involvement in open-source communities, particularly CNCF and related projects.

Strong leadership and team collaboration skills.

Excellent communication skills, both verbal and written.

Preferred Qualifications:

Knowledge of additional ML/AI frameworks and tools.

Experience in DevOps practices and tools.

Certification in Kubernetes or related technologies.

Awareness of FinOps and SRE best practices

Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.

Living Talent

You must sign in to apply for this position.