AI/ML - Device Cloud SW Engineering Manager, ML Platform & Technology
High-quality software and Machine Learning requires large-scale infrastructure to validate and train software prototypes and models. The team at Apple at the forefront of building this infrastructure using Apple hardware is looking for an engineering leader! This exciting role will bring best practices, knowledge, and experience from traditional data center operations, and apply those ideas to data centers with thousands of Apple devices.
In this role, you’ll have the opportunity to design, implement, and operate large-scale scheduling, systems management, and failure detection. You’ll have a team of software engineers to develop new solutions. You’ll work closely with the SRE team to deploy and operate the solutions you develop. And you’ll have the opportunity to influence the design of Apple hardware to ensure it’s fully compatible with high-density deployments and automation. If you’re interested in Apple devices, and you have a background in data center infrastructure, this role will give you an exciting opportunity to apply your skills to an entirely new problem space.
- Significant experience developing and/or testing large scale enterprise systems with sophisticated distributed integrations
- Experience leading systems software development projects
- Excellent leadership, communication, prioritization, collaboration, and problem-solving skills
- Experience with developing and implementing systems software solutions, i.e., filesystems, scheduling, network routing, and failure detection.
- Solid interpersonal skills are a requirement due to the high level of interaction with engineering teams, management, and other organizations within Apple
- Ability to act as a technical leader to: increase infrastructure reliability and device availability through process and automation, reduce customer-impacting defects and improve early fault detection
- You are upbeat, adaptable, and results-oriented with a positive attitude
We are looking for a motivated and service-oriented individual who loves working on innovative infrastructure. You must enjoy working on multiple concurrent projects in a fast-paced environment that nurtures growth, collaboration, and innovation.
Some of your responsibilities may include:
- Operating and innovating a geographically distributed device cloud infrastructure, enabling large scale machine learning workflows across Apple
- Supervising and mentoring systems software engineers; responsible for their management, evaluation, and career development
- Applying your strong technical expertise to allow you to work at the micro or macro level
- Software configuration management where required tooling is consistently and optimally deployed across the fleet of machines
- Solving problems and resolving issues in live production environments and implementing strategies to eliminate them
- Advocating for automation trusting it plays a meaningful role in software development and sustenance
- Being willing to get into the details; be hands-on with the scheduling and prioritization needs of internal customers & how to best serve these needs on the internal device cloud
Education & Experience
B.S. in Computer Science or equivalent experience/expertise
- Nice to have, but not required:
- - Experience with macOS and iOS is a plus
- - Understanding of standard networking protocols and components such as HTTP, DNS, TCP/IP, Subnetting
- - Knowledge of Puppet, Ansible, or other configuration management tools
- - Familiarity with Git or other source control systems
Your application has been successfully submitted.