We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Principal Software Engineer

Microsoft
United States, Washington, Redmond
Jan 11, 2025
OverviewMicrosoft is a company where passionate innovators come to collaborate, envision what can be and take their careers further. This is a world of more possibilities, more innovation, more openness, and the sky is the limit thinking in a cloud-enabled world. Excited to be a part of Microsoft's leading initiative on AI and ML? Everywhere we look, AI/ML is the buzz. Microsoft is at the forefront of this AI/ML revolution to help enterprises with platforms to make decisions, train models and learn across their product portfolio. If you like challenges and want to develop services to support AI/ML solutions to empower enterprise, organization and AI/ML practitioners to improve their products through AI, then this is your ideal opportunity. We are the AI Platform team in the Cloud and AI platform group and seek experienced developers with distributed systems experience. We build platforms to handle AI workloads for customers and have a significant business impact on products that ship across all Microsoft groups as well as enterprise customers. One of the key challenges with AI-platforms solutions is how to distribute, manage and connect different compute platforms, handle the provisioning, scheduling of workloads across different platforms and run the user logic in those computes in an efficient and secure way. With the increasing demand and limited supply of GPUs, it becomes critical to handle the workloads to consume the resources efficiently. Our team within AI Platform is trying to build cloud services built on Kubernetes to solve this problem. The area provides challenges from different fronts. The solution involves large-scale backend services, interacting with multiple types of compute targets, VM, Containers, building UX experience, and Open Source SDK. The scale of jobs being submitted, and computes being created is huge, and it needs to interface with different compute platforms. We need talent with strong system architecture, design, and implementation skills to build a scalable system, as well as experience in state-of-art machine learning practice to understand and design the best experience. As a Principal Software Engineer, you will be part of a very strong and fun team, developing advanced and practical machine learning platforms. Here, you have the right environment and strong support to drive your favorite features to a solution. You are empowered to influence millions of end users. You will have opportunities to work together with world class developers and researchers to stay in the forefront of advancing technologies, such as machine learning, big data, deep learning, large scale models, GPT-4o, OAI, online experimentation, cloud computing. We do not just value differences or different perspectives. We seek them out and invite them in so we can tap into the collective power of everyone in the company. As a result, our customers are better served. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
ResponsibilitiesWe seek smart, highly innovative experienced engineers to take the path ahead for services to support machine learning solutions, work on highly distributed platforms with 24x7x365 service availability and tight SLA and build comprehensive and intuitive end-end solutions to delight our customers.Work on the architecture, design and development of the core compute services powering AzureML to tackle challenging AI problems using state-of-the-art AI models such as GPT4o and OSS models.Develop, test and maintain backend services written in C# hosted on Kubernetes clusters and Docker containers.Support multiple consumption of APIs through python SDK, CLI, UX.Enhance systems and applications to ensure high stability, efficiency, & maintainability, low latency, tight cloud security.Develop and foster a deep understanding of the machine learning systems and concepts and their usage by customers.Collaborate closely with engineers, data scientists within the team, internal Microsoft Research teams and external enterprises to build better solutions together.Provide vision, expertise, and technical leadership to other team members. Help to grow talent in these areas.Other:Embody our Culture and Values
Applied = 0

(web-776696b8bf-cvdwt)