Sr. Research Support/ Devops Engineer
Posting ID: JP-002661563
Sr. Research Support/Devops Engineer
Top Skills Details:
Azure, Linux, Ubuntu, Powershell, Machine Learning, Big Data, Hadoop, Docker, Java, Bash, Kubrnetes, Jenkins, Git, Continuous Integration,shell scripting,Go,Ansible
Are you are interested in working with an organization that is helping to make the next generation of cutting edge technology possible? Can you think "outside the box" to engineer technology solutions that solve complex issues? Is coordinating and collaborating with some of the sharpest researchers in the world an exciting challenge for you?
If so, this team( Technology & Research) division needs your technical expertise to help support our Machine Learning/Deep Learning researchers to troubleshoot and optimize their workloads on our hardware-accelerated super-computing clusters.
Research Technology Engineering team supports the technological needs of one of the most diverse corporate research labs in operation today. With over 1000 researchers doing work in more than 55 areas across 6 labs worldwide, the technology exposure, engineering and problem-solving opportunities are diverse and continually changing.
You must be a self-starter who works well as part of distributed team as this position is working with a global team of support professionals. The high-level responsibilities that this team provides are to engineer, deploy and support the technical solutions that enable activities from basic computer science research to cutting edge Cloud Computing and Machine Learning. We focus on delivering timely solutions engineered with the highest possible levels of customer service, attention to detail, cost effectiveness and trustworthy computing.
The Sr. Research Support Engineer role will help understand our users’ data processing needs, optimize and troubleshoot their workloads on our supercomputer systems, engineer solutions to recurring problems and keep our technical documentation current and relevant. You will interact with researchers directly to understand their needs and issues, then consult with them to improve their outcomes. Foundational knowledge of current machine learning/deep learning techniques is critical. Passion for new technology and intellectual curiosity are also keys for success in this role.
Daily responsibilities will include (but are not limited to) monitoring our cluster infrastructure and users’ jobs for failures and inefficiencies. Consulting with researchers to troubleshoot job configuration and optimization. Instruct team members and researchers on best practices for job configuration and performance. Maintain documentation to leverage your knowledge more efficiently with a global (internal only) user base.
Qualifications for this role include:
• Proficient in building, maintaining and using Docker containers
• Proficient in running PyTorch and Tensorflow toolkit jobs for AI training and inference tasks with GPU acceleration
• Experience with other ML toolkits such as ONNYX, MXNet, Sci-kit learn, Theano, Keras and others.
• Proficient with data loader and serialization/deserialization optimization
• Advanced DevOps-CI/CD with scripting, automation and orchestration experience using Python, Ansible, Bash, C#, Powershell, Java, Jenkins, etc..
• Proficient knowledge and experience in Linux (Ubuntu) configuration and administration
• Introductory knowledge of Big Data Infrastructure (Hadoop)
• Experience with containerization/deploying applications and Microservices
• Familiarity with HPC (High Performance Computing) and Azure Machine Learning
• Experience troubleshooting and optimizing workloads on GPU server clusters and workstations.
• Familiarity with networking protocols
• Sound troubleshooting, analytical and organizational skills
• Strong written and verbal communication and interpersonal skills
• Impeccable documentation skills
• Passion for learning and drive to ramp up quickly on new or unfamiliar technologies
• A BS/BA degree in computer science or related field required. Masters preferred. Bachelor’s in CompSci and a professional certification in Data Science can also be acceptable.
Azure, Linux, Ubuntu, Powershell, Machine Learning, Big Data, Hadoop, Docker, Java, Bash, Kubrnetes, Jenkins, Git, Continuous Integration, shell scripting, Go, Ansible, Pytorch, Python, Tensorflow, C#, GPU clusters, MXNet, Sci-kit, Automation, Orchestration, Troubleshooting, Microservices
Additional Skills & Qualifications:
Passion for technology, learning, troubleshooting and problem solving.
We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in Full-Stack Technology Services, Talent Services, and real-world application, we work with progressive leaders to drive change. That's the power of true partnership. TEKsystems is an Allegis Group company.
Recruiter: Jean Chambers
Phone: (410) 579-3072