CI/CD
DevOps
Triage
Dashboard
Debugging
Telemetry
Operations
Automation
Reliability
Problem Solving
Computer Science
Safety Assurance
Technical Issues
Operating Systems
Analytical Skills
Incident Response
Workflow Management
Software Engineering
Software Development
Computer Engineering
Device Configuration
Operational Reporting
Live Connect (Windows)
Reliability Engineering
Artificial Intelligence
Authorization (Computing)
Android (Operating System)
Software Quality (SQA/SQC)
Verbal Communication Skills
Employee Assistance Programs
Site Reliability Engineering
Troubleshooting (Problem Solving)
Hardware Configuration Management
5 to 7 years
Posted today
Description:
Software Development Engineer (Contract)
Please note that this is a contract role providing services to Microsoft through external staffing partners of Allegis Global Solutions. If you are selected for this role, you will be employed by AGS and will not be an employee of Microsoft.
Summary:
We are seeking a highly skilled and experienced Senior Operations and Reliability Engineer to support live operations, service reliability, release stability, and prototype device monitoring for a cutting-edge hardware and software product. This role is central to ensuring the health and stability of services, applications, and prototype device environments through telemetry monitoring, live issue diagnosis, software release validation, and incident response support. The ideal candidate thrives in a fast-paced engineering environment, is comfortable working with logs, dashboards, alerts, and deployment signals, and can communicate technical findings clearly to cross-functional teams. While the role offers strong support from experienced engineers who will provide guidance on service architecture, telemetry interpretation, and complex debugging, the engineer is expected to take ownership of day-to-day monitoring, release validation, live issue triage, documentation, and operational reporting with a high degree of independence.
Job Responsibilities:
AGS is an Equal Opportunity Employer and will consider all applications without regard to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.
If you require a reasonable accommodation related to the application or interview process due to a disability, please email accommodation@allegisglobalsolutions.com . This inbox is monitored solely for accommodation requests. For questions about open roles or to apply, please submit your application through the job posting, as this inbox is not monitored by recruiters and applications sent here will not be reviewed.
In accordance with the Immigration Reform and Control Act of 1986, employment is contingent upon verification of identity and authorization to work in the United States. All persons hired will be required to complete Form I-9 and provide acceptable documentation as required by law.
Please note that we may use artificial intelligence (AI) tools to screen, assess, or select applicants for this position. These tools may analyze application materials and assist our team in identifying candidates whose qualifications best match the requirements of the role. If you have questions about our use of AI in the hiring process, or would like more information, please contact us.
*We reserve the right to pay above or below the posted wage based on factors unrelated to protected classifications.
Individual compensation offered for this position within this range will depend on many factors, including qualifications, skills, relevant experience, job knowledge, geographic location, internal equity, and other pertinent job-related factors.
Please note that this is a contract role providing services to Microsoft through external staffing partners of Allegis Global Solutions. If you are selected for this role, you will be employed by AGS and will not be an employee of Microsoft.
Summary:
We are seeking a highly skilled and experienced Senior Operations and Reliability Engineer to support live operations, service reliability, release stability, and prototype device monitoring for a cutting-edge hardware and software product. This role is central to ensuring the health and stability of services, applications, and prototype device environments through telemetry monitoring, live issue diagnosis, software release validation, and incident response support. The ideal candidate thrives in a fast-paced engineering environment, is comfortable working with logs, dashboards, alerts, and deployment signals, and can communicate technical findings clearly to cross-functional teams. While the role offers strong support from experienced engineers who will provide guidance on service architecture, telemetry interpretation, and complex debugging, the engineer is expected to take ownership of day-to-day monitoring, release validation, live issue triage, documentation, and operational reporting with a high degree of independence.
Job Responsibilities:
- Monitor telemetry from services, applications, and prototype devices to assess operational health and identify anomalies, failures, performance degradation, or emerging reliability risks
- Analyze real-time metrics, dashboards, alerts, and logs to support troubleshooting across cloud, on-premises, and prototype device environments
- Triage operational issues and communicate findings clearly to engineering, QA, PM, and product teams
- Provide actionable insights based on telemetry trends, system behavior, and recurring failure patterns
- Support software releases by validating deployments, monitoring live systems, and assessing post-deployment stability
- Track service health during rollouts, ring deployments, updates, and release validation windows
- Identify, debug, and help resolve live issues affecting services, devices, internal users, or product readiness
- Partner with engineering teams to support mitigations, fixes, rollbacks, or follow-up validation
- Support incident response by gathering data, summarizing impact, identifying suspected causes, and tracking mitigation progress
- Participate in post-incident reviews and help document lessons learned
- Recommend improvements to monitoring, alerting, operational procedures, and service reliability practices
- Perform in-person troubleshooting for self-hosted systems, prototype devices, or test environments when telemetry or dashboards indicate issues
- Assist with device configuration, deployment, validation, and live verification
- Run smoke checks or readiness checks to confirm device, service, and environment health
- Maintain documentation of hardware configurations, operational procedures, environment setup, and observed issues
- Communicate operational status, risks, and technical findings clearly and promptly to cross-functional teams
- Provide concise summaries of system health, release readiness, incident status, and recommended next steps
- Help reduce operational toil by identifying repeatable troubleshooting steps, documentation gaps, and automation opportunities
- 5 to 7 years or more of relevant experience in software engineering, DevOps, site reliability engineering, production operations, infrastructure, service reliability, or related technical operations roles
- 3 or more years of hands-on experience with monitoring, telemetry analysis, logging, and live issue troubleshooting
- 3 or more years of demonstrated ability to independently drive technical work and deliver operational value
- Experience with Android mobile operating system is required, with additional experience considered a strong advantage
- Bachelor's degree in Computer Science, Computer Engineering, Software Engineering, or a related technical field, or equivalent practical experience
- Proficiency in monitoring live services, applications, infrastructure, or device environments
- Strong ability to use dashboards, alerts, logs, metrics, and telemetry to diagnose system health and troubleshoot issues
- Experience supporting software releases, deployments, production validation, or service rollouts
- Familiarity with CI/CD workflows, cloud or hybrid infrastructure, release validation, and incident response practices
- Ability to investigate technical issues, summarize findings, and communicate risks clearly to engineering and product teams
- Experience with Android mobile operating system environments
- Strong problem-solving and analytical skills with the ability to identify patterns in complex system behavior
- Excellent written and verbal communication skills for documenting incidents, operational procedures, and technical findings
- Ability to work independently and manage multiple priorities in a fast-moving engineering environment
- Experience collaborating effectively with software, QA, infrastructure, PM, and product teams
- Location: Redmond, WA (Onsite)
- Duration: 13 months
- Pay Range*: $36.50 - $41.50 per hour
- Weekly Schedule: 40 hours
- Job Status: Non-Exempt
- Application Deadline: Apply within 72 hours of the posting date to ensure consideration.
- Medical, dental & vision
- Hospital plans
- 401(k) Retirement Plan - Pre-tax and Roth post-tax contributions available
- Life Insurance (Company paid Basic Life and AD&D as well as voluntary Life & AD&D for the employee and dependents)
- Company paid short and long-term disability
- Health & Dependent Care Spending Accounts (HSA & DCFSA)
- Employee Assistance Program
- Time Off/Leave(PTO, Allegis Group Paid Family Leave, Parental Leave
AGS is an Equal Opportunity Employer and will consider all applications without regard to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.
If you require a reasonable accommodation related to the application or interview process due to a disability, please email accommodation@allegisglobalsolutions.com . This inbox is monitored solely for accommodation requests. For questions about open roles or to apply, please submit your application through the job posting, as this inbox is not monitored by recruiters and applications sent here will not be reviewed.
In accordance with the Immigration Reform and Control Act of 1986, employment is contingent upon verification of identity and authorization to work in the United States. All persons hired will be required to complete Form I-9 and provide acceptable documentation as required by law.
Please note that we may use artificial intelligence (AI) tools to screen, assess, or select applicants for this position. These tools may analyze application materials and assist our team in identifying candidates whose qualifications best match the requirements of the role. If you have questions about our use of AI in the hiring process, or would like more information, please contact us.
*We reserve the right to pay above or below the posted wage based on factors unrelated to protected classifications.
Individual compensation offered for this position within this range will depend on many factors, including qualifications, skills, relevant experience, job knowledge, geographic location, internal equity, and other pertinent job-related factors.
Posting ID: 1073388326_crt:1779167042996
