Sr. Site Reliability Engineer
TEKsystems
Posted Monday, May 15, 2023
Posting ID: JP-003777481
Description:
Empower platform users with a rich feature set, high availability, and stellar
performance level to pursue their missions. Deliver insights from massive scale data in
real time, bring fresh ideas, demonstrate a unique and informed viewpoint, and
collaborate with a cross-functional teams to develop real-world solutions and positive
user experiences at every interaction.
Essential Functions
May be required to perform one or more of the following functions:
Improve reliability, quality, and time-to-market of our suite of software solutions, through
effective hosting, monitoring, operations, and automation
Proactively deliver on SLOs identifying and remediating issues keeping production
performance within service SLAs
Build software and systems to manage platform infrastructure and applications, creating
sustainable systems and services through automation and uplifts
Support system cost modeling for all hosted systems
Measure and optimize system performance, with an eye toward pushing our capabilities
forward, getting ahead of customer needs, and innovating to continually improve
Drive primary operational support and engineering for multiple large, distributed
software applications
Develop guidelines and detailed plans for automated systems delivery maintaining
system and data security
Conduct impact analysis regarding enterprise-wide technology
Perform capacity monitoring with various monitoring tools (Splunk, Dynatrace, etc.) and
make recommendations leading to the integration new tools to give visibility to required
components
Gather and analyze metrics from both operating systems and applications to assist in
performance tuning, fault finding, and corrective action planning
Site Reliability Engineer VSP Proprietary 3/31/2023
Support system integration, software, and hardware at enterprise level for optimum
performance
Partner with development teams to improve services through rigorous automated
testing and release procedures
Contribute to system architecture planning, and policies and procedures surrounding
enterprise-wide technology
Participate and lead system design consulting, platform management, and capacity
planning
Guide change with a focus on optimal outcomes
Remain current on new technologies; introduce applicable technology in alignment with
VSP goals and for creative solutions
Job Specifications
Typically has the following skills or abilities:
Bachelor’s Degree in Computer Science or related field and/or equivalent experience
6+ years of related experience and ability to perform job role independently
Experience with both Windows and Linux, as well as containerization software products
Proficient with continuous integration and continuous delivery
Experience with automation and orchestration using Chef, Puppet, Ansible and
containers
Coding experience beyond simple scripts and knowledge of application architecture
Ability to program (structured and OO) with one or more high level languages, such as
Python, Java, C/C++/C#, Ruby, and JavaScript
Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as
dynamic resource management frameworks (OpenShift, Kubernetes, Yarn)
A proactive approach to spotting problems, areas for improvement, and identifying
performance bottlenecks, leading to problem and root cause analysis and risk mitigation
Capacity monitoring and performance planning experience with cloud solutions like
AWS using applications such as Dynatrace, New Relic, App Dynamic
Meet/exceed organizations best practices, expectations, and standards while being
security focused (DevSecOps)
Skills:
site reliability, windows, Linux, Automation, coding
Top Skills Details:
site reliability, windows, Linux, Automation, coding
Additional Skills & Qualifications:
100% remote. This person will be responsible for helping stand up Site Reliability operations for this client. This is not an order taker role. This person should have experience in helping stand up these operations as well as experience working with various levels of leadership and different groups within an organization.
Contact Information
Email: mmegery@teksystems.com