Search icon
Radius icon
Search for and find Sr. Site Reliability Engineer jobs and TEKsystems jobs at CareerCircle.com
TEKsystems jobs, learn more at CareerCircle.com

Sr. Site Reliability Engineer

TEKsystems

Posted Monday, May 15, 2023

Posting ID: JP-003777481

×Not Interested
Save Job
Pin drop icon
Remote
Share:Facebook iconTwitter iconLinkedin icon

Description:

Empower platform users with a rich feature set, high availability, and stellar

performance level to pursue their missions. Deliver insights from massive scale data in

real time, bring fresh ideas, demonstrate a unique and informed viewpoint, and

collaborate with a cross-functional teams to develop real-world solutions and positive

user experiences at every interaction.


Essential Functions

May be required to perform one or more of the following functions:

Improve reliability, quality, and time-to-market of our suite of software solutions, through

effective hosting, monitoring, operations, and automation

Proactively deliver on SLOs identifying and remediating issues keeping production

performance within service SLAs

Build software and systems to manage platform infrastructure and applications, creating

sustainable systems and services through automation and uplifts

Support system cost modeling for all hosted systems

Measure and optimize system performance, with an eye toward pushing our capabilities

forward, getting ahead of customer needs, and innovating to continually improve

Drive primary operational support and engineering for multiple large, distributed

software applications

Develop guidelines and detailed plans for automated systems delivery maintaining

system and data security

Conduct impact analysis regarding enterprise-wide technology

Perform capacity monitoring with various monitoring tools (Splunk, Dynatrace, etc.) and

make recommendations leading to the integration new tools to give visibility to required

components

Gather and analyze metrics from both operating systems and applications to assist in

performance tuning, fault finding, and corrective action planning

Site Reliability Engineer VSP Proprietary 3/31/2023

Support system integration, software, and hardware at enterprise level for optimum

performance

Partner with development teams to improve services through rigorous automated

testing and release procedures

Contribute to system architecture planning, and policies and procedures surrounding

enterprise-wide technology

Participate and lead system design consulting, platform management, and capacity

planning

Guide change with a focus on optimal outcomes

Remain current on new technologies; introduce applicable technology in alignment with

VSP goals and for creative solutions

Job Specifications


Typically has the following skills or abilities:

Bachelor’s Degree in Computer Science or related field and/or equivalent experience

6+ years of related experience and ability to perform job role independently

Experience with both Windows and Linux, as well as containerization software products

Proficient with continuous integration and continuous delivery

Experience with automation and orchestration using Chef, Puppet, Ansible and

containers

Coding experience beyond simple scripts and knowledge of application architecture

Ability to program (structured and OO) with one or more high level languages, such as

Python, Java, C/C++/C#, Ruby, and JavaScript

Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as

dynamic resource management frameworks (OpenShift, Kubernetes, Yarn)

A proactive approach to spotting problems, areas for improvement, and identifying

performance bottlenecks, leading to problem and root cause analysis and risk mitigation

Capacity monitoring and performance planning experience with cloud solutions like

AWS using applications such as Dynatrace, New Relic, App Dynamic

Meet/exceed organizations best practices, expectations, and standards while being

security focused (DevSecOps)


Skills:

site reliability, windows, Linux, Automation, coding


Top Skills Details:

site reliability, windows, Linux, Automation, coding


Additional Skills & Qualifications:

100% remote. This person will be responsible for helping stand up Site Reliability operations for this client. This is not an order taker role. This person should have experience in helping stand up these operations as well as experience working with various levels of leadership and different groups within an organization.


Contact Information

Email: mmegery@teksystems.com

Related Courses

Blog