Senior Systems Administrator - Req. 1900089
Performs analysis and design tasks related to Application Performance Monitoring. Executes on strategic direction and develops tactical plans for improving performance/stability of mission critical applications. Position requires extensive contact with development, QA, and admin/operational staff. Effectively identifies opportunities for change, implements change and introduces new concepts, procedures, policies and tools while providing a clear explanation of benefits and purpose.
- Triage degraded / outages situations in a production environment in order restore system health.
- Use monitoring tools to uncover the backend dependencies for critical applications and work with teams to identify performance improvements/bottlenecks.
- Acts as an escalation point for individuals/teams when they are engaged in troubleshooting production issues.
- Engages teams to ensure that operationally significant events are being addressed or escalated in a timely manner.
- Work with Command Center and Monitoring teams to ensure the proper level of visibility exists for business critical applications.
- Develop dashboards which show the overall health of a complex application. This will likely be accompanied by other dashboards showing the health of dependent systems.
Experience and Skills:
- Minimum of a B.S. in Computer Science, MIS or related degree and five (5) years of related experience or combination of education, experience and training.
- Technical Skillset highly preferred:
- Experience with Nagios, Splunk, SCOM, Service Now, MS PowerShell, Blue Stripe, AlertSite, Dynatrace, etc.
- Event Management and Integrations (Tools like CA Service Operation Insight and Service Now, leveraging REST)
- Understanding of standard protocols/technologies such as DNS/WINS, TCP/IP, FTP, SSH, RDP, Active Directory, HTTP/S, IIS, JBoss, F5, etc.
- Experience creating dashboards and relevant visualizations.
- Proficient with Dynatrace, including:
- Building custom measures
- Building Business Transactions
- Creating incident rules avoiding false positives
- Building dashboards to show application health/KPI
- Use Dynatrace to triage a performance problem in any environment
- None Required
- Analysis: Identify and understand issues, problems and opportunities; compare data from different sources to draw conclusions.
- Communication: Clearly convey information and ideas through a variety of media to individuals or groups in a manner that engages the audience and helps them understand and retain the message.
- Exercising Judgment and Decision Making: Use effective approaches for choosing a course of action or developing appropriate solutions; recommend or take action that is consistent with available facts, constraints and probable consequences.
- Technical and Professional Knowledge: Demonstrate a satisfactory level of technical and professional skill or knowledge in position-related areas; remains current with developments and trends in areas of expertise.
- Building Effective Relationships: Develop and use collaborative relationships to facilitate the accomplishment of work goals.