Sr Architect, IT Situational Awareness at Information Technology Senior Management Forum
Dallas, Texas, USA -
Full Time


Start Date

Immediate

Expiry Date

30 Oct, 25

Salary

0.0

Posted On

30 Jul, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Documentation, Software Development, Information Systems, Reliability Engineering, Appdynamics, Mainframe, Oci, Distributed Systems, Firewalls, Mission Critical Environments, Collaboration, Programming Languages, Java, Technical Documentation, Devops, Code, Scripting

Industry

Information Technology/IT

Description

Posted Date
7/29/2025
Description
Job Description

MINIMUM QUALIFICATIONS- EDUCATION & PRIOR JOB EXPERIENCE

  • Bachelor’s degree in technology, Computer Science, Information Systems, or related technical discipline, or equivalent experience/training
  • 5+ years’ experience in system architecture, software development, cloud computing (e.g., AWS, Azure), and cybersecurity.
  • 3+ years’ experience in designing, deploying, and configuring telemetry solutions to observe diverse tech stacks, ranging from mainframe, midrange, and cloud technologies.
  • 5+ years of experience in observability, site reliability engineering, DevOps, or a related technical field.
  • Prior experience architecting observability solutions for high-availability, large-scale, or mission-critical environments.
  • Demonstrable leadership in driving observability initiatives and influencing engineering culture.

PREFERRED QUALIFICATIONS- EDUCATION & PRIOR JOB EXPERIENCE

  • Master’s degree in Computer Science, Computer Engineering, Technology, Information Systems (CIS/MIS), Engineering or related technical discipline, or equivalent experience/training
  • Familiarity with various programming languages and frameworks.
  • Airline Industry leadership experience.
  • Certifications in cloud architecture, DevOps, or observability tools.
  • Experience with automation frameworks, Infrastructure as Code (IaC), and integrating observability with deployment pipelines.

SKILLS, LICENSES & CERTIFICATIONS

  • Technical Expertise: In-depth knowledge of observability pillars (metrics, logs, traces), telemetry, and distributed tracing frameworks (OpenTelemetry, Jaeger, Zipkin).
  • Tool Proficiency: Hands-on experience with one or more major observability platforms (e.g., Dynatrace, Prometheus, Grafana, Datadog, New Relic, ELK/EFK Stack, Splunk, AppDynamics).
  • Cloud and Containerization: Experience with cloud platforms (Azure, AWS, IBM Cloud, GCP, OCI) and container orchestration (Kubernetes, Docker).
  • Programming and Scripting: Proficiency in at least one programming or scripting language (Python, Go, Java, Bash, etc.) for automation and tool integration.
  • Problem-Solving: Strong analytical and troubleshooting skills, especially with complex, distributed systems.
  • Documentation: Ability to produce clear, concise technical documentation and architectural diagrams.
  • Security Awareness: Understanding of security implications within observability pipelines and compliance requirements.
  • Advanced knowledge of IT systems, networks, and applications with deep knowledge of Observability space (Monitoring, logging , AIOps and notification systems).
  • Hands on experience in architecting and implementing AIOps solutions (Artificial Intelligence Operations; Tools like BigPanda)
  • Mastery of modern resiliency practices, and how observability supports resiliency.
  • Moderate knowledge in designing enterprise grade infrastructure solutions (on-prem, hybrid, cloud, virtualization -vmware ) that are scalable and secure.
  • Knowledge of firewalls, routing, VPN, load Balancers.
  • Strong analytical, technical, and problem-solving skills.
  • Familiarity with DevOps practices and tools.
  • Collaboration: Excellent communication and interpersonal skills; proven ability to work effectively across teams.
  • Strong interpersonal skills to work effectively with cross-functional teams.
Responsibilities

As noted above, this list is intended to reflect the current job but there may be additional essential functions (and certainly non-essential job functions) that are not referenced. Management will modify the job or require other tasks be performed whenever it is deemed appropriate to do so, observing, of course, any legal obligations including any collective bargaining obligations.

  • Architectural Design: Develop and maintain the overall architecture of IT solutions, ensuring they meet business needs and are scalable, secure, and efficient.
  • Stakeholder Collaboration: Work with business leaders, project managers, and IT teams to gather requirements and translate them into technical solutions.
  • System Integration: Oversee the integration of new technologies and systems into the existing IT environment, ensuring compatibility and performance.
  • Documentation: Create detailed architectural documentation, including diagrams, design patterns, and technical specifications.
  • Compliance and Security: Ensure that all solutions comply with industry standards and regulations and implement robust security measures to protect data and systems.
  • Continuous Improvement: Stay updated with the latest technologies and industry trends and incorporate relevant advancements into the organization’s IT strategy.
  • Key Responsibilities:
  • Design and architect comprehensive observability frameworks encompassing metrics, traces, logs, and events across cloud and on-premise environments.
  • Develop and enforce best practices for instrumentation, monitoring, alerting, and dashboarding for applications and infrastructure.
  • Lead the evaluation, selection, and deployment of observability tools such as Dynatrace, or similar platforms.
  • Collaborate with development, enterprise, and security teams to ensure seamless integration of observability solutions into CI/CD pipelines and operational workflows.
  • Define guidelines and best practices for tracking Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs) in partnership with stakeholders.
  • Analyze observability data to identify trends, perform root-cause analysis, and recommend system improvements.
  • Mentor engineering teams on observability principles and facilitate knowledge-sharing throughout the organization.
  • Stay current with industry trends, emerging technologies, and best practices in observability and site reliability engineering.
Loading...