Observability Engineer
at EDP PT
Lisboa, Área Metropolitana de Lisboa, Portugal -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 15 Feb, 2025 | Not Specified | 16 Nov, 2024 | N/A | Good communication skills | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
Country/Region: PT
City: Lisbon
Company: EDP, S.A.
The DGU Tech Services is looking to recruit an Observability Engineer
EDP is a global energy group present in around 30 markets with a particular emphasis on renewable energies. With more than 45 years of experience, we have been consolidating a relevant presence on the world energy scene based on the commitment to be all-green by 2030, leading the energy transition. With more than 13,000 employees around the world, we are committed to using our energy and heart to drive a better tomorrow.
WHAT ARE WE LOOKING FOR:
- Bachelor’s or Master’s degree in Engineering, or a related field;
- Extensive experience with monitoring tools, particularly Splunk, Elastic, and ITSI, with the ability to propose effective solutions;
- Proven expertise in problem analysis and incident resolution in complex environments;
- Familiarity with cloud platforms (Azure, AWS, Google Cloud) and their native monitoring tools;
- Experience with ITIL practices and platforms like ServiceNow;
- Proficient in configuration management tools such as Ansible and Terraform for automating and standardizing system and monitoring configurations;
- Hands-on experience with APM tools like Dynatrace for monitoring critical application performance;
- Strong scripting skills (Python, PowerShell, Shell Script) for automating monitoring tasks and optimizing incident response;
- Focus on continuous improvement through the analysis of performance and availability metrics, proactively optimizing operations and ensuring SLA compliance;
- Competence in developing real-time dashboards and alerts, providing efficient system visibility;
- Focus on continuous improvement, with a proactive approach to optimizing implemented solutions;
- Ability to monitor new system configurations, ensuring that changes do not negatively impact performance or observability;
- Knowledge of OpenTelemetry for distributed tracing, metrics, and logging in complex systems;
- Strong coordination skills with cross-functional teams to integrate monitoring practices and oversee configuration follow-ups;
- Experience with centralized observability platforms, consolidating data and applying best practices for comprehensive applications observability.
More than academic knowledge and technical skills, we are looking for ambitious people who are enthusiastic about the future and who bring human skills aligned with our purpose.
Responsibilities:
Collaborating closely with teams managing monitoring platforms, this role drives the development of innovative solutions while leading initiatives to enhance system observability.
- Designing and implementing improvements to monitoring processes, ensuring that systems are effectively tracked for performance, and enabling proactive issue resolution. This ensures optimized performance, minimizes downtime, and supports continuous system health monitoring;
- Collaborate with DevOps, development, and cross-functional teams to request, track, and coordinate the implementation of system configurations and observability practices. Ensure the effective adoption of observability solutions in alignment with operational goals;
- Design and implement monitoring solutions to ensure system and application performance observability, using tools such as Splunk, Elastic, and other relevant platforms;
- Develop and optimize automation scripts for monitoring processes and problem resolution, minimizing response times and enhancing operational reliability;
- Collect, analyze, and interpret system performance metrics, identifying bottlenecks and ensuring SLA compliance while optimizing overall performance;
- Configure customized dashboards and automated alerts to provide clear, real-time visibility into system status, enabling quick detection and resolution of incidents;
- Continuously evaluate observability solutions, proposing and implementing improvements and optimizations to meet evolving needs and align with organizational objectives;
- Consolidate data from multiple monitoring tools into a unified view, facilitating efficient IT operations management and informed decision-making;
- Proactively monitor critical business services, implementing preventive measures to mitigate risks and avoid disruptions;
- Collaborate in the implementation and refinement of system and monitoring tool configurations, ensuring that observability solutions align with operational needs;
- Leverage experience with ITIL frameworks and platforms like ServiceNow to align monitoring processes with service management best practices, enhancing incident management, change control, and service delivery.
Employment type: Full-Time
Work site: Hybrid Model
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Engineering
Proficient
1
Lisboa, Portugal