Senior Data Engineer at Stream Systems Ltd

Vancouver, BC V6H 1L9, Canada -

Full Time

Start Date

Immediate

Expiry Date

07 May, 25

Salary

0.0

Posted On

08 Feb, 25

Experience

0 year(s) or above

Remote Job

Telecommute

Sponsor Visa

Skills

Python, Aws, Sql, Dbt, Java, Jenkins, Confluence, Postgresql, Reporting, Scripting, Bitbucket, Sql Server, Platforms, Performance Tuning, Kafka, Security, Graphql, Demand, Rabbitmq, Azure, Jira

Industry

Information Technology/IT

Description

COMPANY OVERVIEW

Stream Systems (www.streamsystems.ca) is a leading-edge technology company that enables enterprises to optimize and make better decisions. Our SimOpti intelligence platform brings AI, machine learning and simulation to power business intelligence across complex operations, in any industry. Our entire team of talented individuals is growing quickly; we are targeting 50-75% growth over the next fiscal year. At this pivotal moment in the company, we are embarking on new product roadmaps and new development allowing for this role to highly influence the product direction.

POSITION SUMMARY

Reporting to the Software Development Manager, we are looking for an experienced Data Engineer to lead the development and implementation of our data layer. Your work will directly support and enable our hybrid and cloud-based platform, simulation models and engines, Artificial Intelligence (AI) / Machine Learning (ML) development, and SaaS products. For this role, you will work directly with the Platform Owner to design, develop, implement and optimize data ingestion and integration pipelines, data connectors, database schemas and data sharing APIs.

REQUIREMENTS

To ensure success in the role, you will need the following:

Experience and formal qualifications in a STEM related discipline, such as Engineering or Computer Science, or certification and extensive work experience in a data centric development role.
Experience designing and implementing data pipeline solutions accessing data from multiple big data sources and frameworks (AWS, Azure) as well as hybrid architectures including on-premises data connectors (SQL Server, PostgreSQL, Historian, Timeseries)
Data stack working proficiency in SQL, Python, Java, and API development utilizing GraphQL. Experience with messaging platforms like Pulsar, RabbitMQ and Kafka, data ingestion tools such as Airbyte, DBT, Databricks, MLflow an asset.
Experience in data analytics engineering, orchestrating data pipelines, performance tuning, and script and test automation development.
Experience architecting, scripting and maintaining ETL solutions for a wide variety of sources, data abstractions, and data pipelines for real-time, streaming, batch, and on-demand workloads.
Experience with data security strategies, configuring security for reporting, and certifications such as SOC 2.
Functional experience using Database IDE’s, IntelliJ/DataGrip, Jira, Confluence, Bitbucket, and Jenkins would be considered an asset

Responsibilities

WHAT YOU’LL BE DOING

At Stream, data drives our product. Data enables our data scientists, simulation analysts and ML researchers to create amazing new solutions for industry. As a Data Engineer at Stream, you love solving business problems with data. As part of the Platform Team, you will work for the Platform Owner and fellow software developers, architects, product managers, simulation analysts and data scientists to understand customer data and build our intelligent data platform. You have a passion for creating innovative data pipeline solutions built upon industry standards and effective technologies.

THE ROLE’S DIRECT RESPONSIBILITIES INCLUDE:

Architect and build data ingestion and data integration solutions to support Stream’s simulation platform on premise and cloud (AWS) in conjunction with the Platform Owner
Manage and build ETL processes, extraction & integration of data from various data sources
Oversee data quality monitoring, including processing, management and cleaning of data flows leveraged for analytics purposes while working closely with QA and Testing teams
Create custom data models for automated analysis, training machine learning, and enabling embedded BI reporting with row level security in conjunction with Product and Business analysts
Leverage best practices and best-in-class products and services to build cost-effective, scalable, and reliable data pipelines
Define and manage relational and data warehouse structures and schemas while ensuring optimal query execution (cost and performance)
Collaborate with Simulation Analysts and Data Scientists on development of AI/ ML based solutions, develop and define new data integrations and support modeling projects.
Translate data requirements into detailed designs and pipeline architecture, ensuring proper documentation of methodologies.
Define best practices around data extraction, modeling, consumption & governance at Stream.

To ensure success in the role, you will need the following:

Experience and formal qualifications in a STEM related discipline, such as Engineering or Computer Science, or certification and extensive work experience in a data centric development role.
Experience designing and implementing data pipeline solutions accessing data from multiple big data sources and frameworks (AWS, Azure) as well as hybrid architectures including on-premises data connectors (SQL Server, PostgreSQL, Historian, Timeseries)
Data stack working proficiency in SQL, Python, Java, and API development utilizing GraphQL. Experience with messaging platforms like Pulsar, RabbitMQ and Kafka, data ingestion tools such as Airbyte, DBT, Databricks, MLflow an asset.
Experience in data analytics engineering, orchestrating data pipelines, performance tuning, and script and test automation development.
Experience architecting, scripting and maintaining ETL solutions for a wide variety of sources, data abstractions, and data pipelines for real-time, streaming, batch, and on-demand workloads.
Experience with data security strategies, configuring security for reporting, and certifications such as SOC 2.
Functional experience using Database IDE’s, IntelliJ/DataGrip, Jira, Confluence, Bitbucket, and Jenkins would be considered an asse