Principal Data Engineer at Cellpoint Digital

Irving, TX 75038, USA -

Full Time

Start Date

Immediate

Expiry Date

07 Nov, 25

Salary

0.0

Posted On

08 Aug, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Protocol Buffers, Kotlin, Python, Code

Industry

Information Technology/IT

Description

PRINCIPAL DATA ENGINEER

We are currently seeking a hands-on data engineer with years of experience building and supporting large-scale, cloud-first platforms.
This role requires an amalgam of several different cutting-edge data skills. Primarily this is a high-performing data engineering role, focused on building out a robust, scalable data engineering platform. The ideal candidate has the background and foresight to architect this solution and create it. While the skills and detail to architect the solution are critical, the job will primarily involve building it out; coding the solution.
As mentioned above, you will be a key member of the small and growing team that is building out an entirely new platform for managing transactions at a massive scale. While CellPoint Digital is a fairly small company, due to our business focus, we work with very large companies that require massive velocity. For example, we have an internal goal of soon supporting 1 million transactions per second. In order to achieve these ambitious targets, we rely heavily on Google Cloud Platform native tools throughout our platform. In our data ETL process, for instance, we rely upon Dataflow to deliver the needed message velocity to our customers in real-time.
The role also requires a deep understanding of, and empathy for, robust data governance. You will create, vet, and document rules and methodologies to manage data storage, PII, data for auditing purposes, etc. You will do this in conjunction with security experts; we are not looking for someone with deep knowledge of data laws and regulations. However we are looking for someone passionate about data protection and governance.
There will also be the creation of data lineage, discoverability, visualization, and quality tools. We need metrics available to show what data is most useful, what data is most accurate, and what data is rarely accessed and potentially can be deprecated. Also, our data is directly used by customers to determine not only how well we are performing, but how much payment traffic they have seen, how well their payment providers are performing, and many other details of interest to our customers.
Finally, the role will involve creating a data enablement platform. You will develop a data platform that will enable others with little technical expertise to easily generate their own reports and dashboards. Think of this as a software platform but for data products. This will involve creating an infrastructure backbone. All infrastructure work is performed using infrastructure-as-code best practices.
Beyond the short term, once the essential data engineering work is complete and in production, you will be responsible for building out critical machine learning products. For instance, payment processor routing will require machine learning products for optimized, real-time payment processor routing. There are many more solutions beyond this.
You will be a key member deciding what technologies we should adopt within the data team. The team is dynamic and will adopt and deprecate solutions as appropriate.

SKILLS YOU WILL HAVE FINE-TUNED:

Coding in core Google Cloud Platform products. The code base is mostly in Python, but Python, Java/Kotlin, and Go are all viable languages for our platform. (Microservices teams are using Kotlin.)
Data testing solutions such as Great Expectations and software testing solutions like Pytest.
Massively parallelized, high-velocity data pipeline programming paradigms, like Apache Beam, Protocol Buffers, and asynchronous programming libraries.
Robust CI/CD pipeline solutions; we only release via fully automated pipelines.
Data accountability practices: exposing data lineages, ensuring principles of privilege have been adopted, creating discoverability tools, and developing quality metrics
Generous, positive, and collaborative spirit. We have a wonderful team that gets along very well, with some fun jokes thrown in the mix, as well as managers who take full ownership and responsibility.
Infrastructure as code: we use either Terraform or Config Connector (a Google solution similar to Crossplane), but you simply need some prior exposure to infra as code.
(Mid- to Long-term) Machine learning development, model evaluation, and statistical analysis.
Beautiful data visualizations that simplify end-user insights as a core value-end to our customer product.These visualizations must be responsive, dynamic, and interactive, with large datasets.

Responsibilities

Please refer the Job description for details