Data Engineer at Pipl

Petah Tikva, Center District, Israel -

Full Time

Start Date

Immediate

Expiry Date

12 Sep, 26

Salary

0.0

Posted On

14 Jun, 26

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Python, SQL, Cloud Computing, dbt, Spark, Airflow, Kafka, ETL Pipelines, Data Modeling, NoSQL, Docker, Kubernetes, Java, Scala, Aerospike, Real-time Data Processing

Industry

Computer and Network Security

Description

Pipl is an AI company that helps global enterprises make better fraud decisions. We're built on more than 20 years of identity data and Elephant, the industry's only large risk model trained on payment fraud. Pipl's three products give enterprise teams modular access to the intelligence they need across payment and transaction ecosystems, where the cost of getting it wrong is highest. Pipl Trust brings AI-native risk decisioning into payment workflows, with Elephant resolving identity across behavioral, device, and network signals in real time. Pipl Search connects identity intelligence for investigations and background checks, drawing on our global identity graph. Pipl Elements delivers phone and email signals that strengthen existing fraud models and verification workflows. Our identity graph covers more than 5 billion identities and 740 billion signals. The world's largest payment networks, ecommerce marketplaces, and digital wallet platforms trust Pipl to get it right. We’re looking for a Data Engineer to join our growing data team and help design, build, and scale our data infrastructure. Our team works closely with product, engineering, and data science to ensure reliable, high-quality data pipelines that power analytics, machine learning models, and data-driven decision-making. As a Data Engineer, you will be responsible for creating and maintaining systems that collect, process, and store vast amounts of data, ensuring it is accessible, reliable, and optimized for performance across the organization. Responsibilities Design, build, and maintain scalable ETL pipelines from multiple sources. Work closely with product managers, data scientists, and analysts to ensure data solutions meet business and technical needs. Ensure data integrity, accuracy, and security across platforms, implementing monitoring and validation frameworks. Improve data pipeline efficiency and performance, ensuring low latency and cost-effective solutions. Recommend and implement new technologies, tools, and best practices for data engineering. Leverage AI tools and assistants to improve productivity, code quality, and pipeline development, and share best practices with the team. Requirements 3+ years of experience as a Data Engineer (or in a similar role). Passionate about AI and actively uses AI tools to accelerate day-to-day work, from writing and debugging code to data exploration, documentation, and problem-solving. Strong programming skills in Python. Hands-on experience with Cloud environment. Experience working with dbt for data transformations and modeling (an advantage). Solid experience with SQL and database design (both relational and NoSQL). Proven track record in building and maintaining large-scale data pipelines using frameworks such as Spark, Airflow, Kafka, or similar. Strong understanding of data modeling, warehousing, and ETL best practices. Self-motivated, detail-oriented, and able to work autonomously. Excellent communication skills in English. Advantages Experience with containerization and orchestration (Docker, Kubernetes). Programming skills in Java and Scala. Familiarity with real-time data processing systems. Exposure to data security and compliance best practices. Prior experience working in a big data or search engine environment. Hands-on experience with Aerospike. Working at PIPL null

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities

Design, build, and maintain scalable ETL pipelines to ensure high-quality data for analytics and machine learning. Collaborate with product and data science teams to optimize data infrastructure for performance and reliability.