Emplifimonster2mo ago

Data Engineer

Praguemid

EngineeringData ScienceData EngineeringData Engineer

0 views0 saves0 applied

Apply Now

Quick Summary

Key Responsibilities

You’ll be part of a team that builds and operates the data platform behind many of the company’s key functions, supporting a variety of teams with diverse use cases,

Technical Tools

EngineeringData ScienceData EngineeringData Engineer

We’re looking for a Data Engineer to join our Data Engineering team. We build the data platform that powers decision-making across the entire company — from product insights to customer analytics.

If you enjoy building scalable pipelines, working with PySpark, AWS, and Databricks, and want to grow your expertise in a collaborative team, we’d love to hear from you.

Responsibilities

~1 min read

You’ll be part of a team that builds and operates the data platform behind many of the company’s key functions, supporting a variety of teams with diverse use cases, from analytics to internal tools and customer-facing features. To support that range effectively, we focus on standardization, industry-standard tooling, and best practices that make our solutions scalable, reliable, and easy to maintain.

Build and maintain reliable data pipelines (batch and streaming) using PySpark, SQL, and AWS
Help develop and scale our company-wide Data Lake on AWS and Databricks (operating at petabyte scale)
Work with data from diverse sources: APIs, file systems, databases, event streams
Contribute to internal tooling (e.g., schema registries) to improve workflows
Write clean, tested code and participate in code reviews
Collaborate closely with other engineers, analysts, and product teams to deliver data solutions
Learn and experiment with new tools and best practices in modern data engineering

Python – clean code, testing, and ability to read existing codebases
Apache Spark – development and basic performance tuning
SQL – good understanding and hands-on experience
Git – solid version control habits
Strong English – comfortable working and communicating in an international team
Distributed systems mindset – solid understanding of fault tolerance, data partitioning, shuffling, and parallel processing

Delta Lake, Databricks
Apache Airflow or similar orchestration tools
Amazon S3, AWS experience overall
Streaming & messaging technologies – Kafka, Kinesis, RabbitMQ
Python libraries for RESTful APIs
Data modeling
PostgreSQL, ElasticSearch
Familiarity with JVM languages (e.g., Java, Scala)

Beyond the Tech
Besides strong technical skills and clear communication, we value the ability to explain technical concepts clearly and contribute your own ideas. We also appreciate a broad understanding of the modern data landscape and awareness of industry best practices.

Languages & Tools: Python with PySpark, SQL, Git
Data & Storage: AWS S3, Databricks, Delta Lake, PostgreSQL, ElasticSearch
Streaming: Kafka, Kinesis, RabbitMQ
Workflow & Orchestration: Apache Airflow
Infrastructure: AWS (core services), Docker