Company logo hidden

Data Engineer (PySpark)

Unlock employer Dubai, United Arab Emirates Posted: 23 Jan 2026

Financial

  • Estimate: $80k - $120k*
  • Zero income tax location

Accessibility

  • Hybrid
  • Visa Provided

Requirements

  • Experience: Senior
  • English: Professional

Position

As a Data Engineer specializing in PySpark, you will be responsible for designing, developing, and maintaining highly scalable ETL pipelines on the Cloudera Data Platform. Your duties will include ensuring data integrity and accuracy throughout the data pipeline, implementing data ingestion processes from various sources to the data lake or data warehouse, and utilizing PySpark for data transformation and processing. You will also focus on performance optimization of PySpark code, executing data quality checks, and automating workflows using orchestration tools like Apache Oozie or Airflow. Collaboration with data engineers, analysts, and product managers is essential, along with thorough documentation of your engineering processes.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

Responsibilities:

  • Design and maintain ETL pipelines using PySpark.
  • Manage data ingestion from relational databases, APIs, and file systems.
  • Process, cleanse, and transform large datasets for analytical needs.
  • Optimize performance of PySpark code and data processes.
  • Implement monitoring, validation, and quality checks for data accuracy.
  • Automate data workflows with orchestration tools.
  • Troubleshoot and maintain the pipeline performance within the Cloudera Data Platform.
  • Document data engineering processes and code.

Qualifications:

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or a related field.
  • 8+ years of experience as a Data Engineer with a strong focus on PySpark and the Cloudera Data Platform.

Technical Skills:

  • Advanced proficiency in PySpark, working with RDDs, DataFrames, and optimization techniques.
  • Strong experience with Cloudera Data Platform, including Cloudera Manager, Hive, Impala, HDFS, and HBase.
  • Knowledge of data warehousing concepts and SQL-based tools.
  • Familiarity with Hadoop, Kafka, and distributed computing tools.
  • Experience with orchestration tools like Apache Oozie or Airflow.
  • Strong scripting skills in Linux.

Soft Skills:

  • Strong analytical and problem-solving abilities.
  • Excellent verbal and written communication.
  • Ability to work independently and in a collaborative team environment.
  • Attention to detail and commitment to data quality.

Location: Dubai, Dubai, United Arab Emirates Work Conditions: Hybrid, Full-time

Language Requirements: Not specified.

Apply Direct

Jobs you might like   View all jobs

About IT Services and Solutions Company

Company details are hidden. Subscribe to view full company profile.

Ready to apply for this role?

Apply Direct