Company logo hidden

Data Engineer (Pyspark)

Unlock employer Dubai, United Arab Emirates Posted: 10 Apr 2025

Financial

  • Estimate: $60k - $93k*
  • Zero income tax location

Accessibility

  • Hybrid
  • Apply from abroad
  • Visa Provided

Requirements

  • Experience: Intermediate

Position

Virtusa is seeking a Data Engineer (Pyspark) to join our team. This role involves working with the Cloudera Data Platform and requires a strong focus on developing and maintaining data pipelines for large datasets.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

Key Responsibilities:

  • Data Pipeline Development: Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark, ensuring data integrity and accuracy.
  • Data Ingestion: Implement and manage data ingestion processes from various sources (relational databases, APIs, file systems) to the data lake or data warehouse on CDP.
  • Data Transformation and Processing: Use PySpark to cleanse and transform large datasets into formats that support analytical needs and business requirements.
  • Performance Optimization: Conduct performance tuning of PySpark code and Cloudera components to optimize resource utilization and reduce ETL runtime.
  • Data Quality and Validation: Implement data quality checks, monitoring, and validation routines to ensure data accuracy and reliability.
  • Automation and Orchestration: Automate data workflows using tools like Apache Oozie, Airflow, or similar orchestration tools within the Cloudera ecosystem.

Technical Skills Required:

  • 3+ years of experience as a Data Engineer, focusing on PySpark and the Cloudera Data Platform.
  • Advanced proficiency in PySpark, including working with RDDs, DataFrames, and optimization techniques.
  • Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase.
  • Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala).
  • Familiarity with Big Data Technologies such as Hadoop and Kafka.
  • Experience with orchestration frameworks like Apache Oozie and Airflow.
  • Strong scripting skills in Linux.

Location: Dubai, Dubai, United Arab Emirates
Work Conditions: Hybrid, Full-time

About Virtusa:
Join Virtusa, one of the fastest-growing IT Services companies in the Middle East. We offer the opportunity to work on leading Digital Transformation programs with a growing client base in the UAE, KSA, Qatar, and Oman, partnering with top firms in various sectors including Banking, Financial Services, Travel, and Telecom. Our commitment to teamwork, quality of life, and professional development makes Virtusa a nurturing environment for your career growth.

Apply Direct

Jobs you might like   View all jobs

About IT Services and Solutions Company

Company details are hidden. Subscribe to view full company profile.

Ready to apply for this role?

Apply Direct