Virtusa Circular Logo

Data Engineer (Pyspark)

Virtusa Dubai, United Arab Emirates Posted: 10 Apr 2025

Financial

  • Estimate: $60k - $93k*
  • Zero income tax location

Accessibility

  • Hybrid
  • Apply from abroad
  • Visa Provided

Requirements

  • Experience: Intermediate
  • English: Professional

Position

Virtusa is seeking a Data Engineer (Pyspark) to join our team. This role involves working with the Cloudera Data Platform and requires a strong focus on developing and maintaining data pipelines for large datasets.

Key Responsibilities:

  • Data Pipeline Development: Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark, ensuring data integrity and accuracy.
  • Data Ingestion: Implement and manage data ingestion processes from various sources (relational databases, APIs, file systems) to the data lake or data warehouse on CDP.
  • Data Transformation and Processing: Use PySpark to cleanse and transform large datasets into formats that support analytical needs and business requirements.
  • Performance Optimization: Conduct performance tuning of PySpark code and Cloudera components to optimize resource utilization and reduce ETL runtime.
  • Data Quality and Validation: Implement data quality checks, monitoring, and validation routines to ensure data accuracy and reliability.
  • Automation and Orchestration: Automate data workflows using tools like Apache Oozie, Airflow, or similar orchestration tools within the Cloudera ecosystem.

Technical Skills Required:

  • 3+ years of experience as a Data Engineer, focusing on PySpark and the Cloudera Data Platform.
  • Advanced proficiency in PySpark, including working with RDDs, DataFrames, and optimization techniques.
  • Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase.
  • Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala).
  • Familiarity with Big Data Technologies such as Hadoop and Kafka.
  • Experience with orchestration frameworks like Apache Oozie and Airflow.
  • Strong scripting skills in Linux.

Location: Dubai, Dubai, United Arab Emirates
Work Conditions: Hybrid, Full-time

About Virtusa:
Join Virtusa, one of the fastest-growing IT Services companies in the Middle East. We offer the opportunity to work on leading Digital Transformation programs with a growing client base in the UAE, KSA, Qatar, and Oman, partnering with top firms in various sectors including Banking, Financial Services, Travel, and Telecom. Our commitment to teamwork, quality of life, and professional development makes Virtusa a nurturing environment for your career growth.

Apply now

Jobs you might like   View all jobs

About Virtusa

Virtusa is a global provider of digital strategy, digital engineering, and IT services and solutions. We combine logic, creativity, and curiosity to build, solve, and create innovative solutions for our clients' most pressing business challenges. Our services include consult & design, engineer & automate, and analyze & optimize, across various industries.

Benefits at Virtusa

    • Opportunities for continuous learning and career advancement
    • Flexible work arrangements to accommodate different needs
    • Competitive compensation packages and recognition programs