Virtusa Circular Logo

Data Engineer (PySpark)

Virtusa Dubai, United Arab Emirates Posted: 09 Jan 2025

Financial

  • Estimate: $70k - $100k*
  • Zero income tax location

Accessibility

  • Hybrid
  • Visa Provided

Requirements

  • Experience: Unspecified
  • English: Professional

Position

As a Data Engineer at Virtusa, your primary responsibility will involve designing, developing, and maintaining highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform (CDP). You will ensure data integrity and accuracy through effective data ingestion, transformation, and processing techniques.

Responsibilities:

  • Data Pipeline Development: Create and maintain ETL pipelines to ensure the seamless movement and transformation of data.
  • Data Ingestion: Manage data ingestion processes from various sources into the data lake or warehouse on CDP.
  • Data Transformation and Processing: Utilize PySpark to cleanse and transform large datasets for analytical purposes.
  • Performance Optimization: Tune PySpark code and Cloudera components for optimal resource utilization.
  • Data Quality and Validation: Implement checks and monitoring to assure data reliability.
  • Automation and Orchestration: Automate workflows using orchestration tools like Apache Oozie or Airflow.
  • Monitoring and Maintenance: Oversee pipeline performance and perform routine maintenance.
  • Collaboration: Work with data engineers, analysts, and product managers to fulfill data requirements.
  • Documentation: Maintain comprehensive documentation for engineering processes and configurations.

Technical Skills Required:

  • PySpark: Strong proficiency in PySpark, including experience with RDDs and DataFrames.
  • Cloudera Data Platform: Familiarity with CDP components like Cloudera Manager, Hive, Impala, HDFS, and HBase.
  • Data Warehousing: Knowledge of ETL best practices and SQL tools.
  • Big Data Technologies: Exposure to Hadoop, Kafka, and distributed computing tools.
  • Orchestration: Experience with Apache Oozie, Airflow, or similar frameworks.
  • Scripting: Proficient in Linux scripting.

Location: Dubai, Dubai, United Arab Emirates
Work Conditions: Hybrid, Full-time

About Virtusa:
Virtusa is a leading IT Services company in the Middle East, known for its Digital Transformation initiatives across various industries, including banking, travel, and telecom. The company values teamwork, professional development, and a balanced work-life environment, actively seeking to enhance employee growth and well-being.

Apply now

Jobs you might like   View all jobs

About Virtusa

Virtusa is a global provider of digital strategy, digital engineering, and IT services and solutions. We combine logic, creativity, and curiosity to build, solve, and create innovative solutions for our clients' most pressing business challenges. Our services include consult & design, engineer & automate, and analyze & optimize, across various industries.

Benefits at Virtusa

    • Opportunities for continuous learning and career advancement
    • Flexible work arrangements to accommodate different needs
    • Competitive compensation packages and recognition programs