Company logo hidden

Data Engineer (PySpark)

Unlock employer Dubai, United Arab Emirates Posted: 09 Jan 2025

Financial

  • Estimate: $70k - $100k*
  • Zero income tax location

Accessibility

  • Hybrid
  • Visa Provided

Requirements

  • Experience: Unspecified

Position

As a Data Engineer at Virtusa, your primary responsibility will involve designing, developing, and maintaining highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform (CDP). You will ensure data integrity and accuracy through effective data ingestion, transformation, and processing techniques.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

Responsibilities:

  • Data Pipeline Development: Create and maintain ETL pipelines to ensure the seamless movement and transformation of data.
  • Data Ingestion: Manage data ingestion processes from various sources into the data lake or warehouse on CDP.
  • Data Transformation and Processing: Utilize PySpark to cleanse and transform large datasets for analytical purposes.
  • Performance Optimization: Tune PySpark code and Cloudera components for optimal resource utilization.
  • Data Quality and Validation: Implement checks and monitoring to assure data reliability.
  • Automation and Orchestration: Automate workflows using orchestration tools like Apache Oozie or Airflow.
  • Monitoring and Maintenance: Oversee pipeline performance and perform routine maintenance.
  • Collaboration: Work with data engineers, analysts, and product managers to fulfill data requirements.
  • Documentation: Maintain comprehensive documentation for engineering processes and configurations.

Technical Skills Required:

  • PySpark: Strong proficiency in PySpark, including experience with RDDs and DataFrames.
  • Cloudera Data Platform: Familiarity with CDP components like Cloudera Manager, Hive, Impala, HDFS, and HBase.
  • Data Warehousing: Knowledge of ETL best practices and SQL tools.
  • Big Data Technologies: Exposure to Hadoop, Kafka, and distributed computing tools.
  • Orchestration: Experience with Apache Oozie, Airflow, or similar frameworks.
  • Scripting: Proficient in Linux scripting.

Location: Dubai, Dubai, United Arab Emirates
Work Conditions: Hybrid, Full-time

About Virtusa:
Virtusa is a leading IT Services company in the Middle East, known for its Digital Transformation initiatives across various industries, including banking, travel, and telecom. The company values teamwork, professional development, and a balanced work-life environment, actively seeking to enhance employee growth and well-being.

Apply Direct

Jobs you might like   View all jobs

About IT Services and Solutions Company

Company details are hidden. Subscribe to view full company profile.

Ready to apply for this role?

Apply Direct