Data Engineer | 2+ Years Experience | PySpark, Python, SQL, Java, Azure, AWS

Actively Looking for Work

Summary

Results-driven Data Engineer with over two years of professional experience, including nearly two years specializing in big data technologies. Proven expertise in architecting and implementing scalable data pipelines, managing large-scale data migrations, and optimizing distributed systems using tools like Apache Spark, Iceberg, Kafka, and cloud platforms like AWS and Azure. Adept at ensuring data quality, integrity, and performance in complex data lakehouse and warehouse environments. Backed by a strong foundation in software engineering, with hands-on experience building robust backend services, APIs, and real-time data platforms. Skilled in Python, Java, and C++, with a deep understanding of system design patterns and data-driven application development.

Experience

Data Engineer

Migrated hundreds of full loads and historical tables with Terabytes of data from Oracle data warehouse to data lakehouse using a robust in-house developed PySpark framework with the help of Airflow.
Developed a PySpark framework to migrate hundreds of tables containing billions of records from an old data warehouse to a data lakehouse environment.
Created comprehensive ETL documentation for numerous tables to ensure streamlined data processing and future scalability.
Conducted data profiling and implemented data quality checks to ensure integrity and consistency throughout the migration process.
Tools: Pyspark, SQL, Excel, Oracle DB, Airflow, Apache Iceberg, MinIO, and Hive

Software Engineer

Developed backend multi-dashboard APIs for forecasting and presenting passenger traffic, efficiently processing data volumes reaching hundreds of terabytes.
Engineered a name nationality prediction backend service.
Tools: Fast API, PostgreSQL

Data Engineer

Designed and implemented an AI-powered SQL query platform (PrismSQL).
Worked on multiple R&D projects, enhancing the DigiXT product.
Engineered a new data-loading service using Spring Boot for integration with Iceberg tables.
Optimized Spark workflows and maintained Iceberg tables for efficient and significant data operations.
Tools/Technologies: Spark, Iceberg, Kafka, Airflow, MinIO, Trino, Nifi, Superset, PostgreSQL, MySQL, Azure (ADLS2), and AWS (S3, Glue).

Certifications

AWS Certified: AWS Solution Architect - Associate
Microsoft Certified: Azure Data Engineer - Associate
Databricks Certified: Associate Developer for Apache Spark
CCNA: Switching, Routing, and Wireless Essentials

Technical Skills

Programming: Python, Java, C/C++, SQL, PHP, JavaScript
Big Data Tools: Apache Spark, Superset, Kafka, Airflow, Nifi, Trino, Apache Iceberg
Cloud Platforms: AWS(Redshift, Lambda, Glue, EC2, S3, IAM, EMR, etc.), Azure(Data Factory, Synapse, Data lake, Microsoft Purview, Azure Databricks, etc.)
AI & ML: RAG, Milvus DB, Indexing.
Cybersecurity: Network Security, Data Integrity, System Troubleshooting
Data Operations: ETL/ELT, Stream Processing, Distributed Computing, Data warehousing, Data Modeling, Data Profiling, and Quality
Version Control: Git/GitHub

Contact Me

Get Hired! Add Your Profile!

Let employers in Dubai, UAE and Saudi Arabia find you! Sign up and add your profile and be seen by hundreds of employers in the Middle East!

Add My Profile