Responsibilities:
Design, develop, and maintain data processing workflows using Azure Databricks.
Implement scalable and efficient data pipelines to support business analytics and reporting.
Collaborate with data scientists and analysts to deploy machine learning models on Azure Databricks.
Optimize and troubleshoot performance issues within Databricks environments.
Work closely with cross-functional teams to understand business requirements and translate them into technical solutions.
Develop custom Python scripts and applications to enhance data processing capabilities.
Utilize Azure services such as Azure Data Lake Storage, Azure SQL Database, and Azure Key Vault in conjunction with Databricks.
Implement best practices for code versioning, testing, and documentation within the Databricks environment.
Ensure data security and compliance with relevant regulations in Databricks-based solutions.
Requirements:
Strong proficiency in Python programming language with a focus on. data processing and analytics.
In-depth experience with Azure Databricks, including cluster management, job execution, and notebook workflows.
Knowledge of big data technologies such as Apache Spark and Spark SQL.
Familiarity with Azure cloud services and the ability to integrate Databricks with other Azure components.
Solid understanding of data engineering concepts, data modeling, and ETL processes.
Experience in performance tuning and optimization of Spark jobs in Databricks.
Ability to work in an agile development environment, following best practices for software development and collaboration.
Strong problem-solving skills and the ability to troubleshoot issues in a complex distributed system.
Excellent communication skills and the ability to convey technical concepts to both technical and non-technical stakeholders,