Job Description

Title- Data Platform Engineer: Cloud Ops + Data Ops

Location- Dallas TX (Hybrid, 3 days/ week..Need locals)

Term: Contract

Mandatory Skills: Data Platform Engineer, PySpark, AWS, Devops, Kubernetes, DataBricks administration

Job Description:

As a Data Platform Engineer, you will be responsible for the design, development, and maintenance of our high-scale, cloud-based data platform, treating data as a strategic product. You will lead the implementation of robust, optimized data pipelines using PySpark and the Databricks Unified Analytics Platform—leveraging its full ecosystem for Data Engineering, Data Science, and ML workflows. You will also establish best-in-class DevOps practices using CI/CD and GitHub Actions to ensure automated deployment and reliability. This role demands expertise in large-scale data processing and a commitment to modern, scalable data engineering and AWS cloud infrastructure practice

Key Responsibilities

Platform Development: Design, build, and maintain scalable, efficient, and reliable ETL/ELT data pipelines to support data ingestion, transformation, and integration across diverse sources.
Big Data Implementation: Serve as the subject matter expert for the Databricks environment, developing high-performance data transformation logic primarily using PySpark and Python. This includes utilizing Delta Live Tables (DLT) for declarative pipeline construction and ensuring governance through Unity Catalog.
Cloud Infrastructure Management: Configure, maintain, and secure the underlying AWS cloud infrastructure required to run the Databricks platform, including virtual private clouds (VPCs), network endpoints, storage (S3), and cross-account access mechanisms.
DevOps & Automation (CI/CD): Own and enforce Continuous Integration/Continuous Deployment (CI/CD) practices for the data platform. Specifically, design and implement automated deployment workflows using GitHub Actions and modern infrastructure-as-code concepts to deploy Databricks assets (Notebooks, Jobs, DLT Pipelines, and Repos).
Data Quality & Testing: Design and implement automated unit, integration, and performance testing frameworks to ensure data quality, reliability, and compliance with architectural standards.
Performance Optimization: Optimize data workflows and cluster configurations for performance, cost efficiency, and scalability across massive datasets.
Technical Leadership: Provide technical guidance on data principles, patterns, and best practices (e.g., Medallion Architecture, ACID compliance) to promote team capabilities and maturity. This includes leveraging Databricks SQL for high-performance analytics.
Documentation & Review: Draft and review architectural diagrams, design documents, and interface specifications to ensure clear communication of data solutions and technical requirements.

Required Qualifications

Experience: 5+ years of professional experience in Data Engineering, focusing on building scalable data platforms and production pipelines.
Big Data Expertise: Minimum 3+ years of hands-on experience developing, deploying, and optimizing solutions within the Databricks ecosystem. Deep expertise required in:
Delta Lake (ACID transactions, time travel, optimization).
Unity Catalog (data governance, access control, metadata management).
Delta Live Tables (DLT) (declarative pipeline development).
Databricks Workspaces, Repos, and Jobs.
Databricks SQL for analytics and warehouse operations.
AWS Infrastructure & Security: Proven, hands-on experience (3+ years) with core AWS services and infrastructure components, including:
Networking: Configuring and securing VPCs, VPC Endpoints, Subnets, and Route Tables for private connectivity.
Security & Access: Defining and managing IAM Roles and Policies for secure cross-account access and least privilege access to data.
Storage: Deep knowledge of Amazon S3 for data lake implementation and governance.
Programming: Expert proficiency (4+ years) in Python for data manipulation, scripting, and pipeline development.
Spark & SQL: Deep understanding of distributed computing and extensive experience (3+ years) with PySparkand advanced SQL for complex data transformation and querying.
DevOps & CI/CD: Proven experience (2+ years) designing and implementing CI/CD pipelines, including proficiency with GitHub Actions or similar tools (e.g., GitLab CI, Jenkins) for automated testing and deployment.
Data Concepts: Full understanding of ETL/ELT, Data Warehousing, and Data Lake concepts.
Methodology: Strong grasp of Agile principles (Scrum).
Version Control: Proficiency with Git for version control.

Preferred Qualifications

AWS Data Ecosystem Experience: Familiarity and experience with AWS cloud-native data services, such as AWS Glue, Amazon Athena, Amazon Redshift, Amazon RDS, and Amazon DynamoDB.
Knowledge of real-time or near-real-time streaming technologies (e.g., Kafka, Spark Structured Streaming).
Experience in developing feature engineering pipelines for machine learning (ML) consumption.
Background in performance tuning and capacity planning for large Spark clusters.

Job Tags

Contract work, Local area, 3 days per week,

Similar Jobs

Current Events Dayton

Customer Experience Associate Job at Current Events Dayton

...Customer Experience Associate Retail Sales Location: Huber Heights, OH Compensation: Base Salary + Bonus Position Type: Full-Time, In-Person Whats in it for you Competitive base salary with weekly performance bonuses Full training provided no...

Guthrie

CDL Driver-Mobile Unit - Mammography - Per Diem Job at Guthrie

Summary: Driver Mobile Unit-CDL Experience: Experience in CDL driving and operating vehicles, preferably mobile unit or tractor trailers Skills: - Ability to safely drive and operate CDL vehicles - Basic mechanical skills to perform daily inspections and...

Pentangle Tech Services | P5 Group

Hardware Test Engineer Job at Pentangle Tech Services | P5 Group

...Job Description: Responsibilities: Execute performance and reliability tests on complete medical devices. Follow step-by-step documented procedures (e.g., vibration, humidity, chamber testing). Operate and troubleshoot hardware (oscilloscopes, probes, testing...

Meaningful Beginnings

Marketing Consultant (Part-Time) Job at Meaningful Beginnings

...seeking a creative and results-driven Marketing Consultant to help elevate our online presence... ...rankings, and enhancing our overall digital footprint to attract and engage families... ...is a plus. Why Join Us: Flexible part-time hours. Opportunity to make a meaningful...

Hummingbird

TV commercial production assistant Job at Hummingbird

2 years experience essential. Must be fluent in both Mandarin and English. Hummingbird, a production company based in Shanghai is looking for a production assistant. Someone with a real interest in learning, or improving their experience in, all aspects of creative visual...

Data Platform Engineer: Cloud Ops + Data Ops Job at Net2Source (N2S), Dallas, TX

Uy9lQnhXZlMxUk9FTkR4MUJwSzZZczBwaEE9PQ==