To Apply for this Job Click Here
Senior Data Engineer – Clinical & Healthcare Data
Overview
We are seeking a Senior Data Engineer with deep experience working with clinical and real-world healthcare data. This role will focus on building and scaling data pipelines that support analytics, research, and downstream machine learning use cases. The ideal candidate has hands-on experience with OMOP, Databricks, and modern data stacks, and understands the real-world challenges of clinical data harmonization across disparate sources.
Key Responsibilities
- Design, build, and maintain scalable data pipelines for large, complex clinical datasets (EHR, pathology, genomics, etc.)? Implement and manage data transformations and analytics workflows using Databricks (Spark, Delta Lake)
- Ingest, standardize, and harmonize healthcare data into OMOP Common Data Model
- Partner with clinical, analytics, and ML teams to ensure data is reliable, well-documented, and fit for downstream use
- Lead data quality, validation, and observability efforts for clinical data pipelines
- Develop data models and schemas that support analytics, research, and ML use cases
- Optimize performance, cost, and reliability across the data platform
- Contribute to best practices around data governance, versioning, lineage, and reproducibility
- Taking data analysis requirements from commercial customers and mapping to clinical variables from the OMOP, Epic, or other data models
Required Qualifications
- 5+ years of experience as a Data Engineer, with significant experience in healthcare or life sciences
- Strong hands-on experience with Databricks (Spark SQL, PySpark, Delta Lake)
- Deep understanding of OMOP CDM, including:
- Standard vocabularies (SNOMED, LOINC, RxNorm, ICD, CPT)
- ETL patterns for clinical data mapping and normalization
- Experience with clinical data harmonization, including:
- Mapping heterogeneous source systems into a common schema
- Managing missing, inconsistent, or conflicting clinical data
- Understanding clinical workflows and data provenance.
- Strong cloud experience, preferably in Azure, relating to items such as Data Factory and other data related tooling
- Proficiency in Python and SQL
- Experience with modern data stacks, including:
- Cloud data warehouses or lakehouses (Databricks, Snowflake, BigQuery, Redshift)
- Orchestration tools (Airflow, Dagster, Prefect)
- Data transformation frameworks (dbt or equivalent)
- Strong data modeling and analytics engineering skills
Preferred / Nice-to-Have
- Experience working with real-world evidence (RWE), clinical research, or regulatory-facing datasets
- Familiarity with ML or feature engineering pipelines built on clinical data
- Experience supporting downstream LLM, NLP, or ML workloads using healthcare data
- Knowledge of healthcare data standards beyond OMOP (FHIR, HL7)
- Experience operating data systems in HIPAA-compliant environments
To Apply for this Job Click Here
Equal Employment Opportunity Statement
Gravity IT Resources is an Equal Opportunity Employer. We are committed to creating an inclusive environment for all employees and applicants. We do not discriminate on the basis of race, color, religion, sex (including pregnancy, sexual orientation, or gender identity), national origin, age, disability, genetic information, veteran status, or any other legally protected characteristic. All employment decisions are based on qualifications, merit, and business needs.
Share This Job
Share This Job
Refer A Candidate
Recommend a candidate and receive a referral bonus as a thank-you for helping us find top talent.
Upload Your Resume
Share your resume, and we’ll match you with opportunities that fit your skills and goals.