Synthetic Data Engineer (AI Data/Training)

United StatesUnited States·Bostonmid
Data EngineerData
1 views0 saves0 applied

Quick Summary

Overview

We are seeking a talented and innovative Synthetic Data Engineer. In this role, you will design and implement domain-specific synthetic data generation pipelines, ensuring high-quality data management for training loops.

Key Responsibilities

Design domain-specific synthetic data generation (SDG) pipelines via self-instruct and constitutional prompting. Implement automated quality scoring and de-duplication systems. Manage data pipelines that feed directly into SFT and DPO training loops.

Requirements Summary

Proven experience building large-scale data pipelines (Airflow, Spark, Ray). Deep knowledge of prompt engineering for data generation. Familiarity with dataset distillation and bias mitigation.

Technical Tools
airflowetl

We are seeking a talented and innovative Synthetic Data Engineer. In this role, you will design and implement domain-specific synthetic data generation pipelines, ensuring high-quality data management for training loops. Your expertise will drive the success of data processing and model training within the organization.

 

Responsibilities

~1 min read
  • Design domain-specific synthetic data generation (SDG) pipelines via self-instruct and constitutional prompting.
  • Implement automated quality scoring and de-duplication systems.
  • Manage data pipelines that feed directly into SFT and DPO training loops.

Requirements

~1 min read
  • Proven experience building large-scale data pipelines (Airflow, Spark, Ray).
  • Deep knowledge of prompt engineering for data generation.
  • Familiarity with dataset distillation and bias mitigation.

Location & Eligibility

Where is the job
Boston, United States
On-site at the office
Who can apply
US
Listed under
United States

Listing Details

Posted
April 24, 2026
First seen
April 24, 2026
Last seen
June 14, 2026

Posting Health

Days active
50
Repost count
0
Trust Level
21%
Scored at
June 14, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Hyphen Connect Limited

Web3 and AI talent recruitment agency based in Hong Kong with 700+ placements globally

View company profile
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

Hyphen Connect LimitedSynthetic Data Engineer (AI Data/Training)