Synthetic Data Engineer (AI Data/Training)

Data EngineerData
0 views0 saves0 applied

Quick Summary

Key Responsibilities

Design domain-specific synthetic data generation (SDG) pipelines via self-instruct and constitutional prompting. Implement automated quality scoring and de-duplication systems.

Requirements Summary

Proven experience building large-scale data pipelines (Airflow, Spark, Ray). Deep knowledge of prompt engineering for data generation. Familiarity with dataset distillation and bias mitigation.

Technical Tools
Data EngineerData

We are seeking a talented and innovative Synthetic Data Engineer. In this role, you will design and implement domain-specific synthetic data generation pipelines, ensuring high-quality data management for training loops. Your expertise will drive the success of data processing and model training within the organization.

 

Responsibilities

~1 min read
  • Design domain-specific synthetic data generation (SDG) pipelines via self-instruct and constitutional prompting.
  • Implement automated quality scoring and de-duplication systems.
  • Manage data pipelines that feed directly into SFT and DPO training loops.

Requirements

~1 min read
  • Proven experience building large-scale data pipelines (Airflow, Spark, Ray).
  • Deep knowledge of prompt engineering for data generation.
  • Familiarity with dataset distillation and bias mitigation.

Location & Eligibility

Where is the job
Singapore
On-site within the country
Who can apply
SG
Listed under
Singapore

Listing Details

Posted
April 24, 2026
First seen
April 24, 2026
Last seen
May 2, 2026

Posting Health

Days active
8
Repost count
0
Trust Level
35%
Scored at
May 3, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Hyphenconnect
Hyphenconnect
greenhouse

Web3 and AI talent recruitment agency based in Hong Kong with 700+ placements globally

View company profile
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

HyphenconnectSynthetic Data Engineer (AI Data/Training)