Talent Network: Lead Data Scientist
Quick Summary
define the DS strategy, choose the right tools and frameworks, and establish best practices.
Toptal is a global network of top talent in business, design, and technology that enables companies to scale their teams, on-demand. With $200+ million in annual revenue and team members based around the globe, Toptal is the world’s largest fully remote workforce.
We take the best elements of virtual teams and combine them with a support structure that encourages innovation, social interaction, and fun. We see no borders, move at a fast pace, and are never afraid to break the mold.
We are looking for a Senior Data Scientist to join us as the first Data Scientist on a new product we are building. This is a founding role: you will shape the data science function from the ground up, set technical direction, and own the end-to-end delivery of intelligent systems that define how our product creates value. You will tackle open-ended problems involving Task Mining, Process Mining, behavioral workflow analysis, pattern discovery, predictive modeling, and applied GenAI/ML systems. The goal is not just to build models, but to turn raw interaction data into measurable product and business impact: discovered workflows, bottlenecks, optimization opportunities, and scalable foundations for future DS/ML work.
This is a remote position. We do not offer visa sponsorship or assistance. Resumes and communication must be submitted in English.
Responsibilities
~1 min read- →Act as the founding Data Scientist on the product: define the DS strategy, choose the right tools and frameworks, and establish best practices.
- →Design and build Task Mining and Process Mining solutions that transform raw interaction data into discovered workflows, patterns, bottlenecks, and optimization opportunities.
- →Design, develop, and deploy ML systems and data pipelines for large-scale structured, unstructured, and event/interaction data.
- →Build predictive and pattern-discovery solutions using supervised and unsupervised learning, representation learning, sequence modeling, and LLM/GenAI approaches where appropriate.
- →Establish practical foundations for dataset construction, labeling strategy, offline/online evaluation, monitoring, feedback loops, and human-in-the-loop review where needed.
- →Own projects end-to-end, from problem framing and experimentation through production deployment and iteration. Collaborate closely with engineering on data instrumentation, pipeline design, deployment, and integration of production-ready services.
- →Communicate findings, tradeoffs, and technical concepts effectively to both technical and business stakeholders.
Requirements
~1 min read- 5+ years of professional experience in Data Science, Machine Learning, or Applied ML roles.
- Demonstrated experience operating as the sole or lead Data Scientist on a product or team — owning problems end-to-end without senior DS supervision.
- Strong experience with supervised and unsupervised ML, modern ML/data tooling, and the judgment to select the right approach for the problem.
- Practical familiarity with representation learning, sequence modeling, Transformers, LLMs, or GenAI systems where relevant to product use cases.
- Experience handling large-scale structured, unstructured, event, or interaction datasets.
- Advanced proficiency in Python and SQL, with hands-on experience using tools such as PyTorch, scikit-learn, pandas/Polars, experiment tracking, and production ML workflows.
- Experience deploying ML models, data pipelines, or intelligent systems into production.
- Familiarity with Task Mining, Process Mining, event-log analysis, behavioral analytics, workflow automation, or adjacent domains.
- Advanced degree in Computer Science, Data Science, AI, Statistics, Mathematics, or a related field is a plus; equivalent practical experience is strongly valued.
- A founder’s mindset: full responsibility for outcomes, not just deliverables.
- Comfort operating in high ambiguity: able to turn unclear product goals, noisy data, and incomplete requirements into an executable roadmap.
- Strong business sense — connects technical work to commercial impact and measurable product value.
- Pragmatic technical judgment — knows when to use advanced ML, when to simplify, and when better data, labeling, or evaluation is the real bottleneck.
- Ability to build foundations for rapid scaling: reusable datasets, pipelines, metrics, evaluation frameworks, and modeling patterns future DS/ML hires can build on.
- Highly proactive problem solver who acts without waiting for detailed instructions.
- Excellent communication skills, with the confidence to push back constructively and propose direction.
Nice to Have
~1 min read- Previous experience as a first or early Data Scientist at a startup or new product line.
- Direct experience with Task Mining, Process Mining, workflow intelligence, RPA, or productivity analytics.
- Experience with LLMs and Generative AI applications, especially evaluation, structured outputs, semantic labeling, summarization, or human-in-the-loop workflows.
- Experience working with privacy-sensitive behavioral, productivity, or user-interaction data.
- Experience with product experimentation, causal inference, or measuring the impact of workflow/process interventions.
- Knowledge of MLOps and distributed processing frameworks, such as Spark.
- Experience with cloud environments, especially GCP.
Location & Eligibility
Listing Details
- Posted
- May 22, 2026
- First seen
- May 22, 2026
- Last seen
- May 23, 2026
Posting Health
- Days active
- 0
- Repost count
- 0
- Trust Level
- 76%
- Scored at
- May 22, 2026
Signal breakdown

Toptal is a global network of the top talent in business, design, and technology that enables companies to scale their teams, on-demand.
View company profilePlease let Toptal know you found this job on Jobera.
3 other jobs at Toptal
View all →Explore open roles at Toptal.
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.