Skywarditsolutions2mo ago

USD 180000–200000/yr

Senior Data Scientist

United StatesFull-Timesenior

Data ScientistData

3 views0 saves0 applied

Apply Now

Quick Summary

Overview

We are Skyward. That is, a love for people, for improvement, for human advancement through information technology. We are a people-centered business with a desire to serve others.

Technical Tools

Data ScientistData

We are Skyward.

That is, a love for people, for improvement, for human advancement through information technology. We are a people-centered business with a desire to serve others. We are diverse and unified; creative and collaborative; a collection of complementary, not competing talents. And though on the surface we remain relaxed, beneath, a torrent of energy links us to our civic tech mission.

We stand by our values, and we won’t compromise on any of them.

Integrity: We’re conscientious, intentional, and empathetic. Our words and actions align. That’s our character. Please don’t ask us to play another part, we’re poor actors.

Compassionate: If we may borrow a quote from Theodore Roosevelt: “No one cares how much you know until they know how much you care.” Because our team is thoughtful and supportive, caring deeply for each other, our clients, and our work, this comes naturally.

Inquisitive: We remain students by failing openly and turning lessons into solutions.

Unconventional: For us, life isn’t what happens outside of work. Work happens inside of life and our culture erases the line often dividing the two.

Authentic: Made possible only because we embody the values listed above. We’re relaxed and fun yet intensely curious and driven. Team members are placed with thought, care, and precision to ensure that Trust, Truth, and Transparency continue to represent our brand.

Because of that, we continue Onward, Upward, and Skyward.

We need a Senior Data Scientist.
The kind who looks at a tangle of federal and private datasets that don’t share schemas, don’t share IDs, and were never meant to talk to each other, and gets a little excited. The kind who knows that the answer is almost never “throw a bigger model at it” and is almost always “understand the data first, then pick the model.” The kind who can sit across from a federal subject matter expert and explain what a Leiden community is without making them feel dumb, and without dumbing it down either.

If you’ve ever quietly fixed someone else’s “production” notebook on a Friday afternoon - the one with hard-coded paths, no random seed, and a function called final_FINAL_v3() - this might be you.

Come join us if you're motivated to learn from others, to learn from mistakes, to be part of a future-looking and growth-oriented team.

Let's go Skyward together.

Lead end-to-end data science experiments. From a data readiness assessment, through clustering and topological risk modeling, into unstructured-data enrichment and entity resolution.

Run exploratory data analyses (EDAs) on government-furnished data inside a government-controlled environment: profile completeness, find the schema mismatches, flag the gaps, and document what the data can and cannot support before a single model gets trained.

Apply graph analytics. Leiden community detection, betweenness and eigenvector centrality, motif analysis, temporal cluster detection, link prediction. And be able to explain in plain English what each one means and what it doesn’t.

Train interpretable classifiers (logistic regression, gradient boosted trees) and Graph Neural Networks (GraphSAGE, GAT) where the data supports them; reach for unsupervised anomaly detection when labels are thin (and they will be).

Run probabilistic entity resolution across biographic, behavioral, and biometric features using tools such as Senzing. Handle name transliteration, DOB variation, and fuzzy address matching like the working scientist you are.

Apply LLM-based Named Entity Recognition and relationship extraction to unstructured field text and quantify whether the extracted edges actually change the graph (rather than just adding noise that looks impressive in a deck).

Wrangle messy data and recommend supplemental, de-identified data sources that would enrich the analysis, and document the case for each recommendation so the customer’s privacy and legal teams can make an informed call.

Document everything so the customer can rerun it after you’re gone. Reproducibility is a hard acceptance criterion here, not a “nice to have.”

Hold the line on rigor: confidence intervals on everything, null findings documented with the same care as positive ones, and zero appetite for dashboard theater.

An active DHS Public Trust clearance at time of hire.

7+ years of applied data science experience, with at least 2 years that included graph analytics or network analysis as a primary tool, not a side dish.

Strong production-grade Python: pandas, NumPy, scikit-learn, networkx (or graph-tool / igraph), and at least one GNN library such as PyTorch Geometric or DGL.

Real-world experience with probabilistic record linkage / entity resolution - Senzing, dedupe, FEBRL, Magellan, or a homegrown Fellegi-Sunter implementation you’d be willing to defend in a code review.

Comfort working without labels. Anomaly detection, positive-unlabeled learning, isolation forests, autoencoders. You know how to evaluate a model when there’s no clean ground truth to compare it to.

Interpretability discipline: SHAP, feature importance, partial dependence plots, and the wisdom to pick the simpler model when it’s the right one.

LLM application experience beyond prompt-and-pray. Entity and relationship extraction on real text, with real evaluation of extraction quality.

Statistical maturity: someone who knows what a power analysis is, why a confidence interval matters, and why p-hacking is bad even when nobody is checking.

Comfort presenting technical findings to non-technical operational stakeholders without losing the nuance.

Reproducibility hygiene to ship code that someone else can actually rerun: version control, parameterized pipelines, deterministic seeds, pinned dependencies, READMEs that work

Published research, open-source contributions, or conference talks in entity resolution, graph ML, or applied causal inference.

Knowledge of commercial data sources like LexisNexis, Transunion, Babel Street, etc

A track record of standing in front of federal subject matter experts and walking them through a null result without flinching.

A reputation among teammates as the person who finds the bug in the pipeline before it hits the deliverable.

Even if you don’t meet 100% of the qualifications, we encourage you to apply. At Skyward, we’re focused on hiring individuals with the right skills and passion to grow, not just checking off every box.

Medical, dental, vision insurance (fully paid for employees)

15 days of paid leave

7 days of sick leave

2 days bereavement leave

11 paid Federal holidays

Up to 40 hours for jury duty

401K with 4% employer contribution (and no vesting period)

Up to 4 weeks of paid paternity and maternity leave

Company provided laptop

$5,000 per year for professional development

$600 per year for technical supplies and equipment

$2,000 referral bonus

Life and disability insurance

HSA and FSA

Legal Shield and ID Shield Voluntary Benefits

Opportunity to work in a collaborative, motivated team focused on modernizing government services with cutting-edge technology and innovative solutions. Who says government work can't be exciting!

At Skyward, we support flexible working hours and remote opportunities to help maintain a healthy work-life balance for all employees.

Offers of employment with Skyward are contingent upon acceptable results of a background investigation.

Applicants must have the ability to obtain and maintain a Public Trust security clearance due to the nature of our work as a government contractor.