insightsoftmax
New

HPC Infrastructure Engineer

Part-Timemid
EngineeringDevops Engineer
0 views0 saves0 applied

Quick Summary

Key Responsibilities

Design, build, and maintain high-performance computing (HPC) infrastructure. Develop and implement scalable distributed systems for complex computing tasks.

Requirements Summary

At least 3-5 years of experience in HPC infrastructure, systems engineering, or a similar role. Strong understanding of systems engineering principles and scaling strategies.

Technical Tools
EngineeringDevops Engineer

Insight Softmax is a leading organization in the fields of machine learning, data science, and high-performance computing (HPC). We are dedicated to providing innovative solutions in computing technology to solve complex problems in various sectors including research, science, manufacturing, finance, and data analytics.


Description:


  • We are hiring an HPC Infrastructure Engineer for our engineering team. Within the HPC realm, we design and build CFD (computational fluid dynamics) and simulation services.
  • The ideal candidate will have a strong background in building and maintaining high-performance computing services, expertise in distributed systems, and a passion for systems engineering, scaling, and storage architecture. This role involves working closely with our development teams to design, implement, and optimize HPC solutions that meet our growing needs.
  • This opportunity is particularly unique, placed within our skunkworks team at Insight Softmax. While the exact nature and purpose of this work is confidential, it is an extremely exciting set of projects and objectives that touch HPC, ML, Simulation, and Quantum swimlanes. Our HPC team is building CFD services and architecture that will integrate directly with ML and Simulation services. You will interface and work with some of the most experienced and talented people on the planet across these swimlanes.
  • The infrastructure team we are putting together requires individuals with focus on delivery and results, hunger to learn, ability to adapt, and also thrive in an environment where you will not be provided all the training nor instructions to get your deliverables done. You will be given top-level engineering goals and milestones, so you are responsible for figuring out and delivering what is required to get there. Folks with startup experience may be more comfortable in this position, as will people that have a mental itch that can never be fully scratched. While senior-level individuals often bring well-needed experience to the table, we are always open to less-experienced individuals who may have great attributes suitable for our team culture.
  • Depending on engineering objectives, priorities, and team member skill sets, in the future we may swap team members between responsibilities, or we may all team up together to finish one project more quickly, so you may have the opportunity to work on other projects (ML, Simulation, Quantum, etc) outside of HPC.


Key Responsibilities:


  • Design, build, and maintain high-performance computing (HPC) infrastructure.
  • Develop and implement scalable distributed systems for complex computing tasks.
  • Scaling services and systems from smaller deployments of ~50 nodes into larger ~250+ node clusters.
  • Using orchestration, configuration management, virtualization, linux, CLI, deploy tools, monitoring, APM.
  • Monitor HPC systems performance and implement improvements to ensure scalability and efficiency.
  • Optimizing engineering and deliverables to balance feature development, product quality, service reliability.


Qualifications:


  • At least 3-5 years of experience in HPC infrastructure, systems engineering, or a similar role.
  • Strong understanding of systems engineering principles and scaling strategies.
  • Deep knowledge of at least one of AWS, GCP, or Azure. Preferably AWS.
  • Strong linux chops.
  • Experience with data architecture and large-scale data processing.


Optional Experience and Skills:


  • HPC CFD experience.
  • Scaling services and systems from single-region clusters into multi-regional deployments.
  • Experience with clusters of greater than 100 nodes.
  • Building on-prem systems (on-premise, data center, rack-n-stack, etc).
  • Distributing workloads onto both cloud and on-prem systems.
  • Balancing business needs, product delivery dates, and customer satisfaction.


Bonus Experience and Skills:


  • Backend work like SQL, API development, and serverless.
  • Software engineering
  • Python, and/or other programming languages
  • Harnessing creativity, generating innovative products and features, and ensuring a delightful customer experience.
  • Using linux as your primary desktop workstation environment.


Attributes that we value:


  • Transparency
  • Grit
  • Effectiveness
  • Willingness to learn


Location and Hours:


  • Full time position
  • Remote work environment
  • 80% of your work schedule will be during the business time zones for Latin America and Canada.
  • Some customer meetings will occur each week at non-standard / nighttime hours, as we currently support a global team across 3-4 continents. 
  • Travel for up to 4 weeks per year to customer destinations


References:


  • 3 references will be required.


Location & Eligibility

Where is the job
Location terms not specified

Listing Details

Posted
April 7, 2026
First seen
May 21, 2026
Last seen
May 24, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
14%
Scored at
May 21, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

insightsoftmaxHPC Infrastructure Engineer