We run a High-Performance Computing Platform (HPC) on AWS with many additional opensource technologies and middleware. All our systems run in the Cloud so we always think cloud first! Our team uses a mix of Linux and some Windows. We are trying to remove each and every barrier that would keep the product team from executing faster than our competitors and releasing a clean, quality product. This means supporting and testing our full stack in a public cloud environment along with distributed schedulers, logging solutions, metrics, storage archiving, and optimization of HPC application cost and performance.
We are looking for a System/Software Engineer (Networking) with strong knowledge of networking concepts as well as design and development of complex/distributed systems and/or high performance computing services. Your primary responsibility will be to help design and develop software to run network simulations using the NS3 framework.
Develop and maintain Scala’s simulation methodology for performance modeling for network and device including queueing & packet processing models. Incorporate functional models into performance models..
Identify bottlenecks and bugs, and devise solutions to these problems.
Operate as a self-driven team player at times independently and with minimal direction, and at times collaborating closely with co-workers and customer engineers
Own device and system models for large-scale distributed applications such as deep learning.
Specify the methodology and software required to exercise the models; log results, perform regression testing, or correlate against real life systems.
Leverage simulation efforts for customer validation by adjusting the model to per-customer variants, drive evolution of the models to achieve both customer and internal development goals.