Bain & Company is one of the world's leading global business consulting firms, serving clients across six continents. It was founded in 1973 on the principle that consultants must measure their success in terms of their clients' financial results. Bain's clients have outperformed the stock market 4 to 1. With offices in all major cities, Bain has worked with over 4,150 major multinational and other corporations from every economic sector, in every region of the world.
A career at Bain & Company will provide the opportunity to learn in a collaborative teaming environment, drive impact to support our 'Results' mission statement. The firm has a passionate and rich culture that offers an unparalleled business experience that can carry throughout a career. We hire dynamic individuals that are dedicated to achieving both personal and professional goals.
Bain's Global Engineering leads the firm's software development efforts and defines engineering standards for Bain globally. The team ships software solutions to address client and internal needs, ranging from iterative prototypes to enterprise-grade production software.
You will solve cutting-edge problems for a variety of industries as a software engineer specializing in Platform Infrastructure and DevOps. As a member of a diverse engineering team, you will participate in the full engineering life cycle which includes designing, developing, optimizing, and deploying new machine learning solutions and infrastructure at the production scale of the world's largest companies.
Core Responsibilities and Requirements
- Partner with Data Science, Data Engineering, and Machine Learning Engineering teams to develop and deploy production quality code
- Develop and champion modern infrastructure concepts to technical audience and business stakeholders
- Implement new and innovative deployment techniques, tooling, and infrastructure automation within Bain and our clients.
- This position will be located in Palo Alto, Los Angeles, Boston, Dallas, Austin, Seattle, or remotely
- Travel is required (~20%)
Build and deploy highly available, scalable, and fault tolerant platforms to run production applications that solve business problems
- Understand the needs and challenges of a client across operations and development, and then formulate solutions that advance their business and technical goals.
- Develop solutions encompassing technology, process, and people for:
- Continuous Delivery
- Infrastructure strategy & operations
- Build and release management
- Work closely with development teams to ensure that solutions are designed with customer user experience, scale/performance, and operability in mind.
Develop infrastructure and deployment platform to enable production data science and machine learning engineering development
- Participate in the full software development life cycle including designing distributed systems, writing documentation and unit/integration tests, and conducting code reviews.
- Develop and improve infrastructure including CI/CD, microservice frameworks, distributed computing, and cloud infrastructure needed to support this platform.
Provide technical guidance to external clients and internal stakeholders in Bain:
- Explore new technical innovations in the machine learning and data engineering to improve customer results.
- Advise and coach engineering teams on technology stack best practices and operational models to raise their devops capabilities.
- 4+ years of experience using one of the following IaC frameworks: CloudFormation, Terraform
- 4+ years of experience working with Docker containers
- 4+ years of experience working on public cloud environments (AWS, GCP, or Azure), and associated deep understanding of failover, high-availability, high scalability, and security
- 2+ years of experience with Unix/Linux system administration and scripting
- 2+ years of experience with administering and managing Kubernetes clusters (EKS, GCP, or AKS) and Helm (optional)
- 2+ years of experience programming with Python, C/C++, Java, Go, or similar programming language
- 2+ years of experience with authentication mechanisms including LDAP, Active Directory and SAML
- One or more configuration management tools: Ansible, Salt, Puppet, or Chef
- One or more monitoring and analytics platforms: Grafana, Prometheus, Splunk, SumoLogic, NewRelic, DataDog, CloudWatch, Nagios
- CI/CD deployment pipelines: Jenkins, TravisCI, Gitlab CI, AWS CodePipeline
- Version control and git workflows
- HashiCorp Vault and integrating it with Kubernetes for secret management
- Deploying end-to-end logging solutions such as the EFK stack
- Deploying Prometheus and various exporters (postgres, elasticsearch, etc)
- Hadoop framework
- Distributed databases and query languages such as SQL or HQL: Hive, Aster Data, Greenplum, Cassandra, Vertica, Amazon Redshift, Snowflake
- Developing frameworks, platforms, APIs
- Developing and maintaining rigorous technical documentation and runbooks
- Collaborating with the Networking and Security infrastructure teams to achieve and maintain baseline security standards
- Serverless frameworks
- Agile development methodology
- Grafana dashboards