As a global leader in cybersecurity, CrowdStrike protects people, processes, and technologies driving modern organizations, focusing on stopping breaches with an AI-native platform. As an Engineer III in the internal SRE team, you'll build and maintain a scalable, highly reliable internal developer platform that enhances engineering productivity. Responsibilities include automation, reliability, and observability improvements, supporting CI/CD tools, architectural design of high availability services, incident response, performance tuning, capacity forecasting, and mentoring other engineers. Key duties also involve configuring and optimizing load balancers, databases, and messaging systems.
What You'll Do:
- Build software and systems for platform infrastructure and applications
- Support primary CI/CD build tools including automation for service deployment
- Monitor system health to improve reliability and quality
- Design highly available enterprise-scale services
- Engage with internal customers to develop solutions
- Lead Incident Response and Production Readiness Reviews
- Analyze metrics for performance tuning and root cause analysis
- Resource, capacity, and license forecasting
- Mentor other engineers
- Configure and optimize load balancers (NGINX, HAProxy, Envoy), databases (relational and non-relational), and key-value stores/message brokers (ETCD, Kafka, Red Panda).
What You'll Need:
- Expertise with on-premise and cloud deployment and scaling of CI/CD tools (Bazel, Github Actions, Jenkins), IaC tools (Ansible, Chef, Puppet, Salt, Terraform), source code management (Bitbucket, Gitlab, Github), and monitoring tools (Prometheus/Grafana, Datadog, Honeycomb, New Relic)
- Experience deploying applications on Kubernetes at scale
- 5+ years experience in large-scale production environments
- Ability to work effectively with local and remote teams
- Strong attention to detail and decision-making skills
- Ability to balance short-term and long-term goals
- Self-learning and initiative in fast-paced environments
- Security-first mindset with understanding of cybersecurity principles
Bonus Points:
- AI integration in workflows
- Familiarity with networking patterns (load balancers, DNS, VIPs, routing, firewall rules)
- Knowledge of multiple cloud providers (AWS, GCP, Azure, Oracle)
- Experience with data science tools (Apache Airflow, Apache Spark)
- Automated reporting experience