The presentation discusses the keys to a successful SPIRE rollout in production, based on learnings from multiple successful production deployments and commonly asked questions in SPIFFE/SPIRE Slack channels.
- Understand trust boundaries and how they map into SPIFFE trust domains
- Consider how this mapping affects your PKI and where to store keys
- Federation between independent SPIFFE systems can affect performance and bundle size
- Investment into building your own system depends on how much you trust it
- Consider architecture patterns, deployment models, logging, monitoring, security, availability, and performance topics when moving from proof of concept to production
The speaker spent a couple of years at ByteDance helping to rebuild their health indication and authorization system using SPIRE, which is now the world's largest deployment running and scaling beyond one million nodes.
You might have heard about SPIFFE and SPIRE, or you've already read specifications and run your first proof of concept SPIRE deployment to provide your workloads X.509 or JWT SVIDs. Maybe you are planning to use SPIRE for advanced use-cases like federating with the cloud service provider IAM, third-party service, or for your hybrid deployment. Despite where you are on your journey, you most likely asked yourself a question: How do I run SPIRE in production? In this presentation, Eli Nesterov will discuss what it means to run SPIRE in production and how it differs from POC. We'll go through different stages, from the most common architecture patterns, deployment models, logging, and monitoring to security, availability, and performance topics. The talk is based on learning from multiple successful production deployments, the most commonly asked questions in SPIFFE/SPIRE Slack channels, and hours of video conference talks.