✤ The Idea: Building a Modern YouTube-like Service from the Ground Up
The goal of the StreamHive project was to design, build, and deploy a complete, end-to-end video
streaming platform, similar in concept to services like YouTube or Vimeo. The core challenge was
not just to build an application that could play videos, but to architect a system using modern,
cloud-native principles. This means the platform is designed from day one to be highly scalable,
resilient to failures, secure by default, and fully automated.
The project encompasses the entire lifecycle of software development: from a developer writing
code, through automated build and deployment pipelines, to a user accessing the final application
securely and efficiently from anywhere in the world.
✤ The Architecture: A Journey Through a Modern Cloud Application
The architecture is layered to separate concerns, ensuring each part of the system is specialized
and efficient.
» 1. The Entry Point: The Edge & API Gateway
Before a user's request ever reaches our application, it passes through the edge layer. We use
Cloudflare as our primary API Gateway and reverse proxy. It acts as the front door,
providing:
‣ Security: DDoS protection and a Web Application Firewall (WAF) to prevent malicious
traffic.
‣ Performance: A global Content Delivery Network (CDN) to cache static assets closer to users,
dramatically speeding up load times.
‣ DNS Management: Securely routing user traffic to our cluster's entry point.
» 2. The Core Infrastructure: Orchestration on the Cloud
The entire application runs on the Azure cloud platform, leveraging its powerful managed
services:
‣ Azure Kubernetes Service (AKS): The heart of our platform. Kubernetes is the orchestrator that
manages our containerized microservices. It handles automatic scaling, healing, and rolling
updates, ensuring the application is always running and available.
‣ Azure Blob Storage: A highly scalable and cost-effective solution for storing large files. We
use
it as the definitive storage for all raw and processed video files.
‣ Azure Key Vault: The secure vault for all our application secrets, like database passwords and
API
keys. This ensures sensitive data is never hard-coded in our source code.
» 3. The Internal Network: A Secure and Intelligent Service Mesh
Once inside the cluster, all service-to-service communication is managed by the Istio service
mesh. Istio automatically injects a smart sidecar proxy into each microservice pod, giving us
incredible power without changing any application code:
‣ Zero-Trust Security: Automatic mutual TLS (mTLS) encryption for all internal traffic, meaning
services communicate securely by default.
‣ Advanced Traffic Management: Intelligent routing, circuit breaking, retries, and timeouts,
making
the entire system more resilient to failures.
‣ Deep Observability: Istio generates detailed metrics, logs, and traces for every single request,
giving us a complete picture of our system's health.
» 4. The Application Logic: A Suite of Specialized Microservices
The application itself is broken down into small, independent microservices written in Go and
Node.js. Each service has a single responsibility, making them easy to develop, test, and scale
independently.
‣ Frontend Service: The user-facing web application that provides the UI.
‣ Security Service: Handles user authentication and authorization using JWTs.
‣ Upload Service: Manages the initial ingestion and validation of video files.
‣ Transcoder Service: A background worker that processes uploaded videos into different formats
and
resolutions for adaptive streaming.
‣ Video Catalog Service: The central API for all video metadata (titles, descriptions, etc.).
‣ Playback Service: Provides the streaming manifests that video players use to stream
content.
» 5. The Data and Eventing Layer: Ensuring Consistency and Performance
The microservices rely on a robust data and messaging layer:
‣ PostgreSQL: Our primary relational database for storing all structured data, like user
information
and video metadata.
‣ Redis: An in-memory cache used to store frequently accessed data, dramatically reducing database
load and improving API response times.
‣ RabbitMQ: A powerful message broker that enables asynchronous communication. When a video is
uploaded, the system immediately responds to the user while publishing an event to a queue. This
decouples the upload process from the slow transcoding process, creating a responsive and
fault-tolerant system.
» 6. The Automation Engine: A Fully Automated CI/CD GitOps Pipeline
The entire process of building and deploying the StreamHive platform is fully automated, following
modern CI/CD and GitOps principles.
‣ Source Control: All application code and Kubernetes configuration is stored in GitHub.
‣ Continuous Integration (CI): We use Azure DevOps to create a CI pipeline that automatically
listens for code changes in GitHub. It builds the code, runs tests, and publishes a versioned
container image to Docker Hub.
‣ Continuous Deployment (CD) with GitOps: The final step of the CI pipeline is to update a
Kubernetes manifest file in a separate Git repository with the new image tag. ArgoCD, our GitOps
tool running in the cluster, detects this change and automatically synchronizes the application,
safely rolling out the new version with zero downtime. This means our Git repository is the single
source of truth for our entire live environment.
» 7. The Observability Stack: Monitoring, Visualizing, and Alerting
To ensure the platform is running smoothly, we have a comprehensive observability stack:
‣ Prometheus: A time-series database that automatically scrapes and stores the detailed metrics
generated by the Istio service mesh.
‣ Grafana: A powerful visualization tool that connects to Prometheus. We use it to build real-time
dashboards that monitor the health, performance, and error rates of every microservice in the
system.
‣ Secrets Management: The Secrets Store CSI Driver is a key security component that bridges Azure
Key Vault and our Kubernetes pods, securely mounting secrets as files at runtime.
✤ Lessons Learned
‣ Kubernetes Journey: From Local Setups to a Managed Service
Our initial approach involved deploying on a local Kind Kubernetes cluster, where we faced and
solved several early-stage issues. We then moved to manually installing Kubernetes on Azure VMs.
This process was a significant learning experience, introducing us to complex concepts like CNI
plugins for networking. Ultimately, these experiences highlighted the value of a managed service,
leading us to adopt Azure Kubernetes Service (AKS) for its stability and operational
efficiency.
‣ Edge Security, DNS, and Observability
Using Cloudflare as our DNS and edge security provider was a new and valuable experience. We
learned how to configure security rules, enable DDoS protection, and set up HTTPS for the entire
cluster. On the monitoring front, setting up Grafana dashboards with Prometheus as the data source
was instrumental in learning how to visualize the health and performance of a distributed system
in real-time.
‣ Cloud Integration and CI/CD Pipeline Challenges
Working with Azure as a cloud provider exposed us to its powerful ecosystem. We gained hands-on
experience with services like Azure Load Balancers, Azure Blob Storage, and Azure Key Vault. A key
challenge we solved was in our Azure DevOps pipeline, which required setting up a self-hosted
build agent within the same virtual network as our Kubernetes cluster to enable successful
deployments. Furthermore, securely pulling secrets from Key Vault required installing and
configuring the Secrets Store CSI Driver in the cluster.
‣ Service Mesh Implementation and Application Resiliency
To meet industry standards for a modern API gateway, we implemented the Istio service mesh, which
was a valuable and challenging lesson in itself. We also focused on application resiliency by
implementing the Circuit Breaker pattern in our Go and NodeJS services to prevent cascading
failures. On the frontend, we adopted the best practice of building our React app into static
files and then serving them efficiently using a lightweight Nginx web server container.
✤ Technology Stack Summary
‣ Cloud: Azure (AKS, Blob Storage, Key Vault)
‣ Containerization & Orchestration: Docker, Kubernetes
‣ CI/CD & GitOps: Azure DevOps, ArgoCD, GitHub, Docker Hub
‣ Networking & Service Mesh: Cloudflare, Istio
‣ Backend Languages: Go, Node.js
‣ Databases & Caching: PostgreSQL, Redis
‣ Messaging: RabbitMQ
‣ Observability: Prometheus, Grafana
‣ Security: Secrets Store CSI Driver
✤ Architecture Diagrams
StreamHive System - Detailed Architecture
StreamHive System - High level Architecture
✤ Project Documentation
📒 Download StreamHive Project Documentation (PDF)