My infrastructure: Ridge

InfrastructureMachine LearningDevOpsSelf-Hosted

Ridge is a self-hosted, reproducible, and SSH-accessible infrastructure designed for end-to-end Data Science and AI project management.

Built on an optimized Ubuntu 24.04 layer via Ansible, it handles filesystem provisioning (SSD+HDD) and security through nftables and Fail2Ban. The core architecture is fully containerized:

Networking: Traefik as a reverse proxy with dynamic routing and automatic HTTPS.
ML Core: MLflow for experiment tracking, MinIO (S3-compatible) for artifacts, and FastAPI for the initial serving layer.
Dev Experience: Per-user VS Code containers with native NVIDIA GPU access.
DevOps & CI/CD: Gitea-hosted repositories with integrated Actions runners for automation.
Ecosystem: Seamless integration between Dev/DevOps components and Hugging Face repositories, partially embedded into personal web projects.
Monitoring: Real-time system, container, and GPU performance tracking via Prometheus and Grafana.
Collaboration: Project management via OpenProject, internal communication through Mattermost, documentation on MediaWiki, and quick note-taking with Memos.

Part of the playbook available on GitHub


← Back to project list