TJ CSL
  • TJ CSL
  • Services
    • Ion
      • Development
        • Overview
        • Setup
          • Docker Setup
          • Vagrant Setup
        • Environment
        • Fixtures
        • PR Workflow
        • Style Guide
        • Maintainer Workflow
        • Repository Maintenance
        • Data Generation
      • Production
      • User Experience
        • User Interface
    • Director
      • Development
        • Vagrant Setup
        • PR Workflow
        • Style Guide
        • Maintainer Workflow
      • Production
    • Workstations
    • Signage
      • Setup
      • Administration
      • Monitoring
      • Troubleshooting
      • Experimental
        • IonTap
        • SignageAdmin
    • Remote Access
      • Setup
      • Administration
    • Cluster
      • FAQ
      • Setup
        • SSH Setup
      • Administration
      • Slurm
      • Slurm Administration
      • Borg
    • Printing
      • Setup
      • Troubleshooting
    • WWW
      • Administration
      • Sites
        • Web Proxy
      • Setup
      • Troubleshooting
    • Academic Services
      • Tin
      • Othello
        • Administration
        • Setup
  • Technologies
    • Web
      • Nginx
      • Django
      • PHP-FPM
      • Node.js
      • Supervisord
    • DBs
      • PostgreSQL
      • MySQL
    • Authentication
      • Passcard
        • GPG Usage
      • SSHD
        • SSH Passwordless Login
      • FreeIPA
    • Storage
      • NFS
      • Ceph
        • Setup
        • Backups
        • CephFS
    • Operating Systems
      • Ubuntu Server
      • AlmaLinux
      • Debian
    • Tools
      • Ansible
      • Slack
      • GitBook
      • GitLab
        • Setup
        • Updating
    • Virtualization
      • QEMU/KVM
      • Libvirt
    • Advanced Computing
      • MPI
      • Tensorflow
    • Networking
      • Netbox
      • Cisco
      • Netboot
      • DNS
      • DHCP
      • NTP
      • BGP
    • Mail
      • Postfix
      • Dovecot
    • Monitoring
      • Prometheus
      • Grafana
      • Sentry
      • Uptime Robot
  • Machines
    • VM Servers
      • Utonium
      • Blossom
      • Bubbles
      • Buttercup
      • Antipodes
      • Chatham
      • Cocos
      • Galapagos
      • Gandalf
      • Gorgona
      • Overlord
      • Waverider
      • Torch
    • Ceph
      • Karel
      • Stobar
      • Wumpus
      • Waitaha
      • Barrel
      • Valdes
    • HPC Cluster
      • Zoidberg
    • Borg Cluster
    • Compute Sticks
    • Other
      • ASM
      • Duke
      • Snowy
      • Sauron
      • Sun Servers
        • Altair
        • Centauri
        • Deneb
        • Sirius
        • Vega
        • Betelgeuse
        • Ohare
    • Switches
      • Core0
      • Xnor
      • Xor
      • Imply
    • UPS
    • History
      • 2008 Sun AEG
      • 2011 Sun Upgrades
      • 2017 VM Disaster
      • 2018 Purchases
      • 2018 Cephpocalypse
    • VLANs
    • Remote Management
      • iLO
      • LOMs
    • Understudy
      • Switch Configuration
      • Server Configuration
        • Setting Up the Operating System
        • Network Configuration
        • Saruman
        • Fiordland
  • General
    • Sysadmins List
    • Organization
    • Documentation
      • Security
      • Runbooks
    • Communication
      • Terminology
    • Understudies
    • Account Structure
    • Machine Room
    • Branding
    • History
      • Fridge
      • The Brick
  • Procedures
    • Data Recovery
    • Account Provisioning
    • tjSTAR
      • Tech Support
    • Onboarding
      • New Sysadmin Onboarding
  • Guides
    • VM Creation
    • sshuttle Usage
    • Linux Wifi Setup
    • VNC Usage
    • Password Changes
    • Sun Server RAID Configuration
  • Policies
    • Data Release Policy
    • Upgrade Policy
    • Account Policy
    • Election Policy
  • Obsolete
    • Arcturus
    • Chuku
    • Cray SV1 Supercomputer
    • Ekhi
    • Mihr
    • Moloch
    • Sol
    • Rockhopper
    • Kerberos
    • LDAP
    • Agni
    • Moon
    • Apocalypse
    • AFS
      • OpenAFS
      • Setup
      • Client Setup
      • Administration
      • Troubleshooting
      • Directory Structure
      • Backups
      • Cross-Cell Authentication
    • Observium
    • OpenVPN
Powered by GitBook
On this page
  • Introduction
  • Servers
  • Architecture
  1. Technologies
  2. Storage

Ceph

PreviousNFSNextSetup

Last updated 5 years ago

Introduction

The Computer Systems Lab uses as its main storage infrastructure. We chose to use Ceph because of its scalability, redundancy, and reliability.

Ceph is developed by Red Hat, Inc. and is used by organizations like CERN, Cisco, T-Mobile, and various educational institutions around the world.

Servers

Currently, we have 3 OSD servers and 3 monitor/manager/metadata servers in the Ceph cluster. Each OSD server (, , and ) has 12×4TB HDD and 2×400GB SSDs.

Architecture

A Ceph promo video is available .

provides a good technical overview of Ceph from the Ceph Project Lead.

Ceph is composed of four different daemons: monitors, managers, object storage daemons, and metadata server daemons. The backend storage layer is called the Reliable Autonomous Distributed Object Store (RADOS).

Ceph was developed from the ground up to protect against hardware failures and corruption by distributing data across multiple hard drives hosted on multiple servers. The data written to each hard drive is managed by an object storage daemon (OSD) and the state of the cluster is managed by a set of monitors (mon). Using a state-of-the art algorithm, data is stored in various placement groups (PGs) which are replicated across multiple OSDs and served out to appropriate clients by the monitors.

The data itself is distributed according to the Controlled, Replicated Under Scalable Hashing (CRUSH) algorithm. At least a majority of the monitors must agree to the state of the cluster and resolves differences in the main cluster map. The monitors also store the CRUSH map.

Manager daemons provide an interface for outside systems to access data about the cluster.

Metadata server daemons are responsible for managing metadata for the CephFS file system.

Ceph
karel
wumpus
stobar
here
This video