Work among the Sysadmins are split among a variety of teams, each working on a specific area of the Lab.
The Ion team is responsible for the administration, maintenance, and development of Ion.
The Director team is responsible for the administration, maintenance, and development of Director. They also are responsible for ensuring the high availability of websites hosted on Director.
The Web Services (or WWW) team is responsible for maintaining the web presence of the Lab not supervised by other teams. This includes tjhsst.edu and sysadmins.tjhsst.edu. They are also responsible for *.tjhsst.edu domains not supervised by other teams or by the Infrastructure Lead. They are also responsible for managing TJ's proxy configuration file.
The Mail team is responsible for maintaining TJ's mail servers, list servers, and webmail clients (shared with Web Services).
The Signage team is responsible for maintaining TJ's Signage displays. They work closely with the Ion team in this regard.
The Networking team is responsible for managing the CSL's network infrastructure, including switches, networking connections, OpenVPN, NTP, DNS, and DHCP. They are responsible for the smooth flow of network traffic. They are also the point persons when diagnosing networking connections on CSL systems.
The Monitoring team is responsible for observability in the CSL, including logging, alerts, and metrics. They are responsible for maintaining systems that provide monitoring capability such as Grafana and Prometheus.
The Documentation team is responsible for accurate, comprehensive, and well-written documentation for the Sysadmins. They assist and strongly encourage other teams in documenting everything in both our Runbooks and this Docsite.
The Academic Services team is responsible for maintaining software that is used by TJHSST classes. This includes Othello, the TJHSST AI Grader, and Tin. Due to the presence of many services, there may be a sub-team for each service.
The Printing team is responsible for printing operations in the Lab, including the CUPS server and the printers.
The Cluster team is responsible for maintaing TJ's clusters (Borg and HPC).
The Advanced Computing Hardware Team is responsible for the maintenance of hardware within the Lab used for advanced computing, including GPUs.
The Understudy Coordinator is responsible for leading the Understudy program. The Coordinator is responsible for primarily planning the structure and activities with the Understudy program.
The Infrastructure Lead is one of the Lead Sysadmins who is responsible for broadly supervising all facets of the Lab's infrastructure.
The Infrastructure Lead is also responsible for:
prioritizing work among the Sysadmins
allocating work among the teams
ensuring work is done in a timely manner
ensuring best security practices and policies in the Lab
setting abuse guidelines
spearheading automation efforts
maintaining the GitLab issue tracker
The Infrastructure Lead provides recommendations and feedback on changes to the Lab's architecture or to substantial technical changes.
The Infrastructure Lead is NOT a person who takes on all responsibility. Instead, the Lead delegates work. authority, and responsibility to other teams and people.
Has an extraordinary knowledge of the Lab and the relationship between its software, services, and technologies
Has a broad range of expertise working with various aspects of the Lab's infrastructure
Has shown an extraordinary level of dedication to the program and its mission/values
CSL architectural decisions
The Lead Sysadmins make up the Sysadmin Leadership Team together with the Faculty Sponsor and are the final decision-makers in the Sysadmins. They make the final call with respect to team organization/membership, access requests, and all decisions related to the Lab.
They are appointed by the outgoing Lead Sysadmins with approval from the Faculty Sponsor.
In another sense, the Lead Sysadmins are the Presidents. They may appoint Junior Lead Sysadmins (Vice Presidents), if those people are expected to become Lead Sysadmins in the next year.
Senior Sysadmins are sysadmins who are seniors. By virtue of being a senior, they have no additional rights or responsibilities. Instead, by virtue of having served in the Lab for a long time, they often have the most experience in a specific area and offer a valuable perspective.
Leads are the Directly Responsible Individuals by default on a team. They are responsible for serving as the primary point of contact with respect to the team. If there is an incident relating to their team, the Lead(s) must be the one to report it.
stay apprised of their team's work
have extensive knowledge of the team's functional area
supervise the work done by their team members
report on their work to the broader Sysadmin team
Apple coined the term "directly responsible individual" (DRI) to refer to the one person with whom the buck stopped on any given project. The idea is that every project is assigned a DRI who is ultimately held accountable for the success (or failure) of that project.
They likely won't be the only person working on their assigned project, but it's "up to that person to get it done or find the resources needed."
... What's most important is that they're empowered.
The Deputy (or Deputy Lead) is a backup to to the Lead(s) and defers to their opinion. If the Lead(s) is/are not available, the Deputy should be able to temporarily take over. A Deputy is only appointed if the Sysadmin has demonstrated competence and trust that would make him/her already qualified to be a Lead.
If their is no Deputy, a team has a Backup, who would be someone that can step in for a Lead in the Lead's absence. The Backup is generally a Lead Sysadmin or a Sysadmin who has previously led that team.
Team members are people who significantly contribute to a team's goals. Passive involvement does not mean that someone is a team member. They operate under the direction of the Team Lead.