Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
tamiwiki:internal:networks:tami_sre [2023/03/05 00:00] – 444b | tamiwiki:internal:networks:tami_sre [2023/05/26 21:51] (current) – removed corshunov | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Tame Site Reliability Engineering ====== | ||
- | This page detials our efforts to keep the systems online reliable | ||
- | ---- | ||
- | |||
- | Currently, we have a Raspberry pi in the space that runs a realtime Status webpage from UptimeRobot | ||
- | The Status page of our Services is here | ||
- | |||
- | |||
- | |||
- | In case the network is not functioning, | ||
- | |||
- | |||
- | ===== Troubleshooting steps ===== | ||
- | The first step is to identify the nature of the root cause and whether it is related to the network or the infrastructure. | ||
- | Use the [https:// | ||
- | * If everything is down, is it definetly a Network issue but maybe also infra | ||
- | * If the IP address of Tami is reachable (Ping and Telnet), but the yunohost services are down, its likely just an infra issue | ||
- | |||
- | |||
- | ==== Network ==== | ||
- | Relevant Link: [[tamiwiki: | ||
- | |||
- | ==== Infra ==== | ||
- | Relevant Link: [[tamiwiki: | ||
- | === If there is an issue with a single service === | ||
- | * The first step is to see if you can log into [[https:// | ||
- | * Then check the service at [[https:// | ||
- | * Review the logs, restart the service if necessary and maybe share logs with yunopast into a relevant group in tamis communication channel | ||
- | === If there is an issue with a multiple services === | ||
- | * Attempt the steps above for each services but if its all services, it might be something related to yunohost or the device it is running on | ||
- | * Try to ssh into yunohost. The password is your yunohost SSO password | ||
- | * ssh < | ||