James Tombleson
d14c2aaa2c
* docker tests are looking good and nfs is able to connect and containers can talk to each other. * Added pihole support for a new vm * pihole is not working yet via docker. Installed it by hand without ansible for now. * added some docker related tasks and working on collins now to see how to use it. * forgot to push some changes... kube didnt work out as it adds too much overhead for what I need. * added two roles to help working with backup and restore of docker volume data. * did some cleanup on old roles. * pushing for axw testing * moving to requirements.yml. adding cron jobs for maint. * roles are being moved out of this repo. Roles are handled by requirements.yml going forward. Dev roles are still in the repo but if they stick around a new repo will be made for it. * Made a bunch of changes * fixed a problem * Added a playbook to deploy grafana and added prometheus role to monitor things. * Updated cron to test * Updated cron to test * Updated cron * updated discord_webhook and now testing if cron will pick up the changes. * Fixed plex backup for now. * docker updates and working on nginx * pushing pending changes that need to go live for cron testing * fixed debug roles and updated discord test * fixed debug roles and updated discord test * Disabling test cron * its been awhile... I am not sure what I have done anymore but time to push my changes. * added newsbot configs, added to jenkins, starting to migrate to collections. * Updated inventory to support the network changes * jenkinsfile is now working in my local setup. * node2 is unhealthy and is removed from inv. I was doing something to this box months ago, but now i dont remember what it was." * updated images and adding them to jenkins for testing * removed the old image files and moved to my public image * Jenkins will now inform discord of jobs. Added post tasks. Added mediaserver common. * updated the backend update job and adding a jenkins pipeline to handle it for me. * updated the backup job again * Updated all the jekins jobs. Added a jenkins newsbot backup job. Adjusted newsbot plays to add backup and redeploy jobs. * updated newsbot backup playbook to make older backup files as needed. * Added debug message to report in CI what version is getting deployed. * I did something stupid and this device is not letting me login for now. * removing twitter source for now as I found a bandwidth related bug that wont get pushed for a bit * Adding a bunch of changes, some is cleanup and some are adds * updated the images * updated the kube common playbook * Started to work on ceph, stopped due to hardware resources, updated common, added monit, and starting to work on a playbook to handle my ssh access. * Added a role to deploy monit to my servers. Still needs some more updates before its ready * Here is my work on ceph, it might go away but I am not sure yet. * Starting to migrate my common playbook to a role, not done yet. * updated kube and inventory * updated gitignore
89 lines
1.6 KiB
YAML
89 lines
1.6 KiB
YAML
---
|
|
# defaults file for jtom38/monit
|
|
|
|
monit_slack_alert_script: /etc/monit/scripts/slack.sh
|
|
monit_discord_alert_script: /etc/monit/scripts/discord.sh
|
|
|
|
monit_global:
|
|
check_interval: 120
|
|
log_file: /var/lib/monit/state
|
|
statefile: /var/lib/monit/state
|
|
eventqueue:
|
|
basedir: /var/lib/monit/events
|
|
slots: 100
|
|
|
|
monit_alert_slack:
|
|
deploy: false
|
|
webhook_token: ''
|
|
slack_instance: ''
|
|
channel: "#alerts"
|
|
username: "Monit"
|
|
|
|
monit_alert_discord:
|
|
deploy: false
|
|
webhook: ''
|
|
username: 'Monit'
|
|
|
|
monit_http:
|
|
port: 2812
|
|
username: admin
|
|
password: monit
|
|
|
|
monit_processes:
|
|
- name: ssh
|
|
pidfile: '/var/run/sshd.pid'
|
|
matching: ''
|
|
start: '/bin/systemctl start ssh'
|
|
stop: '/bin/systemctl stop ssh'
|
|
timeout: '30 seconds'
|
|
when:
|
|
- type: 'totalmem'
|
|
usage: '> 80%'
|
|
cycles: 1
|
|
alert: false
|
|
exec: "{{ monit_discord_alert_script }}"
|
|
|
|
monit_filesystems:
|
|
- name: root
|
|
path: /
|
|
when:
|
|
- usage: '> 80%'
|
|
tries: 1
|
|
cycles: 1
|
|
alert: false
|
|
exec: ""
|
|
|
|
monit_system:
|
|
hostname: "{{ ansible_hostname }}"
|
|
when:
|
|
- type: cpu
|
|
usage: "usage (user) > 80%"
|
|
cycles: 5
|
|
alert: false
|
|
exec: ""
|
|
- type: cpu
|
|
usage: "usage (system) > 30%"
|
|
cycles: 5
|
|
alert: false
|
|
exec: ""
|
|
- type: cpu
|
|
usage: "usage (wait) > 20%"
|
|
cycles: 5
|
|
alert: false
|
|
exec: ""
|
|
|
|
- type: memory
|
|
usage: "usage > 90%"
|
|
cycles: 5
|
|
alert: false
|
|
exec: ""
|
|
- type: swap
|
|
usage: "usage > 25%"
|
|
cycles: 5
|
|
alert: false
|
|
exec: ""
|
|
- type: "loadavg(5min)"
|
|
usage: "> 1"
|
|
cycles: 5
|
|
alert: false
|
|
exec: "" |