Gain experience in monitoring your SAP environment

August 13, 2024 | By: Thomas Schlosser

Our Best Practice Paper Infrastructure monitoring for SAP Systems does a really good job to get detailed information about how to install and customize SUSE Linux Enterprise Server for SAP Applications to monitor metrics. It provide a lot of insights that can help increase uptime of critical SAP applications.

For people without any experiences in monitoring, it is however challenging to learn all the different components and how they are fit together. We all know, the best way to learn is using it in practice. But to come to the point where playing with all the components makes really fun takes often a little longer like we want to invest.

This is the moment where our ansible playbook for monitoring comes into place. The ansible playbook is deploying most of the monitoring solutions discussed in the Best Practice Paper mentioned above. This means with changing only a few parameter a complete monitoring infrastructure can be enrolled.

The components are:

Grafana Dashboard
Prometheus Server
Prometheus Node Exporter
Prometheus Alert Manager
Collectd Exporter
PCM Exporter (only available on SLES for SAP Application)
Loki Server
Promtail Agent

Get ready for the adventure – We prepare everything

Especially for playing around it is also possible to have everything in one small Virtual Maschine. In our example we use a Virtual machine with 4 CPU and 4 GB of RAM. We will install openSUSE Leap 15.6. It is of course also possible to use SUSE Linux Enterprise Server (a compatibility list can be found in the README of the ansible playbook). Before we run the ansible playbook there is at least a little preparing needed.

1. The ssh key

Our ansible playbook needs of course access to the systems we want to deploy. So a first step would be to create private and public ssh key and copy the public one to the systems.

ansible:~ # ssh-keygen
ansible:~ # ssh-copy-id root@<vm01>

Note: On creating the key, it is important to NOT Enter a passphrase. Otherwise we have to give the passphrase on every single ansible tasks.

2. Clone the github repository

The next step is of course getting the code. To do this we can simply clone the ansible playbook from github.

ansible:~ # git clone https://github.com/SUSE/SLES4SAP-sap-infra-monitoring-ansible-playbook.git

3. Install ansible

We are using pip3 for installing ansible (please keep in mind that it is recommended to use ansible version < 2.12 and python version 3.8 or newer).

ansible:~ # pip3 install ansible
ansible:~ # ansible-galaxy collection install -r requirements.yaml

Ingredients for the cake – The configuration

Now that we’ve prepared the environment we need to change some configuration. The main “ingredients” are simply our hosts. They can be add to the inventory file.

There are 3 section:

1. Deployment variables

This first one defines what we would like to deploy. If we would like everything we can get – nothing to do here.

2. Server Host Group:

This section defines where you want to have the server services. Server Services are for example the Grafana Dashboard or the Prometheus Server.
Each server service can be on an individual host or, like in our example, all on one host (vm01):

# Server Host Group (can be in each group the same host)

grafana_server:
  hosts:
    vm01:

prometheus_server:
  hosts:
    vm01:
prometheus_alertmanager:
  hosts:
    vm01:

loki_server:
  hosts:
    vm01:

3. Agent Host Groups

This section is where the monitored host can be placed. To be able to group the hosts, nested groups will be used. This means we can group hosts into different “departments”, etc. All these “departments” needs to be named by starting with agents_. The different hosts can then put under hosts: (see the example agents_test below).

# Agent Host Groups (Please add new groups to the nested group below as well)
# Don't use any host twice.
agents_test:
  hosts:
    vm01:

Important is to put theses agent_ names as well under children in the next section. (see agents_test below)

# Nested group will be used for playbook (Please don't use any single host entry)
# Please add new host groups here as well.
monitored_hosts:
  children:
    agents_test:

Collectd setting

There is only one small thing left. We need to tell collectd which host he can pinged for checking the network healthy. The file where this can be done is the group_vars file.

# Collectd: IP address target
ping_target: <ip address> # Please change this IP to a valid one

Push the button – Execute ansible

Once everything is prepared we can simply run our ansible playbook with the following command:

ansible-playbook -i inventory.yaml --user root playbook-monitoring.yaml

If everything works correctly we will get a result like:

[...]
PLAY RECAP *******************************************************************
vm01 : ok=70 changed=34 unreachable=0 failed=0 skipped=47 rescued=0 ignored=0

Important is of course to check if the failed count is zero (failed=0).

Are you Ready to explore?

The Grafana Dashboard.

The ansible playbook comes already with an example Grafana Dashboard.

It can be opened by using the following URL:

http://<vm01>:3000

The default login is:

User: admin
Password: SUSE1234

After login we go to Menu – Dashboards – General – Example Dashboard.

The Dashboard shows only a small piece of what is possible, but it gives a first impression about Grafana Dashboards.

But there is even more.

Here are some key points where you can start exploring:

- Check out the prometheus URL: http://<vm01>:9090
- Alerting: There are already some rules which can be used within the Prometheus Alertmanager.
  - It only needs to fill out some mail settings in alertmanager.yaml
- You want to explore log files on the command line? logcli is you friend.
  - ```
  logcli labels
  logcli labels <labelname>
  logcli query '{labelname="value"}
```

Conclusion

Having already some experiences before going into production is always a good thing. The ansible playbook will give you that possibility. Without reading tons of documentation and without the need of a big system in the first place. Once everything is deployed you can start playing around.

But is can also be adapted to a bigger existing environment by changing options and ports of each service. It will also take care of firewall settings for a secure environment.

(Visited 1 times, 1 visits today)

Jul 03rd, 2024

Thomas Schlosser