Windows server system health check

Giteqa

Each system requires constant monitoring of the operability of both components and the system itself. To do this, the system administrator should use special programs that will help him and will issue alerts if something goes wrong. A little below I will list a few programs and show you how to work with them.

Video presentation

Resource Monitor (Real-time Monitoring)

Let's start with the main program that is built into the system. With this program, you can observe how it works and what is the workload of the processor, memory, and more. It looks like the task manager, but here everything is shown in more detail that relates to the system.
To open it, select Server Manager -> tools -> resource monitor

Next, you will see the resource monitoring itself, where you can view all the data about the components of your server.

Initially, you will be shown all the items that you can view. If you are interested in one thing, then just go to this component. For example, I'm interested in what is the load of my RAM. In order to monitor it, I'll just go to the Memory section 

Here I can view more detailed information about memory. For example: how much memory is available/used, which process uses how much memory. Also I can disable processes or a whole tree of processes, etc.

Performance monitor

There is also a program built into the system that will help the system administrator monitor performance. Unlike the first method, you can set up an administrator alert here if there are problems or some kind of overload. You can record up to 10,000 reports that the administrator can view. For example, there were problems at night, and the administrator comes in the morning and checks all the reports of the system. They will contain information at what point the peak load was and what it was associated with.

To open the program in server manager, select tools -> performance monitor

Monitoring will open in front of you. It will initially show only the CPU load in real time

To add some other parameters here, for example, the time that the processor spends executing the user's command, click on the green plus.

After you have added the necessary parameters, you can see that the monitoring has changed. Now there are peaks (heavy load) and sids (normal load) by which you can understand when the server is heavily loaded or when it is working stably.

This way you can track various characteristics of the system. For example, you can track the number of unsuccessful logins to the system.

Creating monitoring reports

It's very important for the system administrator, because there can be a heavy load on the servers at night and he needs to know about it. In order to create reports on certain system parameters, you will need to create a folder for data collection. You can create it by following the path Data collector sets -> user defined right click and select Data collector set

After that, we enter the name of what exactly we will track and select Advance

After that, we need to choose the time at which the data will be recorded by selecting Performance counter

Next, we select everything that we want to track and once at what time it will be checked. In my case I chose some data on RAM and a check in 5 seconds.

After that, we can press Finish to complete the configuration. Then we will simply be asked to select the location of the file (I leave it by default).
After you have created the file, run it by right-clicking and selecting Start and after a while you can go check the data.

To check the data, stop the data collector that was started earlier and go to the path Reports->user defined->created folder and there you will find the data that was collected.

Thus, the administrator will always be able to check during what period what was the load on the server, which will allow him to understand what should be done next. For example, such load tracking will allow the sysadmin to understand why the server stops working in the evenings / nights and how it can be solved (If the CPU load suddenly equals 80-90%, then it is worth thinking about improving it).

Error Alerts

In order for an error notification or an overload of components to appear, you will need to create a data collection file as before. We do this in the same place, but add _Alert to the name so that it does not get lost among the collections already created. During creation, also select Advance and there select Performance counter alert

After that, add everything you need. For example, I added a data assembly on the processor. Also do not forget to specify the % when exceeding which you should have an error.

Next, complete the creation of this file and go to it to configure the properties.

You need to specify the number of seconds/minutes in which the information will be collected. Next, go to the Alert Action item and tick the box

After you have done this, start data collection.

You can also start another data collection if you suddenly need more information about the load on the server. To do this, simply fill in the Start a data collector set field by selecting the necessary collection there.

To receive notifications on the screen or via email, you will need the Alert Task item in it you can specify an argument that should be executed if an error occurred (Run this task when an alert is triggered). Also there you can simply specify which task Task Scheduled should refer to, and this is how you can display the error on the screen or in other ways.

In order for notifications to pop up, you need to install the repository in powershell with the command

Install-Module -Name BurntToast    And confirming the installation by pressing Y

For example, I will show you how to display an error on the screen using Task scheduled, which parameters need to be selected and what to run.

You can find it as well as other programs through Tools

Next, you will need to create a task that will be executed, namely, launching the powershell application and entering a command into it. After launching this program, select Create Task in the right corner and you will be transferred to create a task.

In the General menu, select the name of this file and then go to the Triggers menu.

In it, click on New and select One time so that the notification pops up only when it is called by the system. Next, go to the Actions item

Here you need to select the program that will run PowerShell and enter the following code

New-BurntToastNotification -Text "Header", "Notification Text"

Thanks to this code, a notification will be displayed on the main desktop every time our data collection calls it.

You can also choose your own individual icon that the notification will have. To do this, enter -applogo at the end of the code and specify the path to the icon.

You can see error messages using the Event Viewer

When you go to it, you will need to go to the path Application and services -> Microsoft -> Windows -> Diagnosis PLA and there will be a file in which you can view all the logs created indicating the parameters that you previously registered in the data collection.

These are the main parameters that will help the sysadmin to constantly monitor the system and have the latest information on it.