Victor's Blog

Logrotate: Simplifying Log Management

9 Sep, 2024 Linux Administration Ansible

logrotate is a system utility designed to manage log files, ensuring that logs don’t consume excessive disk space by rotating, compressing, and removing old logs according to user-defined rules. This post will help you get familiar with logrotate configuration and usage.

By configuring and running logrotate effectively, you can automate log management, saving space, and ensuring your logs stay manageable over time.

Main Config

The primary configuration file for logrotate is located at /etc/logrotate.conf. It sets default values and contains the directory for additional configuration files (typically /etc/logrotate.d):

# see "man logrotate" for details
# rotate log files weekly
weekly

# keep 4 weeks worth of backlogs
rotate 4

# create new (empty) log files after rotating old ones
create

# use date as a suffix of the rotated file
dateext

# uncomment this if you want your log files compressed
#compress

# RPM packages drop log rotation information into this directory
include /etc/logrotate.d

# no packages own wtmp and btmp -- we'll rotate them here
/var/log/wtmp {
    monthly
    create 0664 root utmp
    minsize 1M
    rotate 1
}

/var/log/btmp {
    missingok
    monthly
    create 0600 root utmp
    rotate 1
}

Configuration Files

The configuration file for each application or log location usually resides under /etc/logrotate.d/ and can contain various options to control how logs are handled.

Common Options

hourly – Rotate logs every hour (requires cron to run logrotate hourly).
daily – Rotate logs daily.
weekly [weekday] – Rotate logs once per week.
monthly – Rotate logs the first time logrotate is run in a month.
yearly – Rotate logs once a year.
rotate [count] – Defines how many rotated logs to keep. If set to 0, logs are deleted instead of being rotated.
minsize [size] – Rotate logs when they exceed a specific size, while also respecting the time interval (daily, weekly, etc.).
size [size] – Rotate only when log size exceeds the defined limit.
maxage [days] – Remove rotated logs after a specified number of days.
missingok – Continue without error if the log file is missing.
notifempty – Skip rotation if the log file is empty.
create [mode] [owner] [group] – Create a new log file with specified permissions, owner, and group.
compress – Compress rotated logs.
delaycompress – Delay compression until the next rotation cycle.
copytruncate – Truncate the log after copying it to the rotated file.
sharedscripts – Ensures that post-rotation scripts are executed only once.
postrotate/endscript – Define commands to be executed after log rotation:

postrotate
  /opt/life/tools/stop-approve.sh && /opt/life/tools/start-approve.sh
endscript

Examples

Example 1: Deleting Old NMON Files

This example deletes NMON output files that are older than 3 months or larger than 5 MB:

# Logrotate config for nmon
/var/log/nmon/*.nmon {
  rotate 0
  maxage 90
  size 5M
}

Example 2: Compressing LMS Logs

In this case, logrotate keeps 10 compressed versions of logs, rotating them when logs are older than 12 months or larger than 5 MB:

# Logrotate config for LMS
/home/my_admin/cmd/log/*.log {
  rotate 10
  maxage 360
  size 5M
  compress
}

Creating a New Configuration File

To create a new log rotation rule, drop a configuration file into the /etc/logrotate.d/ directory. Ensure the scheduler for logrotate (either cron or systemd) is set up correctly.

Tip: Double-check your scheduler settings to ensure smooth log rotation.

Testing and Running Logrotate

Testing

Dry-Run

Before applying your configuration, you can test it with a dry-run using the -d option:

logrotate -d [my_config_file].conf

Verbosity

To view detailed steps, run logrotate with the -v option (useful with -d for dry-run testing):

logrotate -vd [my_config_file].conf

Scheduling Logrotate

logrotate can be scheduled using either cron or systemd. Here’s a quick overview of both methods:

Cron

When using cron, the logrotate job is typically defined in /etc/cron.daily/logrotate:

#!/bin/sh

/usr/sbin/logrotate -s /var/lib/logrotate/logrotate.status /etc/logrotate.conf
EXITVALUE=$?
if [ $EXITVALUE != 0 ]; then
    /usr/bin/logger -t logrotate "ALERT exited abnormally with [$EXITVALUE]"
fi
exit 0

Systemd

Systemd can also manage logrotate via logrotate.timer and logrotate.service.

logrotate.timer:

[Unit]
Description=Daily rotation of log files

[Timer]
OnCalendar=daily
RandomizedDelaySec=1h
Persistent=true

[Install]
WantedBy=timers.target

logrotate.service:

[Unit]
Description=Rotate log files

[Service]
Type=oneshot
ExecStart=/usr/sbin/logrotate /etc/logrotate.conf

Personal Cron Job

You can also schedule logrotate manually through a personal cron job:

# Runs logrotate daily
@daily /sbin/logrotate -s /home/my_admin/cmd/log/logrotate.status /home/my_admin/cmd/logrotate.conf >> /home/my_admin/cmd/log/cron.log 2>&1

Creating Logrotate With Ansible

Managing log rotation through Ansible is a powerful way to automate log maintenance across multiple servers. Below is an example of how you can create a logrotate configuration using Ansible.

Ansible Playbook Example

This playbook installs logrotate (if it’s not already installed) and creates a new configuration file under /etc/logrotate.d/ for an application:

---

- hosts: all
  gather_facts: true
  become: true

  vars:
    logrotate_conf: |
      # Logrotate for application
      /var/log/application/* {
        # Keep 4 versions of file
        rotate 4

        # compress rotated files
        compress

        # Rotates the log files every week
        weekly

        # Ignores the error if the log file is missing
        missingok

        # Does not rotate the log if it is empty
        notifempty

        # Creates a new log file with specified permissions
        create 0755 apache splunk       
      }

  tasks:

    - name: Installs logrotate
      package:
        name: logrotate

    - name: Creates logrotate configuration
      copy:
        content: "{{ logrotate_conf }}"
        dest: /etc/logrotate.d/application
        owner: root
        group: root
        mode: '0644'

Explanation

logrotate_conf variable: Defines the configuration file for logrotate. It includes options such as file rotation frequency, compression, and file permissions.
- rotate 4 – Keeps 4 old versions of the log.
- compress – Compresses the rotated logs.
- weekly – Rotates the logs every week.
- missingok – Ignores errors if the log file is missing.
- notifempty – Skips log rotation if the log is empty.
- create 0755 apache splunk – Creates a new log file with specific permissions, owned by apache and splunk groups.
Installs logrotate: The task ensures that logrotate is installed on the target servers.
Creates logrotate configuration: The copy task creates the custom logrotate configuration file under /etc/logrotate.d/, with appropriate permissions.

By using Ansible, you can streamline the management of log rotation across your environment, ensuring consistency in how logs are maintained across all your systems.

Getting Started with Ansible Console

14 Feb, 2024 Linux Ansible

Are you ready to level up your Linux game and wield the power of automation like a true DevOps ninja? Enter Ansible Console, the interactive command-line interface that puts the force of Ansible at your fingertips!

In this introduction post, I’ll take you on tour of Ansible Console, teaching you how to run ad-hoc tasks against your inventory with ease.

What is Ansible Console

Ansible Console is an interactive console for executing ad-hoc tasks against hosts defined in your Ansible inventory. It offers built-in command completion and a user-friendly interface, making it a versatile tool for system administrators and automation engineers.

Getting Started

To begin using Ansible Console, launch it by invoking ansible-console with the default inventory, or by specifying a custom inventory file.

Using the default inventory:

$ ansible-console

Specifying an inventory file:

$ ansible-console -i [inventory]

Once launched, the console prompt will display essential information about the current context, including group names, the number of hosts, and forks.

ansible@all (6)[f:10]$
		  |  |    |- forks
		  |  |- number of hosts
		  |- group

Keep an eye on the prompt color – if it turns red, it’s a subtle reminder that the become flag is set to true. Safety first, folks!

Inventory Management

Managing your inventory has never been easier. With Ansible Console, you can list hosts, groups, and even test connections with a flick of your wrist (or a tap of your keyboard).

Need to list all your hosts? No problem, just use the list sub-command:

ansible@all (6)[f:10]$ list
nut-pi
ubuntu-pi4
ubuntu-backup
ubuntu-lenovomini
freenas
pfsense

Want to flex your group management skills? Try this:

ansible@all (6)[f:10]$ list groups
all
centos
freebsd
linux
pfsense
raspberrypi
raspbian
rhel
truenas
ubuntu
ungrouped

And if you’re feeling a bit adventurous, why not test the connection to your hosts using the trusty ping or the ls command?

ls

ansible@all (6)[f:10]$ ls
freenas | CHANGED | rc=0 >>

pfsense | FAILED! => {
		"changed": false,
		"module_stderr": "Shared connection to 192.168.1.1 closed.\r\n",
		"module_stdout": "/bin/sh: /usr/local/bin/python: not found\r\n",
		"msg": "The module failed to execute correctly, you probably need to set the interpreter.\nSee stdout/stderr for the exact error",
		"rc": 127
}
ubuntu-lenovomini | CHANGED | rc=0 >>

ubuntu-backup | CHANGED | rc=0 >>

ubuntu-pi4 | CHANGED | rc=0 >>

nut-pi | CHANGED | rc=0 >>

ping

ansible@all (6)[f:10]$ ping
freenas | SUCCESS => {
		"changed": false,
		"ping": "pong"
}
pfsense | FAILED! => {
		"changed": false,
		"module_stderr": "Shared connection to 192.168.1.1 closed.\r\n",
		"module_stdout": "/bin/sh: /usr/local/bin/python: not found\r\n",
		"msg": "The module failed to execute correctly, you probably need to set the interpreter.\nSee stdout/stderr for the exact error",
		"rc": 127
}
ubuntu-lenovomini | SUCCESS => {
		"changed": false,
		"ping": "pong"
}
ubuntu-backup | SUCCESS => {
		"changed": false,
		"ping": "pong"
}
ubuntu-pi4 | SUCCESS => {
		"changed": false,
		"ping": "pong"
}
nut-pi | SUCCESS => {
		"changed": false,
		"ping": "pong"
}

Selecting Host or Group

You can select a host or a group with the ‘cd’ command (note how the prompt changes).

Host

ansible@all (6)[f:10]$ cd pfsense
ansible@pfsense (1)[f:10]$

Group

ansible@all (6)[f:10]$ cd ubuntu
ansible@ubuntu (3)[f:10]$

You can also use host patterns eg.: app*.dc*:!app01*

Configuration Customization

Ansible Console puts the power in your hands with customizable configuration options. Whether you need to specify the remote user, adjust the number of forks, toggle the become flag (and many more), Ansible Console has got you covered.

ansible@all (0)[f:5]$ remote_user [username]
ansible@all (0)[f:5]$ forks 7
ansible@all (0)[f:5]$ become [true|false]
ansible@all (0)[f:5]$ check [true|false]
ansible@all (0)[f:5]$ diff [true|false]

Modules

Ah, modules – the bread and butter of Ansible automation. With Ansible Console, you can unleash the full potential of Ansible modules right from your command line.

Use the help sub-command to get a list of available modules:

ansible@ubuntu (3)[f:10]$ help
	
Documented commands (type help <topic>):
========================================

[... truncated ...]

wti.remote.cpm_syslog_server_info
wti.remote.cpm_temp_info
wti.remote.cpm_time_config
wti.remote.cpm_time_info
wti.remote.cpm_user

Undocumented commands:
======================
ping

Explore the vast array of available modules or dive deep into specific modules for detailed information and usage examples. Simply provide the module name as an argument to help. And don’t forget that Ansible Console has built-in tab completion, so make sure to use and abuse it:

ansible@ubuntu (3)[f:10]$ help copy

Copy files to remote locations
Parameters:
	src Local path to a file to copy to the remote server.
	content When used instead of O(src), sets the contents of a file directly to the specified value.
	dest Remote absolute path where the file should be copied to.
	backup Create a backup file including the timestamp information so you can get the original file back if you somehow clobbered it incorrectly.
	force Influence whether the remote file must always be replaced.
	mode The permissions of the destination file or directory.
	directory_mode Set the access permissions of newly created directories to the given mode. Permissions on existing directories do not change.
	remote_src Influence whether O(src) needs to be transferred or already is present remotely.
	follow This flag indicates that filesystem links in the destination, if they exist, should be followed.
	local_follow This flag indicates that filesystem links in the source tree, if they exist, should be followed.
	checksum SHA1 checksum of the file being transferred.
	decrypt This option controls the autodecryption of source files using vault.
	owner Name of the user that should own the filesystem object, as would be fed to I(chown).
	group Name of the group that should own the filesystem object, as would be fed to I(chown).
	seuser The user part of the SELinux filesystem object context.
	serole The role part of the SELinux filesystem object context.
	setype The type part of the SELinux filesystem object context.
	selevel The level part of the SELinux filesystem object context.
	unsafe_writes Influence when to use atomic operation to prevent data corruption or inconsistent reads from the target filesystem object.
	attributes The attributes the resulting filesystem object should have.
	validate The validation command to run before copying the updated file into the final destination.

Module Usage Examples

Installing a Package

ansible@ubuntu (3)[f:10]$ ansible.builtin.package name=git

Installing a PIP Package

 ansible@ubuntu (3)[f:10]$ ansible.builtin.pip name=youtube-dl

Modules vs Commands

Anything that you type and it’s not a module will be run as a command. For example:

ansible@ubuntu (3)[f:10]$ cat /etc/hostname
ubuntu-lenovomini | CHANGED | rc=0 >>
ubuntu-lenovomini
ubuntu-pi4 | CHANGED | rc=0 >>
ubuntu-pi4
ubuntu-backup | CHANGED | rc=0 >>
ubuntu-backup

However, if a module exists, ansible-console will invoke that module, or give you an error.

Example with hostname, which is a built-in module:

ansible@ubuntu (3)[f:10]$ help hostname
Manage hostname
Parameters:
	name Name of the host.
	use Which strategy to use to update the hostname.

The module expects a parameter, so it fails:

ansible@ubuntu (3)[f:10]$ hostname
ubuntu-lenovomini | FAILED! => {
		"changed": false,
		"msg": "missing required arguments: name"
}
ubuntu-pi4 | FAILED! => {
		"changed": false,
		"msg": "missing required arguments: name"
}
ubuntu-backup | FAILED! => {
		"changed": false,
		"msg": "missing required arguments: name"
}

Another example, but now with yum:

ansible@ubuntu (3)[f:10]$ yum check-update -q
 [ERROR]: Unable to build command: this task 'yum' has extra params, which is only allowed in the following modules: ansible.legacy.group_by, shell, ansible.legacy.set_fact, ansible.builtin.include_tasks, command, ansible.legacy.script, ansible.legacy.include,
ansible.builtin.group_by, ansible.legacy.shell, set_fact, ansible.builtin.shell, win_command, ansible.legacy.command, ansible.legacy.win_shell, ansible.builtin.set_fact, ansible.builtin.add_host, ansible.legacy.import_role, include_role, ansible.builtin.win_shell, include,
ansible.legacy.win_command, ansible.legacy.include_tasks, include_tasks, ansible.legacy.add_host, ansible.builtin.include_role, group_by, ansible.builtin.meta, ansible.builtin.raw, import_role, ansible.legacy.raw, ansible.builtin.command, ansible.legacy.include_vars, script,
win_shell, ansible.builtin.script, ansible.legacy.import_tasks, add_host, meta, ansible.builtin.import_tasks, ansible.windows.win_shell, ansible.builtin.include, ansible.windows.win_command, ansible.builtin.include_vars, raw, ansible.builtin.import_role, include_vars, import_tasks,
ansible.builtin.win_command, ansible.legacy.meta, ansible.legacy.include_role

But sometimes you may need to run a command that exists as a module instead of the module itself. This can easily be done with the !:

ansible@ubuntu-pi4 (1)[f:10]$ !apt list --upgradable
ubuntu-pi4 | CHANGED | rc=0 >>
Listing...
base-files/focal-updates 11ubuntu5.8 arm64 [upgradable from: 11ubuntu5.7]
gnome-shell-common/focal-updates 3.36.9-0ubuntu0.20.04.3 all [upgradable from: 3.36.9-0ubuntu0.20.04.2]
gnome-shell/focal-updates 3.36.9-0ubuntu0.20.04.3 arm64 [upgradable from: 3.36.9-0ubuntu0.20.04.2]
libnss-systemd/focal-updates 245.4-4ubuntu3.23 arm64 [upgradable from: 245.4-4ubuntu3.22]
libpam-systemd/focal-updates 245.4-4ubuntu3.23 arm64 [upgradable from: 245.4-4ubuntu3.22]
libsystemd0/focal-updates 245.4-4ubuntu3.23 arm64 [upgradable from: 245.4-4ubuntu3.22]
libudev1/focal-updates 245.4-4ubuntu3.23 arm64 [upgradable from: 245.4-4ubuntu3.22]
motd-news-config/focal-updates 11ubuntu5.8 all [upgradable from: 11ubuntu5.7]
systemd-sysv/focal-updates 245.4-4ubuntu3.23 arm64 [upgradable from: 245.4-4ubuntu3.22]
systemd-timesyncd/focal-updates 245.4-4ubuntu3.23 arm64 [upgradable from: 245.4-4ubuntu3.22]
systemd/focal-updates 245.4-4ubuntu3.23 arm64 [upgradable from: 245.4-4ubuntu3.22]
tailscale/unknown 1.58.2 arm64 [upgradable from: 1.56.1]
tzdata/focal-updates 2023d-0ubuntu0.20.04 all [upgradable from: 2023c-0ubuntu0.20.04.2]
udev/focal-updates 245.4-4ubuntu3.23 arm64 [upgradable from: 245.4-4ubuntu3.22]
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Variables

Collection variables are not available by default because ansible-console, unlike ansible-playbook, doesn’t gather facts before running:

ansible@ubuntu-pi4 (1)[f:10]$ debug msg="{{ ansible_facts.architecture }}"
ubuntu-pi4 | FAILED! => {
		"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'architecture'. 'dict object' has no attribute 'architecture'. 'dict object' has no attribute 'architecture'. 'dict object' has no attribute 'architecture'"
}

But fear not, you can manually invoke the setup module and either filter the values, or print them with the debug module:

ansible@ubuntu-pi4 (1)[f:10]$ setup filter="*architecture"
ubuntu-pi4 | SUCCESS => {
		"ansible_facts": {
				"ansible_architecture": "aarch64"
		},
		"changed": false
}

ansible@ubuntu-pi4 (1)[f:10]$ debug msg="{{ ansible_facts.architecture }}"
ubuntu-pi4 | SUCCESS => {
		"msg": "aarch64"
}

Conclusion

With its intuitive interface and powerful functionality, Ansible Console offers a convenient way to execute ad-hoc tasks and manage your Ansible inventory interactively. By mastering Ansible Console, you can streamline your automation workflows and improve operational efficiency in your environment.

Playing a Sound on Headless Server Boot

21 Dec, 2023 Linux Ubuntu

Beep is a versatile utility that transforms your computer speaker into a musical instrument reminiscent of early Nintendo-style sounds. You provide a frequency and length and beep will play the respective sound.

I started using beep in the early 2000s with Smoothwall (think of a Linux predecessor to pfSense) to notify me when it had rebooted. That is still my main use for the utility (less the Smoothwall part), but it can be used for many other things.

We’ll go through a quick setup on Ubuntu using group permission, but you can refer to the main GitHub repo (Permission setup for beep) for alternative setup instructions if you need something different.

Configuration

Let’s start by installing beep:

sudo apt update && sudo apt install beep

Now create the beep system group. Any user member of this group will be able to run beep:

sudo addgroup --system beep

Add it as a secondary group for your user:

sudo usermod [user] -a -G beep

Create the udev rule to allow the group permission to the speaker:

/usr/lib/udev/rules.d/90-pcspkr-beep.rules

# Add write access to the PC speaker for the "beep" group
ACTION=="add", SUBSYSTEM=="input", ATTRS{name}=="PC Speaker", ENV{DEVNAME}!="", GROUP="beep", MODE="0620"

Comment out the existing blacklist for ‘pcspkr’ in /etc/modprobe.d/blacklist.conf:

# ugly and loud noise, getting on everyone's nerves; this should be done by a
# nice pulseaudio bing (Ubuntu: #77010)
# blacklist pcspkr

Load the module:

sudo modprobe pcspkr

Now let’s test to make sure that it works. You will need to logout and log back in, or simply ssh to your localhost (this is so the secondary group is loaded). And then run beep with the options below:

beep -f 587 -l 714

If that works then we have confirmed that we configured everything correctly. Just a reminder, your computer needs to have a speaker otherwise beep wont work.

Now let’s create a sample script and the Systemd unit file for the service that will play beep after boot.

Create the following script and give it execute permission:

/usr/local/bin/star-trek.sh

#!/bin/bash
beep -f 587 -l 714
beep -f 784 -l 238
beep -f 1046 -l 1428
beep -f 987 -l 476
beep -f 784 -l 357
beep -f 659 -l 357
beep -f 880 -l 357
beep -f 1174 -l 952

sudo chmod +x /usr/local/bin/star-trek.sh

Create /lib/systemd/system/beep-startup.service with the following content. Note that you will need to change the User= property to be the username for your user:

# /lib/systemd/system/beep-startup.service
[Unit]
Description=Plays startup audio
After=network-online.target

[Service]
Type=simple
ExecStart=/usr/local/bin/star-trek.sh
User=[user]

[Install]
WantedBy=default.target

Let’s enable and start the service, and it should play the Star Trek intro sound:

systemctl enable --now beep-startup.service

Conclusion

With these configurations, your machine will serenade you with the iconic Star Trek intro sound after establishing a network connection during boot. Dive into the world of beep, and if you’re looking for more scripts, explore GitHub - victorbrca/beep-scripts for an extensive collection. Happy beeping!

RHCE v8 Practice Exam

27 Nov, 2023 Linux RedHat RHCE

I created this practice exam as part of my preparation for the Red Hat RHCE exam (EX294V84K). It comprises 10 advanced-level tasks, with a focus on incorporating some of the most challenging RHCSA objectives.

Answers are available on my practice environment at victorbrca/rhce8-practice-env.

Pre-requisites

Setup the Vagrant/VirtualBox environment - victorbrca/rhce8env
Register for a free Red Hat Developer subscription - instructions
On the control node:
- Register with you developer subscription. Set it to the 8.4 release
- Uninstall Ansible if installed
- Remove all EPEL repos
- Add the ansible-2.9-for-rhel-8-x86_64-rpms repo via subscription manager
Add an additional 10GB disk to node4
Increase memory on node4 to 1024M

Objectives

The following top-level objectives are covered on this exam, as per ‘EX294V84K’.

Be able to perform all tasks expected of a Red Hat Certified System Administrator
Understand core components of Ansible
Install and configure an Ansible control node
Configure Ansible managed nodes
Script administration tasks
Create and use static inventories to define groups of hosts
Create Ansible plays and playbooks
Use Ansible modules for system administration tasks that work with:
Create and use templates to create customized configuration files
Work with Ansible variables and facts
Create and work with roles
Download roles from an Ansible Galaxy and use them
Use Ansible Vault in playbooks to protect sensitive data
Use provided documentation to look up specific information about Ansible modules and commands

Tasks

Task 1

Objectives covered:

2. Understand core components of Ansible
3. Install and configure an Ansible control node
4. Configure Ansible managed nodes

Tasks:

Install ansible on the control node
Create a user called ‘ansible’ on the control node
Create the directory /home/ansible/exam-files. This is where all files will be saved
- Create the following sub-directories:
- roles, vars, playbooks, scripts, files
- Create an ssh key for the ‘ansible’ user in this folder
Create an inventory file with the nodes:
- node1
- node2
- node3
- node4
Create an ansible config file as follows:
- Roles path is set to /home/ansible/exam-files/roles
- Inventory file is /home/ansible/exam-files/inventory
- User to SSH to remote nodes is ‘ansible’
- Add the ssh key from the previous task
- Disable:
- Cow output
- Retry files
- Host key checking
SSH to all nodes and create the ‘ansible’ user. Give it a password
Make so that the ‘ansible’ user can elevate privileges without a password on all nodes
Distribute the ssh key created to the nodes (use any method)
Disable ssh password authentication for the ‘ansible’ user on all nodes
Create the ad-hoc script /home/ansible/exam-files/scripts/check-connection.sh that checks that the ssh connection works to all nodes

Task 2

Objectives covered:

5. Script administration tasks

Tasks:

Create the shell script /home/ansible/exam-files/scripts/get-server-info.sh that:
- Gets the hostname, OS name, OS version, tuned service status, and the tuned profile that is currently active. Output should look like:
```
Hostname: control.ansi.example.com
Name: "Red Hat Enterprise Linux"
Version: "8.0 (Ootpa)"
Tuned status: active
Current active profile: virtual-guest
```
Create the ad-hoc script /home/ansible/exam-files/scripts/task2.sh that:
- Uploads ‘get-server-info.sh’ to ‘node1’ with:
- To /usr/local/bin/get-server-info.sh
- Owned by ‘root:root’
- Permission is ‘rwxr-xr-x’
Run the add-hoc script

Task 3

Objectives covered:

6. Create and use static inventories to define groups of hosts

Tasks:

Modify the inventory file to have the following groups
- ‘node1’ and ‘node2’ are in the ‘webservers’ group
- ‘node3’ and ‘node4’ are in the ‘databases’ group
- ‘node3’ is in the ‘mysql’ group
- ‘node4’ is in the ‘postgresql’ group
- ‘node1’ is in the ‘version1’ group
- ‘node2’ is in the ‘version2’ group

Task 4

Objectives covered:

7. Create Ansible plays and playbooks

Tasks:

Create the playbook /home/ansible/exam-files/playbooks/task4.ymlthat:
- Creates the folder /data/backup on the ‘webservers’ group. The folder should have read and execute permission for group and others
- Creates the file /etc/server_role on all servers
  - The content of the file should be ‘webservers’ or ‘databases’ according to the inventory group
- Create a task that uses the rpm command to check if ‘httpd’ is installed on the webservers and databases groups
- This task should only show as changed when it fails
- Ceate two tasks that display the following output based on the exit code from the rpm task. These tasks should run against the same groups as the rpm task:
  - ‘HTTPD is installed’ if it’s installed
  - ‘HTTPD is not installed’ if it’s not installed
- Makes sure that the default system target is set to ‘multi-user.target’
  - Should only set the target if not already set
  - Should show change on failure
  - Should ignore errors

Task 5

Objectives covered:

1. Be able to perform all tasks expected of a Red Hat Certified System Administrator
8. Use Ansible modules for system administration tasks that work with
10. Work with Ansible variables and facts

Tasks:

Create bash a script called /home/ansible/exam-files/files/root_space_check.sh that gets the used space percent for root (/) and:
- Logs an info message to journald that looks like root_space_check.sh[PID]: / usage is within threshold when usage is below 70%
- Logs a warning message to journald with root_space_check.sh[PID]: / usage is above 70% threshold when usage is above 70%
Create the playbook /home/ansible/exam-files/playbooks/task5.yml that
- Uploads the root_space_check.sh script to /usr/local/bin/ to all servers and set execute bit all accross (ugo)
- Adds an entry to root’s crontab to execute the script every hour on all servers
- Does the following on the ‘webservers’ group
- Installs ‘httpd’
- Enables and starts the ‘httpd’ service
- Enables the ‘http’ and ‘httpd’ service on firewalld (runtime and permanent)
- Sets the Listen option in /etc/httpd/conf/httpd.conf to the internal IP. E.g.: Listen 192.168.55.201:80. Use facts variables for the internal IP
- Whenever httpd.conf is changed
  - Make sure that the ‘httpd’ service is restarted
  - Backs up an archived (zip) version of httpd.conf to /data/backup/httpd.conf-[YYYYMMDD_HHMMSS].zip (change [YYYYMMDD_HHMMSS] to a date string, e.g.: ‘20231123_2400’)
- Configures storage on the mysql group as follow:
- PV using /dev/sdb
- VG named ‘databases_vg’
- LV name ‘databases_lv’
- ext4 filesystem with the volume label of ‘DATABASES’
- Mounted on fstab under /data/databases
- Enables SELinux on the databases group with targeted policy

Task 6

Objectives covered:

9. Create and use templates to create customized configuration files
10. Work with Ansible variables and facts
11. Create and work with roles

Files:

index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Server Information</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            margin: 20px;
        }
        div {
            margin-bottom: 10px;
        }
    </style>
</head>
<body>
    <h1>Server Information</h1>

    <div>
        <strong>Hostname:</strong> <span id="hostname">[HOSTNAME]</span>
    </div>

    <div>
        <strong>Node Group:</strong> <span id="group">[NODE GROUP]</span>
    </div>

    <div>
        <strong>IP Address:</strong> <span id="ip">[IP ADDRESS]</span>
    </div>

    <div>
        <strong>Timezone:</strong> <span id="timezone">[TIMEZONE]</span>
    </div>

</body>
</html>

Tasks:

Create the role /home/ansible/exam-files/roles/start-page
- Manually convert the index.html file into a jinja2 template that will set the following values and add it to the ‘start-page’ role as a template:

[HOSTNAME] - Should get the node FQDN value from an ansible fact variable
[VERSION] - Version group from the inventory
[IP ADDRESS] - Should get the node internal IP value from an ansible fact variable
[TIMEZONE] - Should

Create the main task for this role to push the template
Create the role /home/ansible/exam-files/roles/journald-persistent. This role should:
- Enable persistent journald with all the required steps
- Set the max storage to 100M
- Reload the service when changes are made
Create the playbook /home/ansible/exam-files/playbooks/task6.yml that applies the ‘start-page’ role to the ‘webservers’ group and the ‘jounald-persistent’ role to all servers

Task 7

Objectives covered:

Work with Ansible variables and facts

Tasks:

Create the a custom fact for the ‘webservers’ group with the structure below:
- app_version should be based on the version specified in the inventory file

"exam": {
    "server_info": {
        "group": "webservers",
        "app_version": "1"
    }
}

NOTE: This task can be done via a playbook or manually

Task 8

Objectives covered:

1. Be able to perform all tasks expected of a Red Hat Certified System Administrator
11. Create and work with roles

Tasks:

Before you start, remember you should have added a 10GB disk to node4 and increased it’s memory to 1024M

Create the role /home/ansible/exam-files/roles/postgresql that does the following:
- Creates a VDO on the 10G disk with:
  - VDO name is ‘databases_vdo’
  - 20G logical size
  - Deduplication disabled
  - Auto write mode (write policy)
Perform needed VDO steps, as per RHCSA
- Create a logical volume with:
  - PV using the vdo device
  - VG named ‘databases_vg’
  - LV name ‘databases_lv’
- Format and mount with:
  - ext4 filesystem (using VDO requirements)
  - Mounted on fstab under /data/databases
Follow vdo mount requirements, as per RHCSA
- Installs the postgresql package group - @postgresql
- Modifies the value of Environment=PGDATA= in the systemd service for ‘postgresql.service’ to have the value below (remember the old path and make sure new value is reloaded) Environment=PGDATA=/data/databases/postgresql_data
- Creates the directory /data/databases/postgresql_data
- Sets the ownership of /data/databases/postgresql_data to postgres:postgres with rwx------
- Initializes the DB with postgresql-setup --initdb
  - Should only run during setup
- Enables the SELinux boolean selinuxuser_postgresql_connect_enabled
- Enables and starts the service postgresql.service
  - The service should be restarted whenever the systemd unit file for postgresql.service is changed
See warning below
Create the playbook /home/ansible/exam-files/playbooks/deploy-postgresql.yml that pushes this role to the ‘postgresql’ group
Add the following to the same playbook as tasks:
- Creates the dir /data/db_troubleshoot
- Sets the ownership of /data/db_troubleshoot to postgres:postgres with rwx------
- Creates the group ‘pgsqladmin’
- Creates the user ‘dbadmin’ with primary group of ‘pgsqladmin’
- Adds an ACL that gives the ‘pgsqladmin’ group full access to /data/db_troubleshoot. This should also be the default ACL for new files

WARNING

The postgresql service will fail to start. You will need to logon to the server and fix the issue. The solution/fix can be done manually, but it needs to be part of the playbook.

TIP

While creating the VDO device you may run into the error below:
 fatal: [node4]: FAILED! => {
     "changed": false,
     "module_stderr": "Shared connection to node4 closed.\r\n",
     "module_stdout": "/tmp/ansible_vdo_payload_crp07req/ansible_vdo_payload.zip/ansible/modules/system/vdo.py:330: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.\r\n/bin/sh: line 1:  6280 Killed                  /usr/libexec/platform-python /home/ansible/.ansible/tmp/ansible-tmp-1701096243.3300107-7102-276967642935618/AnsiballZ_vdo.py\r\n",
     "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
     "rc": 137
 }
If that’s the case, fully remove the vdo device and then apply the patch below. While this is not part of the exam, it’s a good skill to aquire.

https://github.com/ansible-collections/community.general/pull/5632/files

You can identify the path for the Ansible code with ansible --version. Then browse to the module shown in that commit message and modify the 2x lines. Note that the line number may not match, but should be pretty close.

Task 9

Objectives covered:

12. Download roles from an Ansible Galaxy and use them
13. Use Ansible Vault in playbooks to protect sensitive data

Tasks:

Using ansible-galaxy search for and download the ‘mysql’ role by ‘geerlingguy’
Create a vault password file and add it to ansible.cfg
Create the variable file /home/ansible/exam-files/vars/mysql.yml and add the following variables:
- mysql_root_username: root
- mysql_root_password: sqlrootpassword
Encrypt the variable file with ansible vault
Modify the role so that it:
- Changes the root credentials
- Saves the root credentials to ~/.my.rc
Create the playbook /home/ansible/exam-files/playbooks/deploy-mysql.yml that pushes the role to the mysql group

Task 10

Objectives covered:

Use provided documentation to look up specific information about Ansible modules and commands

Tasks:

Create the file /home/ansible/exam-files/ansible.cfg.template with a dump of all possible env and config values. For example:

ACTION_WARNINGS:
  default: true
  description: [By default Ansible will issue a warning when received from a task
      action (module or action plugin), These warnings can be silenced by adjusting
      this setting to False.]
  env:
  - {name: ANSIBLE_ACTION_WARNINGS}
  ini:
  - {key: action_warnings, section: defaults}
  name: Toggle action warnings
  type: boolean
  version_added: '2.5'

Create the file /home/ansible/exam-files/ansible.cfg.dump with all the current variables/settings. For example:

ACTION_WARNINGS(default) = True
AGNOSTIC_BECOME_PROMPT(default) = True
ALLOW_WORLD_READABLE_TMPFILES(default) = False
ANSIBLE_CONNECTION_PATH(default) = None
ANSIBLE_COW_PATH(default) = None
ANSIBLE_COW_SELECTION(default) = default
ANSIBLE_COW_WHITELIST(default) = ['bud-frogs', 'bunny', 'cheese', 'daemon', 'default', 'dragon', 'elephant-in-snake', '>
ANSIBLE_FORCE_COLOR(default) = False
ANSIBLE_NOCOLOR(default) = False
ANSIBLE_NOCOWS(default) = False

Create the file /home/ansible/exam-files/ansible-modules.txt with a list of all the Ansible modules available on this system. For example:

a10_server                                                    Manage A10 Networks AX/SoftAX/Thunder/vThunder device...
a10_server_axapi3                                             Manage A10 Networks AX/SoftAX/Thunder/vThunder device...
a10_service_group                                             Manage A10 Networks AX/SoftAX/Thunder/vThunder device...
a10_virtual_server                                            Manage A10 Networks AX/SoftAX/Thunder/vThunder device...
aci_aaa_user                                                  Manage AAA users (aaa:User)
aci_aaa_user_certificate                                      Manage AAA user certificates (aaa:UserCert)
aci_access_port_block_to_access_port                          Manage port blocks of Fabric interface policy leaf pr...
aci_access_port_to_interface_policy_leaf_profile              Manage Fabric interface policy leaf profile interface...
aci_access_sub_port_block_to_access_port                      Manage sub port blocks of Fabric interface policy lea...
aci_aep                                                       Manage attachable Access Entity Profile (AEP) objects...
aci_aep_to_domain                                             Bind AEPs to Physical or Virtual Domains (infra:RsDom...
aci_ap                                                        Manage top level Application Profile (AP) objects (fv...
aci_bd                                                        Manage Bridge Domains (BD) objects (fv:BD)

Install jinja2 html documentation

Testing New Hard Drives

21 Nov, 2023 Hardware Storage Linux

So you got a spanking new hard drive for your NAS and you are ready to install it… but wait! What if the drive is bad?

This is not something that most people would think of, but that shiny new drive could already have come with defects (a.k.a extra features) from the factory. Or maybe it was part of a fun game of “throw the client’s package” that some delivery man like to play (as my preferred social media likes to show me). So before we install this new piece of hardware that has the potential to render all my data, accumulated from years of hoarding, useless, let’s do some testing.

S.M.A.R.T Testing

Let’s start with a S.M.A.R.T (Self-Monitoring, Analysis, and Reporting Technology) test.

smart

Wikipedia: Self-Monitoring, Analysis and Reporting Technology (S.M.A.R.T.)

SMART is an interface between the platform’s BIOS and the storage device. When SMART is enabled in the BIOS (mostly default), the BIOS can process information from the storage device and determine whether to send a warning message about potential failure of the storage device. The purpose of SMART is to warn a user of impending drive failure while there is still time to take action, such as backing up the data or copying the data to a replacement device.

First we need to identify if the drive is capable of S.M.A.R.T test. Most modern drives should be.

sudo smartctl -i /dev/sdX

You should get an output similar to the one below:

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD60EFPX-68C5ZN0
Serial Number:    WD-WX12D12312345
LU WWN Device Id: 5 0014ee 26b395dd4
Firmware Version: 81.00A81
User Capacity:    6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database 7.3/5528
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Nov  9 08:21:36 2023 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

I got the following error because I’m using a USB-C adapter:

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.5.8-arch1-1] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

/dev/sda: Unknown USB bridge [0x14b0:0x0200 (0x100)]
Please specify device type with the -d option.

Use smartctl -h to get a usage summary

If that’s the same for you, you can try using it with -d sat, and if your adapter is supported it should work.

sudo smartctl -d sat -i /dev/sdX

Once we confirmed that the drive supports S.M.A.R.T. testing we can start. We are interested in the following three tests:

Short - The goal of the short test is the rapid identification of a defective hard drive. Therefore, a maximum run time for the short test is 2 min
Long - The long test was designed as the final test in production and is the same as the short test with two differences. The first: there is no time restriction and in the Read/Verify segment the entire disk is checked and not just a section
Conveyance Test - This test can be performed to determine damage during transport of the hard disk within just a few minutes

We specify the test using the -t flag:

smartctl -t [short|long|conveyance] [dev]

The test runs in the background and we can check it’s status by greping Self-test execution status.

On the example below we can see that the test is in progress and that is 80% complete:

$ sudo smartctl -a /dev/sda | grep -A 1 'Self-test execution status:'
Self-test execution status:      ( 242)	Self-test routine in progress...
					20% of test remaining.

We can use the same command with to check the test result. Just change the -A to ‘2’ in grep:

$ sudo smartctl -a /dev/sda | grep -A 2 'Self-test execution status:'
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever
					been run.

Another option is to use the -l selftest flag:

$ sudo smartctl -l selftest /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.5.8-arch1-1] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%        20         -

Also check the following string after each test:

$ sudo smartctl -a /dev/sda | grep 'test result'
SMART overall-health self-assessment test result: PASSED

Now go ahead and run the short and conveyance tests (or all 3 if you have the time). Here’s how long it took for me to run on a 6TB WD Red Plus (WD60EFPX) over USB-C (Nov 2023):

conveyance -1m13s
short - 2m
long - 11h20m

If you have more than one hard drive to test, and you can plug them in at the same time, you can run the tests in parallel.

Once completed and you have confirmed they have passed, also check the thresholds at the end of the output. It will look similar to this:

Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   100   253   021    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       2
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       46
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       1
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       0
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       3
194 Temperature_Celsius     0x0022   107   104   000    Old_age   Always       -       43
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

You want to pay attention to:

Offline_Uncorrectable - Damaged sectors that don’t respond to any read/write requests (bad sectors). These sectors are remapped to spare sectors.
Reallocated_Sector_Ct - Count of damaged sectors that were remapped to spare sectors.
Current_Pending_Sector - Indicates the number of damaged sectors that are yet to be remapped or reallocated. This number could indicate that spare sectors are not available, and data from bad sectors can no longer be remapped.

Badblocks

Imgur

Before we continue, let’s just make sure that you are indeed testing a new set of spinning rust (a.k.a. hard drive), and not an SSD or an NVMe. We don’t want to run badblocks on the later 2.

Overview

Arch Wiki: badblocks

S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) is featured in almost every HDD still in use nowadays, and in some cases it can automatically retire defective HDD sectors. However, S.M.A.R.T. only passively waits for errors while badblocks can actively write simple patterns to every block of a device and then check them, searching for damaged areas (Just like memtest86* does with RAM).

Now that we have an understanding of what badblocks does, let’s take some time to digest it. We will be writing to all blocks on your new hard drive and then reading to confirm that the data was written correctly. As if that wouldn’t already take long, badblocks will do it not only once, but four times (with four different patterns).

Arch Wiki

As the pattern is written to every accessible block, the device effectively gets wiped. The default is an extensive test with four passes using four different patterns: 0xaa (10101010), 0x55 (01010101), 0xff (11111111) and 0x00 (00000000). For some devices this will take a couple of days to complete.

As with smartctl, you can run multiple instances of badblocks in parallel if you have multiple disks. You can also shorten the time of the test by increasing the number of blocks that are tested at time (-c), or by specifying a single pattern to be written with the -t option, e.g.: -t '0xaa', which will force it to do only one pass. If you specify multiple patterns, e.g.: -t '0xaa' -t '0x55', you will be essentially running multiple passes.

Another option is to use the random pattern option with -t random. This will make badblocks use random patterns for the test, with only one pass (unless you specify -p).

Just keep in mind that different patterns (used by the default write-mode) work better because you can validate against stuck bits. But based on the amount of drives, available drive buses, and time that you have, you might not actually be able to run the full test. But that’s a decision that only you can make.

I wanted to time my tests to help you making a decision, but as I ran them over a USB-C adapter my badblocks seem to have maxed out at around 41mb/s:

  Total DISK READ :       0.00 B/s | Total DISK WRITE :      41.74 M/s
  Actual DISK READ:       0.00 B/s | Actual DISK WRITE:      41.74 M/s
       TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                          
  1516251 be/4 root        0.00 B/s   41.74 M/s  ?unavailable?  badblocks -wsvb 4096 -t 0x00 /dev/sda -o badblocks-output.log

With that in mind, here are the timings from my latest test on a 6TB WD Red Plus (WD60EFPX) over USB-C (Nov 2023):

Write mode - 83h
Random write mode - 81h
Write mode one pattern - 80h

And spoiler alert… we will be running the long test in smartctl once badblocks finishes. So also take that into account.

Running the Test

First, let’s take a look at what your drive’s recommended blocksize is:

sudo blockdev --getbsz /dev/sdX

Because this test will run for a while, start your preferred terminal multiplexer (e.g.: screen, tmux), change into root, and run badblocks:

time badblocks -wsvb {blocksize} /dev/sdX -o [output_file]

time a separate command to tell you the actual time badblocks ran for once complete.
-w uses write-mode test, which is a destructive action.
-s shows an estimate of the progress of the scan. This isn’t 100% accurate, but it’s better to see something than nothing at all.
-b {blocksize} specify the block size. Be sure to replace {blocksize} with the number you found with the previous command mentioned (blockdev --getbsz /dev/sdX).
/dev/sdX the drive you want to test. Replace with the actual drive. Be extra careful as you don’t want to accidentally destroy data on the wrong disk.
-v option is verbose mode
-o option is output file. Without the -o badblocks will simply use the STDOUT

S.M.A.R.T. Again

Once the badblocks test is complete, run another long smartctl test and check to make sure that everything is still good.

Latest Posts

Tweets by victorbrca