Tag Archives: PostgreSQL

Ansible: Populating variables with results from a database query

In a recent post I explained the use of command substitution in Ansible, focusing on the stdout and stderr attributes when registering variables and using their contents for server configuration. In this post I share an example of populating Ansible variables from a database query that provides status information that could be used to change a setting or remediate a specific condition.

Querying the database

Here we query a PostgreSQL master database that has an associated hot standby to determine its replication synchronisation status. The SQL is as follows:

select sync_state from pg_stat_replication;

The playbook

If the status is “async” the playbook debug task will fail and echo the current status in a debug message “The standby database is in async replication mode”.

---
- hosts: primary1

pre_tasks:
 - assert: { that: "{{item}} is defined" }
 with_items:
 - db_name
 - db_port

tasks:

- name: Perform database replication sync status check
 shell: >
 psql -p {{db_port}} -d {{db_name}} -U postgres -c "select sync_state from pg_stat_replication;"
 become: true
 become_user: postgres
 register: rep_sync_state

# - debug: var=rep_sync_state

- debug: msg="The standby database is in {{ item[2] }} replication mode"
 with_nested: "{{ rep_sync_state.stdout_lines }}"
 failed_when: "'async' in rep_sync_state.stdout"

Note the use of the “with_nested” parameter to initialise an array, which form the rows returned from the query. In this case, we require the 3rd element ( [2] ) to populate our dynamic variable {{ item }}.

The runtime output

The playbook runtime output is displayed below with the query results highlighted in red.

hot-standby-sync_state2

The debug output

The debug output of the rep_sync_state variable (- debug: var=rep_sync_state ) from the above example displays a wealth of information with all available attributes, as shown below.

hot-standby-sync_state

One step further

Taking the concept one step further, we can query more than 1 table, column or attribute to initialise a two-dimensional array. Then use the data to populate variables and display additional status information as shown in the example below.

hot-standby-sync_state3

The playbook

In order to read each element of the array, we must use the following syntax that includes the split function where we specify the delimiter, in this case a pipe ( | ), plus the required element ( [0] ).

{{ item.split('|')[0] }}
---
- hosts: primary1

pre_tasks:
 - assert: { that: "{{item}} is defined" }
 with_items:
 - db_name
 - db_port

tasks:

- name: Perform database replication sync status check
 shell: >
 psql -p {{db_port}} -d {{db_name}} -U postgres -c "select sync_state, state from pg_stat_replication;"
 become: true
 become_user: postgres
 register: rep_sync_state

- debug: var=rep_sync_state.stdout_lines

- debug: msg="The standby database is in {{ item.split('|')[0] }} mode with {{ item.split('|')[1] }} replication"
 with_items: "{{ rep_sync_state.stdout_lines[2] }}"
 failed_when: "'async' in rep_sync_state.stdout"

PostgreSQL

Fortunately the PostgreSQL psql utility has an “expanded output”  ( \x ) feature that formats the output of a query having multiple fields into 1 record. This provides excellent readability in the stdout debug messages and negates the need to select individual elements from the array as previously described. The playbook and runtime output are shown below.

The playbook

---
- hosts: primary1

 pre_tasks:
 - assert: { that: "{{item}} is defined" }
 with_items:
 - db_name
 - db_port

 tasks:

 - name: Perform database replication sync status check
 shell: "{{ item }}"
 with_items:
 - echo "\x" > /tmp/tmp.sql
 - echo "select * from pg_stat_replication;" >> /tmp/tmp.sql
 become: true
 become_user: postgres

 - shell: cat /tmp/tmp.sql | psql -p {{db_port}} -d {{db_name}} -U postgres
 become: true
 become_user: postgres
 register: rep_sync_state

 - file: dest=/tmp/tmp.sql state=absent

 - debug: var=rep_sync_state.stdout_lines

The runtime output

hot-standby-sync_state4

If anyone has a more elegant way of populating variables or sending multiple commands to a PostgreSQL database without writing a new Ansible module, then please leave a comment.

[contact-form][contact-field label=’Name’ type=’name’ required=’1’/][contact-field label=’Email’ type=’email’ required=’1’/][contact-field label=’Website’ type=’url’/][contact-field label=’Comment’ type=’textarea’ required=’1’/][/contact-form]

Automating PostgreSQL hot standby DB resynch after failover

In a similar fashion to Oracle, the PostgreSQL database offers a DR solution that enables data replication to a standby server. A number of options exist from logical data replication, log shipping and real-time data streaming to a physical hot standby. The later enables read only access to a mirror copy of the master database, known as query offload, which has the added advantage of providing failover to a standby database should the master fail.

One area that is often overlooked is; when the standby becomes the master, it cannot automatically return to being a standby database. In fact, if the former master is still online, we have a split-brain situation where an application can read-write to two master databases. It is therefore essential that the former master is shutdown when performing failover testing.

The Ansible playbook discussed in this post overcomes this problem, by resynchronizing the former master database with the new master, using the pg_rewind utility, which synchronizes a PostgreSQL data directory with another data directory and thus allows the hot standby database to be recreated.

Promoting a PostgreSQL standby database to be a master is easy. Simply touch the “trigger” file specified in the recovery.conf file on the standby server to automate the failover. If the standby is synchronized with the master, it will immediately become the new master, the trigger file will be deleted and the recovery.conf file will be renamed to recovery.done. Now we can make the former master a hot standby…

The steps

The automated steps executed by the postgres-cluster-switchover-to-standby.yml Yaml script are as follows:

  1. Stop the postgres service on the current primary (master) server
  2. Promote the standby database to become a primary by touching the trigger_file on current standby server
  3. Configure the former standby database parameter wal_log_hints in postgresql.conf
  4. Enable postgres user host based access in pg_hba.conf on former standby (new primary) database
  5. Restart the postgres service on the former standby (new primary) server
  6. Execute the pg_rewind utility on the former primary (new standby) server to synchronize with the new primary database
  7. Configure the former primary (new standby) database parameters archive_mode, max_wal_senders and wal_log_hints parameters in postgresql.conf
  8. Configure the former standby (new primary) database parameters archive_command in postgresql.conf
  9. Configure the former standby (new primary) database. Disable postgres user host based access in pg_hba.conf
  10. Configure the former standby (new primary) database. Enable replication user host based access in pg_hba.conf
  11. Restart the postgres service on former standby (new primary) server
  12. Wait for the new standby database to synchronize
  13. Rename recovery.done to recovery.conf on the former primary (new standby) database
  14. Configure the former primary (new standby) database primary_conninfo parameter in recovery.conf
  15. Configure the former primary (new standby) database max_wal_senders and wal_log_hints parameters in postgresql.conf
  16. Restart the postgres service on former primary (new standby) server
  17. Perform database recovery check on new standby database

The playbook

---
- hosts: primary

# YAML script: postgres-cluster-switchover-to-standby.yml
 # Usage : ansible-playbook postgres-cluster-switchover-to-standby.yml --extra-vars "db_name=postgres db_repuser=repuser db_rep_password=repuser123 db_port=5321 db_data_dir=/var/lib/pgsql/9.5/data primary_ip=192.168.0.191 standby_ip=192.168.0.101"

pre_tasks:
 - assert: { that: "{{item}} is defined" }
 with_items:
 - db_name
 - db_repuser
 - db_rep_password
 - db_port
 - primary_ip
 - db_data_dir

 tasks:
 
 - name: Stopping the postgres service on the current master
 service: name=postgresql-9.5 state=stopped sleep=30

- hosts: standby
 tasks:

 # Initialising the global variable
 - shell: chdir="{{ db_data_dir }}" grep trigger_file recovery.conf | awk '{print $3}' | sed s/\'//g 
 register: triggerfile
 ignore_errors: True
 - debug:
 var: triggerfile.stdout
 
 - name: Initiating switchover to standby. Promoting to master
 file: path={{ triggerfile.stdout }} state=touch
 become: true
 become_user: postgres
 
- hosts: primary
 tasks:

 - name: Enabling wal_log_hints in postgresql.conf on former master
 replace: dest={{ db_data_dir }}/postgresql.conf regexp="#?wal_log_hints\s+=\s+[o].*" replace="wal_log_hints = on"
 become: true
 become_user: postgres

 - name: Starting the postgres service on former master
 service: name=postgresql-9.5 state=started

 - name: Stopping the postgres service on former master
 service: name=postgresql-9.5 state=stopped

- hosts: standby
 tasks:

 - name: Configuring former master DB. Enabling postgres user access in pg_hba.conf
 shell: > 
 if [ $( cat $db_data_dir/pg_hba.conf | grep $standby_ip | wc -l ) -eq 0 ]; then
 echo "host postgres postgres {{ primary_ip }}/32 trust" >> {{ db_data_dir }}/pg_hba.conf;
 fi
 become: true
 become_user: postgres 

 - name: Restarting the postgres service on former standby
 service: name=postgresql-9.5 state=restarted

- hosts: primary
 tasks:
 
 - name: Executing the pg_rewind utility on former master ..
 shell: /usr/pgsql-9.5/bin/pg_rewind --target-pgdata=/var/lib/pgsql/9.5/data --source-server="host={{ standby_ip }} port={{ db_port }} user=postgres dbname={{ db_name }}"
 become: true
 become_user: postgres

- hosts: standby
 tasks:

 - name: Configuring former master DB. Configuring archive_mode, max_wal_senders and wal_log_hints parameters in postgresql.conf
 replace: dest={{db_data_dir}}/postgresql.conf regexp={{ item.src }} replace={{ item.tgt }}
 with_items:
 - { src: '^#?archive_mode\s+=\s+[o][n]?[ff]?', tgt: 'archive_mode = on' }
 - { src: '^max_wal_senders\s+=\s+.*', tgt: 'max_wal_senders = 3' }
 - { src: '^wal_log_hints\s+=\s+[o].', tgt: '#wal_log_hints = on' }
 become: true
 become_user: postgres

 - name: Configuring former standby DB. Adding archive_command config to postgresql.conf
 replace: dest={{db_data_dir}}/postgresql.conf regexp="^#?archive_command\s+=\s+\'.*\'" replace="archive_command = 'cp -i %p {{ db_data_dir }}/archive/%f'"
 become: true
 become_user: postgres

 - name: Configuring former standby DB. Disabling postgres user access in pg_hba.conf
 replace: dest={{db_data_dir}}/pg_hba.conf regexp="^host\s+postgres\s+postgres\s+.*/32\s+trust" replace="#host postgres postgres {{ primary_ip }}/32 trust"
 become: true
 become_user: postgres 

 - name: Configuring former standby DB. Enabling replication user access in pg_hba.conf
 replace: dest={{db_data_dir}}/pg_hba.conf regexp="^host\s+replication\s+.*\s+.*/32\s+md5" replace="host replication {{ db_repuser }} {{ primary_ip }}/32 md5"
 become: true
 become_user: postgres 

 - name: Restarting the postgres service on former standby
 service: name=postgresql-9.5 state=restarted

- hosts: primary

 post_tasks:
 - name: Perform database recovery check
 shell: tail -2 $(ls -1rt | tail -1) | grep "database system is ready to accept read only connections" chdir={{db_data_dir}}/pg_log
 register: db_is_in_recovery
 ignore_errors: true
 become: true
 become_user: postgres
 
 - fail: msg="Standby DB creation failed - db_is_in_recovery.stderr"
 when: db_is_in_recovery.stderr != ""

 handlers:
 - name: restart-postgres
 service: name=postgresql-9.5 state=restarted

 tasks:
 - name: Waiting for standby DB to synchronise ..
 wait_for: path={{ db_data_dir }}/recovery.done timeout=180

 - name: Configuring former master DB. Renaming recovery.done to recovery.conf
 shell: mv {{ db_data_dir }}/recovery.done {{ db_data_dir }}/recovery.conf
 become: true
 become_user: postgres

 - name: Configuring former master DB. Enabling replication user access in recovery.conf
 replace: dest={{db_data_dir}}/recovery.conf regexp="^primary_conninfo\s+=\s+'host=.*\s+port=.*\s+user=.*\s+password=.*" replace="primary_conninfo = 'host={{ standby_ip }} port={{ db_port }} user={{ db_repuser }} password={{ db_rep_password }}'"
 become: true
 become_user: postgres 

 - name: Configuring former master DB. Configuring max_wal_senders and wal_log_hints parameters in postgresql.conf
 replace: dest={{db_data_dir}}/postgresql.conf regexp={{ item.src }} replace={{ item.tgt }}
 with_items: 
 - { src: '^max_wal_senders\s+=\s+.*', tgt: 'max_wal_senders = 3' }
 - { src: '^wal_log_hints\s+=\s+[o].', tgt: '#wal_log_hints = on' }
 become: true
 become_user: postgres

 - name: finished
 shell: date
 notify: 
 - restart-postgres

[contact-form][contact-field label=’Name’ type=’name’ required=’1’/][contact-field label=’Email’ type=’email’ required=’1’/][contact-field label=’Website’ type=’url’/][contact-field label=’Comment’ type=’textarea’ required=’1’/][/contact-form]

Command substitution gotcha in Ansible Playbook

Command substitution reassigns the output (stdout) of a command or even multiple commands to a variable. The mechanism is useful for dynamically populating variables based on the environment in which a program is executed.
In the case of cloud orchestration, we may wish to reconfigure memory parameters for an application or service. The example I have chosen is for a PostgreSQL database deployment, where I need to adjust the memory parameters to their optimum value based on the required percentage of total available memory.

For example, among others, I wish to change the work_mem parameter from its default value.

On the Ansible server, I execute a playbook that contains the following code snippet that assigns a value to the “workmem” variable using the “register” Ansible module, and then uses the “replace” module to replace the parameter value in the postgresql.conf file.

# Initialising the global variable
 - shell: echo $(cat /proc/meminfo | grep MemTotal | awk '{print $2}') / 100 / 1024 |bc
 register: workmem
 ignore_errors: True

# Using variable in replace statement
 - name: Configure memory parameters ( work_mem = {{ workmem }}MB )
 replace: dest={{db_data_dir}}/postgresql.conf regexp="^#?work_mem\s+=\s+[1-9]*[kMGT]B" replace="work_mem = {{ workmem }}MB"
 become: true
 become_user: postgres
 
 - name: Restarting the postgres service
 service: name=postgresql-9.5 state=restarted


This results in the following error when Ansible attempts to restart the PostgreSQL instance at the end of the playbook.

TASK [Restarting the postgres service] *****************************************
 fatal: [10.127.3.18]: FAILED! => {"changed": false, "failed": true, "msg": "Stopping postgresql-9.5 service: [ OK ]\r\nStarting postgresql-9.5 service: [FAILED]\r\n"}

NO MORE HOSTS LEFT *************************************************************
 to retry, use: --limit @/home/ansible/yaml/postgres-main2.retry

PLAY RECAP *********************************************************************
 10.127.3.18 : ok=23 changed=7 unreachable=0 failed=1

The gotcha

The register statement is used to store the output of a single task into a variable. However, the shell task will include stdout & stderr, as well as the string returned from the  command. This is visible in the postgresql.conf file we are trying to modify on the target server.

postgres@db_host[~] $ cd /var/lib/pgsql/9.5/data
postgres@db_host[data] $ view postgresql.conf

work_mem = {u'changed': True, u'end': u'2016-10-12 03:15:56.780380', u'stdout': u'9', u'cmd': u"echo $(cat /proc/meminfo | grep MemTotal | awk '{print $2}') / 100 / 1024 |bc", u'start': u'2016-10-12 03:15:56.775608', u'delta': u'0:00:00.004772', u'stderr': u'', u'rc': 0, 'stdout_lines': [u'9'], u'warnings': []}MB

The solution

The variable appears to store an array of values. The solution to this problem is to force Ansible to substitute only the required element, in this case stdout. This is achieved using the following syntax:

- debug:
  var: <variable_name>.stdout

or

{{ <variable_name>.stdout }}

An example of Ansible command substitution from a shell task is shown below:

# Initialising the global variable
 - shell: echo $(cat /proc/meminfo | grep MemTotal | awk '{print $2}') / 100 / 1024 |bc
 register: workmem
 ignore_errors: True
 - debug:
 var: workmem.stdout

# Using variable in replace statement
 - name: Configure memory parameters ( work_mem = {{ workmem.stdout }}MB )
 replace: dest={{db_data_dir}}/postgresql.conf regexp="^#?work_mem\s+=\s+[1-9]*[kMGT]B" replace="work_mem = {{ workmem.stdout }}MB"
 become: true
 become_user: postgres

The correct playbook execution output for provisioning a single instance PostgreSQL database is shown below:

root@ansible_host[yaml] # ansible-playbook /home/ansible/yaml/postgres-main2.yml --extra-vars "target=postgres node_ip=10.127.3.18 db_name=dbdemo db_user=pgadmin db_password=password db_port=5432 db_data_dir=/var/lib/pgsql/9.5/data"

PLAY [postgres] ****************************************************************

TASK [setup] *******************************************************************
 ok: [10.127.3.18]

TASK [assert] ******************************************************************
 ok: [10.127.3.18] => (item=node_ip)
 ok: [10.127.3.18] => (item=db_name)
 ok: [10.127.3.18] => (item=db_user)
 ok: [10.127.3.18] => (item=db_password)
 ok: [10.127.3.18] => (item=db_port)
 ok: [10.127.3.18] => (item=db_data_dir)

TASK [command] *****************************************************************
 changed: [10.127.3.18]

TASK [debug] *******************************************************************
 ok: [10.127.3.18] => {
 "totalmem.stdout": "1018628"
 }

TASK [command] *****************************************************************
 changed: [10.127.3.18]

TASK [debug] *******************************************************************
 ok: [10.127.3.18] => {
 "sharedbuf.stdout": "248"
 }

TASK [command] *****************************************************************
 changed: [10.127.3.18]

TASK [debug] *******************************************************************
 ok: [10.127.3.18] => {
 "workmem.stdout": "9"
 }

TASK [command] *****************************************************************
 changed: [10.127.3.18]

TASK [debug] *******************************************************************
 ok: [10.127.3.18] => {
 "maintworkmem.stdout": "124"
 }

TASK [command] *****************************************************************
 changed: [10.127.3.18]

TASK [debug] *******************************************************************
 ok: [10.127.3.18] => {
 "effectcachesize.stdout": "746"
 }

TASK [Add the group 'postgres'] ************************************************
 ok: [10.127.3.18]

TASK [Add the user 'postgres' and a primary group of 'postgres'] ***************
 ok: [10.127.3.18]

TASK [Intialise the DB as postgres user] ***************************************
 changed: [10.127.3.18]

TASK [Start the DB server as postgres user and enable at boot] *****************
 ok: [10.127.3.18]

TASK [Set Port binding] ********************************************************
 changed: [10.127.3.18]

TASK [Set Interface binding] ***************************************************
 ok: [10.127.3.18]

TASK [Configure memory parameters ( shared_buffers = 248MB )] ******************
 ok: [10.127.3.18]

TASK [Configure memory parameters ( work_mem = 9MB )] **************************
 ok: [10.127.3.18]

TASK [Configure memory parameters ( maintenance_work_mem = 124MB )] ************
 ok: [10.127.3.18]

TASK [Configure memory parameters ( wal_buffers = 64MB )] **********************
 ok: [10.127.3.18]

TASK [Configure memory parameters ( effective_cache_size = 746MB )] ************
 ok: [10.127.3.18]

TASK [Restarting the postgres service] *****************************************
 changed: [10.127.3.18]

TASK [Create database named dbdemo] ********************************************
 ok: [10.127.3.18]

TASK [Setup database user] *****************************************************
 ok: [10.127.3.18]

TASK [Ensure user does not have unnecessary privileges] ************************
 ok: [10.127.3.18]

TASK [Configuring DB remote access in pg_hba.conf] *****************************
 changed: [10.127.3.18]

TASK [Restarting the postgres service] *****************************************
 changed: [10.127.3.18]

TASK [Perform database connection test] ****************************************
 changed: [10.127.3.18]

TASK [debug] *******************************************************************
 skipping: [10.127.3.18]

PLAY RECAP *********************************************************************
 10.127.3.18 : ok=30 changed=11 unreachable=0 failed=0

Another top tip when populating variables in Ansible is the ability to search in the variable data (stdout & stderr) and act on a keyword.

The following post_tasks example shows the db_is_in_recovery variable being populated with the output of a SQL query that checks whether a PostgreSQL hot standby database is in synchronization with its master. The check will fail when “(0 rows)” are returned.

post_tasks:
 - shell: sleep 10

- name: Perform database recovery check
 command: psql -p {{db_port}} -d {{db_name}} -U postgres -c "select pg_last_xlog_receive_location() "receive_location", pg_last_xlog_replay_location() "replay_location" where pg_last_xlog_receive_location() = pg_last_xlog_replay_location();"
 become: true
 become_user: postgres
 register: db_is_in_recovery
 ignore_errors: True
 failed_when: "'(0 rows)' in db_is_in_recovery.stdout"
 
 - debug:
 var: db_is_in_recovery.stdout

The runtime output:

TASK [Perform database recovery check] *****************************************
changed: [10.127.3.187]

TASK [debug] *******************************************************************
ok: [10.127.3.187] => {
 "db_is_in_recovery.stdout": " receive_location | replay_location \n------------------+-----------------\n 0/3000060 | 0/3000060\n(1 row)"
}

[contact-form][contact-field label=’Name’ type=’name’ required=’1’/][contact-field label=’Email’ type=’email’ required=’1’/][contact-field label=’Website’ type=’url’/][contact-field label=’Comment’ type=’textarea’ required=’1’/][/contact-form]