Persisting data within Ansible Automation Controller

Overview

Did you ever have more than one Job-Template in an Ansible Workflow and wanted to hand over data from one Job to the next? If you had this requirement and did not find a good answer, this article might help.
I’m dedicating this to Ansible Automation Controller but actually first solved this challenge within Ansible Tower. Ansible Automation Controller seems more restrictive. I’m confident that what you read here will work on Tower as well.

Problem Statement

I’m working with Ansible Controller, not from CMD-Line. The sandbox-mechanisms of Controller (or Tower) which assure that tasks are executed within a sandbox from where they can’t reach out to the Controller Host itself, kick in. Any other security mechanism Ansible Controller or Ansible Tower brings along might also have impact.
On the other hand you might find configuration parameters within Web-UI, which modify the behaviour to your needs.

We are having more than one playbook, all are defined as individual job-templates within Ansible Automation Controller. There is a chronological order these job-templates should be run. One of the earlier Jobs creates some data which the later Job needs to know.

The question is, how can these parameters be handed over from one Job-Template to the next. Or in other words how can these Parameters be handed over from one Playbook, to the next, within Ansible Automation Controller?

Options

I found multiple strategies to solve this simple request. All seem to have some positive aspects and some negative aspects. The easy “it just works without any downside” is difficult to find. Please find a short description of what i found in the next paragraphs.

set_fact and caching

The set_fact task sets a fact for a certain system. But typically the fact will only live for the duration of the play it was called in. The additional parameter cachable: true will duplicate the variable as an ansible_fact which makes it accessible for the cache plugin (see [1]). Maybe it is worth exploring ansible.builtin.jsonfile as a cache plugin which writes data to json files. I must confess i’m not convinced that caching is reliable enough for persistence and did not follow up on this approach.

Note: During my research i also found a parameter persist: true as a non documented parameter to set_fact. But as it is not documented, i’d recommend not to use it.

dynamic inventory

If you are using a dynamic inventory you could add parameters into the source of this inventory. For instance: you add tags to the VMs of your environment, when reading the inventory from the virtualization platform you can evaluate these tags.

This approach very dependant on the environment you want to automate and the access rights to that environment. I personally find there might be reasons for Ansible to inject parameters into the CMDB or into the virtualisation layer, but this should be reasonable on its own and not as a work around to get persistence into Ansible across jobs.

set_stats

Persistence within a workflow seems to be possible with set_stats (see [2]). I tried this a while ago and did not succeed, but i do read that others where successful. If i read the documentation [3] of the module itself, “stats” seems to relate to statistics and not so much to static. I must confess, i did not pay more attention to it.

writing / reading files

I started with the perception that this should be a no-brainer. Manipulating files is something Ansible seems to do all the time. I quickly found that things are not that easy in the end. Some modules only (or at least prefer to) write on the remote system. Delegating these tasks to localhost did do something locally but somehow not as expected. On the other hand the most modules for reading would only read form the local system.

writing / reading files on remote host

Ansible has many means to manipulate remote files (see template, copy or lineinfile). Persisting a single variable at a single task might be easiest done with lineinfile.

Reading can be done by read_csv or slurp, both in combination with the task parameter register to save the output of the module into some variable. Unfortunately the play parameter vars_file: will only read from local filesystem. Same is true for the task include_vars: which only reads local files.
You could write the variables into a remote facts file, e.g. /etc/ansible/facts.d/myvars.fact . The next play – even in an other playbook – would read the facts file at the start of the play and would have the persisted variables available.

In my use case this would not work as the remote system should still be installed, had no OS yet and was therefore not accessible for file manipulation (nor for anything else) yet.

writing / reading files on the “localhost”

Writing files locally can be achieved by delegating the task to localhost. Unfortunately plays / jobs are executed within a sandbox on the controller node. Same is true in a similar way for Tower nodes. So a local write will not work out easily. We will cover this in the next chapter.

persist variable to local file

- name: persist variable to local file
  lineinfile:
    path: "{{ somedir }}/persistence.state"
    regexp: "{{ var_name }}"
    line: "{{ var_name }}: {{ var_value }}"
    state: present
    create: yes
  delegate_to: localhost

When having overcome the write issue, reading from local files is relatively easy. vars_file: can be used in the header of a play to read additional variables into the ansible “top level” variable name space.
If the play finds “nasty” content within the file any ansible variables could be overwritten, leading to unpredictable outcome. I therefore prefer to use the task include_vars: with the parameter name: . This reads all variable content from the file into the variable named, which avoids overwrite of anything critical.

read variables from a local file

- name: read variables from a local file
  include_vars:
    file: "{{ somedir }}/persistence.state"
    name: mynamespace

Enabling write to local files on controller node

As said the playbook is executed within a sandbox. In case of Ansible controller this sandbox is called execution environment and is technically just a container. We need to allow access to a directory outside this execution environment. This can be done within the “settings” of Ansible Controller WebUI. As i prefer to also automate the configuration of the controller, please find following play which prepares Ansible Controller to enable “persistence”.

---
- hosts: controller_hosts
  gather_facts: no
  
  tasks:
  - name: create somedir
    file:
      path: "{{ somedir }}"
      state: directory
      mode: '0755'
      owner: awx
      group: awx
    become: true

  - name: Allow containerized jobs to put files into tmpdir
    ansible.controller.tower_settings:
      tower_host: "{{ ansible_host }}"
      tower_username: "{{ tower_username }}"
      tower_password: "{{ tower_passwords.admin_password }}"
      validate_certs: no 
      name: AWX_ISOLATION_SHOW_PATHS
      value:
       - "{{ somedir }}"
    delegate_to: localhost

Hints:

somedir should be a new / unused directory. Ansible wants to set its own selinux context and will run in a conflict if you choose an existing directory like /tmp.
With Tower i used the module ansible.tower.tower_settings instead of ansbile.controller.tower_settings.
With tower the parameter to change was called AWX_PROOT_SHOW_PATHS instead of AWX_ISOLATION_SHOW_PATHS.

This is the route i took to solve the persistence challenge and it works very well for me. While implementing and testing you have full control as you can read and analyse the data written to the file for persistance. You could even alter the file to the expected needs to test the reading play without having the play for writing in place yet.

I want to add some feedback i got from colleagues:
* In general it is not the preferred path to change filesystem content on the controller node.
* It is something to be thought through deeply to change a system wide parameter
* If you also use execution mesh results might be different as it is not clear (to me) what exactly does “local” refer to.

Security

Let us have a small look into the security impact when writing and reading a file during job executions:
An attacker could symlink this file to a system device or any other file to make us write somewhere unhealthy.
Further more an attacker could add malicious content into the file, e.g. ansible_* variables. When reading this file into the top level variable hierarchy we would easily overwrite parameters and cause some unpredictable outcome.
To avoid both, it is good practice for the temporary file names to be unpredictable. Unfortunately we can not make the temp-file unpredictable as the second playbook needs to know which file to read from. If you find a way to pass this information on between both playbooks, we can avoid the whole exercise and pass on the variables we wanted to persist directly.
Anyway the writes are executed as user awx, which avoids overwriting system devices, or files not owned or not accessible by awx. To create a file in our predefined directory the attacker would need privileges of the awx user. So our writes would not add additional benefits to the attacker.
As we read the variables into our own variable tree the worst to happen is that the expected variables are set wrong. Again the advantage for an attacker is limited compared to the other options an attacker would have when having access rights of the awx user on the Ansible Controller.
I decide to accept this risk. You might easily come to a different conclusion.

Using external data storage

Thinking about the comment Moy made to this article, i’d like to add the following option.
You could make use of any additional data store. Choose a data store with supported Ansible modules available to run a read as well as a write query (see [4] e.g. “postgresql_query”). So my very quick research found PostgreSQL. You could look at Hira or maybe at “Azure App Configuration”. But i did not check availability of modules on that.
Taking Moys comment it could also be an additional (maybe even private) git repository, what makes sense if it stores all configuration of a certain environment.
You could look at a file or files on a file system share. This would enable to use any structure, like ini-format, json or xml.
When using object store you might need to solve the challenge to hand over the object id, which leaves you at the same situation, we have started with.
I believe there are many more storage options.
For reading and writing you could easily provide the credentials out of the Automation Controller Credentials. Maybe you need to define a custom credential type for this. Writing and reading of the data itself will then be very straight forward.

Conclusion

Working Combinations

To persist variables between different jobs / playbooks within Ansible Controller you find different options, some might suite your use case very well. In the following table i want to give an overview of the options discussed in a easy to consume way.

Name	Writing	Reading	additional info
set_fact	set_fact cachable: true	parameter available	needs ansible.builtin.jsonfile works only within workflow
set_stats	set_stats	parameter available	might need additional configuration: show_custom_stats
dynamic inventory	write data in inventory source	parameter available	dependant on inventory source and access rights to that inventory. I’d recommend this ONLY if data belongs into that inventory source anyhow.
remote facts file	lineinfile	gather_facts: true	file needs to be in a file called /etc/anible/facts.d/<myvars>.fact on remote system nice, when OS is up and running already
local file	lineinfile delegate_to: localhost	task include_vars: name: <mynamespace>	local directory needs to be available and with AWX_ISOLATION_SHOW_PATHS exposed for direct access to the plays. When set up this allowes highest flexibilty and most control.
external data store			depending on the solution choosen and the infrastructure available in your environment, this might imply additional admin work, additional security benefits or additional security concerns. It might easily be the most professional way forward as well.

Optipons to persist data accross ansible playbooks or job templates

I decided to follow the “local file” approach and write a file locally with the lineinfile module and read from that file into my own variable hierarchy via include_vars task. This concept does not need an overarching workflow but allows execution of both jobs / playbooks independently. We may not forget to expose a directory within Ansible Controller for this to work properly.

Disclaimer

Please also keep in mind that this is my private opinion. Others within or outside of Red Hat might come to different conclusions (see [5]).

Reference

For your reference i put together two small lists of the options i looked at to write or read files for the use of persistence across different jobs / playbooks within Ansible Controller:

Writing to files

template:	remote	Uses a local file, runs it through jinja2 interpreter and copies it to a remote file. he file gets written / overwritten with each alteration.
copy:	remote	Copies content passed on via `content` parameter to remote file. The file gets written / overwritten with each alteration.
lineinfile:	remote	Adds or changes a line to an existing (or non existing) remote file.

Writing to files

Reading from files

vars_file:	local	Within the play header reads one or more (additional) variable files.
include_vars:	local	Task, which reads local var files into a defined variable name space.
slurp:	remote	Task reads a remote file into memory base64 encoded. Needs the task attribute register to save output into a variable and needs b64decode filter to make use of the data.
read_csv:	remote	Reads a csv file into memory. Needs the task attribute register to save output into a variable.

Reading from Files

Links

[1] set_facts
https://docs.ansible.com/ansible/latest/collections/ansible/builtin/set_fact_module.html

[2] set_stats persistence:
https://docs.ansible.com/ansible-tower/latest/html/userguide/workflows.html#extra-variables

[3] set_stats documentation: https://docs.ansible.com/ansible/latest/collections/ansible/builtin/set_stats_module.html

[4] database modules:
https://docs.ansible.com/ansible/2.9/modules/list_of_database_modules.html

[5] personal opinion disclaimer:
https://mdschreier.com/2022/04/07/need-to-know/

3 thoughts on “Persisting data within Ansible Automation Controller”

Moy says:

April 4, 2022 at 5:44 pm

Congrats Markus for the post very interesting and complete researching all the options that we have to exchange data between different playbooks, Job Templates and/or Workflows.

I found this case with a customer and I fully agree with you, but IMHO is missing the most elegant solution: use a repo to push and get this information that you need to exchange between the automatism. If you use a repo, you get:

· Security. You can configure this repo as private, and restrict the actions to the users.
· Traceability. In any case you can recover all the modifications.

Last but not least… use the same source of truth for your automatism, IaC files and all the data generate it.

LikeLiked by 1 person

1. mdschreier says:
  
  April 5, 2022 at 10:04 pm
  
  Hi Moy,
  thank you for your comment and this very interesting approach. I must confess i normally try to keep the code separate from configuration. So this is not a way i’d personally recommend. But hopefully many people read your comment and can decide themselves.
  
  LikeLiked by 1 person
  
Fred says:

June 9, 2022 at 7:06 pm

Hi Markus. Thanks for this insightful blogpost. A suggestion for improvement I have is you could also specify which method has a “inventory” scope (is stored/handled/used) per inventory item and which method could be used for data at the play or playbook scope.

LikeLiked by 1 person

Markus Schreier

IT in the Datacenter

Persisting data within Ansible Automation Controller

Overview

Problem Statement

Options

set_fact and caching

dynamic inventory

set_stats

writing / reading files

writing / reading files on remote host

writing / reading files on the “localhost”

persist variable to local file

read variables from a local file

Enabling write to local files on controller node

Security

Using external data storage

Conclusion

Working Combinations

Disclaimer

Reference

Writing to files

Reading from files

Links

3 thoughts on “Persisting data within Ansible Automation Controller”

Leave a comment Cancel reply

Overview

Problem Statement

Options

set_fact and caching

dynamic inventory

set_stats

writing / reading files

writing / reading files on remote host

writing / reading files on the “localhost”

persist variable to local file

read variables from a local file

Enabling write to local files on controller node

Security

Using external data storage

Conclusion

Working Combinations

Disclaimer

Reference

Writing to files

Reading from files

Links

Share this:

3 thoughts on “Persisting data within Ansible Automation Controller”

Leave a comment Cancel reply