Thursday, January 4, 2018

Introduction To Ansible ............. Simple, Easy & it is YAML

Now-e-days we get to hear a lot about automation, specifically Infrastructure Automation. Technology has already shifted gears towards automation way of controlling IT Infrastructure. This is much needed for today's fast/rapid changing technology or infrastructure, as demands from customer end does also took a sharp turn from way of investing/setting up infrastructure towards hosted infrastructure which we call it as "Cloud". A commonly used term which basically allows/facilitates converting physical infrastructure into a framework which is easy to scale-up & flexible that provides a virtual platform where everything is defined as service i.e majorly as IaaS (Infrastructure As A Service), PaaS (Platform As A Service) & SaaS (Software As A Service). There is a similar wing where activities of operation and development could be closely associated which we commonly call it as "DevOps".  Yes, I'm talking about one such tool that is being used to automate tasks in infrastructure which is "Ansible". A simple, easy to use, easy to automate tasks i.e to write playbooks and easy to is YAML (YAML Ain't Markup Language, the language used to write playbooks). In this blog page, I wanted to talk about a few introductory points about Ansible and a little about installing and setting up.

Ansible is :
- an open-source utility.
- built on Python.
- agent-less.
- originally written by Michael DeHaan.
- easy to learn and understand.
- works by using modules (core & custom).
- uses YAML syntax for creating playbooks.
- playbooks are used to automate tasks.
- easy to install and configure.
- works on push mode (connects to remote hosts and pushes programs called modules).
- it is idempotent (playbooks work in idempotent mode, meaning if desired state is already achieved then running playbook again would not make any changes).

Ansible can automate most of the tasks in IT infrastructure. It natively uses SSH protocol to communicate and push tasks on to remote system and get results. However, it can't perform a bare metal installation and can't monitor configuration drift.

Ansible architecture

The architecture consists of a control node and managed hosts. A control node is the one which got Ansible and related packages installed and configured. So, this control node would maintain an inventory which simply consists of hostnames or IP address of managed hosts and other configurations. Playbooks (tasks) are written in YAML syntax on control node to perform required tasks. These playbooks when run would execute tasks serially on targeted hosts. A control node should got Python 2.6/2.7 or Python 3 (version 3.5 and higher) installed and it could be either RHEL6/7.x system. A control node could be a Red Hat, Debian, CentOS, OS X, any of the BSDs, and so on (Windows is not supported as control node). On the other side all managed hosts should got Python 2.4 or later installed running with RHEL5/6/7. On older RHEL5.x python-simplejson package should also be installed.

By default, Ansible manages only Unix/Linux systems, however, starting from version 1.7 Ansible can manage windows systems as well by using PowerShell instead of ssh to remote in. Control node would use "winrm" module to talk to remote windows system.

Playbook: It is a file with .yml extension and consists of multiple plays. A play consists of set of tasks to be run on remote system. So, a playbook would normally contain targeted host (on which hosts the tasks needs to be run) and multiple plays. This is written in yaml syntax.
Inventory: It is a file which holds list of IP addresses or host-names of targeted systems. These names could be grouped.

Modules: These are either core (main) or custom modules which are written using Python and performs required functions. These modules would be pushed to managed hosts as per requirement when playbooks are executed.

Plugins: These connection plugins would allow Ansible control node to talk to managed hosts and by default it uses SSH. There are other connection plugins such as local, paramkio, winrm etc., are available.

How does Ansible works?

In simple terms, Ansible control node would establishes connections (via SSH) with managed hosts and execute tasks defined within playbooks or as per roles set, and which in turn would push required modules (Python programs) to get the desired results. Those results would be retrieved and showed on the control node from where playbooks were executed.

Control Node <-----------------------------------------------> managed hosts
---->roles (playbook) -----> modules --->

At core there is "ansible.cfg" main configuration file, and there is a inventory file which holds a list of hosts to be managed, and finally there are playbooks (simple text files in YAML syntax) written to perform required tasks. At higher level there are roles which would make maintenance of structure easy and get the tasks done.

Let's install Ansible

Ansible package is available in EPEL (Extra Packages for Enterprise Linux) repo can be downloaded from here, otherwise using the ansible link

Easy way is to enable 'rhel-7-server-extras-rpms' repository and then run "yum install ansible" which would pull out all dependency packages which are mostly related with python and would install Ansible on the control node as shown here (installation done on RHEL7.1 system).

> By default 'rhel-7-server-extras-rpms' repo channel would not be enabled, so enable it by running "subscription-manager repos --enable rhel-7-server-extras-rpms" command.

> After this check out the enabled repos using the command "subscription-manager repos --list-enabled" as shown below:-

> Later, run the command "yum install ansible" which would pull up dependencies from both server and extras repo channel and install ansible. Below is the snap of dependent packages along with Ansible which would get installed:-


> The above snap is not clear enough to see what are all the dependent packages, hence, let me add another screen shot which shows this (considering RHEL7.x system is installed with core packages):-


Let's Configure Control Node

After installing ansible on the control node, we need to setup key-based authentication so that the control node communicates with managed nodes using public-private keys which otherwise, requires entering password to connect. Once the public keys are exported to managed hosts, next step is to create a hosts list (list of managed hosts) and put them into a inventory file.

Let's create a simple setup with one managed host. By default all standard configuration and parameters are stored in "/etc/ansible/ansible.cfg" file. So, let's create a separate working directory for this "/ansible-projects" and create "ansible.cfg" (configuration file) and "inventory" (inventory file for storing managed hosts).

> Let's check out Ansible version and managed hosts configured:


- In this setup,

control node :
managed host :

parent directory : /ansible-project
configuration file : /ansible-project/ansible.cfg
inventory file : /ansible-project/inventory

> Let's run ping command and check if connection can be established. To run standalone commands the syntax would be "ansible <hosts> -i <inventory-file> -m <module-name> ". So, we could run the command "ansible all -m ping" (by default when no inventory file is specified anisble would take inventory file as per ansible.cfg file in working directory).

> You could see that the current working directory is "/ansible-project". The configuration file being used is "/ansible-project/ansible.cfg". The inventory host file is defined in the configuration file i.e "/ansible-project/inventory" as defaults, and the plugin "ansible_local" denotes that this is a local system and doesn't need to use ssh to connect. As we could see that, control node "rhel77" could talk to "pxeserver" which is the managed host.

How & which ansible.cfg file to be used?

> Ansible follows a definite order to refer "ansible.cfg" configuration file when executing commands or running playbooks as shown in this picture.

So, if there is a "ANSIBLE_CONFIG" environment variable defined then it would take precedence over all and would be used. If not then "ansible.cfg" file defined in the current working directory if found, otherwise, ".ansible.cfg" file in the user's home directory. If there is no "ansible.cfg" file defined either in current working directory or user's home directory then it would refer to standard configuration file which is "/etc/ansible/ansible.cfg" file by default. So, to check which configuration file being used, one could run the command "ansible --version".

> Let's run some commands which is called "ad-hoc mode" where we can run any commands using either "command" module or "shell" module or any other module as shown below. The syntax for running an ad-hoc command is as shown below:

ansible <host-pattern> -m <module-name> [ -a 'module arguments'] [ -i inventory]

By default, command module (this is pre-defined in "/etc/ansible/ansible.cfg" file as 'module_name') would be used when not specified in "add-hoc mode", so there is no need to mention module name. Like-wise, if you skip to mention the inventory file then Ansible would use a default one as per precedence order. Let's say to check the hostname, kernel version of the remote managed host, we could run this command:-


- The option '-a' would be used to pass arguments to the command module as shown above. As you noticed, I've not used "module-name" or "inventory file", so, Ansible uses "command" module as the default module and "inventory" file defined as per the "ansible.cfg" file under current working directory.

- To get a list of available modules, we can run the command "ansible-doc -l". To get help on a particular module, run the command "ansible-doc <module-name>". This is in ad-hoc mode, so all such commands/tasks could be grouped in a single playbook to get a task done.

- A playbook is used to perform certain task using yaml syntax which is usually a text file with .yml extension. These playbooks would normally being with "---" (three dash) and would be run using the command "ansible-playbook <playbook-file.yml>". A playbook would define tasks to be done which is usually multiple plays and targeted hosts. At broader level roles would be used. Using roles, we could define a structure which would separate variables, main tasks, files, etc., which is to maintain, code, debug etc,.

- So, how does this helps? How could Ansible do infrastructure automation? Yes, once a system is installed in minimal mode that could be used by Ansible to run playbooks which would contain tasks for specific hosts. Tasks such as Creating users, setting up passwords, installing packages, starting services, editing configurations etc.,. The codes that are run using playbooks are not hosts specific, the same code/playbook could be used to setup other systems as well. Yes, we need to study how to create playbooks, roles, galaxy, using/declaring variables etc,. to work in Ansible. This is just an introduction, so, for further reading one could refer to 


soumya Teja said...

This above information really Good beginners are looking for these type of blogs, Thanks for sharing article on Devops Online Training Hyderabad

Selva Kumar said...
This comment has been removed by the author.
Selva Kumar said...

Nice article, you may try this link also for more info.

Ansible Vault