YAML Tutorial for DevOps Engineers

YAML has most certainly become a foundational skill for DevOps engineers these days and is being used extensively with tools like Kubernetes, Ansible, Docker Compose, and CI/CD pipelines. In this comprehensive YAML tutorial for DevOps engineers, you’ll learn how to read, write, and structure YAML files effectively. Whether you’re just starting out or looking to refine your YAML skills, this guide covers the essentials and best practices tailored for real-world DevOps workflows.

Why should I learn YAML?

Before diving into any new topic, I believe it’s essential to understand why it’s worth learning in the first place. Clarifying the “why” helps build a stronger sense of purpose and makes the learning journey both more enjoyable and more effective.

When I asked myself why I should learn YAML, the answer became clear pretty quickly. Here are the key reasons that motivated me:

1. A Must-Have for Aspiring DevOps/MLOps Engineers

If you’re aiming to excel in DevOps or MLOps, YAML isn’t optional any more — it’s foundational. From writing configuration files to automating infrastructure, YAML is everywhere.

2. Widely Used in Popular Tools

Whether it’s Kubernetes, Docker, Ansible, or any other mainstream DevOps tool—you’ll find YAML at the core. Mastering YAML is like holding the key to unlocking a whole ecosystem of tools.

3. Essential in Data-Centric Workflows

Working with data often involves tasks like structuring, transforming, or moving data between systems. YAML plays a vital role in serialization and deserialization processes, making it highly relevant in data engineering and related fields.

What is YAML

If you’re reading this section, chances are you’ve already answered the “why” behind learning YAML—and now motivated to learn and enjoy at the same time.

Fun fact: Many people (myself included) originally thought YAML to be “Yet Another Markup Language.” However, its officially known as “YAML Ain’t Markup Language

As per yaml.org “YAML is a human-friendly data serialization language for all programming languages.”

Breaking down the definition above .. YAML is very simple and easy to read. One of the key reason it has become so popular with Devops tools.
It is widely used to structure and transmit data between systems, hence making it cross platform data serialization preferred format.

YAML today, has become the de facto standard for writing configuration files across many tools and platforms. Mainly because it is so simple to write and human readable. No wonder it is being used by tools like Kubernetes (K8s), Docker, GitHub Actions, Ansible, Chef, major cloud providers and many more.

Another major feature of YAML is “cross platform“. Same YAML file will work the same on windows, mac and/or linux. Hence, supported and adopted by many programming languages. And, as you will see in sections below, it supports various data types, like strings, integers, floats, dates, lists, and dictionaries.as well.

Evolution of YAML

Before diving into learning YAML and getting our hands dirty with it, let’s take a brief moment to understand why it came into existence. What’s the history behind YAML becoming so popular, what issues did it predecessor had, which YAML solves and excels at.

In the field of data serialization and configuration files, YAML’s ancestors are XML and JSON. Let’s start with a small code snippet to representing 1 user.

XML
<user>
  <name>John Doe</name>
  <age>30</age>
  email>john.doe@example.com</email>
  <is_active>true</is_active>
  <roles>
    <role>admin</role>
    <role>editor</role>
  </roles>
</user>
JSON
{
  "user": {
    "name": "John Doe",
    "age": 30,
    "email": "john.doe@example.com",
    "is_active": true,
    "roles": ["admin", "editor"]
  }
}
YAML
user:
  name: John Doe
  age: 30
  email: john.doe@example.com
  is_active: true
  roles:
    - admin
    - editor

As you can see for yourself, how concise and easy to read YAML is. Look how gracefully indentation replaces brackets {} and tags <>, hence making it lightweight as well. YAML is super easy to read and highly intuitive.

It is also worth noting that any valid json file is also a valid yaml file. You might say JSON is also very close to YAML, but the mjor difference is that JSON lacks the support for comments. Thus, making documentation of config file easier with YAML.

Without going into too much details, the foundation of YAML is easy to read, intuitive, less verbose and lightweight.

YAML Syntax

Foundation of YAML is “INDENTATION / SPACING” and “<KEY>:<SPACE><VALUE>” pair. I can’t emphasize enough on these two concepts anymore as to how essential they are for learning and writing correct YAML. If you can keep these two things in mind before going forward, then learning and excelling at YAML is going to be easy for you.

As you will see, proper spacing and/or indentation is the key to writing YAML. One wrong spacing will and can lead to error. If you don’t provide space, then a simple value “name:frank” will be considered as key. Correct represenation should be “name: frank”
It is also very important to note that space here is not same as tab.

YAML is case sensitive

_ is allowed in key. Example of valid YAML with _ in key :=> is_student: false

Comments

Any data format discussion is incomplete without mentioning about comments. After all how will you describe the data that you are representing.

#This code snippet is about comments
#You write comments using #
#Following is YAML for Employee Frank
name: Frank

As you can see in the code snippet above, we can write multi line comments using # in the beginning of the line. Ultimately, it is multiple single line comments.

# in beginning of every comment,

NOTE: JSON does not support comments, so converting YAML to JSON, will also get rid of comments

YAML formatting styles

There are two formatting styles with which you can write YAML. These are just good to know. You will be using and seeing Block style most widely used.

  • Block style
  • Flow style
Block Style example:

Employees:
- John
- Harry

#Most common YAML style, indentation based
Flow Style example:

Employees: [John, Harry]

# Same Employees list object represented as flow style
# Used to represent small structure in single line
# also known as JSON style

YAML Data types

As you can see in the diagram on the left, YAML data types can be divided into two broad categories.

Basic – strings, numbers, booleans, null

Advanced, which is further subdivided into two categories:

Sequences – lists

Mappings – dictionaries

We also have multi line strings (| and >), which we will study later.

Implicit vs Explicit Types

Implicit typing – data type of the value is inferred from its format without any additional annotations.

Explicit typing – when you want the parser to identify the value as that of specific data type. Declared using ‘!!’

age: 30    #implicit type of age here will be number. YAML will treat age data type as number

age: !!str 30 #You are explicitly specifying YAML to consider age as "string" here.

String data type

Most common data type used with YAML is “string”. You can represent a string with following three options:

String with no quotes

name: John      #one word
String with single quote

name: 'John Smith' #has space
String with Double quote

name: "has \n escape sequence"

As you can see from above, following is the convention to follow:

No quotes needed: Usually one word string, having no special characters or spaces or keywords that could be interpreted as YAML keywords

Use single quotes: When you string has special character, spaces or reserved keywords, but do not need to process escape sequences. example of reserved word can be the word true itself. Without single quotes around true, (‘true’), it will be interpreted as boolean data type.

Use double quotes: When your string has escape sequences. Single quote do not interpret escape sequence.. \n is read as \n only, not new line.

Boolean, Number Data Type

Although it is self explanatory and easily identifiable, but will be good to have an example code snippet of each.

# Boolean data type
# valid values: true, false, on, off

is_valid: true
# Number data type
# valid values 1,121, 130.98
age: 45
weight: 123.67

Array/List Data Type

The two code snippet shows you how you can represent a list of students and a list of professors. Following are few points you should notice and remember

  • Both the Students and Professors are followed by ‘:’, thus it becomes the key, in this case, the object/entity you are representing
  • Values are represented as -(hyphen) followed by space and then actual value. e.g. “- John” becomes the value. Which is part of the list.
  • List of students (John and harry) has 2 spaces preceding – (hyphen) and list of professors (Fred and Matt) has no spaces preceding – (hyphen). Both the formats are valid. So long as you follow the format specified in bullet point just above, both are valid representation.
    To re-iterate, you can also start list elements without indentation, so long as you have a hyphen followed by space followed by value.
Students: 
  - John # 2 spaces before -
  - Harry # 2 spaces before -
Professors:
- Fred # No spaces before -
- Matt # No spaces before -

Dictionary/Map Data Type

Properties of an object in key value pair. In example below, Address is an object and Street, City and State are it’s properties, which themselves are key/value pair.

# Example of Dictionary, 
Address:
  - Street: Main St
  - City: Old City
  - State: New Jersey 

As you can see from the example above, Address has Key/Value pairs as its elements. Unlike List data type, which just had single entry as elements, Dictionary/Map data type has Key/Value pair as it’s elements.

Two properties should be at the same level if they belong to same object. i.e. street and city in example above are at same level as they have equal number (2) of space in front of them and they both belong to Address object.

A Word about List and Dictionary

Dictionary is unordered and list is ordered. Properties can be defined in any order, but two dictionaries will be the same if their values are same.
Arrays/list are ordered and the position of it’s elements matters.

Employees: 
- emp1
- emp2

IS NOT SAME AS
Employees:
- emp2
- emp1

Multi line strings (|, >):

Multi line string is supported in YAML by using two special characters ‘Literal Style’ (|) or ‘Folded Style’ (>).
As their name suggests, literal style block preserves the line breaks and Folded style block removes the line breaks and folds it in one line.

Each of the two characters can be combined with ‘+’ or ‘-‘.

  • >+ or |+ – retains the whitespace after the text.
  • >- or |- removes the white space after the string.
  • If no + or -, then only single line break. Rest all removed.
# use of '|' for multi line string.
# literal style block
literal_block: |
the text goes here, it can be in multiline which will be preserved. i.e. line breaks will be preserved. Output will be the same that you write it as.
# use of '>' for multi line string.
# folded style block
folded_block: >
the text goes here, it can be in multiline, but new line will not be preserved. i.e. line breaks will not be preserved. Output will not have line breaks.

Validating YAML

Following are the list of tools which are used widely for validating YAML you write:

Real World examples

  • Docker Compose file
  • Kubernetes manifest
  • GitHub Actions workflow

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top