Statefile

Terraform manages infrastructure through the use of a statefile. The state is one of the most critical components of the Terraform workflow, playing an extremely critical role in ensuring that your infrastructure is set up and managed correctly. The Terraform state serves a number of purposes:

Mapping Real-World Resources to Configuration: State maps your real-world resources such as virtual machines, databases, and networks to your Terraform configuration. This in turn helps Terraform understand the relationships that exist between your resources and your configuration.

Keeping Track of Metadata: The state maintains information such as creation date, modification date, and a lot more about each resource. This information is used to optimize the Terraform workflow for the best performance.

Performance Improvement: State is used in improving performance by caching information about your resources and configuration. Such caching by Terraform allows it to rapidly obtain information about your resources and configuration, thus avoiding wastage of time in provisioning and managing infrastructure.

By default, Terraform will store the state in a local file named terraform.tfstate. This file is utilized to store the state of your infrastructure and configuration. It gets updated every time Terraform is executed.

State Format

State snapshots are stored in JSON format, a common encoding format for data, which is conspicuously human-readable. The state format is generally backward compatible with earlier versions of Terraform. This means that you may still use older versions of Terraform to manage your infrastructure upon upgrading to newer versions.

However, the format of state can change in new versions of Terraform, and software that reads or updates the state directly may need to be updated in order to use a newer version of Terraform.

Purpose of Terraform State

The main purpose of the Terraform state is to store bindings between objects in a remote system and resources instances declared in your configuration. This binding enables Terraform to:

Create Remote Objects: When remote objects are created by Terraform, it records their identity against a particular resource instance. In that way, Terraform can manage the object and, when necessary, update and delete it.

Update or Delete Remote Objects: Terraform uses the state in updating or deleting remote objects in case one has made changes to the configuration. This ensures that at all times, the infrastructure is to date to the configurations.

The Terraform Remote State Data Source

The Terraform remote state data source is a terraform data source that utilizes an output value from another Terraform configuration. This data source looks up the state snapshot that's considered to be the most recent, from the given state backend, to obtain the output values. It can be used without requiring or configuring a provider and is always available through a built-in provider with the source address terraform.io/builtin/terraform.

Syntax for the terraform_remote_state data source as below:

data "terraform_remote_state" "LOCAL_NAME" {
    // Backend configuration
}

The terraform_remote_state data source supports the following arguments:

backend (Required): The remote backend to use for retrieving the state.

config (optional: object): Configuration of the remote backend. For most backends, this is required. The configuration should be supplied exactly as one would set them in the backend argument.

workspace (Optional): The Terraform workspace to use, if the backend supports workspaces.

defaults (Optional: object): Default values for outputs, in case the state file is empty or does not contain an expected output.

For example, here is a sample state file containing some simple metadata about the Terraform configuration:

{
      "version" : 4
      "terraform_version" : 1.5.0
      "serial" : 8
      "lineage" : 60dd9c0a-caa1-de0e
      "outputs" : {
        "env" : {
          "value" : "prod"
          "type" : "string"
        }
      }
      "resource" : {
        "aws_security_group" : {
          "example" : {
              "name" : "Example Security Group",
              "vpc_id" : "${aws_vpc.example.id}",
              "ingress" : [
                {
                    "from_port" = 80
                    "to_port" = 80
                    "protocol" = "tcp"
                    "cidr_blocks" = ["0.0.0.0/0"]
                }
              ],
              "egress" : [
                {
                    "from_port" = 0
                    "to_port" = 0
                    "protocol" = "-1
                    "cidr_blocks" = ["0.0.0.0/0"]
                }
              ]
          }
        }
    }
}

This state file contains information about a Terraform configuration that created an AWS security group. The outputs section includes a single output, "env", with a value of "prod" and of type "string". The Terraform remote state can then be used by creating a terraform_remote_state data source referencing the location of the state file. Example:

data "terraform_remote_state" "local_statefile" {
    backend = "local"
    config = {
        path = "./statefile/ownstate.tfstate"
    }
}
resource "null_resource" "local" {
    provisioner "local-exec" {
        command = "echo 'The Environment is ${data.terraform_remote_state.local_statefile.outputs.env}'"
    }
}

data.terraform_remote_state.local_statefile: Reading...
data.terraform_remote_state.local_statefile: Read complete after 0s
Terraform will perform the following actions:
    # null_resource.local will be created
    + resource "null_resource" "local" {
      + id = (known after apply)
    }
Plan: 1 to add, 0 to change, 0 to destroy.

Enter a value: yes

null_resource.local: Creating...
null_resource.local: Provisioning with 'local-exec'...
null_resource.local: Executing: ["/bin/sh" "-c" "echo 'The Environment is prod"]
null_resource.local: The Environment is prod
null_resource.local: Creation complete after 0s [id=4425172414869377883]

State Locking

State locking in Terraform stops more than one user from simultaneously editing the Terraform state file, which often leads to conflicts and errors. It ensures that only one user can change the state file at any given time, so no potential problems are brought about by the concurrent set changes.

State locking automatically enables on operations that can write state if this is supported by your backend. When a state lock is acquired, other users cannot write state until this lock is released. If state locking fails, Terraform will not proceed with the operation. State locking is automatically enabled for most operations which can write state. You can disable state locking for most commands using the -lock=false flag. This is not recommended because disabling state locking can lead to all kinds of conflicts and errors.

Force Unlock

Sometimes, Terraform will refuse to automatically release the state lock. The force-unlock command can be used to forcibly unlock the state. This command takes a unique lock ID, which Terraform will output if the unlock fails. This lock ID ensures that the unlock is targeting the correct lock.

  terraform force-unlock <LOCK-ID>

NOTE: This command should be used with care because it will result in multiple writers to the state file, which may frequently cause conflicts and errors. It is best utilized in situations whereby Terraform itself cannot release the state lock automatically.

Workspaces

A workspace is a collection of persistent data stored in the backend, such as state. Each Terraform configuration has one associated backend that defines how Terraform executes operations and stores data. By default, a Terraform configuration has only one workspace with one associated state known as "default" which cannot be deleted.

Some backends will support multiple named workspaces, which allows a single configuration to store multiple states. This will enable you to deploy multiple distinctive instances of a single configuration without having to configure a new backend and changing authentication credentials.

Backends Supporting Multiple Workspaces

The following backends support multiple workspaces:

S3

AzureRM

GCS

Kubernetes

Consul

Local

Postgres

Managing Resources in Multiple Workspaces

When you create a new workspace and run terraform plan, Terraform considers only resources in that workspace. The existing resources of other workspaces still physically exist, but you must switch workspaces to manage those resources.

You can include the name of the currently workspace in your Terraform configuration, through the interpolation sequence ${terraform.workspace}. This will come in handy when trying to customize behavior based on the workspace.

Sensitive Data in Terraform State

The Terraform state can contain sensitive information such as resource IDs, attributes, and initial database passwords. If not handled properly, such data exposures could give really sensitive information to possible attackers.

When working with a local state, everything is purely stored in plain-text JSON files. This means sensitive information one might have in local state isn't encrypted and freely available to anyone that has access to the file.

When Terraform uses remote state, the state exists only in memory when Terraform is actively using it. Remote state may also be encrypted at rest depending on specific remote state backend used. Even when encryption of data at rest is enabled, sensitive data may still be exposed in case of an unauthorized access or compromise to the backend.

The following remote state backends support encryption, along with other security features:

S3 Backend: The S3 backend supports encryption at rest by enabling the encrypt option. IAM policies and logging can be used to highlight attempts of access that are invalid. In addition, all state requests are made over a secure TLS connection.

GCS Backend: The GCS backend supports the use of customer-supplied or customer-managed encryption keys via Cloud KMS.