Open In App

Terraform Data Sources

Last Updated : 24 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Terraform is a powerful tool used for Infrastructure as Code (IaC), which helps in management and deployment of Infrastructure. The important feature that make Terraform powerful is its ability to interact with data sources.

Terraform data sources are used to get data from external systems or services to dynamically build and manage infrastructure. Whether you're working with cloud services, databases, or online tools, Terraform data sources make it easy to blend external data into your infrastructure plan.

What are Terraform Data Sources?

A data source is Terraform is a way to pull information from external systems or services (such as cloud platforms, APIs, or configuration management tools) and use that data in your Terraform configuration.

These external resources could include things like:

  • The Latest Amazon Machine Image (IAM) for AWS.
  • Subnet IDs from a Cloud Provider.
  • Service Configuration from Tools like Consul.
  • External Data from custom APIs.

Data Sources allow you to dynamically fetch information, ensuring that your infrastructure is always using up-to-date data. This helps to build flexible, scalable, and adaptive infrastructure.

Why are Terraform Data Sources Important?

Imagine you need to create a server (like an EC2 instance) in AWS and want to use a specific subnet for it. Instead of manually entering the subnet's ID, which can change, you can dynamically fetch this information using a Terraform data source. This ensures you always have the most up-to-date subnet data without having to update your configuration each time it changes.

In Short, Terraform data sources help you to fetch the outside world real-time data and make it applicable to your infrastructure setup, making it more dynamic and real.

Benefits of Using Terraform Data Sources:

  • Real-time data: Automatically fetch the latest available data, like the newest AMIs or available IP addresses.
  • Automation: Remove manual updates from your infrastructure configuration, making the process faster and less error prone.
  • Consistency: Ensure that your resources are always configured with the latest and correct values, even as the external data changes.

Types of Terraform Data Sources

Terraform supports a variety of data sources depending on what type of data you need. Below are different types of Terraform data sources with examples:

1. Cloud Provider Data Sources

These data sources allow you to get information from major cloud platforms like AWS, Azure, and Google Cloud.

In the given below example, we are using AWS AMI data source to fetch the latest Ubuntu AMI:

Amazon machine Image

Explanation:

  • most_recent = true: Fetches the most recent version of the image.
  • owners = ["amazon"]: Ensures we only pull AMIs owned by Amazon.
  • filter: Looks for AMIs with names matching "ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*" to get the latest Ubuntu 20.04 AMI.

This data source make sure that Terraform always uses the most up-to-date AMI, making your infrastructure setup dynamic and avoiding manual updates.

2. Configuration Management Data Sources

Configuration management data sources allow terraform to pull data from tools that manage the configuration of your servers and infrastructure. These tools like Consul, help to track and manage infrastructure details, and with Terraform, you can pull that configuration data into your infrastructure plan.

Here's an example of how to use Consul as a data source to fetch key-value pairs from the Consul Database:

Data Source of Configuration Managemnet

Explanation:

  • consul_key_prefix: This data source fetches all keys from the Consul database that match the provided prefix.
  • prefix = "example/": It specifies the prefix of the key in Consul. All keys that start with "example/" will be retrieved.

This is useful when you want to dynamically pull configuration data, like service configurations, and use it in your Terraform setup.

3. Custom API Data Sources

Terraform also allows you to integrate data from custom APIs. If your infrastructure relies on data form external system, custom APIs, or databases, you can use Terraform's data source to retrieve that information.

For Example: Let's say you have an internal API that provides user data (e.g., user IDs, names, etc.) stored in a custom system. You can create a Terraform data source to automatically query the API for user information and integrate it into your Terraform configuration.

User data

Explanation:

  • http: This is the data source type. In this case, it is used to fetch data from an HTTP endpoint.
  • user_data: This is the name of the data source. You can use this name to reference the data elsewhere in your terraform configuration.
  • url = "https://api.example.com/users.123": This URL from which the data is being fetched. In this example. Terraform is pulling data about user 123 from a custom API.

This data source retrieves information form an external API and makes it available for use in your Terraform configuration.

How to Configure Terraform Data Sources

To configure the Terraform data sources, you follow a straightforward syntax. You specify the provider (such as AWS or HTTP), the type of data source (Such as aws_subnet or http), and then define the configuration to fetch the data.

Syntax of Data Source Configuration

data "provider_type" "resource_name" {
# Configuration settings
}
provider type

Explanation:

  • provider_type: This specifies the type of data source you are using (e.g., aws_ami, http, consul_key_value).
  • resource_name: This is the reference name that you assign to this data source. It’s used in your configuration to call the data source whenever needed.
  • Configuration settings: This is where you define the specific details required to pull data from the provider (e.g., filtering, ID, URL).

Where provider_type is the type of data source and resource_name is a reference for accessing the data fetched.

Example for AWS Subnet Data Source

data "aws_subnet" "example_subnet" {
id = "subnet-abcdef12345"
}

This configuration fetches the subnet details based on the given subnet ID.

How to Use Terraform Data Sources

After configuring, you can reference it within your Terraform resources.

Suppose you want to create an AWS EC2 instance and associate it with a specific subnet using Terraform. You can use the aws_subnet data source to fetch information about the desired subnet and then utilize that information to configure your EC2 instance.

Here's how you could achieve this with an AWS example:

Fetch subnet

In this example, the aws_subnet data source is used to fetch information about a specific AWS subnet. The data.aws_subnet.example_subnet reference can then be used to access attributes of that subnet, such as its ID. Subsequently, an AWS EC2 instance is created using the aws_instance resource. The subnet_id attribute of the EC2 instance is set to the ID of the subnet fetched from the data source, effectively associating the instance with the specified subnet.

Best Practices for Using Terraform Data Sources

  1. Optimize with for_each: To minimize API calls, utilize the for_each argument effectively in data source usage.
  2. Cache Smartly:Be aware of Terraform's data source result caching to prevent unnecessary API requests. Remember that the cache is relevant for a single run.
  3. Version Constraint Awareness: Approach version constraints in provider configurations cautiously as they might impact data source behavior.
  4. State Management Strategy: Maintain a vigilant approach to state management, considering that data source outputs contribute to Terraform's state.

What is the difference between resources and data sources in Terraform?

In Terraform, resources are the parts of your infrastructure that you want to create, manage, or delete. These can include things like virtual machines, databases, or networks. Resources are the core components of your infrastructure and directly interact with your cloud provider's APIs to create or change actual services.

On the other hand, data sources are used to get information from existing resources or external systems. They don’t create or manage anything but help you fetch specific data that you need to use in your Terraform configuration. Data sources provide the necessary information to help configure your resources correctly.

Aspect ResourcesData Sources
DefinitionRepresents infrastructure components that you create, manage, or delete.Used to query information from existing resources or external systems.
ActionCreates, updates, or deletes resources in your infrastructure.Only retrieves data, does not create or manage infrastructure.
PurposeTo provision and manage infrastructure, like virtual machines, databases, etc.To fetch information to use in your Terraform configuration.
InteractionDirectly interacts with cloud provider APIs to create or modify resources.Retrieves data (e.g., AMI IDs, VPC info) from existing services or APIs.
Examplesaws_instance, aws_s3_bucket, aws_vpcaws_ami, http, consul_key_value
MutabilityCan be created, updated, or deleted.Cannot be modified; only fetches data.
LifecycleManaged by Terraform throughout its life (create, update, destroy).Only queried at runtime; no ongoing management.

Next Article
Article Tags :

Similar Reads