Explain_using_ tools
Explain_using_ tools
know that it is an aws product that is intended exclusively for creating infrastructure on aws. My
use of Terraform was to create virtual machines on an ESXi cluster, and Pulumi to create on a
Kubernetes cluster. I will give a brief description of the use of Terraform and Pulumi in order to
show the knowledge of these tools and how they are used, the code examples are abstract and
not related to my projects.
At the end, I gave an example of a technical problem I had and how it was solved, describing in
detail the cause of the problem and its solution.
Terraform is a popular Infrastructure as Code (IaC) tool that enables developers and engineers
to define, manage, and deploy infrastructure using a declarative configuration language. It is
used to automate the process of creating and maintaining infrastructure in the cloud, on-
premise environments, or hybrid setups.
• Example:
provider "aws" {
region = "us-east-1"
}
• Project initialization: The terraform init command is run, which installs the required
providers (eg AWS, Azure, GCP).
2. Planning of changes
• Overview of changes: Through the terraform plan, we get insight into the changes that
Terraform will make to the infrastructure, thus ensuring that nothing will be changed
unintentionally.
• Example:
3. Application of infrastructure
• After the review, we run the terraform apply command, which deploys the resources in
the defined provider (eg AWS).
• Terraform automatically takes care of the order of resources. For example, it will not
create an EC2 instance before a VPC exists.
• State monitoring: Terraform uses the terraform.tfstate file to monitor the state of the
infrastructure. If the infrastructure is changed manually, it can cause inconsistency, so it is
better to always use Terraform.
• Resource update: Changes to .tf files are reflected using terraform apply.
• Destroying resources: When resources are no longer needed, use terraform destroy to
remove them.
5. Version Control: Terraform configurations can be versioned via Git, which allows
tracking of change history.
Pulumi is a tool forInfrastructure as Code (IaC), which allows defining and managing
the infrastructure using standard programming languages such as Python, TypeScript,
Go, C#and others. This is the main difference from Terraform, which uses a tool-specific
declarative language. Pulumi offers the flexibility and power of programming languages,
which allows for more complex logic and easier integration with the rest of the code.
• Project creation: The first step is to startpulumi new, where we choose the language
and project type (eg AWS, Azure, Kubernetes). This creates the initial files and configuration.
importpulumi
importpulumi_awsasaws
• Use of logic: Since we use programming languages, we can easily use loops, conditionals
and functions.
forandinrange (3):
aws.ec2.Instance(f"exampleInstance-{and}",
instance_type="t2.micro",
ami="ami-12345678")
3. Infrastructure implementation
4. Infrastructure management
1. Programming languages: Enables the use of known languages with all their libraries
and tools.
• For example, you can write infrastructure tests using Pytest or unittest.
importpulumi_kubernetesask8s
app_labels = {app:"nginx"}
deployment = k8s.apps.v1.Deployment(
"nginx-deployment",
spec={
"selector": {"matchLabels": app_labels},
"replicas":2,
"template": {
"metadata": {"labels": app_labels},
"spec": {
"containers": [{"name":"nginx","image":"nginx:1.14"}]
},
},
},
)
3. CI/CD integration: Pulumi was part of the pipeline to automatically update the
infrastructure on eachmergeto the main branch.
The script
I designed the infrastructure for serverless application using AWS resources (eg
Lambda, API Gateway, S3) and integrated Pulumi into the CI/CD pipeline to automate
everything from creation to infrastructure updates.
project/
│
├── infra/ # Pulls files for infrastructure
│ ├── __main__.py # Main script with infrastructure code
│ ├── Pulumi.yaml # Pulumi configuration
│ ├── Pulumi.dev.yaml # Parameters for the development environment
│ └── Pulumi.prod.yaml # Parameters for the production environment
│
├── lambda/ # Code for Lambda functions
│ └── handler.py
│
└── .github/ # CI/CD configuration
└── workflows/
└── pulumi.yml # GitHub Actions pipeline
importpulumi
importpulumi_awsasaws
pulumi.export("bucket_name", bucket.bucket)
pulumi.export("bucket_url", bucket.website_endpoint)
Creation of Lambda function and API Gateway
# Lambda function
lambda_role = aws.iam.Role(
"lambda-execution-role",
assume_role_policy="""{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Effect": "Allow",
"Principal": {"Service": "lambda.amazonaws.com"}
}
]
}""",
)
function = aws.lambda_.Function(
"my-lambda-function",
runtime="python3.9",
role=lambda_role.arn,
handler="handler.main",
code=pulumi.AssetArchive({
".": pulumi.FileArchive("./lambda"),
}),
)
pulumi.export("api_url", api.api_endpoint)
I changed the parameters in the files Pulumi.dev.yaml and Pulumi.prod.yaml. For example:
configuration:
aws:region: us-east-1
heh:
push:
branches:
- main
- dev
jobs:
deploy:
runs-on: ubuntu-latest
steps:
# 1. Checkout code
- name: Checkout code
uses: actions/checkout@v3
Explanation of steps:
1. Checkout code: Pipeline fetches code from the corresponding branch (main or dev).
2. Pulumi installation and Python libraries: Setting up the environment to work with
Pulumi.
3. Authentication:
4. Deploy infrastructure:
• pulumi previewverifies changes before implementation.
• pulumi cfimplements the infrastructure for the defined stack(devOrprod).
1. Local development:
• I wrote and tested the Pulumi code locally using the commands:
3. Automatic validation:
• pulumi preview ensured that subtle errors were identified before implementation.
• If the deployment fails, the team is notified via GitHub Actions logs.
4. Parameterization for environments:
In the case of a large number of parallel requests, I ran into a problem deadlock. This happened
when two or more processes were trying to update the same rows in the inventory table, but
there were locks (locks) caused a mutual blockade. This resulted in an error:
The problem was in the design of the transactions. Transactions were taking overRow-Level
Lockson different lines in different order, which caused interdependencies between processes.
Specifically:
Solution:
I changed the logic so that all processes lock rows in the same order, regardless of the input
parameters. For example:
SELECT*FROMstock
WHEREproduct_idIN(product_a, product_b)
ORDER BYproduct_idFOR UPDATE;
This change ensures that all processes lock the row with the smallest one firstproduct_id, and
only then the others.
3. Introduction of retry interval (Retry)
Deadlock is inevitable in certain situations where parallelism exceeds the resources of the
database. I implemented the retry mechanism in the application code:
importpsycopg2
fromwith thatimportsleep
defexecute_transaction():
forattemptinrange (3):
try:
withconnection.cursor()ascursor:
cursor.execute("BEGIN;")
# Inventory update query
cursor.execute("""
UPDATE stock
SET quantity = quantity - 1
WHERE product_id = %s AND quantity > 0;
""", (product_id,))
cursor.execute("COMMIT;")
break
exceptpsycopg2.errors.DeadlockDetected:
ifattempt <2:
sleep(0.5) # Waits before retrying
else:
raise
SETlog_lock_waits =HE;
SETdeadlock_timeout ='1s';
1. Consistent locking order: I defined a rule that all processes lock rows in the same
order. This eliminates the cause of the deadlock.
2. Retry mechanism: Even in case of unexpected situations, the system automatically
repeats the transaction up to three times before reporting an error.
3. Monitoring and alarms: I set up monitoring at the database and application level, so
that any new deadlock is reported immediately. This enables proactive problem solving.
4. Optimization of the isolation level: By changing the isolation level of transactions, the
number of situations where competitive locking occurs has been reduced.
5. Testing under load: I simulated high levels of competition using tools such
aspgbenchand custom load scripts. The deadlock did not reoccur.
Results:
• The system became stable, even with a large number of parallel requests.
• Database performance has improved, as the number of locks has been reduced to a
minimum.
• There are no reported deadlocks more than six months after the implementation of the
solution.