Mastering Your Cloud Bill: Getting Started with Pulumi and FinOps Best Practices

Infrastructure as Code (IaC) is a software engineering approach that enables developers and operations teams to automate infrastructure deployment, configuration, and management. IaC allows organizations to define their infrastructure requirements as code, which can be version-controlled, tested, and deployed like any other software code. This approach is gaining popularity, particularly in cloud computing, due to its many benefits, such as cost-effectiveness, agility, consistency, and security. In this blog post, we will explore the importance of IaC for organizations and guide them on implementing FinOps best practices using Pulumi, a modern IaC platform.

Why Infrastructure as Code is Important for Organizations

Infrastructure as Code (IaC) provides several critical advantages for organizations, including:

  1. Cost-effectiveness: IaC allows organizations to save costs by automating infrastructure deployment, configuration, and management. By defining infrastructure as code, teams can easily create new resources, adjust capacity as necessary, and eliminate the need for manual intervention. This automation results in significant cost savings regarding time, resources, and infrastructure utilization.
  2. Agility: IaC allows organizations to move faster and respond to changes in market demand or customer needs more quickly. With IaC, teams can easily scale resources up or down, test new configurations, and roll out new features and services faster. This agility enables organizations to stay ahead of the competition and adapt to changing business requirements.
  3. Consistency: IaC guarantees that infrastructure is deployed and configured consistently across various environments, including development, testing, and production. This consistency ensures effective teamwork, reduces the risk of errors and downtime, and enhances the overall infrastructure quality.
  4. Security: IaC enables organizations to implement security best practices and compliance policies as part of the infrastructure code. This approach ensures that security requirements are integrated into the infrastructure, reducing the risk of security breaches and compliance violations.

What is Pulumi?

Pulumi is a modern IaC platform. It enables engineers to define and manage cloud infrastructure using familiar programming languages such as JavaScript, Python, and Go. Unlike traditional IaC tools that rely on domain-specific languages (DSLs) or solely on configuration files, Pulumi provides a platform that allows you to use familiar tools and techniques to manage infrastructure as code.

With Pulumi, you can use your favorite programming language and development tools to define cloud infrastructure resources such as virtual machines, databases, and networks. It abstracts away the complexities of cloud infrastructure management, so you can focus on writing code and defining your infrastructure requirements. Some of the critical features of Pulumi include:

  1. Multi-Cloud Support: Supports multiple cloud platforms, including Amazon Web Services (AWS), Azure, Google Cloud Platform (GCP), Kubernetes, and bare metal instances hosted in providers like Equinix. This broad support enables teams to manage infrastructure consistently across multiple providers.
  2. Resource Abstraction: Provides a resource abstraction layer that simplifies cloud infrastructure management. With this, you can define complex infrastructure requirements using high-level constructs, reducing the boilerplate code required.
  3. Stack Management: Provides a powerful stack management system, enabling teams to manage multiple environments. This system allows teams to manage infrastructure requirements for each environment independently, reducing the risk of errors and improving the overall quality of the infrastructure.
  4. Secrets Management: Incorporates a secrets management system that enables teams to securely store and manage sensitive information such as passwords, API keys, and certificates. With Pulumi’s secrets management, you can keep secrets in encrypted form and easily reference them in your infrastructure code. This approach ensures that sensitive information is protected and only accessible to authorized users or applications.
  5. Collaboration: Pulumi enables teams to collaborate on infrastructure requirements using code reviews, pull requests, and shared repositories. This approach allows teams to work together more effectively, reducing the risk of errors and improving the overall quality of the infrastructure.

What is FinOps?

FinOps, short for “Financial Operations,” is a collection of practices and principles that aim to help organizations optimize their cloud spending. The core idea is that cloud resources are a shared responsibility across an organization; therefore, cloud costs should also be allocated and managed across the organization. It requires collaboration between finance, operations, product, and engineering teams to ensure that cloud resources are used efficiently and effectively while maximizing value. By applying FinOps best practices, organizations can optimize their cloud spending, avoid waste, and make data-driven decisions about how to use cloud resources best.

Some standard FinOps practices include identifying areas of waste, optimizing resource utilization, and implementing cost allocation strategies. Using tools like cloud cost management platforms, organizations can track and analyze their cloud spending, identify areas of improvement, and make data-driven decisions about optimizing their cloud resources. Pulumi can help enable FinOps in several different ways. Primarily, it allows teams to manage cloud infrastructure as code, track and allocate cloud costs, and automate cloud resource management.

Here are some steps for getting started with FinOps best practices using Pulumi:

  1. Define your FinOps goals: The first step in implementing FinOps is to define your goals, such as reducing costs, optimizing performance, or improving governance. Once you have defined your goals, you can determine the metrics you will use to measure progress toward those goals.
  2. Tag your resources: To implement FinOps, you need to tag your resources based on their purpose and cost center. This tagging enables you to track and allocate cloud costs to specific teams or projects, helping you optimize your spending.
  3. Use Pulumi for cost monitoring: Pulumi offers providers for monitoring cloud costs. For instance, you can deploy DataDog monitors that integrate cloud provider billing data with resource usage data. These tools help teams identify idle or underutilized resources, optimize reserved instances, and adjust capacity as needed.
  4. Continuously iterate and improve: FinOps is an iterative process, and teams need to monitor and improve their cloud spending continuously. With Pulumi, teams can easily iterate on their infrastructure code, test new configurations, and make changes as needed to optimize their cloud spending.

As previously mentioned, one of the key tenets of FinOps is to allocate cloud costs to the teams and engineers responsible for them. In a cloud environment where many engineers can self-provision their own infrastructure, this can be challenging because it requires tracking usage across multiple services and resources.

Here’s an example of how you can use Pulumi with Go to implement cost allocation and a Cost Usage Report in AWS:

package main import ( "github.com/pulumi/pulumi-aws/sdk/v5/go/aws/ec2" "github.com/pulumi/pulumi-aws/sdk/v5/go/aws/cur" "github.com/pulumi/pulumi/sdk/v3/go/pulumi" ) func main() { pulumi.Run(func(ctx *pulumi.Context) error { // Create a new VPC vpc, err := ec2.NewVpc(ctx, "my-vpc", &ec2.VpcArgs{ CidrBlock: pulumi.String("10.0.0.0/16"), Tags: pulumi.StringMap{ "Name": pulumi.String("my-vpc"), "environment": pulumi.String("staging"), "team": pulumi.String("engineering"), "product": pulumi.String("cpu-development"), }, }) if err != nil { return err } // Create a new EC2 instance _, err = ec2.NewInstance(ctx, "my-instance", &ec2.InstanceArgs{ InstanceType: pulumi.String("t2.micro"), VpcSecurityGroupIds: pulumi.StringArray{ vpc.DefaultSecurityGroupId, }, Ami: pulumi.String("ami-0bd3b255f1beeae5e"), // Ubuntu 22.04 us-west-2 Tags: pulumi.StringMap{ "Name": pulumi.String("my-instance"), "environment": pulumi.String("staging"), "team": pulumi.String("engineering"), "product": pulumi.String("cpu-development"), }, }) if err != nil { return err } // Create a Cost and Usage Report _, err = cur.NewReportDefinition(ctx, "my-cur-report", &cur.ReportDefinitionArgs{ Format: pulumi.String("Parquet"), S3Bucket: pulumi.String("my-bucket"), S3Prefix: pulumi.String("cur/"), TimeUnit: pulumi.String("HOURLY"), Compression: pulumi.String("Parquet"), ReportName: pulumi.String("my-report"), S3Region: pulumi.String("us-west-2"), AdditionalSchemaElements: pulumi.StringArray{ pulumi.String("RESOURCES"), }, AdditionalArtifacts: pulumi.StringArray{ pulumi.String("ATHENA"), }, RefreshClosedReports: pulumi.Bool(false), ReportVersioning: pulumi.String("CREATE_NEW_REPORT"), TimeUnit: pulumi.String("HOURLY"), }) if err != nil { return err } return nil }) }

Let’s take a closer look at what’s going on in this code.

First, the required Pulumi packages are imported, including the Pulumi AWS SDK for EC2 and Cost and Usage Reports.

import ( "github.com/pulumi/pulumi-aws/sdk/v5/go/aws/ec2" "github.com/pulumi/pulumi-aws/sdk/v5/go/aws/cur" "github.com/pulumi/pulumi/sdk/v3/go/pulumi" )

Next, the main() function is defined, which will be executed by Pulumi when the program is run.

func main() { pulumi.Run(func(ctx *pulumi.Context) error { // Pulumi code goes here return nil }) }

Within the pulumi.Run() function, the Pulumi code to create the VPC, EC2 instance, and Cost and Usage Report is written.

First, a new VPC is created using the ec2.NewVpc() function. The function takes in a VpcArgs struct which specifies the CIDR block for the VPC and its tags.

vpc, err := ec2.NewVpc(ctx, "my-vpc", &ec2.VpcArgs{ CidrBlock: pulumi.String("10.0.0.0/16"), Tags: pulumi.StringMap{ "Name": pulumi.String("my-vpc"), "environment": pulumi.String("staging"), "team": pulumi.String("engineering"), "product": pulumi.String("cpu-development"), }, })

Then, a new EC2 instance is created using the ec2.NewInstance() function. The function takes in an InstanceArgs struct which specifies the instance type, VPC security group IDs, AMI ID, and tags.

_, err = ec2.NewInstance(ctx, "my-instance", &ec2.InstanceArgs{ InstanceType: pulumi.String("t2.micro"), VpcSecurityGroupIds: pulumi.StringArray{ vpc.DefaultSecurityGroupId, }, Ami: pulumi.String("ami-0bd3b255f1beeae5e"), // Ubuntu 22.04 us-west-2 Tags: pulumi.StringMap{ "Name": pulumi.String("my-instance"), "environment": pulumi.String("staging"), "team": pulumi.String("engineering"), "product": pulumi.String("cpu-development"), }, })

Finally, a new Cost and Usage Report is created using the cur.NewReportDefinition() function. The function takes in a ReportDefinitionArgs struct which specifies the format, S3 bucket and prefix, time unit, compression, report name, S3 region, additional schema elements, additional artifacts, refresh closed reports, report versioning, and time unit.

_, err = cur.NewReportDefinition(ctx, "my-cur-report", &cur.ReportDefinitionArgs{ Format: pulumi.String("Parquet"), S3Bucket: pulumi.String("my-bucket"), S3Prefix: pulumi.String("cur/"), TimeUnit: pulumi.String("HOURLY"), Compression: pulumi.String("Parquet"), ReportName: pulumi.String("my-report"), S3Region: pulumi.String("us-west-2"), AdditionalSchemaElements: pulumi.StringArray{ pulumi.String("RESOURCES"), }, AdditionalArtifacts: pulumi.StringArray{ pulumi.String("ATHENA"), }, RefreshClosedReports: pulumi.Bool(false), ReportVersioning: pulumi.String("CREATE_NEW_REPORT"), TimeUnit: pulumi.String("HOURLY"), })

The Cost and Usage Report provides detailed information about the usage and costs incurred by an AWS account. This report can be exported to an S3 bucket in a format that can be read by Amazon Athena. By creating a table in Athena with the schema of the Cost and Usage Report data, you can then run SQL queries against this data to analyze usage patterns, identify cost-saving opportunities, and optimize spending.

This approach enables more effective tracking of usage and allocation of costs, improving the overall effectiveness of FinOps initiatives.

Conclusion

In conclusion, Infrastructure as Code (IaC) is a powerful tool that can help organizations manage their cloud resources more efficiently and effectively, leading to significant cost savings. By defining and managing cloud infrastructure as code, teams can automate resource management, track changes over time, and identify cost-saving opportunities more easily.

Additionally, IaC enables teams to implement FinOps best practices by setting cost allocation tags and enabling teams to more effectively track usage and allocate costs to the teams and individuals responsible for them. With IaC, teams can optimize their cloud spending, avoid waste, and make data-driven decisions about how to use cloud resources best.

By leveraging tools like Pulumi, teams can more effectively implement IaC and enable FinOps best practices. Whether you’re just getting started with cloud infrastructure or looking to optimize your existing resources, IaC is a powerful tool that can help you get more value out of your cloud services while also improving your bottom line.