Twelve-Factor App Methodology on the Public Cloud
The popular twelve-factor app methodology is the best practice to build modern cloud-native applications. In this article, I will discuss how to apply the twelve-factor app methodology on the public cloud with selected cloud-native services to make scalable/resilient applications. There are five key features of the twelve-factor app methodology to build modern cloud applications:
- Use declarative formats for setup automation, to minimize time and cost for new developers joining the project;
- Have a clean contract with the underlying operating system, offering maximum portability between execution environments;
- Are suitable for deployment on modern cloud platforms, obviating the need for servers and systems administration;
- Minimize divergence between development and production, enabling continuous deployment for maximum agility;
- And can scale up without significant changes to tooling, architecture, or development practices.
These five key features are associated with CI/CD (continuous integration and continuous deployment), microservices, and containerization. I will use cloud-native services from AWS and GCP to describe each factor:
- I. Codebase
- II. Dependencies
- III. Config
- IV. Backing Services
- V. Build, release, run
- VI. Process
- VII. Port Binding
- VIII. Concurrency
- IX. Disposability
- X. Dev/prod parity
- XI. Logs
- XII. Admin Processes
Twelve-Factor App Methodology
You should track your codes in a version control system, such as Git or Mercurial. A copy of the revision tracking database is known as a code repository, often shortened to code repo or just repo. The repo provides a place from which to do continuous integration (CI) and continuous deployment (CD). You can set up a private or self-hosted repo. Public cloud providers also offer version control services that you can use to privately store and manage assets (such as documents, source code, and binary files) in the cloud. GCP uses Cloud Source Repositories to collaborate and manage your code in a fully-featured, scalable, private Git repository. AWS uses a Git-based service CodeCommit to eliminate the need for you to manage your own source control system or worry about scaling its infrastructure. We will talk more on CI/CD in V. Build, release, run.
You should automate explicitly declare dependencies in a CI/CD process and isolate the application with its dependencies by packaging them into a container. Many programming languages offer a way to explicitly declare dependencies:
- Node.js: npm
- Python: pip
- Java: maven
- C#: NuGet
- Ruby: Bundler
- Go: Go get packages
All public cloud providers offer the container registry (e.g. GCP’s Container Registry and AWS Elastic Container Registry) to store, manage, and deploy Docker container images. You can integrate the container registry with existing CI/CD to simplify your development to production workflow.
Storing config as constants in the code is a violation of twelve-factor. Config varies substantially across deploys, code does not. So the best practice should store external for each environment and strict separation of config from code. If you still do this by creating configurations files, then you should check out the better approach in the public cloud by storing configuration in environment variables. For example, leverage Lambda environment variables to store secrets securely and adjust your function’s behavior without updating code. Or create Kubernetes ConfigMaps to bind environment variables, port numbers, configuration files, command-line arguments, and other configuration artifacts to your pods’ containers and system components at runtime.
Backing service is any service the app consumes over the network as part of its normal operation such as databases, messaging/queueing systems, file systems, and caching systems. These services should be accessed as a service and externalized in the configuration as previously covered. The public cloud providers offer the cloud-native fully managed services for the backing services. For example:
- Cloud Storage with AWS or GCP Cloud Storage for file systems
- Databases on AWS or GCP Cloud Databases for all kinds of databases such as relational, No SQL, data warehouses, in-memory cache, graph, and ledger
- For message queueing, AWS provides Amazon Simple Queue Service (SQS) and GCP provides Pub/Sub
The twelve-factor app uses strict separation between the build, release, and run stages. So you should have a CI/CD process for development and deployment. The deployment tools typically offer release management tools. Every release should always have a unique release ID that’s a result of combining an environment’s configuration with a build. The release management tools should have the ability to roll back and track the production deployment history.
The following two sample diagrams show a detailed view of the CI/CD pipeline steps from GCP and AWS:
Twelve-factor processes are stateless and share-nothing with each other. Any data that needs to persist must be stored in a stateful backing service, typically a database. If the application has “sticky” sessions on-prem, then you need to change the way to handle the persistent data in the cloud. Both GCP and AWS offers fully managed in-memory data store service for Redis and Memcached with scalable, secure, and highly available infrastructure. So you can use AWS ElasticCache or GCP Memorystore as a backing service to cache the state for your applications and share common data between processes. Even more, you can leverage AWS step functions to coordinate the components of distributed applications and microservices using visual workflows to execute the processes in order and as expected.
The twelve-factor app is completely self-contained and does not rely on runtime injection of a webserver into the execution environment to create a web-facing service. So you should not hard-code port numbers in your code. Instead, you should follow III Config to store port in the environment. If you use AWS Elastic Kubernetes Service (EKS) or GCP Kubernetes Engine (GKE) to run containerized applications, then Kubernetes provides service discovery by mapping services ports to containers.
You should adopt the microservices architectural approach to software development. Microservices architectures allow a large application to be decomposed into small independent services that communicate over well-defined APIs. It makes the applications easier to scale and faster to develop, enabling innovation and accelerating time-to-market for new features. The following cloud-native services offer auto-scaling:
- AWS Lambda or GCP Cloud Functions
- AWS Auto Scaling or GCP Autoscaling groups
- AWS EKS or GCP GKE
The twelve-factor app’s processes are disposable, meaning they can be started or stopped at a moment’s notice. So you should make sure the start-up time is minimal such as how long to perform the startup scripts, load the images, initiate the packages, and complete tasks during start-up time. Follow VI. Process and IV. Backing Services for decoupling functionality. Manage environmental variables with III. Config factor so you can use them during runtime. Processes shut down gracefully when they receive a SIGTERM signal from the process manager (e.g. StopTask in AWS ECS, terminating with grace in Kubernetes).
The twelve-factor app is designed for continuous deployment (CD) by keeping the gap between development and production small. You should follow I. CodeBase and V. Build, release, run to manage CI/CD. You can also use AWS CloudFormation or GCP Deployment Manager to provision and model your cloud resources from development to production environments.
A twelve-factor app never concerns itself with routing or storage of its output stream. It should not attempt to write to or manage logfiles. Public cloud providers follow this factor by offer operations services to help you track the performance of an application. Examples include AWS CloudWatch and GCP Cloud Logging.
One-off admin processes should also follow the same codebase, dependency isolation, and config as any process in the same release. Admin/management tasks can be trigged by AWS CloudWatch events, the CronJob in Kubernetes (e.g. AWS EKS or GCP GKE), complicated jobs in AWS Batch, or AWS ECS scheduled tasks.
The best practices from the twelve-factor app methodology provide the approach to build cloud-native applications. The public cloud providers offer managed services to ease the design and development cycles. Besides the twelve-factor app methodology for the cloud-native application development, we also need to consider the security on the public cloud. I will cover this topic in a future post.
- Hands-on with DynamoDB
- AWS Data Warehouse – Build with Redshift and QuickSight
- AWS Relational Database Solution: Hands-on with AWS RDS
- Which is Right Hadoop Solution for You?
- Apache Hadoop Ecosystem Cheat Sheet
- Data Storage for Big Data: Aurora, Redshift or Hadoop?
- AWS Kinesis Data Streams vs. Kinesis Data Firehose
- Streaming Platforms: Apache Kafka vs. AWS Kinesis
- AWS Machine Learning on AWS Redshift Data
- Why Use AWS Redshift Spectrum with Data Lake
- How to Design AWS DynamoDB Data Modeling
- When Should Use Amazon DynamoDB Accelerator (AWS DAX)?
- Web Application with Aurora Serverless Cluster
- Top IT Certifications for 2018
- How I Passed AWS CSAA in 3 Months
- How to Pass AWS Certified Big Data Specialty
- AWS Elastic Beanstalk or AWS Elastic Container Service for Kubernetes (AWS EKS)
- How to Use AWS CodeStar to Manage Lambda Java Project from Source to Test Locally