Tapis Vault¶
Note
This guide is for users wanting to deploy Tapis software in their own datacenter. Researchers who simply want to make use of the Tapis APIs do not need to deploy any Tapis components and can ignore this guide.
Introduction to Tapis Vault¶
Tapis stores all its secrets in an instance of Hashicorp Vault. It is critical that access to Vault is tightly controlled and its secret content safeguarded. The only Tapis component that interactes directly with Vault is the Security Kernel and its associated utility programs.
When planning a Tapis installation, one should consider the tradeoff between automation and robust secret management. In highly automated environments like Kubernetes, services are automatically restarted when they fail or need to be moved between nodes. This level of automation requires writing at least one secret on some software accessible disk. On the other hand, Vault initialization is geared toward having a human in the loop to execute its unseal protocol and to protect high value tokens.
Tapis supports two levels of Vault automation. When running in a Kubernetes cluster, Vault’s unseal keys and a long-lived token are written to disk on the control plane. Kubernetes accesses these secrets when it initializes a new Vault instance or it needs to restart Vault. These secrets are protected by the operating system’s account and access control mechanisms; if those mechanisms are compromised, the Vault’s contents are vulnerable.
Tapis also supports deploying Vault outside a Kubernetes cluster while the rest of Tapis runs in the cluster. In this configuration, a data center can deploy Vault to meet its local security policies and standards, including limiting administrative access to Vault, running Vault in a physically secure location, employing a hardware security module, etc.
In the following two sections we discuss how the community version of Vault can be run inside Kubernetes and run on a VM outside of Kubernetes. In both cases, Vault needs to be installed, certains capabilities need to be enabled, Tapis-specific policies and roles need to be created, and administrative processes need to be put in place. Other configurations, such as running enterprise Vault or sharing Vault with other applications, are not covered, but information from this discussion can be adapted to other configurations.
Deploying Vault in Kubernetes¶
Using Tapis Deployer, Tapis’s top-level burnup script configures and initializes Vault early in the deployment process. The Vault burnup script executes from the Kubernetes control plane and performs these tasks:
Configures Vault’s network and site location
Configures Vault storage
Creates and initializes Vault (new installations only)
Creates vault file containing unseal keys and root token
Creates vault-token file containing root token from vault file (if necessary)
Pushes unseal keys and root token to Kubernetes secrets
Starts the Vault pod and unseals it (if necessary)
The deployer scripts detect the vault file to determine whether Vault has been installed. This file along with the vault-token file are placed in user’s home directory by default, but this can be changed.
At this point, a Vault docker image is running in a Kubernetes pod and deployment scripts have written the unseal keys and the root token to files. These secrets are also written to Kubernetes secrets to make them available to pods and jobs. Kubernetes has all the secret information needed to restart Vault whenever it detects a failure or needs to move the pod. This automation comes at the cost of having sensitive Vault secrets on disk in the Kubernetes control plane.
The next phase initializes the Vault with Tapis roles, policies and secrets using the SkAdmin utility program. SkAdmin secrets management capabilities is discussed in depth in its own topic, but here we’ll focus on its role during Vault initialization.
SkAdmin connects to Vault with administrative permissions so that is can set up Tapis’s standard policies and its administrative policies. SkAdmin also sets up Tapis’s standard roles and its administrative roles.
Once SkAdmin completes setting up Tapis’s roles and policies, Vault is ready to accept Tapis service and user secrets. See the secrets discussion for details.
Deploying Vault Outside of Kubernetes¶
The procedure below describes how to build a Vault VM on Linux, which we’ll call “tapis-vault”. This procedure can also be used as a template for building Vault instances in other environments and follows the same general outline as dicussed above for Kubernetes environments.
Important security characteristics of the VM installation approach are:
No Vault tokens are ever stored on the VM’s disk.
No unseal keys are ever stored on VM’s disk.
The reason not to store these secrets on the Vault machine is because even if the root user is compromised, Vault secrets are inaccessible unless Vault is unsealed and the attacker has a valid token.
That said, automation inside Kubernetes will typically require a scoped token to start the Security Kernel and that token will most likely be saved in the control plane.
We assume the installation uses the open source version of Hashicorp Vault, so we do not have access to Vault’s enterprise features. Tapis has been tested with Vault v1.8.3 and we assume a version compatible with that is being used.
The procedure can be split into two phases. The first phase requires command line access to the Vault host as root. The second phase can be performed remotely using the Vault REST APIs and a root token.
PHASE I - Command Line Execution¶
Step 1 - Acquire A VM
Acquire a VM with a TLS certificate installed. We’ll use the fictitious domain “mydomain.com” for illustrative purposes.
Step 2 - Install Hashicorp Vault
SSH into target VM as root. Follow installation instructions for your package manager to natively install Vault:
https://learn.hashicorp.com/tutorials/vault/getting-started-install?in=vault/getting-started
https://developer.hashicorp.com/vault/docs/install
Step 3 - Change Private Key Access
Now that Vault is installed, change the group of the private key to “vault” and allow group read. Here are example commands:
It’s also a good idea to create /etc/pki/tls/certs/README.VAULT explaining the steps you took to customize your VM.
Step 4 - Configure Vault for RAFT Storage
Save the original /etc/vault.d/vault.hcl. Update /etc/vault.d/vault.hcl to use the RAFT backend. Here are contents of an example vault.hcl file that can provide a template for your configuration:
# Full configuration options can be found at https://www.vaultproject.io/docs/configuration
ui = true
disable_mlock = true
cluster_addr = "https://tapis-vault.mydomain.com:8201"
api_addr = "https://tapis-vault.mydomain.com:8200"
storage "raft" {
path = "/opt/vault/data"
node_id = "node_1"
}
# HTTPS listener
listener "tcp" {
address = "0.0.0.0:8200"
tls_cert_file = "/etc/pki/tls/certs/certchain.pem"
tls_key_file = "/etc/pki/tls/private/tapis-vault-key.20230403"
tls_client_ca_file = "/etc/pki/tls/certs/certchain.pem"
}
Vault information about using the RAFT protocol can be found here.
Step 5 - Start Vault
Test the installation (customize for your hostname):
Step 6 - Initialize Vault
vault operator init
Five unseal keys and the root token will be written to the screen. DO NOT SAVE THESE DATA PERMANENTLY ON THE FILE SYSTEM. Instead, copy the information off the screen and save them securely off the VM.
Step 7 - Unseal Vault The Vault requires 3 out of the 5 of the unseal keys to unseal. Issue the operator unseal call 3 times, each time using a different key.
Step 8 - Export Root Token To avoid saving the root token to the command history file:
where the command has a leading space and xxx is the token output by the above operator init command.
Step 9 - Enable Authn Methods and Secrets Engines
Step 10 - Check Remote Access
Before logging off, test remote access by running a status command that will be used in Phase II. On the remote machine, export the root token.
To avoid saving the root token to the command history file:
Step 11 - Logoff VM (optional)
All further configuration will be performed from the remote machine.
PHASE II - Remote Commands¶
Step 12 - Create SK Roles
On the remote machine terminal, export the root VAULT_TOKEN as shown in Step 10. Clone the tapis-vault-vm git repo into the current directory.
Step 13 - Test SK Roles (optional)
Step 14 - Create Roles and Policies
The tapis-vault/CreatePolicies.sh script encapsulates basic policy and role creation needed for Tapis to function. See comments in the script for details, but basically the script requires:
The current directory to be tapis-vault.
The VAULT_TOKEN environment variable be set to a root token.
The DNS name of the new Vault VM be provided on the command line.
Requirements 1 and 2 where already set in the previous two steps, so an invocation of the script looks like this (but with your VM):
Step 15 - View Roles (optional) Each of the roles referenced in CreatePolicies.sh should be returned.
Step 16 - View Policies (optional)
Each of the policies listed in CreatePolicies.sh should be returned.
Step 17 - Create tapisroot Token
The tapisroot token is a root token that should be used instead of the original root token generated by Vault. It tapisroot gets compromised it can easily be revoked and replaced.
Create a file named tapisroot.json with the content:
{
"display_name": "tapisroot",
"policies": [ "root" ],
"ttl": 0
}
Run this command:
Save the returned “client_token” in a secure place, such as stache or wherever you saved the original root token and unseal keys.
Step 18 - Test tapisroot Token (optional)
To avoid saving the root token to the command history file:
Step 19 - Remove Secrets from History
Remove any commands that leaked secrets into the history file. Enter “history” to see the numbered history records. To remove by line number:
Vault Backup¶
Tapis configures Vault to run with the raft storage type by default, which allows Vault to operate normally while its database is backed up. Vault provides these two administrative commands to save and restore backups:
vault operator raft snapshot save <outfile>
vault operator raft snapshot restore <infile>
Tapis fills the gap in Vault’s community edition support by automating periodic backups in Vault VM environments. The tapis-vaultbackup repository contains a backup utility program’s source code and documentation. The program can be started in a secure manner to periodically takes snapshots of the Vault database (once a day by default). The program runs as a daemon until it’s shutdown. Typically, a separate cron job is set up to copy the backup files from the VM to one or more remote data stores as local policy dictates.
The program is written in Java and packaged as a self-contained executable. The executable is then packaged into an rpm for use on operating systems that support that package manager. There are no plans to support other package managers or container runtimes, but everything needed for such support is available in the repository.
Vault Export¶
The SkExport utility program provides a quick way to extract many Tapis secrets from Vault. The output is written to stdout as either JSON data or key/value pairs. One use of this program is to acquire Tapis service secrets and then to inject them into docker containers as environment variables. SkExport source code is part of the Security Kernel library and is available as a docker image.
SkExport parameters:
SkExport [options...]
-format (--format) [JSON | ENV] : JSON writes raw Vault data, ENV writes key=value (default: ENV)
-help (--help) : display help information (default: false)
-nosan (--nosanitize) : don''t replace unsupported characters with underscore when -format=ENV (default is to sanitize)
-noskip (--noskipusersecrets) : don''t skip user secrets (default is to skip)
-quote (--quoteenv) : enclose secret values in single quotes when -format=ENV (default: false)
-v (--verbose) : output statistics in addition to secrets (default no statistics)
-vtok (--vaulttoken) VAL : Vault token with proper authorization
-vurl (--vaulturl) VAL : Vault URL including port, ex: http(s)://host:32342
Running SkExport¶
The easiest way to execute SkExport is to run its docker image. The -vtok and -vurl parameters are required. Here’s an example of how to export the tapis service secrets (user and system secrets are skipped) in environment variable format with the values single quoted:
export SKEXPORT_PARMS='-quote -vtok xxxx -vurl https://tapis-vault.mydomain.com:8200'
docker run --env SKEXPORT_PARMS tapis/securityexport
This example outputs JSON data:
export SKEXPORT_PARMS='-format=JSON -vtok xxxx -vurl https://tapis-vault.mydomain.com:8200'
docker run --env SKEXPORT_PARMS tapis/securityexport
Since a token with at least as much authorization as the Security Kernel’s token must be used to extract secrets from Vault, and since secrets are being output in the clear, it’s important to take proper security precautions when using SkExport. These precautions include not leaving tokens or secrets in files and deleting sensitive information from the command line history file.