Pods
Introduction to the Pods Service
What is the Pods Service
The Pods Service is a web service and distributed computing platform providing pods-as-a-service (PaaS). The service implements a message broker and processor model that requests pods, alongside a health module to poll for pod data, including logs, status, and health. The primary use of this service is to have quick to deploy long-lived services based on Docker images that are exposed via HTTP or TCP endpoints listed by the API.
- The Pods service provides functionality for two types of pod solutions:
Templated Pods for run-as-is popular images. Neo4J is one example, the template manages TCP ports, user creation, and permissions.
Custom Pods for arbitrary docker images with less functionality. In this case we will expose port 5000 and do nothing else.
Using the Pods Service
Please create issues on our github repo and report problems to Christian R. Garcia. The service is available to researchers and students. To learn more about the the system, including getting access, follow the instructions in Getting Started.
Getting Started
This Getting Started guide will walk you through the initial steps of setting up the necessary accounts and installing the required software before moving to the Pods Quickstart, where you will create your first Pods service pod. If you are already using Docker Hub and the TACC TAPIS APIs, feel free to jump right to the Pods Quickstart or check out the Pods Live Docs.
Account Creation and Software Installation
Create a TACC account
The main instance of the Pods platform is hosted at the Texas Advanced Computing Center (TACC). TACC designs and deploys some of the world’s most powerful advanced computing technologies and innovative software solutions to enable researchers to answer complex questions. To use the TACC-hosted Pods service, please create a TACC account.
Create a Docker account
Docker is an open-source container runtime providing operating-system-level virtualization. The Pods service pulls images for its pods from the public Docker Hub. To register pods you will need to publish images on Docker Hub, which requires a Docker account .
Install the Tapis Python SDK
To interact with the TACC-hosted Abaco platform in Python, we will leverage the Tapis Python SDK, tapipy. To install it, simply run:
$ pip3 install tapipy
Attention
tapipy
works with Python 3.
Working with TACC OAuth
Authentication and authorization to the Tapis APIs uses OAuth2, a widely-adopted web standard. Our implementation of OAuth2 is designed to give you the flexibility you need to script and automate use of Tapis while keeping your access credentials and digital assets secure. This is covered in great detail in our Tenancy and Authentication section, but some key concepts will be highlighted here, interleaved with Python code.
Create an Tapis Client Object
The first step in using the Tapis Python SDK, tapipy, is to create a Tapis Client object. First, import
the Tapis
class and create python object called t
that points to the Tapis server using your TACC
username and password. Do so by typing the following in a Python shell:
Important
Support for Pods service in Tapipy was added in version 1.2.3.
# Import the Tapis object
from tapipy.tapis import Tapis
# Log into you the Tapis service by providing user/pass and url.
t = Tapis(base_url='https://tacc.tapis.io',
username='your username',
password='your password')
Generate a Token
With the t
object instantiated, we can exchange our credentials for an access token. In Tapis, you
never send your username and password directly to the services; instead, you pass an access token which
is cryptographically signed by the OAuth server and includes information about your identity. The Tapis
services use this token to determine who you are and what you can do.
# Get tokens that will be used for authenticated function calls
t.get_tokens()
print(t.access_token.access_token)
Out[1]: eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9...
Note that the tapipy t
object will store and pass your access token for you, so you don’t have to manually provide
the token when using the tapipy operations. You are now ready to check your access to the Tapis APIs. It will
expire though, after 4 hours, at which time you will need to generate a new token. If you are interested, you
can create an OAuth client (a one-time setup step, like creating a TACC account) that can be used to generate
access and refresh tokens. For simplicity, we are skipping that but if you are interested, check out the Tenancy and
Authentication section.
Check Access to the Tapis APIs
The tapipy t
object should now be configured to talk to all Tapis APIs on your behalf. We can check that the client is
configured properly by making any API call. For example, we can use the authenticator service to retrieve the full
TACC profile of our user. To do so, use the get_profile()
function associated with the authenticator
object on
the t
object, passing the username of the profile to retrieve, as follows.
t.authenticator.get_profile(username='apitest')
Out[1]:
create_time: None
dn: cn=apitest,ou=People,dc=tacc,dc=utexas,dc=edu
email: aci-cic@tacc.utexas.edu
username: apitest
Pods Quickstart
In this Quickstart, we will create an Pods service pod from a basic Python function. Then we will get pod credentials and logs.
Registering a templated Pod
To get started we’re going to create a templated Pod. To do this, we will use the Tapis
client object we created above
(see Working with TACC OAuth).
To register an pod using the tapipy library, we use the pods.create_pod()
method and pass the arguments describing
the pod we want to register through the function parameters. For example:
from tapipy.tapis import Tapis
t = Tapis(
base_url='https://tacc.tapis.io',
username='<userid>',
password='*********'
)
t.pods.create_pod(
pod_id='docpod',
pod_template='template/neo4j',
description='My example pod!'
)
As you can see, we’re using pod_template equal to template/neo4j, template being the namespace for our templated pods. You should see a response like this:
creation_ts: None
data_attached: []
data_requests: []
description: My example pod!
environment_variables:
pod_id: test
pod_template: template/neo4j
roles_inherited: []
roles_required: []
status: REQUESTED
status_container:
status_requested: ON
update_ts: None
url: docpod.pods.tacc.develop.tapis.io
Notes:
The pod_id given will be the id used by you to access the pod at all times. It must be lowercase and alphanumeric. It also must be unique within the tenant.
Pods returned a status of
REQUESTED
for the pod; behind the scenes, the Pods service has sent a message requesting the pod described to our backend spawner infrastructure. The pod’s image must be pulled, a pod service must be created (for networking), and the networking changes must propagate to the Pod’s proxy before the Pod is ready for use.When the pod itself has began running, the status will change to
AVAILABLE
. Networking takes time to propagate (expect <1 minute).An
AVAILABLE`
pod only means the pod itself has started, check pod logs to see what your container is writing to stdout.
At any point we can check the details of our pods, including its status, with the following:
t.pods.get_pod(pod_id='docpod')
The response format is identical to that returned from the t.pods.create_pod()
method.
Accessing a Pod
Once your pod is in the AVAILABLE
state your pod’s specified networking ports should be routed to port 443 at specified urls.
Read more at #`Pod Networking`_.
A pod’s access urls specified in the pod’s networking attribute. A pod can have multiple urls, each with different protocols and ports.
Retrieving the Logs
The Pods service collects the latest 10 MB of logs from the pod when running and makes them available
via the logs
endpoint. Let’s retrieve the logs from the pod we just made. We use the get_pod_logs()
method,
passing in pod_id
:
t.pods.get_pod_logs(pod_id='docpod')
The response should be similar to the following:
logs:
Fetching versions.json for Plugin 'apoc' from https://neo4j-contrib.github.io/neo4j-apoc-procedures/versions.json
Installing Plugin 'apoc' from https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/4.4.0.6/apoc-4.4.0.6-all.jar to /var/lib/neo4j/plugins/apoc.jar
Applying default values for plugin apoc to neo4j.conf
Fetching versions.json for Plugin 'n10s' from https://neo4j-labs.github.io/neosemantics/versions.json
Installing Plugin 'n10s' from https://github.com/neo4j-labs/neosemantics/releases/download/4.4.0.1/neosemantics-4.4.0.1.jar to /var/lib/neo4j/plugins/n10s.jar
Applying default values for plugin n10s to neo4j.conf
2022-06-16 00:36:14.423+0000 INFO Starting...
2022-06-16 00:36:15.602+0000 INFO This instance is ServerId{eba2fb15} (eba2fb15-713d-47ba-92a5-0a688696264d)
2022-06-16 00:36:17.468+0000 INFO ======== Neo4j 4.4.8 ========
2022-06-16 00:36:21.713+0000 INFO [system/00000000] successfully initialized: CREATE USER podsservice SET PLAINTEXT PASSWORD 'servicepass' SET PASSWORD CHANGE NOT REQUIRED
2022-06-16 00:36:21.734+0000 INFO [system/00000000] successfully initialized: CREATE USER test SET PLAINTEXT PASSWORD 'userpass' SET PASSWORD CHANGE NOT REQUIRED
2022-06-16 00:36:30.268+0000 INFO Upgrading security graph to latest version
2022-06-16 00:36:30.268+0000 INFO Setting version for 'security-users' to 2
2022-06-16 00:36:30.270+0000 INFO Upgrading 'security-users' version property from 2 to 3
2022-06-16 00:36:30.556+0000 INFO Called db.clearQueryCaches(): Query caches successfully cleared of 1 queries.
2022-06-16 00:36:30.667+0000 INFO Bolt enabled on [0:0:0:0:0:0:0:0%0]:7687.
2022-06-16 00:36:31.745+0000 INFO Remote interface available at http://pods-tacc-tacc-docpod:7474/
2022-06-16 00:36:31.750+0000 INFO id: B1F0F170083249DAAF9127203310961EF79B262C90EA04D9F08EB7F077DF19E7
2022-06-16 00:36:31.750+0000 INFO name: system
2022-06-16 00:36:31.751+0000 INFO creationDate: 2022-06-16T00:36:19.073Z
2022-06-16 00:36:31.751+0000 INFO Started.
We can see logs from our Neo4j image during the initialization process.
Conclusion
Congratulations! You’ve now created, registered, and accessed your first pod. There is a lot more you can do with the Pods service though. To learn more about the additional capabilities, please continue on to the Technical Guide.
Neo4j
Assuming the user has created a Neo4j pod and retrieved credentials (user/pass), the user can now connect to the DB with the Neo4j browser interface or Python Neo4j driver.
from neo4j import GraphDatabase
url = "bolt+s://podId.pods.tacc.tapis.io:443"
user = "podId"
passw = "autoRandomizedPassword"
neo = GraphDatabase.driver(url,
auth = (user, passw),
max_connection_lifetime=30)
Use the neo driver as follows to match and return number of nodes in DB.
with neo.session() as session:
result = session.run("MATCH (n) RETURN n")
for record in result:
print(f"Number of nodes in the database: {record}")
Neo4j has a browser based interface that can be used to interact with remote DBs. With this users can use the browser interface here: https://browser.neo4j.io/ with the Pods service.
Simple provide the following url and credentials to connect to the DB in browser.
url = "bolt+s://podId.pods.tacc.tapis.io:443"
user = "podId"
passw = "autoRandomizedPassword"
Users are able to runt he browser interface themselves, but that is not in scope for these docs.
Postgres
Assuming the user has created a Postgres pod and retrieved credentials (user/pass), the user can now connect to the DB with the Postgres’ PgAdmin interface or Python Postgres drivers.
To note, psycopg2 will be the Postgres driver used, there are more, use your preference.
import psycopg2
db_login = {
"host": "podId.pods.tacc.tapis.io",
"port": 443,
"database": "postgres",
"user": "podId",
"password": "autoRandomizedPassword"
}
conn = psycopg2.connect(**db_login)
pg_cursor = conn.cursor()
At this point the user will have a Python postgres driver with a pg_cursor tied to their DB.
For example, to get all tables in the DB, the user can run the following with the pg_cursor.
# get all tables
pg_cursor.execute("select relname from pg_class where relkind='r' and relname !~ '^(pg_|sql_)';")
print(pg_cursor.fetchall())
PgAdmin is an installable interface that can be used to interact with remote DBs. Simple provide the following url and credentials to connect to the DB in browser.
url = "podId.pods.tacc.tapis.io"
port = 443
user = "podId"
passw = "autoRandomizedPassword"
- To note:
PgAdmin can work through the browser.
PgAdmin GUI can be hosted by the Pods service, it just hasn’t been tried yet.
Future work. Only quickstart is currently complete.
Please view our API Reference to see what additional functionality is currently available.
API Reference
The following link is to our live-documentation that takes our OpenAPI v3 specification that is automatically generated and gives users the public endpoints available within the Pods API along with request body expected and descriptions for each field.