From 37a29bf72c9dde67b96b99ea4f141aff42644f76 Mon Sep 17 00:00:00 2001 From: Webster Mudge Date: Tue, 5 Sep 2023 17:29:37 -0400 Subject: [PATCH 01/10] Update requirements docs Signed-off-by: Webster Mudge --- public-cloud/aws/base/README.md | 4 +--- public-cloud/aws/cde/README.md | 4 +--- public-cloud/aws/cdf/README.md | 4 +--- public-cloud/aws/cml/README.md | 4 +--- public-cloud/aws/tf/README.md | 4 +--- 5 files changed, 5 insertions(+), 15 deletions(-) diff --git a/public-cloud/aws/base/README.md b/public-cloud/aws/base/README.md index 308acdd..9b10e74 100644 --- a/public-cloud/aws/base/README.md +++ b/public-cloud/aws/base/README.md @@ -6,12 +6,10 @@ To run, you need: -* Docker (or a Docker clone[^1]) +* Docker (or a Docker alternative) * AWS credentials (set via `AWS_PROFILE`) * CDP credentials (set via `CDP_PROFILE`) -[^1]: For example, [OrbStack](https://orbstack.dev) works well on OSX. - ## Set Up First, set up your `ansible-navigator` aka `cdp-navigator` environment -- follow the instructions in the top-level [README](../../../README.md#setting-up-ansible-navigator). diff --git a/public-cloud/aws/cde/README.md b/public-cloud/aws/cde/README.md index af11202..ffd0fc6 100644 --- a/public-cloud/aws/cde/README.md +++ b/public-cloud/aws/cde/README.md @@ -6,12 +6,10 @@ To run, you need: -* Docker (or a Docker clone[^1]) +* Docker (or a Docker alternative) * AWS credentials (set via `AWS_PROFILE`) * CDP credentials (set via `CDP_PROFILE`) -[^1]: For example, [OrbStack](https://orbstack.dev) works well on OSX. - ## Set Up First, set up your `ansible-navigator` aka `cdp-navigator` environment -- follow the instructions in the top-level [README](../../../README.md#setting-up-ansible-navigator). diff --git a/public-cloud/aws/cdf/README.md b/public-cloud/aws/cdf/README.md index 69b39bf..b21e44e 100644 --- a/public-cloud/aws/cdf/README.md +++ b/public-cloud/aws/cdf/README.md @@ -6,12 +6,10 @@ To run, you need: -* Docker (or a Docker clone[^1]) +* Docker (or a Docker alternative) * AWS credentials (set via `AWS_PROFILE`) * CDP credentials (set via `CDP_PROFILE`) -[^1]: For example, [OrbStack](https://orbstack.dev) works well on OSX. - ## Set Up First, set up your `ansible-navigator` aka `cdp-navigator` environment -- follow the instructions in the top-level [README](../../../README.md#setting-up-ansible-navigator). diff --git a/public-cloud/aws/cml/README.md b/public-cloud/aws/cml/README.md index 8e8b86e..d740212 100644 --- a/public-cloud/aws/cml/README.md +++ b/public-cloud/aws/cml/README.md @@ -6,12 +6,10 @@ To run, you need: -* Docker (or a Docker clone[^1]) +* Docker (or a Docker alternative) * AWS credentials (set via `AWS_PROFILE`) * CDP credentials (set via `CDP_PROFILE`) -[^1]: For example, [OrbStack](https://orbstack.dev) works well on OSX. - ## Set Up First, set up your `ansible-navigator` aka `cdp-navigator` environment -- follow the instructions in the top-level [README](../../../README.md#setting-up-ansible-navigator). diff --git a/public-cloud/aws/tf/README.md b/public-cloud/aws/tf/README.md index 2fc41c1..77d7b2c 100644 --- a/public-cloud/aws/tf/README.md +++ b/public-cloud/aws/tf/README.md @@ -8,12 +8,10 @@ To run, you need: -* Docker (or a Docker clone[^1]) +* Docker (or a Docker alterative) * AWS credentials (set via `AWS_PROFILE`) * CDP credentials (set via `CDP_PROFILE`) -[^1]: For example, [OrbStack](https://orbstack.dev) works well on OSX. - ## Set Up First, set up your `ansible-navigator` aka `cdp-navigator` environment -- follow the instructions in the top-level [README](../../../README.md#setting-up-ansible-navigator). From 62a437428268241994630ab65caacd9e2ede5031 Mon Sep 17 00:00:00 2001 From: Webster Mudge Date: Tue, 26 Sep 2023 20:37:12 -0400 Subject: [PATCH 02/10] Rename base to datalake and update docs to point at cldr-runner documentation Signed-off-by: Webster Mudge --- public-cloud/aws/{base => datalake}/.gitignore | 0 public-cloud/aws/{base => datalake}/README.md | 13 +++++++++---- .../aws/{base => datalake}/ansible-navigator.yml | 3 --- public-cloud/aws/{base => datalake}/definition.yml | 0 public-cloud/aws/{base => datalake}/inventory.ini | 0 public-cloud/aws/{base => datalake}/main.yml | 0 public-cloud/aws/{base => datalake}/teardown.yml | 0 7 files changed, 9 insertions(+), 7 deletions(-) rename public-cloud/aws/{base => datalake}/.gitignore (100%) rename public-cloud/aws/{base => datalake}/README.md (62%) rename public-cloud/aws/{base => datalake}/ansible-navigator.yml (93%) rename public-cloud/aws/{base => datalake}/definition.yml (100%) rename public-cloud/aws/{base => datalake}/inventory.ini (100%) rename public-cloud/aws/{base => datalake}/main.yml (100%) rename public-cloud/aws/{base => datalake}/teardown.yml (100%) diff --git a/public-cloud/aws/base/.gitignore b/public-cloud/aws/datalake/.gitignore similarity index 100% rename from public-cloud/aws/base/.gitignore rename to public-cloud/aws/datalake/.gitignore diff --git a/public-cloud/aws/base/README.md b/public-cloud/aws/datalake/README.md similarity index 62% rename from public-cloud/aws/base/README.md rename to public-cloud/aws/datalake/README.md index 9b10e74..46b9f84 100644 --- a/public-cloud/aws/base/README.md +++ b/public-cloud/aws/datalake/README.md @@ -12,12 +12,12 @@ To run, you need: ## Set Up -First, set up your `ansible-navigator` aka `cdp-navigator` environment -- follow the instructions in the top-level [README](../../../README.md#setting-up-ansible-navigator). +First, set up your `ansible-navigator` aka `cdp-navigator` environment -- follow the instructions in the [NAVIGATOR document](https://github.com/cloudera-labs/cldr-runner/blob/main/NAVIGATOR.md) in `cloudera-labs/cldr-runner`. Then, clone this project and change your working directory. ```bash -git clone https://github.com/cloudera-labs/cloudera-deploy.git; cd cloudera-deploy/public-cloud/aws/base +git clone https://github.com/cloudera-labs/cloudera-deploy.git; cd cloudera-deploy/public-cloud/aws/datalake ``` ## Configure @@ -37,11 +37,16 @@ admin_password: "Secret" # 1 upper, 1 special, 1 number, 8-64 chars. infra_region: us-east-2 ``` -NOTE: You can override these parameters with any typical Ansible _extra variables_ flags, i.e. `-e admin_password=my_password`. See the [FAQ](../../../FAQ.md#how-to-i-add-extra-variables-and-tags-to-ansible-navigator) for details. +> [!NOTE] +> You can override these parameters with any typical Ansible _extra variables_ flags, i.e. `-e admin_password=my_password`. See the [cldr-runner FAQ](https://github.com/cloudera-labs/cldr-runner/blob/main/FAQ.md#how-to-i-add-extra-variables-and-tags-to-ansible-navigator) for details. ### SSH Keys -This definition will create a new SSH keypair on the host in your `~/.ssh` directory if you do not specify a SSH public key. If you wish to use an existing SSH key already loaded into AWS, set `public_key_id` to the key's label. If you wish to use an existing SSH key, but need to have it loaded into AWS, then set `public_key_file` to the key's path. +This definition will create a new SSH keypair on the host in your `~/.ssh` directory if you do not specify a SSH public key. + +If you wish to use an existing SSH key already loaded into AWS, set `public_key_id` to the key's label in AWS. + +If you wish to use an existing SSH key, but need to have it loaded into AWS, then set `public_key_file` to the key's local path. ## Execute diff --git a/public-cloud/aws/base/ansible-navigator.yml b/public-cloud/aws/datalake/ansible-navigator.yml similarity index 93% rename from public-cloud/aws/base/ansible-navigator.yml rename to public-cloud/aws/datalake/ansible-navigator.yml index a0dc3ef..b4b1e6b 100644 --- a/public-cloud/aws/base/ansible-navigator.yml +++ b/public-cloud/aws/datalake/ansible-navigator.yml @@ -49,9 +49,6 @@ ansible-navigator: arguments: - "--tls-verify=false" volume-mounts: - - src: "${ANSIBLE_COLLECTIONS_PATH}" - dest: "${ANSIBLE_COLLECTIONS_PATH}" - options: "Z" - src: "~/.aws" dest: "/runner/.aws" options: "Z" diff --git a/public-cloud/aws/base/definition.yml b/public-cloud/aws/datalake/definition.yml similarity index 100% rename from public-cloud/aws/base/definition.yml rename to public-cloud/aws/datalake/definition.yml diff --git a/public-cloud/aws/base/inventory.ini b/public-cloud/aws/datalake/inventory.ini similarity index 100% rename from public-cloud/aws/base/inventory.ini rename to public-cloud/aws/datalake/inventory.ini diff --git a/public-cloud/aws/base/main.yml b/public-cloud/aws/datalake/main.yml similarity index 100% rename from public-cloud/aws/base/main.yml rename to public-cloud/aws/datalake/main.yml diff --git a/public-cloud/aws/base/teardown.yml b/public-cloud/aws/datalake/teardown.yml similarity index 100% rename from public-cloud/aws/base/teardown.yml rename to public-cloud/aws/datalake/teardown.yml From 63cfde967c6b6c1601298488215fcd9cf8c2fc71 Mon Sep 17 00:00:00 2001 From: Webster Mudge Date: Tue, 26 Sep 2023 20:38:09 -0400 Subject: [PATCH 03/10] Update FAQ to migrate some entries to cldr-runner and add entry for migration Signed-off-by: Webster Mudge --- FAQ.md | 62 ++++++++-------------------------------------------------- 1 file changed, 8 insertions(+), 54 deletions(-) diff --git a/FAQ.md b/FAQ.md index 67842f3..5533d54 100644 --- a/FAQ.md +++ b/FAQ.md @@ -1,48 +1,29 @@ # Frequently Asked Questions +Be sure to check out the [Discussions > Help](https://github.com/cloudera-labs/cloudera-deploy/discussions/categories/help) category for the latest answers. + ## Where did everything go? The project undertook some serious remodeling, but rest assured, your definitions will still work as they did in the previous version of `cloudera-deploy`. Okay, but where did everything go? Well... -1. The `quickstart.sh` migrated to `ansible-navigator`. Both of these applications use a container based on `ansible-runner`, i.e. [`cldr-runner`](https://github.com/cloudera-labs/cldr-runner), to execute the playbooks, yet `ansible-navigator` is configuration-driven and better aligned with how AWX runs Ansible in containers. Also, `ansible-navigator` brings a nifty UI and the ease of use to handle different execution modes. +1. The `quickstart.sh` migrated to `ansible-navigator`. Both of these applications use a container based on `ansible-runner`, i.e. [`cldr-runner`](https://github.com/cloudera-labs/cldr-runner), to execute the playbooks, yet `ansible-navigator` is configuration-driven and better aligned with how AWX runs Ansible in containers. Also, `ansible-navigator` brings a nifty text-based UI (TUI) and the ease of use to handle different execution modes. We also migrated `cldr-runner` to use `ansible-builder`, but you can read more about that effort at the [`cldr-runner`](https://github.com/cloudera-labs/cldr-runner) project. 1. The original `cloudera-deploy` playbooks moved into `cloudera.exe`. Starting with Ansible `2.11`, [collections can contain playbooks](https://docs.ansible.com/ansible/latest/collections_guide/collections_using_playbooks.html#using-a-playbook-from-a-collection). We call the playbooks using `import_playbook` like roles. - PLEASE NOTE, if you are developing your own project playbooks, you must first set up your `cloudera-deploy` variables _before_ calling the playbooks by running the `cloudera.exe.init_deployment` role on `localhost`. + > [!IMPORTANT] + > If you are developing your own project playbooks, you must first set up your `cloudera-deploy` variables _before_ calling the playbooks by running the `cloudera.exe.init_deployment` role. -1. The _run-levels_ still remain; you can still use `-t infra` for example. However, the playbooks themselves are more granular and overall set up and tear down processes are now separate playbooks. +1. The _runlevels_ still remain; you can still use `-t infra` for example. However, the playbooks themselves are more granular and overall set up and tear down processes are now separate playbooks. This change promotes composibility and reusability, and we are going to continue to break apart the functions and operations within `cloudera-deploy` and -- most importantly -- the collections that drive this application. We fully expect that you will want to adapt and create your own "deploy" application, one that caters to _your_ needs and operating parameters. Switching to a more granular, more modular approach is key to this objective. -## How to I add _extra variables_ and tags to `ansible-navigator`? - -If you want to run a playbook with a given tag, e.g. `-t infra`, then simply add it as a parameter to the `ansible-navigator` commandline. For example, `ansible-navigator run playbook.yml -t infra`. - -Like tags, so you can pass _extra variables_ to `ansible-navigator` and the underlying Ansible command. For example, `ansible-navigator run playbook.yml -e @some_config.yml -e some_var=yes`. - -## How do I tell `ansible-navigator` where to find collections and roles? - -By default, `cloudera-deploy` expects to use the collections, roles, and libraries within the _execution environment_ container, that is `cldr-runner`. Make sure you do _not_ have `ANSIBLE_COLLECTIONS_PATH` or `ANSIBLE_ROLES_PATH` set or `ansible-navigator` will pick up these environment variables and pass them to the running container. The underlying `ansible` application, like `ansible-playbook` will then pick up these environment variables and attempt to use them if set! This behavior is great if you want to use host-based collections, e.g. local development, but you need to ensure that you update the `ansible-navigator.yml` configuration file to mount the host collection and/or role directories into the execution environment container. - -## `ansible-navigator` hangs when I run my playbook. What is going on? - -`ansible-navigator` does not handle user prompts when running in the `curses` UI, so actions in your playbook like: +## How do I run my `cloudera-deploy` V1 playbooks in `ansible-navigator`? -* Vault passwords -* SSH passphrases -* Debugger statements - -will not work out-of-the-box. You can enable `ansible-navigator` to run with prompts, but doing so will also disable the UI and instead run its operations using `stdout`. Try adding: - -```bash -ansible-navigator run --enable-prompts ... -``` - -to your execution. +See the [Migration V1](MIGRATION_V1.md) document for details. ## How can I view a previous `ansible-navigator` run to debug an issue? @@ -54,14 +35,6 @@ ansible-navigator replay runs/-.json Then you can use the UI to review the plays, tasks, and inventory for the previous run! -## How can I enable the playbook debugger? - -The [playbook debugger](https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_debugger.html) is enabled in `ansible-navigator` by setting the debugger and then enabling prompts. For example, - -```bash -ANSIBLE_ENABLE_TASK_DEBUGGER=True ansible-navigator run --enable-prompts main.yml -``` - ## How can I select just a single subnet using `subnet_filter`, say for a CDE definition? The various `filters`, like `subnet_filter`, `loadbalancer_subnets_filter`, etc., use [JMESPath](https://jmespath.org/) expressions against a list of subnet objects. Using expression like: @@ -114,22 +87,3 @@ You can [test sample filters](https://play.jmespath.org/?u=45e4d839-15f9-4569-94 } ] ``` - -## How to I configure SSH to avoid a "Failed to connect to new control master" error? - -When running connecting to a host via SSH while running `ansible-navigator`, in particular when you are working with Terraform inventory managed by the `cloud.terraform` inventory plugin, you might encounter the following error: - -``` -Failed to connect to the host via ssh: Control socket connect(/runner/.ansible/cp/b44b170fff): Connection refused -Failed to connect to new control master -``` - -To resolve, be sure to add the following variable to your `ansible-navigator.yml` configuration file: - -```yaml -ansible-navigator: - execution-environment: - environment-variables: - set: - ANSIBLE_SSH_CONTROL_PATH: "/dev/shm/cp%%h-%%p-%%r" -``` From 110520ffe3e6896d6a5aae791b5661ac3d885727 Mon Sep 17 00:00:00 2001 From: Webster Mudge Date: Tue, 26 Sep 2023 20:41:01 -0400 Subject: [PATCH 04/10] Add migration document Signed-off-by: Webster Mudge --- MIGRATION_V1.md | 187 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 187 insertions(+) create mode 100644 MIGRATION_V1.md diff --git a/MIGRATION_V1.md b/MIGRATION_V1.md new file mode 100644 index 0000000..bd5a1a3 --- /dev/null +++ b/MIGRATION_V1.md @@ -0,0 +1,187 @@ +# Migrating from V1 to V2 of `cloudera-deploy` + +## In Summary + +1. Don't change your `definition.yml` or `cluster.yml` files. +2. Create a playbook within your project to run your setup. You can start by referencing the following: + * [Public Cloud](public-cloud/aws/datalake/main.yml) + * Private Cloud (coming soon!) +3. Create an `ansible-navigator.yml` configuration in your project. You can start by referencing the following: + * [Public Cloud](public-cloud/aws/datalake/ansible-navigator.yml) + * Private Cloud (coming soon!) +4. Run your playbook by using `ansible-navigator` vs. `ansible-playbook`. + * All other arguments apply, so continue to use `-e` and `-t` as needed, e.g. `ansible-navigator run your_playbook.yml -e key=value -t infra,plat,another_tag` + +## In Detail + +So, you may ask yourself, "How do I run my `cloudera-deploy` V1 playbooks in `ansible-navigator`?" + +Previously, you would execute the `quickstart.sh` script to bootstrap the `cldr-runner` image into a shell and then run your scripts _from the container shell_, e.g. `ansible-playbook /opt/cloudera-deploy/main.yml -e "definition_path=examples/sandbox" -t run,default_cluster -vvv`. While this mode is still certainly possible, the introduction of `ansible-navigator` simplifies these action. + +**The most significant change**: the legacy definitions only contain configuration files -- the `definition.yml`, `cluster.yml`, `application.yml`, and `inventory_*` files -- and the legacy `cloudera-deploy` has local playbooks that orchestrated the whole run by calling a "sequence" role in `cloudera.exe`... No longer! + +So, what to do? First off... + +**Your existing platform configurations -- `definition.yml` and `cluster.yml`, specifically -- remain as they are. No changes are needed.** + +What does need to change? + +**You need to provide an entrypoint playbook.** + +Your project now needs a playbook, ala `main.yml`, to coordinate execution. This change allows for considerable flexibility as to how and when infrastructure and platform runlevels execute - frankly, how and when _any_ tasks, runlevel or otherwise, are run. + +In short, we have moved the responsiblity of managing key sections of the "runlevel" from the `cloudera_deploy` _application_ to the project _itself_. This allows you, on a per-project basis, to define _exactly_ what you want, when you need it. Yet, you still can call on the common, shared order-of-operations for installing Cloudera Manager or spinning up a CDP Public Cloud Datalake that the legacy `cloudera-deploy` once had, rather forced you to have. A simple `ansible.builtin.import_playbook` pragma will include these _collection playbooks_ from the updated `cloudera.exe` collection. + +Here is an example. The previous `main.yml` file eventually calls the `cloudera.exe.sequence` role, which in turn calls the _runlevel_ roles. + +```yaml +# cloudera.exe.sequence/tasks/main.yml + +- name: Validate Infrastructure Configuration + ansible.builtin.include_role: + name: cloudera.exe.infrastructure + tasks_from: validate + # Truncated for clarity + +- name: Validate Platform Configuration + ansible.builtin.include_role: + name: cloudera.exe.platform + tasks_from: validate + # Truncated for clarity + +- name: Validate Runtime Configuration + ansible.builtin.include_role: + name: cloudera.exe.runtime + tasks_from: validate + # Truncated for clarity +``` + +([See this file in its entirety.](https://github.com/cloudera-labs/cloudera.exe/blob/v1.7.5/roles/sequence/tasks/main.yml)) + +The _v2.x_ of `cloudera.exe` (and via proxy, `cloudera-deploy`) moves this code from the role _into_ a playbook within `cloudera.exe`. + +Here is a _v2.x_ entrypoint playbook. It assumes that you want to handle infrastructure - say, for a sandbox install - as well as the CDP Public Cloud setup. (There is an explicit playbook to teardown.) + +```yaml +# cloudera-deploy/public-cloud/aws/datalake/main.yml + +- name: Set up the cloudera-deploy variables + hosts: localhost + connection: local + gather_facts: yes + tasks: + - name: Read definition variables + ansible.builtin.include_role: + name: cloudera.exe.init_deployment + public: yes + when: init__completed is undefined + tags: + - always + +- name: Set up CDP Public Cloud infrastructure (Ansible-based) + ansible.builtin.import_playbook: cloudera.exe.pbc_infra_setup.yml + +- name: Set up CDP Public Cloud (Env and DL example) + ansible.builtin.import_playbook: cloudera.exe.pbc_setup.yml +``` + +And the new `cloudera.exe` playbooks? + +```yaml +# cloudera.exe/playbooks/pbc_infra_setup.yml + +- name: Set up CDP Public Cloud infrastructure (Ansible-based) + hosts: "{{ target | default('localhost') }}" + environment: "{{ globals.env_vars }}" + gather_facts: yes + tasks: + - name: Validate CDP Public Cloud infrastructure configuration + ansible.builtin.import_role: + name: cloudera.exe.infrastructure + tasks_from: validate + tags: + - validate + - initialize + - infra + + - name: Initialize CDP Public Cloud infrastructure setup + ansible.builtin.import_role: + name: cloudera.exe.infrastructure + tasks_from: initialize_setup + tags: + - initialize + - infra + + - name: Set up CDP Public Cloud infrastructure + ansible.builtin.import_role: + name: cloudera.exe.infrastructure + tasks_from: setup + tags: + - infra +``` + +```yaml +# cloudera.exe/playbooks/pbc_setup.yml + +- name: Set up CDP Public Cloud + hosts: "{{ target | default('localhost') }}" + environment: "{{ globals.env_vars }}" + gather_facts: yes + tasks: + - name: Validate Platform configuration + ansible.builtin.import_role: + name: cloudera.exe.platform + tasks_from: validate + tags: + - validate + - initialize + - plat + - run + + - name: Validate Data Services configuration + ansible.builtin.import_role: + name: cloudera.exe.runtime + tasks_from: validate + tags: + - validate + - initialize + - run + + - name: Initialize Platform setup + ansible.builtin.import_role: + name: cloudera.exe.platform + tasks_from: initialize_setup + tags: + - initialize + - plat + - run + + - name: Set up Platform + ansible.builtin.import_role: + name: cloudera.exe.platform + tasks_from: setup + tags: + - plat + - run + + - name: Initialize Data Services setup + ansible.builtin.import_role: + name: cloudera.exe.runtime + tasks_from: initialize_setup + tags: + - initialize + - run + + - name: Set up Data Services + ansible.builtin.import_role: + name: cloudera.exe.runtime + tasks_from: setup + tags: + - run +``` + +You can see that instead of calling the role and passing Ansible tags, you call the playbook, which now has _the very same code_ but without the need for some of the tags or the intermediate role, `cloudera.exe.sequence`. In fact, the playbooks in `cloudera.exe` have become the `cloudera.exe.sequence` role. + +You don't want to use the infrastructure playbook because you have your own process for establishing infrastructure? Great! Remove the `import_playbook` and call whatever is necessary! So long as you have run `cloudera.exe.init_deployment` in your project's playbook(s) _prior_ to importing any of the _collection playbooks_, you can use the collection playbooks anytime in your project playbooks. + +Need to discuss this further? Stop by the [Discussions > Help](https://github.com/cloudera-labs/cloudera.exe/discussions/categories/help)! From c944c40a9fa73c27c61fcb71fd017bfcc9ac4120 Mon Sep 17 00:00:00 2001 From: Webster Mudge Date: Wed, 27 Sep 2023 15:12:11 -0400 Subject: [PATCH 05/10] Update image and pull policies, remove ANSIBLE_COLLECTIONS_PATHS Signed-off-by: Webster Mudge --- public-cloud/aws/cde/ansible-navigator.yml | 8 ++------ public-cloud/aws/cdf/ansible-navigator.yml | 8 ++------ public-cloud/aws/cml/ansible-navigator.yml | 8 ++------ public-cloud/aws/datalake/ansible-navigator.yml | 5 ++--- 4 files changed, 8 insertions(+), 21 deletions(-) diff --git a/public-cloud/aws/cde/ansible-navigator.yml b/public-cloud/aws/cde/ansible-navigator.yml index 81bb177..10fe096 100644 --- a/public-cloud/aws/cde/ansible-navigator.yml +++ b/public-cloud/aws/cde/ansible-navigator.yml @@ -44,14 +44,10 @@ ansible-navigator: ANSIBLE_DEPRECATION_WARNINGS: False ANSIBLE_HOST_KEY_CHECKING: False ANSIBLE_SSH_RETRIES: 10 - image: ghcr.io/cloudera-labs/cldr-runner:aws-devel02 + image: ghcr.io/cloudera-labs/cldr-runner:aws-latest pull: - arguments: - - "--tls-verify=false" + policy: missing volume-mounts: - - src: "${ANSIBLE_COLLECTIONS_PATH}" - dest: "${ANSIBLE_COLLECTIONS_PATH}" - options: "Z" - src: "~/.aws" dest: "/runner/.aws" options: "Z" diff --git a/public-cloud/aws/cdf/ansible-navigator.yml b/public-cloud/aws/cdf/ansible-navigator.yml index a0dc3ef..10fe096 100644 --- a/public-cloud/aws/cdf/ansible-navigator.yml +++ b/public-cloud/aws/cdf/ansible-navigator.yml @@ -44,14 +44,10 @@ ansible-navigator: ANSIBLE_DEPRECATION_WARNINGS: False ANSIBLE_HOST_KEY_CHECKING: False ANSIBLE_SSH_RETRIES: 10 - image: ghcr.io/cloudera-labs/cldr-runner:aws-devel + image: ghcr.io/cloudera-labs/cldr-runner:aws-latest pull: - arguments: - - "--tls-verify=false" + policy: missing volume-mounts: - - src: "${ANSIBLE_COLLECTIONS_PATH}" - dest: "${ANSIBLE_COLLECTIONS_PATH}" - options: "Z" - src: "~/.aws" dest: "/runner/.aws" options: "Z" diff --git a/public-cloud/aws/cml/ansible-navigator.yml b/public-cloud/aws/cml/ansible-navigator.yml index a0dc3ef..10fe096 100644 --- a/public-cloud/aws/cml/ansible-navigator.yml +++ b/public-cloud/aws/cml/ansible-navigator.yml @@ -44,14 +44,10 @@ ansible-navigator: ANSIBLE_DEPRECATION_WARNINGS: False ANSIBLE_HOST_KEY_CHECKING: False ANSIBLE_SSH_RETRIES: 10 - image: ghcr.io/cloudera-labs/cldr-runner:aws-devel + image: ghcr.io/cloudera-labs/cldr-runner:aws-latest pull: - arguments: - - "--tls-verify=false" + policy: missing volume-mounts: - - src: "${ANSIBLE_COLLECTIONS_PATH}" - dest: "${ANSIBLE_COLLECTIONS_PATH}" - options: "Z" - src: "~/.aws" dest: "/runner/.aws" options: "Z" diff --git a/public-cloud/aws/datalake/ansible-navigator.yml b/public-cloud/aws/datalake/ansible-navigator.yml index b4b1e6b..10fe096 100644 --- a/public-cloud/aws/datalake/ansible-navigator.yml +++ b/public-cloud/aws/datalake/ansible-navigator.yml @@ -44,10 +44,9 @@ ansible-navigator: ANSIBLE_DEPRECATION_WARNINGS: False ANSIBLE_HOST_KEY_CHECKING: False ANSIBLE_SSH_RETRIES: 10 - image: ghcr.io/cloudera-labs/cldr-runner:aws-devel + image: ghcr.io/cloudera-labs/cldr-runner:aws-latest pull: - arguments: - - "--tls-verify=false" + policy: missing volume-mounts: - src: "~/.aws" dest: "/runner/.aws" From 7f6bbf9e2cfa0d3a9e522e20a1834dd43278ddab Mon Sep 17 00:00:00 2001 From: Webster Mudge Date: Wed, 27 Sep 2023 15:13:01 -0400 Subject: [PATCH 06/10] Rename public-cloud/aws/tf to public-cloud/aws/datalake-tf Signed-off-by: Webster Mudge --- public-cloud/aws/{tf => datalake-tf}/.gitignore | 0 public-cloud/aws/{tf => datalake-tf}/README.md | 0 .../aws/{tf => datalake-tf}/ansible-navigator.yml | 8 ++------ public-cloud/aws/{tf => datalake-tf}/config.yml | 0 public-cloud/aws/{tf => datalake-tf}/inventory.ini | 0 public-cloud/aws/{tf => datalake-tf}/main.yml | 0 .../aws/{tf => datalake-tf}/pbc_deploy_tf/main.tf | 0 .../aws/{tf => datalake-tf}/pbc_deploy_tf/outputs.tf | 0 .../aws/{tf => datalake-tf}/pbc_deploy_tf/variables.tf | 0 public-cloud/aws/{tf => datalake-tf}/pbc_infra_tf/main.tf | 0 .../aws/{tf => datalake-tf}/pbc_infra_tf/outputs.tf | 0 .../aws/{tf => datalake-tf}/pbc_infra_tf/variables.tf | 0 public-cloud/aws/{tf => datalake-tf}/teardown.yml | 0 .../templates/cdp_aws_deploy.tfvars.j2 | 0 .../templates/cdp_aws_prereqs.tfvars.j2 | 0 15 files changed, 2 insertions(+), 6 deletions(-) rename public-cloud/aws/{tf => datalake-tf}/.gitignore (100%) rename public-cloud/aws/{tf => datalake-tf}/README.md (100%) rename public-cloud/aws/{tf => datalake-tf}/ansible-navigator.yml (88%) rename public-cloud/aws/{tf => datalake-tf}/config.yml (100%) rename public-cloud/aws/{tf => datalake-tf}/inventory.ini (100%) rename public-cloud/aws/{tf => datalake-tf}/main.yml (100%) rename public-cloud/aws/{tf => datalake-tf}/pbc_deploy_tf/main.tf (100%) rename public-cloud/aws/{tf => datalake-tf}/pbc_deploy_tf/outputs.tf (100%) rename public-cloud/aws/{tf => datalake-tf}/pbc_deploy_tf/variables.tf (100%) rename public-cloud/aws/{tf => datalake-tf}/pbc_infra_tf/main.tf (100%) rename public-cloud/aws/{tf => datalake-tf}/pbc_infra_tf/outputs.tf (100%) rename public-cloud/aws/{tf => datalake-tf}/pbc_infra_tf/variables.tf (100%) rename public-cloud/aws/{tf => datalake-tf}/teardown.yml (100%) rename public-cloud/aws/{tf => datalake-tf}/templates/cdp_aws_deploy.tfvars.j2 (100%) rename public-cloud/aws/{tf => datalake-tf}/templates/cdp_aws_prereqs.tfvars.j2 (100%) diff --git a/public-cloud/aws/tf/.gitignore b/public-cloud/aws/datalake-tf/.gitignore similarity index 100% rename from public-cloud/aws/tf/.gitignore rename to public-cloud/aws/datalake-tf/.gitignore diff --git a/public-cloud/aws/tf/README.md b/public-cloud/aws/datalake-tf/README.md similarity index 100% rename from public-cloud/aws/tf/README.md rename to public-cloud/aws/datalake-tf/README.md diff --git a/public-cloud/aws/tf/ansible-navigator.yml b/public-cloud/aws/datalake-tf/ansible-navigator.yml similarity index 88% rename from public-cloud/aws/tf/ansible-navigator.yml rename to public-cloud/aws/datalake-tf/ansible-navigator.yml index a0dc3ef..10fe096 100644 --- a/public-cloud/aws/tf/ansible-navigator.yml +++ b/public-cloud/aws/datalake-tf/ansible-navigator.yml @@ -44,14 +44,10 @@ ansible-navigator: ANSIBLE_DEPRECATION_WARNINGS: False ANSIBLE_HOST_KEY_CHECKING: False ANSIBLE_SSH_RETRIES: 10 - image: ghcr.io/cloudera-labs/cldr-runner:aws-devel + image: ghcr.io/cloudera-labs/cldr-runner:aws-latest pull: - arguments: - - "--tls-verify=false" + policy: missing volume-mounts: - - src: "${ANSIBLE_COLLECTIONS_PATH}" - dest: "${ANSIBLE_COLLECTIONS_PATH}" - options: "Z" - src: "~/.aws" dest: "/runner/.aws" options: "Z" diff --git a/public-cloud/aws/tf/config.yml b/public-cloud/aws/datalake-tf/config.yml similarity index 100% rename from public-cloud/aws/tf/config.yml rename to public-cloud/aws/datalake-tf/config.yml diff --git a/public-cloud/aws/tf/inventory.ini b/public-cloud/aws/datalake-tf/inventory.ini similarity index 100% rename from public-cloud/aws/tf/inventory.ini rename to public-cloud/aws/datalake-tf/inventory.ini diff --git a/public-cloud/aws/tf/main.yml b/public-cloud/aws/datalake-tf/main.yml similarity index 100% rename from public-cloud/aws/tf/main.yml rename to public-cloud/aws/datalake-tf/main.yml diff --git a/public-cloud/aws/tf/pbc_deploy_tf/main.tf b/public-cloud/aws/datalake-tf/pbc_deploy_tf/main.tf similarity index 100% rename from public-cloud/aws/tf/pbc_deploy_tf/main.tf rename to public-cloud/aws/datalake-tf/pbc_deploy_tf/main.tf diff --git a/public-cloud/aws/tf/pbc_deploy_tf/outputs.tf b/public-cloud/aws/datalake-tf/pbc_deploy_tf/outputs.tf similarity index 100% rename from public-cloud/aws/tf/pbc_deploy_tf/outputs.tf rename to public-cloud/aws/datalake-tf/pbc_deploy_tf/outputs.tf diff --git a/public-cloud/aws/tf/pbc_deploy_tf/variables.tf b/public-cloud/aws/datalake-tf/pbc_deploy_tf/variables.tf similarity index 100% rename from public-cloud/aws/tf/pbc_deploy_tf/variables.tf rename to public-cloud/aws/datalake-tf/pbc_deploy_tf/variables.tf diff --git a/public-cloud/aws/tf/pbc_infra_tf/main.tf b/public-cloud/aws/datalake-tf/pbc_infra_tf/main.tf similarity index 100% rename from public-cloud/aws/tf/pbc_infra_tf/main.tf rename to public-cloud/aws/datalake-tf/pbc_infra_tf/main.tf diff --git a/public-cloud/aws/tf/pbc_infra_tf/outputs.tf b/public-cloud/aws/datalake-tf/pbc_infra_tf/outputs.tf similarity index 100% rename from public-cloud/aws/tf/pbc_infra_tf/outputs.tf rename to public-cloud/aws/datalake-tf/pbc_infra_tf/outputs.tf diff --git a/public-cloud/aws/tf/pbc_infra_tf/variables.tf b/public-cloud/aws/datalake-tf/pbc_infra_tf/variables.tf similarity index 100% rename from public-cloud/aws/tf/pbc_infra_tf/variables.tf rename to public-cloud/aws/datalake-tf/pbc_infra_tf/variables.tf diff --git a/public-cloud/aws/tf/teardown.yml b/public-cloud/aws/datalake-tf/teardown.yml similarity index 100% rename from public-cloud/aws/tf/teardown.yml rename to public-cloud/aws/datalake-tf/teardown.yml diff --git a/public-cloud/aws/tf/templates/cdp_aws_deploy.tfvars.j2 b/public-cloud/aws/datalake-tf/templates/cdp_aws_deploy.tfvars.j2 similarity index 100% rename from public-cloud/aws/tf/templates/cdp_aws_deploy.tfvars.j2 rename to public-cloud/aws/datalake-tf/templates/cdp_aws_deploy.tfvars.j2 diff --git a/public-cloud/aws/tf/templates/cdp_aws_prereqs.tfvars.j2 b/public-cloud/aws/datalake-tf/templates/cdp_aws_prereqs.tfvars.j2 similarity index 100% rename from public-cloud/aws/tf/templates/cdp_aws_prereqs.tfvars.j2 rename to public-cloud/aws/datalake-tf/templates/cdp_aws_prereqs.tfvars.j2 From a3c90cd62448f6e62d85e929c71bab77e2b10e3f Mon Sep 17 00:00:00 2001 From: Webster Mudge Date: Wed, 27 Sep 2023 15:13:14 -0400 Subject: [PATCH 07/10] Add CONTRIBUTING.md Signed-off-by: Webster Mudge --- CONTRIBUTING.md | 82 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 82 insertions(+) create mode 100644 CONTRIBUTING.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..53a1ce7 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,82 @@ +# Contributing to cloudera-deploy + +Thank you for considering contributions to the `cloudera-deploy` project! + +# Submitting a pull request + +You can start work on issues that are not yet part of a [Milestone](https://github.com/cloudera-labs/cloudera-deploy/milestones) -- anything in our issue tracker that isn't assigned to a Milestone is considered the [backlog](https://github.com/cloudera-labs/cloudera-deploy/issues?q=is%3Aopen+is%3Aissue+no%3Amilestone). + +Before you start working, please announce that you want to do so by commenting on the issue. _([Create an issue](https://github.com/cloudera-labs/cloudera-deploy/issues/new?labels=enhancement) if there isn't one yet, and you can also check out our [Discussions](https://github.com/cloudera-labs/cloudera-deploy/discussions) for ideas.)_ We try to ensure that all active work is assigned to a Milestone in order to keep our backlog accurate. + +**When your work is ready for review, create a branch in your own forked repository from the `devel` branch and submit a pull request against `devel`, referencing your the issue.** + +As a _best practice_, you can prefix your branches with: + +|prefix|Description|Example| +|------|-----------|-------| +|`feature/`|A new feature or changes existing to existing code or documentation|`feature/update-some-params`| +|`fix/`|A non-urgent bug fix|`fix/refactor-some-params`| +|`hotfix/`|An urgent bug fix|`hotfix/patch-insecure-params`| + +> [!NOTE] +> :fire_extinguisher: A **hotfix** should branch from `main`. It will then be committed to both the `main` and `devel` branches. + +# Signing your commits + +Note that we require signed commits inline with [Developer Certificate of Origin](https://developercertificate.org/) best-practices for open source collaboration. + +A signed commit is a simple one-liner at the end of your commit message that states that you wrote the patch or otherwise have the right to pass the change into open source. Signing your commits means you agree to: + +``` +Developer Certificate of Origin +Version 1.1 + +Copyright (C) 2004, 2006 The Linux Foundation and its contributors. +660 York Street, Suite 102, +San Francisco, CA 94110 USA + +Everyone is permitted to copy and distribute verbatim copies of this +license document, but changing it is not allowed. + + +Developer's Certificate of Origin 1.1 + +By making a contribution to this project, I certify that: + +(a) The contribution was created in whole or in part by me and I + have the right to submit it under the open source license + indicated in the file; or + +(b) The contribution is based upon previous work that, to the best + of my knowledge, is covered under an appropriate open source + license and I have the right under that license to submit that + work with modifications, whether created in whole or in part + by me, under the same open source license (unless I am + permitted to submit under a different license), as indicated + in the file; or + +(c) The contribution was provided directly to me by some other + person who certified (a), (b) or (c) and I have not modified + it. + +(d) I understand and agree that this project and the contribution + are public and that a record of the contribution (including all + personal information I submit with it, including my sign-off) is + maintained indefinitely and may be redistributed consistent with + this project or the open source license(s) involved. +``` + +(See [developercertificate.org](https://developercertificate.org/)) + +To agree, make sure to add line at the end of every git commit message, like this: + +``` +Signed-off-by: John Doe +``` + +> [!NOTE] +> :rocket: TIP! Add the sign-off automatically when creating the commit via the `-s` flag, e.g. `git commit -s`. + +# Have questions? Opinions? Comments? + +Come find us on our [Discussions](https://github.com/cloudera-labs/cloudera-deploy/discussions)! From d1eb0e72a7fbe7624efabdcd2d6e2c56e9df26c3 Mon Sep 17 00:00:00 2001 From: Webster Mudge Date: Wed, 27 Sep 2023 15:13:34 -0400 Subject: [PATCH 08/10] Remove profile.yml and readme.adoc Signed-off-by: Webster Mudge --- profile.yml | 83 -------- readme.adoc | 577 ---------------------------------------------------- 2 files changed, 660 deletions(-) delete mode 100644 profile.yml delete mode 100644 readme.adoc diff --git a/profile.yml b/profile.yml deleted file mode 100644 index 1912ed8..0000000 --- a/profile.yml +++ /dev/null @@ -1,83 +0,0 @@ -# Copyright 2021 Cloudera, Inc. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -### Mandatory ### - -## Authentication # -## Ideally for toy deployment 12-16 chars, 1 Cap, 1 num, 1 special. This satisfies password requirements for *most* subsystems -## You should set proper and unique passwords for serious deployments using documented overrides -#admin_password: Mysecretpword1! - -### Optional ### -## The following configurations will be applied globally - -## Namespace ## - -## This will be prefixed to most names used. All deployments should be prefixed to allow differentiation and controlled teardown -## Ideally it should not exceed 6 chars, and must start with a letter, and be <= 4 chars for Azure. -## Will default to the name_prefix in roles/cloudera_deploy/defaults/main.yml if not set -#name_prefix: cldr - -## Tags ## - -## Tags are applied to deployed infrastructure, particularly chargeable infrastructure on cloud providers -## You should use tags to identify your services to make it easy to track and remove them when no longer needed -#tags: -# owner: dchaffelson@cloudera.com -# enddate: "01012022" - -## Cloud Infrastructure ## - -## Specifies the Cloud Infrastructure provider, CDP presently supports GCP, AWS and Azure -## Not necessary when using static Ansible inventory -#infra_type: aws - -## Specify the default region you prefer for your infrastructure provider -## It must be valid for the selected infra_type -## The automation service attempts to validate that the requested build will work in the given region, so some may be rejected -#infra_region: us-east-1 - -## SSH ## - -## Public Key file if using Azure or GCP as Cloud Infrastructure, or deploying Private Cloud -## Should be in your local profile or the Definition Path -## If not supplied one will be generated in the Definition path, along with a matching Private Key file ignoring the private_key_file setting below -#public_key_file: '~/.ssh/mykey.pub' - -## Private key file -## Required if deploying Dynamic Inventory to set the Ansible Connection Parameters -## Should be in your local profile or the Definition Path -## Must be set if public_key_file is set -#private_key_file: '~/.ssh/mykey.pem' - -## Key Name if using AWS as Cloud Infrastructure -## Defaults to the Namespace if not set -## Must be set if public_key_file is set -#public_key_id: mykey - -## Cloudera License ## - -## Path to your Cloudera License file -## Required if deploying a CDP Cluster for Private Cloud in a mode other than Trial -## Should be in your local profile or the definition directory -## A cluster with a Trial license will be deployed if this is not specified -#license_file: "~/.cdp/my_cloudera_license_2021.txt" - -## Cloud Credentials ## - -## Path to Google Cloud Credentials, if using Google Cloud -## Should be in your local profile -## We recommend they should not be located anywhere near a version controlled directory like git to avoid accidental inclusion! -## If using Azure or AWS the credentials will be automatically collected from your local user profile -#gcloud_credential_file: '~/.config/gcloud/mycreds.json' diff --git a/readme.adoc b/readme.adoc deleted file mode 100644 index a81b929..0000000 --- a/readme.adoc +++ /dev/null @@ -1,577 +0,0 @@ -= Cloudera Deploy -cloudera-labs@cloudera.com -v1.6.1 -:page-layout: docs -:description: Cloudera Deploy Documentation -:imagesdir: ./images -:icons: font -:toc: -:toc-placement!: -:sectnums: -:sectnumlevels 3: -ifdef::env-github[] -:tip-caption: :bulb: -:note-caption: :information_source: -:important-caption: :heavy_exclamation_mark: -:caution-caption: :fire: -:warning-caption: :warning: -endif::[] - -toc::[] - -== Automation For the Cloudera Data Platform - -Cloudera Deploy is a toolset for deploying the Cloudera Data Platform (CDP). Its scope includes -** Public Cloud** and **Private Cloud** products, **Private Cloud Base** clusters, and application setup, execution, and other post-deployment functions. - -You can use Cloudera Deploy as your entrypoint for getting started with CDP. The toolset uses straightforward configuration definitions to instruct the automation functions, yet is extensible and highly configurable. The toolset can be a great foundation for custom entrypoints, CI/CD pipelines, and development environments. - -== Quickstart - -=== Prerequisites - -:sectnums: - -==== Install Docker - -Cloudera-Deploy bundles nearly all the software dependencies you need into a convenient Docker Container, so first you will need to get the latest version of **Docker Engine**. - -* https://docs.docker.com/docker-for-windows/install/[For Windows] -* https://docs.docker.com/docker-for-mac/install/[For Macs] -* Linux users, use your package manager. - -WARNING: Be sure you uninstall any earlier versions of Docker, i.e. `docker`, and install the latest version, i.e. `docker-ce`. See https://docs.docker.com/engine/install/[Install Docker Engine] for futher details. - -TIP: If you have not used Docker before, consider following their quick https://docs.docker.com/get-started/#start-the-tutorial[Tutorial] to validate it is working and familiarise yourself with the interface - -==== (Optional) Install Git - -NOTE: Git is required if you intend to clone the software for local editing, if you just intend to Run the automation tools you may skip this step. - -There are excellent instructions for installing Git on all Operating Systems on the https://git-scm.com/book/en/v2/Getting-Started-Installing-Git[Git website] - -==== (Optional) Install AWS CLI - -If you are going to be working with AWS, you will want the latest version of the **AWS CLI**. - -* https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2-windows.html[For Windows] -* https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2-mac.html[For Macs] -* Linux users, use your package manager. - -NOTE: The Quickstart image prepackages the AWS CLI, so it is optional to also install it locally - -If this is the first time you are installing the AWS CLI, configure the program by providing your credentials, and test that your credentials work -[source, bash] ----- -aws configure -aws iam get-user ----- - -Visit the https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html[AWS CLI User Guide] for further details regarding credential management. - -==== (Optional) Install CDP CLI - -Get the latest version of the **CDP CLI**. - -** https://docs.cloudera.com/cdp/latest/cli/topics/mc-installing-cdp-client.html[Install CDP CLI]. - -NOTE: The Quickstart image prepackages the CPI CLI, so it is optional to also install it locally - -If this is the first time you are installing the CDP CLI, you will need to configure the program by providing the right credentials, and should then test that your credentials work. - -[source, bash] ----- -cdp configure -cdp iam get-user ----- - -Visit the https://docs.cloudera.com/cdp/latest/cli/topics/mc-configuring-cdp-client-with-the-api-access-key.html[CDP CLI User Guide] for further details regarding credential management. - -==== (Recommended) Confirm your SSH Keypair - -Ensure that you have a generated SSH keypair for your local profile. Visit the https://www.ssh.com/academy/ssh/keygen[SSH Keygen How-To] for details. - -NOTE: The Quickstart will generate an SSH keypair if none is provided. - -==== (Recommended) Confirm your SSH Agent - -Ensure that you have a properly configured SSH Agent. Visit the https://www.ssh.com/academy/ssh/keygen#adding-the-key-to-ssh-agent[SSH Agent How-To] for details. - -=== Setup - -==== Option 1: Download the Quickstart script - -The `quickstart.sh` script will set up the Docker container with the software dependencies you need for deployment. - -[source, bash] ----- -curl https://raw.githubusercontent.com/cloudera-labs/cloudera-deploy/main/quickstart.sh -o quickstart.sh ----- - -==== Option 2: Clone the repository - -Clone this, i.e. the `cloudera-deploy`, repository, which contains the `quickstart.sh` script. - -[source, bash] ----- -git clone https://github.com/cloudera-labs/cloudera-deploy.git -cd cloudera-deploy ----- - -WARNING: You are advised not to modify any of the files in the project as a user of the software. The vast majority of changes are managed through configurations provided to these project files. - -==== Confirm your Docker service - -Check that **Docker** is running by running the command to list running Docker containers - -[source,bash] -docker ps -a - -If it is not running, please check your prerequisites process for Docker to install, start, and test the service. - -==== Execute the Quickstart script - -Run the `quickstart.sh` entrypoint script. This script will prepare and execute the Ansible Runner container. - -[source, bash] ----- -chmod +x quickstart.sh -./quickstart.sh ----- - -==== Confirm the Quickstart environment - -Confirm that you have the orange `cldr (build)-(version) #>` prompt. + -This is your interactive Ansible Runner environment and provides builtin access to the relevant dependencies for CDP. - -IMPORTANT: Do _NOT_ run the example definition until you have made the changes below. - -==== Setup your user profile - -Modify your local `cloudera-deploy` user profile. Your profile is present in your `$HOME` directory under `~/.config/cloudera-deploy/profiles/default`. - -[source, bash] ----- -vim ~/.config/cloudera-deploy/profiles/default ----- - -===== Properties to change - -* Recommended -** *admin_password:* Note the password requirements (see the link:profile.yml[profile template] comments). -** *name_prefix:* Note the namespace requirements (see the link:profile.yml[profile template] comments). -** *infra_type:* The valid values are `aws`, `gcp`, `azure`. -** *infra_region:* Region is dependent on the value provided in `infra_type`. -* Optional -** *tags* (see the link:profile.yml[profile template] comments) - -WARNING: Please ensure you provide a valid region for your selected Cloud provider for the `infra_type` property. - -=== Execution - -==== Check your Credentials - -Before running a Deployment, it is good practice to check that the credentials available to the Automation software are functioning correctly and _match the expected accounts_ - generally it is good practice to compare the user and account IDs produced in the terminal match those found in the Browser UI. - -===== CDP - -If you are deploying CDP Public, check your credential is available in your profile - -[source, bash] ----- -cdp iam get-user ----- - -TIP: If you do not yet have a CDP Public credential, follow the Cloudera Documentation https://docs.cloudera.com/cdp/latest/cli/topics/mc-cli-generating-an-api-access-key.html[here] - -===== AWS - -If you are using AWS cloud infrastructure, check your credential is available in your profile - -[source, bash] ----- -aws iam get-user ----- - -===== Azure - -If you are using Azure cloud infrastructure, check you are logged into your account and your credentials are available - -[source, bash] ----- -az account list ----- - -TIP: If you cannot list your Azure accounts, consider using `az login` to refresh your credential - -===== GCP - -If you are using GCP cloud infrastructure, check your service account credential is being picked up. - -WARNING: You need a provisioning Service Account for GCP setup in your `cloudera-deploy` user profile 'gcloud_credential_file' entry. If you do not yet have a Provisioning Service Account you can follow this process in the https://docs.cloudera.com/cdp/latest/gcp-quickstart/topics/mc-gcp-quickstart-step1.html[CDP Documentation] to generate one. - -[source, bash] ----- -gcloud auth list ----- - -==== Run the main playbook - -Run the main playbook with the defaults and your configuration at the orange _cldr_ prompt. - -NOTE: This will create a ' CDP sandbox', which is both a CDP Public Environment and CDP Private Base cluster using your default Cloud Infrastructure Provider credentials. Many other deployments are possible and explained elsewhere. - -[source, bash] ----- -ansible-playbook /opt/cloudera-deploy/main.yml -e "definition_path=examples/sandbox" \ - -t run,default_cluster -vvv ----- - -==== View the Ansible execution logs - -The logs are present at `$HOME/.config/cloudera-deploy/log/latest-` - -[source,bash] ----- -tail -100f $HOME/.config/cloudera-deploy/log/latest-2021-05-08_150448 ----- - -IMPORTANT: The total time to deploy varies from 90 to 150 minutes, depending on CDN, network connectivity, etc. Keep checking the logs; if there are no errors, the scripts are working in the background. - -=== Upgrade - -Cloudera-Deploy is regularly updated by the maintainers with new features and fixes. + -The `quickstart.sh` script will check for an updated Container image to use if there is currently no Container running. + -You may use the following process to trigger this behavior. - -WARNING: This will close any active `cldr` sessions you may have running. - -Stop the cloudera-deploy Docker Container -[source, bash] ----- -docker stop cloudera-deploy ----- - -WARNING: If you have made local uncommitted changes to cloudera-deploy, you must resolve them before updating - -In the cloudera-deploy directory, pull the latest changes with git - -[source, bash] ----- -git fetch --all -git pull ----- - -Finally, rerun the quickstart to download the latest image. - -TIP: You can stop the Docker Container and rerun the quickstart at any time to download the latest image - -[source, bash] ----- -./quickstart.sh ----- - -== Project Details - -CAUTION: Don't change the project configuration without getting comfortable with the *quickstart* a few times. - -NOTE: Below pages will be migrated to Github pages shortly. - -Cloudera Deploy is powered by https://github.com/ansible/ansible[Ansible] and provides a standard configuration and execution model for CDP deployments and their applications. It can be run within a container, or directly on a host. - -Specifically, Cloudera Deploy is an Ansible project that uses a set of playbooks, roles, and tags to construct a runlevel-like management experience for cloud and cluster deployments. It leverages several collections, both Cloudera and third-party. - -=== Software Dependencies - -Cloudera Deploy requires a number of host applications, services, and Python libraries for its execution. These dependencies are already packaged for ease-of-use in https://github.com/cloudera-labs/cldr-runner[Cloudera Labs Ansible-Runner], another project within Cloudera Labs, and are made readily accessible through the `quickstart.sh` script. - -Alternatively, and especially if you plan on running Cloudera Deploy in your own environment, you may install the dependencies yourself. - -==== Collections and Roles - -Cloudera Deploy relies directly on a number of Ansible collections: - -- https://github.com/cloudera-labs/cloudera.exe[`cloudera.exe`] -- https://github.com/cloudera-labs/cloudera.cluster[`cloudera.cluster`] -- https://github.com/cloudera-labs/cloudera.cloud[`cloudera.cloud`] - -And roles: - -- `geerlingguy.postgresql` -- `ansible-role-mysql` - -These collection dependencies can be found in the https://github.com/cloudera-labs/cldr-runner/tree/main/payload/deps/ansible.yml[`ansible.yml`] file in the `cldr-runner` project. - -Cloudera Deploy does have a single dependency for its own execution, the https://github.com/ansible-collections/community.crypto[`community.crypto`] collection. To install all of these dependencies, you can run the following: - -[source, bash] ----- -# Get the cldr-runner dependency file first -curl https://raw.githubusercontent.com/cloudera-labs/cldr-runner/main/payload/deps/ansible.yml \ - --output requirements.yml - -# Install the collections (and their dependencies) -ansible-galaxy collection install -r requirements.yml - -# Install the roles -ansible-galaxy role install -r requirements.yml - -# Install the crypto collection -ansible-galaxy collection install community.crypto ----- - -==== Python and Clients - -The supporting Python libraries and other clients can be installed using the various https://github.com/cloudera-labs/cldr-runner/tree/main/payload/deps[dependencies] files in the `cldr-runner` project directly. You might find it easier to follow the installation instructions for https://github.com/cloudera-labs/cloudera.exe[`cloudera.exe`] and https://github.com/cloudera-labs/cloudera.cluster[`cloudera.cluster`], the two collections that drive this set of dependencies. - -For the https://github.com/ansible-collections/community.crypto[`community.crypto`] collection dependency, you will need to ensure that the `ssh-keygen` executable is on your Ansible controller. - -The dependencies cover the full range of the automation tooling, from infrastructure on public or private cloud to the relevant Cloudera platform assets. If you are only working with a limited part of the tooling, then you may not need the full list of dependencies. e.g., if you are only working with AWS infrastructure, it is safe to only install those dependencies or use the tagged https://github.com/orgs/cloudera-labs/packages/container/package/cldr-runner[`cldr-runner`] version. - -=== User Input Dependencies - -Cloudera Deploy does require a small set of user-supplied information for a successful deployment. A minimum set of user inputs is defined in a _profile_ file (see the link:profile.yml[profile.yml] template for details). For example, the `profile.yml` should define your password for the Administrator account of the deployed services, and you should set a unique `name_prefix` to avoid clashing with other deployments. - -The default location for profiles is `~/.config/cloudera-deploy/profiles/`. Cloudera Deploy looks for the `default` file in this directory unless the Ansible runtime variable `profile` is set, e.g. `-e profile=my_custom_profile`. Creating additional profiles is simple, and you can use the `profile.yml` template as your starting point. - -==== CDP Public Cloud - -For CDP Public Cloud, you will need an _Access Key_ and _Secret_ set in your user profile. The tooling uses your default profile unless you instruct it otherwise. (See https://docs.cloudera.com/cdp/latest/cli/topics/mc-configuring-cdp-client-with-the-api-access-key.html[Configuring CDP client with the API access key].) - -==== Cloud Providers - -For Azure and AWS infrastructure, the process is similar, and these parameters may likewise be overridden. - -For Google Cloud, we suggest you issue a credentials file, store it securely in your profile, and then provide the path to that file in `profile.yml`, as this works best with both CLI and Ansible Gcloud interactions. - -We suggest you set your default `infra_type` in `profile.yml` to match your preferred default Public Cloud Infrastructure credentials. - -==== CDP Private Cloud - -For CDP Private Cloud you will need a valid Cloudera license file in order to download the software from the Cloudera repositories. We suggest this is stored in your user profile in `~/.cdp/` and set in the `profile.yml` config file. - -If you are also using Public Cloud infrastructure to host your CDP Private Cloud clusters, then you will need those credentials as well. - -=== Support Matrix -✓ - Supported - -O - Support in CDP, but not in Cloudera-Deploy - -X - Not Supported in CDP - -[width="80%",cols="4,1,1,1"options="header"] -|======================================================== -|Experience |AWS |Azure |GCP -|Environment (Light Duty) |✓ | ✓ | ✓ -|Environment (Medium Duty) |O | O |O -|Data Lake (Light Duty) |✓ | ✓ | ✓ -|Data Lake (Medium Duty) |O |O |O -|Data Hub|✓ |✓ |✓ -|Data Warehouse|✓ |O |X -|Data Engineering|O |O |X -|Data Flow|✓ |X |X -|Machine Learning|✓ |✓ |X -|Operational Database|✓ |✓ |X -|======================================================== - - -== SSH Host Key Checking - -For CDP Private Cloud clusters and other direct inventory scenarios, you will need to manage SSH host key validation appropriate to your specific environment. - -IMPORTANT: By default, the `quickstart.sh` script explicitly sets the `ANSIBLE_HOST_KEY_CHECKING` variable to `False` for ease-of-use with an introductory deployment. However, this setting is *not recommended* for any other deployment type. **For all other deployment types, you should directly manage your SSH host key checking.** - -A common approach is to create your own "startup" script using the `quickstart.sh` as a template, and setting the appropriate https://docs.ansible.com/ansible/latest/reference_appendices/config.html[Ansible SSH configuration variables]. - -In some scenarios, for example, a reused pool of dynamic hosts within a development Openstack environment, you might wish to manage this control from your host machine's SSH config file. For example: - -[source] ----- -# ~/.ssh/config - -# Disable host key checking only for your specific environment -Host *.your.development.domain - StrictHostKeyChecking no ----- - -These settings will flow from your host to the Docker container's environment if you use the `quickstart.sh` script. - -== Execution - -Cloudera Deploy utilizes a single entrypoint playbook -- `main.yml` -- that examines the user-provided <> details, a deployment <>, and any optional Ansible `tags` and then runs the appropriate actions. At minimum, you execute a deployment like so: - -[source,bash] ----- -ansible-playbook /main.yml \ - -e "definition_path=" ----- - -NOTE: The location defined by `definition_path` is relative _to the location of the `main.yml` playbook_, and can also be an absolute location. - -=== Tags - -Cloudera Deploy exposes a set of Ansible tags that allows fine-grained inclusion and exclusion of functions, in particular, a runlevel-like management process. - -.Partial List of Available Execution Tags -[cols="1,1"] -|=== -|`infra` -|Infrastructure (cloud provider assets) - -|`plat` -|Platform (CDP Public Cloud Datalakes). Assumes `infra`. - -|`run` -|Runtime (CDP Public Cloud experiences, e.g. Cloudera Machine Learning (CML)). Assumes `infra` and `plat`. - -|`full_cluster` -|CDP Private Cloud Base Clusters. -|=== - -Current Tags: _verify_inventory, verify, full_cluster, default_cluster, verify_definition, custom_repo, verify_parcels, database, security, kerberos, tls, ha, os, users, jdk, mysql_connector, oracle_connector, fetch_ca, cm, license, autotls, prereqs, restart_agents, heartbeat, mgmt, preload_parcels, kts, kms, restart_stale, teardown_ca, teardown_all, teardown_tls, teardown_cluster, infra, init, plat, run, validate_ - -With these tags, you can set your deployment to a given "runlevel" state: - -[source,bash] ----- -# Ensure only the infrastructure layer is available -ansible-playbook main.yml -e "definition_path=my_example" -t infra ----- - -or select or skip a level or function: - -[source,bash] ----- -# Ensure the platform and runtimes are available, but skip any infrastructure -ansible-playbook main.yml -e "definition_path=my_example" -t run --skip-tags infra ----- - -WARNING: Setting a deployment to a lower runlevel, e.g. from `run` to `infra` will teardown deployed components in the higher runlevels. - -For further details on the various _runlevel_-like tags for CDP Public Cloud, see the https://github.com/cloudera-labs/cloudera.exe/blob/main/docs/runlevels.md[Runlevel Guide] in the `cloudera.exe` project. - -=== Terraform Deployment Engine - -Terraform can optionally be used to create the cloud infrastructure. This will attempt to create the cloud provider assets at the `infra` (network, storage and compute) and `plat` (IAM policies and roles) runlevels using Terraform resources. A list of Terraform related parameters are shown in the table below. - -.List of parameters used by Terraform deployment engine -[cols="1,1,1,1"] -|=== -|Parameter|Description|Default Value|Notes - -|`infra_deployment_engine` -|The engine (ansible or terraform) that will be used to create the infrastructure resources. -| `ansible` -| Needs to be set to `terraform` for Terraform-deployment. - -4+| The parameters below are specified as keys in the `terraform` dictionary -|`terraform.**base_dir**` -| Top-level directory where all Terraform assets will be placed. Includes processed Jinja template files for Terraform, timestamped artefact of Terraform files and the workspace directory where terraform apply/destroy is run. -| `~/.config/cloudera-deploy/terraform` -| - -|`terraform.**state_storage**` -|The type of backend storage to use for the Terraform state. -| `local` -| Current options are `local` or `remote_s3` - -|`terraform.**auto_remote_state**` -| Flag to allow Cloudera Deploy automatically provision remote state resources as part of its initialization. This will also teardown these resources during cleanup. -| `False` -| - -|`terraform.**remote_state_bucket**` -|The name of the Terraform state storage bucket. -| -| Required if using `remote_s3` state storage. Value is derived from `name_prefix` if terraform_auto_remote_state is True. - -|`terraform.**remote_state_lock_table**` -|The name of the table to track locks of remote Terraform state. -| -| Required if using `remote_s3` state storage. Value is derived from `name_prefix` if terraform_auto_remote_state is True. -|=== - -== Definitions - -Cloudera Deploy uses a set of configuration files within a directory to define and coordinate a deployment. This directory also stores any artifacts created during the deployment, such as Ansible inventory files, CDP environment readouts, etc. - -The `main.yml` entrypoint playbook expects the runtime variable `definition_path` which should point at the absolute or relative (to the playbook) directory hosting these configuration files. - -Within the directory, you *must* supply the following files: - -* `definition.yml` -* `application.yml` - -Optionally, if deploying a CDP Private Cloud cluster or need to set up adhoc IaaS infrastructure, you can supply the following : - -* `inventory_static.ini` -* `inventory_template.ini` - -The definition directory can host any other file or asset, such as data files, additional configuration details, additional playbooks. However, Cloudera Deploy will not operate unless the `definition.yml` and `application.yml` files are present. - -=== `definition.yml` - -The required `definition.yml` file contains top-level configuration keys that define and direct the deployment. - -.Top-Level Configuration Keys -[cols="1,1"] -|=== - -|`infra` -|Hosting infrastructure to manage - -|`env` -|CDP Public Cloud Environment deployment (on the infrastructure) - -|`clusters` -.3+|CDP Private Cloud Cluster deployment (on the Infrastructure) -|`mgmt` -|`hosts` -|=== - -Within the top-level keys, you may override the defaults appropriate to that section. - -You may also add other top-level configuration keys if your automation requires it, e.g. if your `application.yml` playbook needs its own configuration details. - -More detailed documentation of all the options is beyond the scope of this introductory readme; further documentation is forthcoming. - -=== `application.yml` - -The required `application.yml` file is not a configuration file, it is actually an Ansible playbook. At minimum, this playbook requires a single Ansible play; a basic _no-op_ task works well if you wish to take no additional actions beyond the core deployment. - -For more sophisticated post-deployment actitivies, you can expand this playbook as much as needed. For example, the playbook can interact with hosts and inventory, execute computing jobs on deployment environments, and include additional playbooks and configuration files. - -NOTE: This file is a standard Ansible playbook, and when it is executed (via `import_playbook`) by the `main.yml` entrypoint, the working directory of the Ansible executable is changed to the directory of the `application.yml` playbook. - -=== `inventory_static.ini` - -You may also include an `inventory_static.ini` file that describes your static Ansible inventory. This file will be automatically loaded and added to the Ansible inventory. Note that you can also use the standard Ansible `-i` switch to include other static inventory. - -=== `inventory_template.ini` - -If included, Cloudera Deploy will use a definition's `inventory_template.ini` file, which describes a set of dynamic host inventory, and provision these hosts as infrastructure for the deployment, typically for a CDP Private Cloud cluster. - -NOTE: This currently only works on AWS. - -== Getting Involved - -Contribution instructions are coming soon! - -== License and Copyright - -Copyright 2021, Cloudera, Inc. - -[source,text] ----- -Licensed under the Apache License, Version 2.0 (the "License"); -you may not use this file except in compliance with the License. -You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - -Unless required by applicable law or agreed to in writing, software -distributed under the License is distributed on an "AS IS" BASIS, -WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -See the License for the specific language governing permissions and -limitations under the License. ----- From c9853009a4ddec748aa1cd44207e8a0889617d3d Mon Sep 17 00:00:00 2001 From: Webster Mudge Date: Wed, 27 Sep 2023 15:14:30 -0400 Subject: [PATCH 09/10] Update README to reflect v2 changes Signed-off-by: Webster Mudge --- README.md | 241 ++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 236 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 5d6f60c..c2308d3 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,242 @@ -# Cloudera Deploy +# cloudera-deploy - Automation Quickstarts and Examples for the Cloudera Data Platform (CDP) -## Setting up `ansible-navigator` +`cloudera-deploy` is a rich set of examples and quickstart projects for deploying and managing the Cloudera Data Platform (CDP). Its scope includes [**Cloudera Data Platform (CDP) Public Cloud, Private Cloud, and Data Services**](https://www.cloudera.com/products/cloudera-data-platform.html) and the software lifecycle of these platforms and the applications that work upon and with them. -`cloudera-deploy` uses `ansible-navigator` to manage and execute the deployment definitions. Setting up `ansible-navigator` is straightforward; create and activate a new `virtualenv` and install the latest `ansible-core` and `ansible-navigator`. +You can use the definitions and projects in `cloudera-deploy` as your entrypoint for getting started with CDP. These resources use straightforward configurations and playbooks to instruct the automation functions, yet each is extensible and highly configurable. -You can name your virtual environment anything you want; by convention, we call it `cdp-navigator`. +`cloudera-deploy` is designed to not only get you up and running quickly with CDP, but also to showcase the underlying toolsets and libraries. These projects demonstrate what you can build and layout a great foundation for your own entrypoints, CI/CD pipelines, integrations, and general platform and application operations. + +# Quickstart + +The definitions and projects in `cloudera-deploy` are designed to run with `ansible-navigator` and other _Execution Environment_-based tools. + +Follow these steps to get started: + +1. [Install `ansible-navigator`](#installation-and-usage) +1. [Check your requirements](#requirements) +1. [Select and configure your project](#catalog) +1. [Set your credentials](#credentials) +1. [Run your project](#execution) + +If you need help, check out the [Frequently Asked Questions](FAQ.md), the [FAQ for cldr-runner](https://github.com/cloudera-labs/cldr-runner/blob/main/FAQ.md), and drop by the [Discussions > Help](https://github.com/cloudera-labs/cloudera-deploy/discussions/categories/help) board. + +# Catalog + +The catalog of projects, examples, and definitions currently covers CDP Public Cloud for AWS. CDP Private Cloud and individual Data Services, Public and Private, as well as Public Cloud deployments to Azure and Google Cloud, are coming soon. + +| Project | Platform | CSP | Description | +|---------|----------|-----|-------------| +| [`datalake`](public-cloud/aws/datalake/README.md) | public cloud | AWS | **Constructs a CDP Public Cloud Environment and Datalake.** Generates via Ansible the AWS infrastructure and CDP artifacts, including SSH key, cross-account credentials, S3 buckets, etc. | +| [`datalake-tf`](public-cloud/aws/datalake-tf/README.md) | public cloud | AWS | **Constructs a CDP Public Cloud Environment and Datalake.** Uses the [terraform-cdp-modules](https://github.com/cloudera-labs/terraform-cdp-modules), called via Ansible, to generate the AWS infrastructure pre-requisite resources and the CDP artifacts. | +| [`cde`](public-cloud/aws/cde/README.md) | public cloud | AWS | **Constructs a set of Cloudera Data Engineering (CDE) workspaces within their own CDP Public Cloud Environment and Datalake.** Generates via Ansible the AWS infrastructure and CDP artifacts, including SSH key, cross-account credentials, S3 buckets, etc. | +| [`cdf`](public-cloud/aws/cdf/README.md) | public cloud | AWS | **Constructs a set of Cloudera Data Flow (CDF) workspaces and data hubs within their own CDP Public Cloud Environment and Datalake.** Generates via Ansible the AWS infrastructure and CDP artifacts, including SSH key, cross-account credentials, S3 buckets, etc. | +| [`cml`](public-cloud/aws/cml/README.md) | public cloud | AWS | **Constructs a set of Cloudera Machine Learning (CML) workspaces within their own CDP Public Cloud Environment and Datalake.** Generates via Ansible the AWS infrastructure and CDP artifacts, including SSH key, cross-account credentials, S3 buckets, etc. | + +# Roadmap + +If you want to see what we are working on or have pending, check out: + +* the [Milestones](https://github.com/cloudera-labs/cloudera-deploy/milestones) and [active issues](https://github.com/cloudera-labs/cloudera-deploy/issues?q=is%3Aissue+is%3Aopen+milestone%3A*) to see our current activity, +* the [issue backlog](https://github.com/cloudera-labs/cloudera-deploy/issues?q=is%3Aopen+is%3Aissue+no%3Amilestone) to see what work is pending or under consideration, and +* the [Ideas](https://github.com/cloudera-labs/cloudera-deploy/discussions/categories/ideas) discussion to see what we are considering. + +Are we missing something? Let us know by [creating a new issue](https://github.com/cloudera-labs/cloudera-deploy/issues/new) or [posting a new idea](https://github.com/cloudera-labs/cloudera-deploy/discussions/new?category=ideas)! + +# Contributions + +For more information on how to get involved with the `cloudera-deploy` project, head over to [CONTRIBUTING.md](CONTRIBUTING.md). + +# Requirements + +`cloudera-deploy` itself is not an application, but its projects and examples expect to run within an _execution environment_ called `cldr-runner`. This _execution environment_ typically is a container that encapsulates the runtimes, libraries, Python and system dependencies, and general configurations needed to run an Ansible- and Terraform-enable project. + +> [!NOTE] +> It is worth pointing out that you don't _have_ to use a container, but setting up a local execution environment is out-of-scope of `cloudera-deploy`; the projects in `cloudera-deploy` will run in any _execution environment_ like [AWX](https://github.com/ansible/awx)/[Red Hat Ansible Automation Platform (AAP)](https://www.redhat.com/en/technologies/management/ansible). If you want to learn more about setting up a local execution environment, head over to [cloudera-labs/cldr-runner](https://github.com/cloudera-labs/cldr-runner). + +The `cloudera-deploy` projects and their playbooks are built with the automation resources provided by `cldr-runner`, notably, but not exclusively: + +* [`cloudera.cloud`](https://github.com/cloudera-labs/cloudera.cloud) - Cloudera Data Platform (CDP) for Public Cloud +* [`cloudera.cluster`](https://github.com/cloudera-labs/cloudera.cluster) - Cloudera Data Platform (CDP) for Private Cloud and Cloudera Manager (CM) +* [`cloudera.exe`](https://github.com/cloudera-labs/cloudera.exe) - Runlevel Management and Utilities for Cloudera Data Platform (CDP) +* [`cdp-tf-quickstarts`](https://github.com/cloudera-labs/cdp-tf-quickstarts) - CDP quickstarts using the Terraform Module for CDP Prerequisites +* [`terraform-cdp-modules`](https://github.com/cloudera-labs/terraform-cdp-modules) - Terraform Modules for CDP Prerequisites + +Besides these resources within `cldr-runner`, generally `cloudera-deploy` projects will need one or more of the following **credentials**: + +## CDP Public Cloud + +For CDP Public Cloud, you will need an _Access Key_ and _Secret_ set in your user profile. The underlying automation libraries use your `default` profile unless you instruct them otherwise. See [Configuring CDP client with the API access key](https://docs.cloudera.com/cdp-public-cloud/cloud/cli/topics/mc-cli-generating-an-api-access-key.html) for further details. + +## Cloud Providers + +For Azure and AWS infrastructure, the process is similar, and these parameters may likewise be overridden. + +For Google Cloud, we suggest you issue a _credentials file_, store it securely in your profile, and then reference that file as needed by a project's configuration, as this works best with both CLI and Ansible Gcloud interactions. + +## CDP Private Cloud + +For CDP Private Cloud you will need a valid Cloudera license file in order to download the software from the Cloudera repositories. We suggest you store this file in your user profile in `~/.cdp/` and reference that file as needed by a project's configuration. + +If you are also using Public Cloud infrastructure to host your CDP Private Cloud clusters, then you will need those credentials as well. + +# Installation and Usage + +To use the projects in `cloudera-deploy`, you need to first set up `ansible-navigator`. + +> [!IMPORTANT] +> Please note each OS has slightly different requirements for installing `ansible-navigator`. :woozy_face: Read more about [installing `ansible-navigator`](https://ansible.readthedocs.io/projects/navigator/installation/#install-ansible-navigator). + +1. Create and activate a new Python `virtualenv`. + + You can name your virtual environment anything you want; by convention, we like to call it `cdp-navigator`. + + ```bash + python -m venv ~/cdp-navigator; source ~/cdp-navigator/bin/activate; + ``` + + This step is _highly recommended_ yet optional. + +2. Install the latest `ansible-core` and `ansible-navigator`. + + These tools can be the latest versions, as the actual execution versions are encapsulated in the _execution environment_ container. + + ```bash + pip install ansible-core ansible-navigator + ``` + +> [!NOTE] Further details can be found in the [NAVIGATOR document](https://github.com/cloudera-labs/cldr-runner/blob/main/NAVIGATOR.md) in `cloudera-labs/cldr-runner`. + +Then, clone this project. ```bash -python -m venv ~/cdp-navigator; source ~/cdp-navigator/bin/activate; pip install ansible-core ansible-navigator +git clone https://github.com/cloudera-labs/cloudera-deploy.git; cd cloudera-deploy; +``` + +## Execution Engine + +`ansible-navigator` can use either `docker` or `podman`. Either way, you will need a container runtime on your host. + +### Confirm your Docker service + +Check that `docker` is available by running the following command to list any active Docker containers. + +```bash +docker ps -a +``` + +If it is not running, please check your prerequisites process for Docker to install, start, and test the service. + +## Credentials + +To check that your various credentials are available and valid -- that they _match the expected accounts_ -- you can use `ansible-navigator` compare the user and account IDs produced via CLI with those found in the browser UI of the associated service. + +> [!IMPORTANT] +> All of the instructions below assume that your project is using the correct CSP-flavored image of `cldr-runner`. If in doubt, you can use the `full` image which has all supported CSP resources. + +### CDP Public Cloud + +``` +ansible-navigator exec -- cdp iam get-user +``` + +> [!IMPORTANT] +> If you do not yet have a CDP Public Cloud credential, follow [these instructions](https://docs.cloudera.com/cdp-public-cloud/cloud/cli/topics/mc-cli-generating-an-api-access-key.html) on the Cloudera website. + +See [CDP CLI](https://docs.cloudera.com/cdp-public-cloud/cloud/cli/topics/mc-cdp-cli.html) for further details. + +### AWS + +```bash +ansible-navigator exec -- aws iam get-user +``` + +See [AWS account requirements](https://docs.cloudera.com/cdp-public-cloud/cloud/requirements-aws/topics/mc-requirements-aws.html) for further details. + +### Azure + +```bash +ansible-navigator exec -- az account list +``` + +> [!NOTE] +> If you cannot list your Azure accounts, consider using `az login` to refresh your local, i.e. host, credential. + +See [Azure subscription requirements](https://docs.cloudera.com/cdp-public-cloud/cloud/requirements-azure/topics/mc-azure-requirements.html) for further details. + +### GCP + +```bash +ansible-navigator exec -- gcloud auth list +``` + +> [!IMPORTANT] +> You need a provisioning Service Account for GCP setup (typically referenced by the `gcloud_credential_file` entry). If you do not yet have a Provisioning Service Account you can [learn more](https://docs.cloudera.com/cdp-public-cloud/cloud/requirements-gcp/topics/mc-gcp-permissions.html) on the Cloudera website. + +See [GCP requirements](https://docs.cloudera.com/cdp-public-cloud/cloud/requirements-gcp/topics/mc-requirements-gcp.html) for further details. + +## Execution + +All of the definitions and projects in `cloudera-deploy` are designed to work with `ansible-navigator`. Each project has discrete instructions on what and how to run, but in general, you will end up executing some form of the `ansible-navigator run` subcommand, like: + +```bash +ansible-navigator run main.yml -e @config.yml -t plat +``` + +Occasionally, the instructions may ask you to run an individual module, such as `ansible-navigator exec -- ansible some_group -m ping`. You can learn more about the [available subcommands](https://ansible.readthedocs.io/projects/navigator/subcommands/) on the `ansible-navigator` website. + +> [!NOTE] +> If you want to check out what's in the container, or use the container directly, run `ansible-navigator exec -- /bin/bash`! + +### Logs + +The projects are configured to log their activities. In each, you will find a `runs/` directory that houses all of the runtime artifacts of `ansible-navigator` and `ansible-runner` (the Ansible application and interface that does the actual Ansible command dispatching). + +The log files are structured (JSON) and are by playbook and timestamp. If you want to review -- or _replay_ in `ansible-navigator`-speak -- you can load them into `ansible-navigator`: + +```bash +ansible-navigator replay .json +``` + +### Upgrades + +The `cldr-runner` image updates fairly often to include the latest libraries, new features and fixes. Depending on how `ansible-navigator` is configured, the application will check for an updated container image if it is missing. + +You can easily change this behavior; change your `ansible-navigator.yml` configuration in your project to: + +```yaml +ansible-navigator: + execution-environment: + pull: + policy: always +``` + +Or use the CLI flags `--pp` or `--pull-policy` and set the value to `always`. + +You can read more about [updating this configuration](https://ansible.readthedocs.io/projects/navigator/settings/#pull-policy) on the `ansible-navigator` website. + +# Troubleshooting + +If you need help, here are some resources: + +* [Frequently Asked Questions for `cloudera-deploy`](FAQ.md) +* [Frequently Asked Questions for `cldr-runner` and `ansible-navigator`](https://github.com/cloudera-labs/cldr-runner/blob/main/FAQ.md) + +Be sure to stop by the [Discussions > Help](https://github.com/cloudera-labs/cloudera-deploy/discussions/categories/help) board! + +# License and Copyright + +Copyright 2023, Cloudera, Inc. + +``` +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. ``` From 9dc0dc1a14a0d93b22e36f6f15b042cfc9b751ca Mon Sep 17 00:00:00 2001 From: Webster Mudge Date: Wed, 27 Sep 2023 15:34:23 -0400 Subject: [PATCH 10/10] Tweaks and formats (and grammar!) to the README Signed-off-by: Webster Mudge --- README.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index c2308d3..33ccf9c 100644 --- a/README.md +++ b/README.md @@ -51,7 +51,7 @@ For more information on how to get involved with the `cloudera-deploy` project, `cloudera-deploy` itself is not an application, but its projects and examples expect to run within an _execution environment_ called `cldr-runner`. This _execution environment_ typically is a container that encapsulates the runtimes, libraries, Python and system dependencies, and general configurations needed to run an Ansible- and Terraform-enable project. > [!NOTE] -> It is worth pointing out that you don't _have_ to use a container, but setting up a local execution environment is out-of-scope of `cloudera-deploy`; the projects in `cloudera-deploy` will run in any _execution environment_ like [AWX](https://github.com/ansible/awx)/[Red Hat Ansible Automation Platform (AAP)](https://www.redhat.com/en/technologies/management/ansible). If you want to learn more about setting up a local execution environment, head over to [cloudera-labs/cldr-runner](https://github.com/cloudera-labs/cldr-runner). +> It is worth pointing out that you don't _have_ to use a container, but setting up a local execution environment is out-of-scope of `cloudera-deploy`; the projects in `cloudera-deploy` will run in any _execution environment_, for example [AWX](https://github.com/ansible/awx)/[Red Hat Ansible Automation Platform (AAP)](https://www.redhat.com/en/technologies/management/ansible). If you want to learn more about setting up a local execution environment, head over to [cloudera-labs/cldr-runner](https://github.com/cloudera-labs/cldr-runner). The `cloudera-deploy` projects and their playbooks are built with the automation resources provided by `cldr-runner`, notably, but not exclusively: @@ -61,7 +61,7 @@ The `cloudera-deploy` projects and their playbooks are built with the automation * [`cdp-tf-quickstarts`](https://github.com/cloudera-labs/cdp-tf-quickstarts) - CDP quickstarts using the Terraform Module for CDP Prerequisites * [`terraform-cdp-modules`](https://github.com/cloudera-labs/terraform-cdp-modules) - Terraform Modules for CDP Prerequisites -Besides these resources within `cldr-runner`, generally `cloudera-deploy` projects will need one or more of the following **credentials**: +Besides these resources within `cldr-runner`, `cloudera-deploy` projects generally will need one or more of the following **credentials**: ## CDP Public Cloud @@ -69,7 +69,7 @@ For CDP Public Cloud, you will need an _Access Key_ and _Secret_ set in your use ## Cloud Providers -For Azure and AWS infrastructure, the process is similar, and these parameters may likewise be overridden. +For Azure and AWS infrastructure, the process is similar to CDP Public Cloud, and these parameters may likewise be overridden. For Google Cloud, we suggest you issue a _credentials file_, store it securely in your profile, and then reference that file as needed by a project's configuration, as this works best with both CLI and Ansible Gcloud interactions. @@ -104,7 +104,8 @@ To use the projects in `cloudera-deploy`, you need to first set up `ansible-navi pip install ansible-core ansible-navigator ``` -> [!NOTE] Further details can be found in the [NAVIGATOR document](https://github.com/cloudera-labs/cldr-runner/blob/main/NAVIGATOR.md) in `cloudera-labs/cldr-runner`. +> [!NOTE] +> Further details can be found in the [NAVIGATOR document](https://github.com/cloudera-labs/cldr-runner/blob/main/NAVIGATOR.md) in `cloudera-labs/cldr-runner`. Then, clone this project. @@ -128,7 +129,7 @@ If it is not running, please check your prerequisites process for Docker to inst ## Credentials -To check that your various credentials are available and valid -- that they _match the expected accounts_ -- you can use `ansible-navigator` compare the user and account IDs produced via CLI with those found in the browser UI of the associated service. +To check that your various credentials are available and valid -- that they _match the expected accounts_ -- you can use `ansible-navigator` within your project and compare the user and account IDs produced with those found in the browser UI of the associated service. > [!IMPORTANT] > All of the instructions below assume that your project is using the correct CSP-flavored image of `cldr-runner`. If in doubt, you can use the `full` image which has all supported CSP resources. @@ -139,7 +140,7 @@ To check that your various credentials are available and valid -- that they _mat ansible-navigator exec -- cdp iam get-user ``` -> [!IMPORTANT] +> [!NOTE] > If you do not yet have a CDP Public Cloud credential, follow [these instructions](https://docs.cloudera.com/cdp-public-cloud/cloud/cli/topics/mc-cli-generating-an-api-access-key.html) on the Cloudera website. See [CDP CLI](https://docs.cloudera.com/cdp-public-cloud/cloud/cli/topics/mc-cdp-cli.html) for further details. @@ -169,7 +170,7 @@ See [Azure subscription requirements](https://docs.cloudera.com/cdp-public-cloud ansible-navigator exec -- gcloud auth list ``` -> [!IMPORTANT] +> [!NOTE] > You need a provisioning Service Account for GCP setup (typically referenced by the `gcloud_credential_file` entry). If you do not yet have a Provisioning Service Account you can [learn more](https://docs.cloudera.com/cdp-public-cloud/cloud/requirements-gcp/topics/mc-gcp-permissions.html) on the Cloudera website. See [GCP requirements](https://docs.cloudera.com/cdp-public-cloud/cloud/requirements-gcp/topics/mc-requirements-gcp.html) for further details. @@ -191,7 +192,7 @@ Occasionally, the instructions may ask you to run an individual module, such as The projects are configured to log their activities. In each, you will find a `runs/` directory that houses all of the runtime artifacts of `ansible-navigator` and `ansible-runner` (the Ansible application and interface that does the actual Ansible command dispatching). -The log files are structured (JSON) and are by playbook and timestamp. If you want to review -- or _replay_ in `ansible-navigator`-speak -- you can load them into `ansible-navigator`: +The log files are structured (JSON) and are indexed by playbook and timestamp. If you want to review, rather _replay_, you can load them into `ansible-navigator`: ```bash ansible-navigator replay .json @@ -199,7 +200,7 @@ ansible-navigator replay .json ### Upgrades -The `cldr-runner` image updates fairly often to include the latest libraries, new features and fixes. Depending on how `ansible-navigator` is configured, the application will check for an updated container image if it is missing. +The `cldr-runner` image updates fairly often to include the latest libraries, new features and fixes. Depending on how `ansible-navigator` is configured (see the `ansible-navigator.yml` file), the application will check for an updated container image only if it is missing. You can easily change this behavior; change your `ansible-navigator.yml` configuration in your project to: