Skip to content
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -7,41 +7,47 @@ sidebar_position: 20

## Steps to Configure Galexie

### 1. Copy the Sample Configuration
### Kubernetes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think need to mention k8s on this page since the page's purpose is mostly runtime environment agnostic and more focused on just config data to network and bucket storage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, what do you think above flipping the structure here and starting with the key config values and then at the bottom having a section just describing copying the config file locally?


In the helm chart for Galexie there is a configmap that contains the configuration files that are used by the Galexie container.

### Running Galexie on an Instance

#### 1. Copy the Sample Configuration

Start with the provided sample file, [`config.example.toml`](https://github.com/stellar/stellar-galexie/blob/main/config/config.example.toml).

### 2. Rename and Update the Configuration
#### 2. Rename and Update the Configuration

Rename the file to `config.toml` and adjust settings as needed.

#### Key Settings Include
### Key Settings Include

##### Cloud Storage Service
#### Cloud Storage Service

Specify the cloud storage service to be used to export ledger metadata. Currently only `GCS` and `S3` are supported

```toml
type = "GCS"
```

##### Cloud Storage Bucket
#### Cloud Storage Bucket

Specify the cloud storage bucket where Galexie will export Stellar ledger data. Update `destination_bucket_path` to the complete path of your bucket, including subpaths if applicable.

```toml
destination_bucket_path = "stellar-network-data/testnet"
```

##### Stellar Network
#### Stellar Network

Set the Stellar network to be used in creating the data lake.

```toml
network = "testnet"
```

##### Data Organization (Optional)
#### Data Organization (Optional)

Configure how the exported data is organized in the storage bucket. The example below adds 1 ledger per file and organizes them in a directory of 64000 files.

Expand All @@ -53,7 +59,7 @@ ledgers_per_file = 1
files_per_partition = 64000
```

##### Use a Custom Core Config (Optional)
#### Use a Custom Core Config (Optional)

You can specify a custom `core.cfg` file in the Galexie `config.toml` to use that will override the default core parameters used with the Stellar Network specified in the `network` parameter.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,18 @@ sidebar_position: 30

# Installing

To install Galexie, retrieve the Docker image from the [Stellar Docker Hub registry](https://hub.docker.com/r/stellar/stellar-galexie) using the following command:
## Helm

In order to install Galexie into a kubernetes cluster follow these steps:

- Run `helm repo add stellar https://helm.stellar.org/charts`
- Run `helm install <RELEASE_NAME> stellar/galexie --set datastore.params.path=<FULL_PATH_TO_GALEXIE_OBJECTS>`

Additional configuration can be set with the `--set` flags or by using a [helm values file](https://helm.sh/docs/chart_template_guide/values_files/) with the chart.

## Running a Container on an Instance

To pull the Galexie container image from the [Stellar Docker Hub registry](https://hub.docker.com/r/stellar/stellar-galexie) using the following docker command or a similar OCI compliant image pull command:

```shell
docker pull stellar/stellar-galexie
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ sidebar_position: 50

### Metrics

Galexie publishes metrics through an HTTP-based admin endpoint, which makes it easier to monitor its performance. This endpoint is configurable in the `config.toml` file, where you can specify the port on which metrics are made available. The data is exposed in Prometheus format, enabling easy integration with existing monitoring and alerting systems.
Galexie publishes metrics through an HTTP-based admin endpoint, which makes it easier to monitor its performance. The data is exposed in Prometheus format, enabling easy integration with existing monitoring and alerting systems.

The admin port can be configured in the `config.toml` file by setting the `admin_port` variable. By default, the `admin_port` is set to `6061`
The admin port where these metrics are served can be configured by setting the `admin_port` variable. By default, the `admin_port` is set to `6061`

```toml
# Admin port configuration
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,18 @@ Galexie exports Stellar ledger metadata to Google Cloud Storage (GCS) or Amazon
- Permissions to create a new S3 bucket, or
- Access to an existing bucket with read/write permissions.

## 2. Docker (Recommended)
## 2. Container Runtime (Recommended)

> **_NOTE:_** While it is possible to natively install Galexie (without Docker), this requires manual dependency management and is recommended only for advanced users.]
### Kubernetes

Galexie is available as a Docker image, which simplifies installation and setup. Ensure you have Docker Engine installed on your system ([Docker installation guide](https://docs.docker.com/engine/install/)).
- Kubernetes 1.19+

### Running the Galexie Container on an Instance

- Instance like AWS EC2 or GCP VM
- OCI complaint container runtime installed on your instance, like Docker ([Docker installation guide](https://docs.docker.com/engine/install/)).

> **_NOTE:_** While it is possible to natively install Galexie (without a container runtime), this requires manual dependency management and is recommended only for advanced users.

## Hardware Requirements

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ sidebar_position: 40

# Running

Currently, the Galexie helm chart only runs the append command. This documentation is focused on containers running on instances.

With the Docker image available and the configuration file set up, you're now ready to run Galexie and start exporting Stellar ledger data to the storage bucket.

## Command Line Usage
Expand Down
101 changes: 84 additions & 17 deletions docs/data/indexers/build-your-own/galexie/admin_guide/setup.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,35 +7,102 @@ sidebar_position: 10

## Google Cloud Platform (GCP) for GCS

### Google Cloud Platform (GCP) credentials

Create application default credentials by using your user account for your GCP project by following these steps:

1. Download the [SDK](https://cloud.google.com/sdk/docs/install).
2. Install and initialize the [gcloud CLI](https://cloud.google.com/sdk/docs/initializing).
3. Create [application default credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc#google-idp) and it should automatically store in this location: `$HOME/.config/gcloud/application_default_credentials.json.`
4. Verify that this file exists before moving on to the next step.

### Google Cloud Storage (GCS) bucket

If you already have a GCS bucket with read and write permissions, you can skip this section. If not, follow these steps:
If you already have a GCS bucket ready for Galexie to push data, you can skip this section. If not, follow these steps:

1. Visit the GCP Console's Storage section (https://console.cloud.google.com/storage) and create a new bucket.
2. Choose a descriptive name for the bucket, such as `stellar-ledger-data`. Refer to [Google Cloud Storage Bucket Naming Guideline](https://cloud.google.com/storage/docs/buckets#naming) for bucket naming conventions. Note down the bucket name, you will need it later during the configuration process.

## Amazon Web Services (AWS) for S3
### Google Cloud Platform (GCP) Authentication

#### Google Kubernetes Engine Cluster

When running Galexie inside of a GKE cluster follow the Google cloud documentation for [workload identity](https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity) to make sure Galexie has the correct bucket access

#### GCP VM

1. [Create a Service Account](https://docs.cloud.google.com/iam/docs/service-accounts-create)
2. Use that Service Account when creating the GCP VM
3. Make sure the Service Account has the correct bucket access

#### Credentials (Not Recommended)

In order to use static credentials, find the authentication route that works best in the Galexie environment and follow the Google cloud documentation for [creating credentials](https://developers.google.com/workspace/guides/create-credentials) making sure the principal of the credentials has access to the correct bucket

#### IAM Role Permissions

### Amazon Web Services (AWS) credentials
When using GCP IAM to authenticate Galexie to access a bucket, the following permissions are required:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may benefit by describing the break out of bucket permissions for the two use cases of Galexie as a Consumer or Publisher. For consumer, the instance just reads from buckets and would only need reduced set of read permissions stated. And then for Publisher, it would be good to mention the extended write permissions needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when does Galexie act as a consumer?


Create application default credentials by using your user account for your AWS project by following these steps:
- storage.buckets.get
- storage.buckets.list
- storage.multipartUploads.abort
- storage.multipartUploads.create
- storage.multipartUploads.list
- storage.multipartUploads.listParts
- storage.objects.create
- storage.objects.delete
- storage.objects.get
- storage.objects.list
- storage.objects.restore
- storage.objects.update

1. Download and install the [SDK](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html#getting-started-install-instructions).
2. Create [authentication credentials](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-authentication.html). They should automatically store in this location: `$HOME/.aws/credentials`
3. Verify that this file exists before moving on to the next step.
## Amazon Web Services (AWS) for S3

### Amazon Simple Storage Service (S3) bucket

If you already have an S3 bucket with read and write permissions, you can skip this section. If not, follow these steps:
If you already have an S3 bucket ready for Galexie to push data, you can skip this section. If not, follow these steps:

1. Visit the AWS Console's Storage section (https://console.aws.amazon.com/s3/) and create a new bucket.
2. Choose a descriptive name for the bucket, such as `stellar-ledger-data`. Refer to [S3 General purpose bucket naming rules](https://cloud.google.com/storage/docs/buckets#naming) for bucket naming conventions. Note down the bucket name, you will need it later during the configuration process.

### Amazon Web Services (AWS) Authentication

#### EKS Cluster

When running Galexie inside of a EKS cluster follow either the AWS documentation for [IAM roles for service accounts](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html) or [pod identity](https://docs.aws.amazon.com/eks/latest/userguide/pod-identities.html)

#### AWS EC2

1. [Creat an IAM Role](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_job-functions_create-policies.html)
2. Use that role in an [instance profile](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html)
3. Use that instance profile in the creation of the EC2 instance
4. Make sure the instance profile has the correct bucket access

#### Credentials (Not Recommended)

In order to use static credentials, [create an IAM user](https://docs.aws.amazon.com/IAM/latest/UserGuide/getting-started-workloads.html) for Galexie making sure the principal of the credentials has access to the correct bucket and generate security credentials.

#### IAM Role Permissions

When using AWS IAM to authenticate Galexie to access a bucket, use this example policy making sure to use the correct bucket destination:

```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowS3BucketOperations",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:ListBucketMultipartUploads"
],
"Resource": "arn:aws:s3:::my-galexie-bucket-example"
},
{
"Sid": "AllowS3ObjectAccess",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Resource": ["arn:aws:s3:::my-galexie-bucket-example/*"]
}
]
}
```