Skip to content

Commit 9e433f0

Browse files
committed
combine CLI pages
1 parent b1765f7 commit 9e433f0

File tree

3 files changed

+114
-163
lines changed

3 files changed

+114
-163
lines changed

docs/source/cli/index.rst

Lines changed: 0 additions & 128 deletions
This file was deleted.

docs/source/index.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,6 @@ Table of Contents
2929
:caption: Supported Environments
3030

3131
Rust <https://docs.rs/crate/datafusion/>
32-
Command line <cli/index>
3332

3433
.. _toc.guide:
3534

docs/source/user-guide/cli.md

Lines changed: 114 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -17,25 +17,11 @@
1717
under the License.
1818
-->
1919

20-
# DataFusion Command-line Interface
20+
# DataFusion Command-line SQL Utility
2121

22-
The DataFusion CLI allows SQL queries to be executed by an in-process DataFusion context.
23-
24-
```
25-
USAGE:
26-
datafusion-cli [FLAGS] [OPTIONS]
27-
28-
FLAGS:
29-
-h, --help Prints help information
30-
-q, --quiet Reduce printing other than the results and work quietly
31-
-V, --version Prints version information
32-
33-
OPTIONS:
34-
-c, --batch-size <batch-size> The batch size of each query, or use DataFusion default
35-
-p, --data-path <data-path> Path to your data, default to current directory
36-
-f, --file <file>... Execute commands from file(s), then exit
37-
--format <format> Output format [default: table] [possible values: csv, tsv, table, json, ndjson]
38-
```
22+
The DataFusion CLI is a command-line interactive SQL utility that allows
23+
queries to be executed against any supported data files. It is a convenient way to
24+
try DataFusion out with your own data sources.
3925

4026
## Example
4127

@@ -47,31 +33,125 @@ $ echo "1,2" > data.csv
4733

4834
```bash
4935
$ datafusion-cli
36+
DataFusion CLI v11.0.0
37+
❯ CREATE EXTERNAL TABLE foo STORED AS CSV LOCATION 'data.csv';
38+
0 rows in set. Query took 0.017 seconds.
39+
select * from foo;
40+
+----------+----------+
41+
| column_1 | column_2 |
42+
+----------+----------+
43+
| 1 | 2 |
44+
+----------+----------+
45+
1 row in set. Query took 0.012 seconds.
46+
```
47+
48+
## Installation
5049
51-
DataFusion CLI v8.0.0
50+
### Install and run using Cargo
5251
53-
> CREATE EXTERNAL TABLE foo (a INT, b INT) STORED AS CSV LOCATION 'data.csv';
54-
0 rows in set. Query took 0.001 seconds.
52+
The easiest way to install DataFusion CLI a spin is via `cargo install datafusion-cli`.
5553
56-
> SELECT * FROM foo;
57-
+---+---+
58-
| a | b |
59-
+---+---+
60-
| 1 | 2 |
61-
+---+---+
62-
1 row in set. Query took 0.017 seconds.
54+
### Install and run using Homebrew (on MacOS)
55+
56+
DataFusion CLI can also be installed via Homebrew (on MacOS). Install it as any other pre-built software like this:
57+
58+
```bash
59+
brew install datafusion
60+
# ==> Downloading https://ghcr.io/v2/homebrew/core/datafusion/manifests/5.0.0
61+
# ######################################################################## 100.0%
62+
# ==> Downloading https://ghcr.io/v2/homebrew/core/datafusion/blobs/sha256:9ecc8a01be47ceb9a53b39976696afa87c0a8
63+
# ==> Downloading from https://pkg-containers.githubusercontent.com/ghcr1/blobs/sha256:9ecc8a01be47ceb9a53b39976
64+
# ######################################################################## 100.0%
65+
# ==> Pouring datafusion--5.0.0.big_sur.bottle.tar.gz
66+
# 🍺 /usr/local/Cellar/datafusion/5.0.0: 9 files, 17.4MB
67+
68+
datafusion-cli
6369
```
6470
65-
## DataFusion-Cli
71+
### Run using Docker
72+
73+
There is no officially published Docker image for the DataFusion CLI, so it is necessary to build from source
74+
instead.
6675
67-
Build the `datafusion-cli`:
76+
Use the following commands to clone this repository and build a Docker image containing the CLI tool. Note
77+
that there is `.dockerignore` file in the root of the repository that may need to be deleted in order for
78+
this to work.
6879
6980
```bash
70-
cd arrow-datafusion/datafusion-cli
71-
cargo build
81+
git clone https://github.com/apache/arrow-datafusion
82+
git checkout 8.0.0
83+
cd arrow-datafusion
84+
docker build -f datafusion-cli/Dockerfile . --tag datafusion-cli
85+
docker run -it -v $(your_data_location):/data datafusion-cli
7286
```
7387
74-
## Cli commands
88+
## Usage
89+
90+
```bash
91+
Apache Arrow <[email protected]>
92+
Command Line Client for DataFusion query engine.
93+
94+
USAGE:
95+
datafusion-cli [OPTIONS]
96+
97+
OPTIONS:
98+
-c, --batch-size <BATCH_SIZE> The batch size of each query, or use DataFusion default
99+
-f, --file <FILE>... Execute commands from file(s), then exit
100+
--format <FORMAT> [default: table] [possible values: csv, tsv, table, json,
101+
nd-json]
102+
-h, --help Print help information
103+
-p, --data-path <DATA_PATH> Path to your data, default to current directory
104+
-q, --quiet Reduce printing other than the results and work quietly
105+
-r, --rc <RC>... Run the provided files on startup instead of ~/.datafusionrc
106+
-V, --version Print version information
107+
108+
Type `exit` or `quit` to exit the CLI.
109+
```
110+
111+
## Registering Parquet Data Sources
112+
113+
Parquet data sources can be registered by executing a `CREATE EXTERNAL TABLE` SQL statement. It is not necessary to provide schema information for Parquet files.
114+
115+
```sql
116+
CREATE EXTERNAL TABLE taxi
117+
STORED AS PARQUET
118+
LOCATION '/mnt/nyctaxi/tripdata.parquet';
119+
```
120+
121+
## Registering CSV Data Sources
122+
123+
CSV data sources can be registered by executing a `CREATE EXTERNAL TABLE` SQL statement.
124+
125+
```sql
126+
CREATE EXTERNAL TABLE test
127+
STORED AS CSV
128+
WITH HEADER ROW
129+
LOCATION '/path/to/aggregate_test_100.csv';
130+
```
131+
132+
It is also possible to provide schema information.
133+
134+
```sql
135+
CREATE EXTERNAL TABLE test (
136+
c1 VARCHAR NOT NULL,
137+
c2 INT NOT NULL,
138+
c3 SMALLINT NOT NULL,
139+
c4 SMALLINT NOT NULL,
140+
c5 INT NOT NULL,
141+
c6 BIGINT NOT NULL,
142+
c7 SMALLINT NOT NULL,
143+
c8 INT NOT NULL,
144+
c9 BIGINT NOT NULL,
145+
c10 VARCHAR NOT NULL,
146+
c11 FLOAT NOT NULL,
147+
c12 DOUBLE NOT NULL,
148+
c13 VARCHAR NOT NULL
149+
)
150+
STORED AS CSV
151+
LOCATION '/path/to/aggregate_test_100.csv';
152+
```
153+
154+
## Commands
75155
76156
Available commands inside DataFusion CLI are:
77157
@@ -101,7 +181,7 @@ Available commands inside DataFusion CLI are:
101181
102182
- QuietMode
103183
104-
```
184+
```bash
105185
> \quiet [true|false]
106186
```
107187

0 commit comments

Comments
 (0)