Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/StardustDocs/d.tree
Original file line number Diff line number Diff line change
Expand Up @@ -203,4 +203,5 @@
<toc-element topic="gradleReference.md"/>
</toc-element>
<toc-element topic="_shadow_resources.md" hidden="true"/>
<toc-element topic="FAQ.md"/>
</instance-profile>
209 changes: 209 additions & 0 deletions docs/StardustDocs/topics/FAQ.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@
# Frequently Asked Questions

Here's a list of frequently asked questions about Kotlin DataFrame.
If you haven’t found an answer to yours, feel free to ask it on:

- [GitHub Issues](https://github.com/Kotlin/dataframe/issues)
- [#datascience](https://slack-chats.kotlinlang.org/c/datascience) channel in Kotlin Slack
([request an invite](https://surveys.jetbrains.com/s3/kotlin-slack-sign-up?_gl=1*1ssyqy3*_gcl_au*MTk5NzUwODYzOS4xNzQ2NzkxMDMz*FPAU*MTk5NzUwODYzOS4xNzQ2NzkxMDMz*_ga*MTE0ODQ1MzY3OS4xNzM4OTY1NzM3*_ga_9J976DJZ68*czE3NTE1NDUxODUkbzIyNyRnMCR0MTc1MTU0NTE4NSRqNjAkbDAkaDA.)).

## What is Kotlin DataFrame?

**Kotlin DataFrame** is a Kotlin library for working with tabular data.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add that it's written fully in Kotlin like "framework written in pure Kotlin. It is available from maven central repository"

Its goal is to reconcile Kotlin’s static typing with the dynamic nature of data,
providing a flexible and convenient idiomatic DSL for working with data in Kotlin.

## Is Kotlin DataFrame a Multiplatform Library?

Not yet — Kotlin DataFrame currently supports only the **JVM** target.

We’re actively exploring multiplatform support.
To stay updated on progress, subscribe to the
[corresponding issue](https://github.com/Kotlin/dataframe/issues/24).

### Does Kotlin DataFrame work on Android?

Yes — Kotlin DataFrame can be used in Android projects.

There is no dedicated Android artifact yet, but you can include the standard **JVM artifact**
by setting up a [custom Gradle configuration](gettingStartedGradleAdvanced.md).

## How to start with Kotlin DataFrame
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed ?


If you're new to Kotlin DataFrame, the [Quickstart guide](quickstart.md) is the perfect place to begin —
it gives a brief yet comprehensive introduction to the basics of working with DataFrame.

You can also check out [other guides and examples](Guides-And-Examples.md)
to explore various use cases and deepen your understanding of Kotlin DataFrame.

## What is the best environment to use Kotlin DataFrame?

For the best experience, Kotlin DataFrame is most effective in an interactive environment.

- **[Kotlin Notebook](gettingStartedKotlinNotebook.md)** is ideal for exploring Kotlin DataFrame.
Everything works out of the box — interactivity, rich rendering of DataFrames and plots.
You can instantly see the results of each operation, view the contents of your DataFrames after every transformation,
inspect individual rows and columns, and explore data step-by-step in a live and interactive way.
See the [](quickstart.md) to get started quickly.

- **[Kotlin DataFrame Compiler Plugin for IDEA projects](Compiler-Plugin.md)** enhances your usual
[IntelliJ IDEA](https://www.jetbrains.com/idea/) Kotlin projects by enabling compile-time
[extension properties](extensionPropertiesApi.md) generation.
This allows you to work with DataFrames in a name- and type-safe manner,
integrating seamlessly with the IDE.

## Is `DataFrame` mutable?

No, [`DataFrame`](DataFrame.md) is a completely immutable structure.
Kotlin DataFrame follows the functional style of Kotlin —
each [operation](operations.md) that modifies the data returns a new, updated `DataFrame` instance.

This means original data is never changed in-place, which improves code safety.

## How do I interoperate with collections like `List` or `Map`?

[`DataFrame`](DataFrame.md) integrates seamlessly with Kotlin collections.

You can:
- Create a `DataFrame` from a `Map` using [`toDataFrame()`](createDataFrame.md#todataframe).
- Convert a `DataFrame` back to a `Map` using [`toMap()`](toMap.md).
- Create a [`DataColumn`](DataColumn.md) from a `List` using [`toColumn()`](createColumn.md#tocolumn).
- Convert a `DataColumn` to a `List` of values.
- Convert a `DataFrame<T>` into a `List<T>` of data class instances corresponding to each row
using [`toList()`](toList.md).

## Are there any limitations on the types used in a DataFrame?

No! You can store values of **any Kotlin or Java types** inside a [`DataFrame`](DataFrame.md)
and work with them in a type-safe manner using [extension properties](extensionPropertiesApi.md)
across various [operations](operations.md).

For some commonly used types — such as
[Kotlin basic types](https://kotlinlang.org/docs/basic-types.html) and
[Kotlin date-time types](https://github.com/Kotlin/kotlinx-datetime) —
there is built-in support for automatic conversion and parsing.

## What data sources are supported?

<!------TODO data sources---->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will you do it in this PR or later?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

later, when #1293 and #1298 are ready


Kotlin DataFrame supports all popular data sources — CSV, JSON, Excel, Apache Arrow, SQL databases, and more!
See the [Data Sources section](Data-Sources.md) for a complete list of supported formats
and instructions on how to integrate them into your workflow.

Some sources — such as Apache Spark, [Exposed](https://www.jetbrains.com/help/exposed/home.html),
and [Multik](https://github.com/Kotlin/multik) — are not supported directly (yet),
but you can find [official integration examples here](Integrations.md).

If the data source you need isn't supported yet,
feel free to open an [issue](https://github.com/Kotlin/dataframe/issues)
and describe your use case — we’d love to hear from you!

## I see magically appearing properties in examples. What is it?

These are [extension properties](extensionPropertiesApi.md) — one of the key features of Kotlin DataFrame.

Extension properties correspond to the columns of a [`DataFrame`](DataFrame.md), allowing you to access and select them
in a **type-safe** and **name-safe** way.

They are generated automatically when working with Kotlin DataFrame in:

- [Kotlin Notebook](gettingStartedKotlinNotebook.md), where extension properties are generated
after each cell execution.
- A Kotlin project in [IntelliJ IDEA](https://www.jetbrains.com/idea/) with the
[](Compiler-Plugin.md) enabled, where the properties are generated at compile time.

## I used the KProperties API in older versions, what should I use now that it's deprecated?

The KProperty API was a useful access mechanism in earlier versions.
However, with the introduction of [extension properties](extensionPropertiesApi.md)
and the [Kotlin DataFrame compiler plugin](Compiler-Plugin.md),
you now have a more flexible and powerful alternative.

Annotate your Kotlin class with [`@DataSchema`](Compiler-Plugin.md#dataschema-declarations),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or alternatively call toDataFrame on list of Kotlin or Java objects, and resulting DataFrame will have schema according to their properties or getters

and the plugin will automatically generate type-safe extension properties
for your [`DataFrame`](DataFrame.md).
Or alternatively, call [`toDataFrame()`](createDataFrame.md#todataframe) on a list of Kotlin or Java objects,
and the resulting `DataFrame` will have schema according to their properties or getters.

See [compiler plugin examples](Compiler-Plugin.md#examples).

## How to visualize data from a DataFrame?

[Kandy](https://kotlin.github.io/kandy) is a Kotlin plotting library
designed to integrate seamlessly with Kotlin DataFrame.
It provides a convenient and idiomatic Kotlin DSL for building charts,
leveraging all Kotlin DataFrame features — including [extension properties](extensionPropertiesApi.md).

See the [Kandy Quick Start Guide](https://kotlin.github.io/kandy/quick-start-guide.html)
and explore the [Examples Gallery](https://kotlin.github.io/kandy/examples.html).

## Can I work with hierarchical/nested data?

Yes, Kotlin DataFrame is designed to work with hierarchical data.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This page could be useful somewhere here: https://kotlin.github.io/dataframe/hierarchical.html

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, I wanted to, but forgot 😄.


You can read JSON or any other nested format into a [`DataFrame`](DataFrame.md)
with hierarchical structure — using `FrameColumn`
(a column of data frames) and `ColumnGroup` (a column with nested subcolumns).

Both [dataframe schemas](schemas.md) and [extension properties](extensionPropertiesApi.md)
fully support nested data structures, allowing type-safe access and transformations at any depth.

See [](hierarchical.md) for more information.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed in []?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If [] is empty, writerside automatically adds topic name:
image


Also, you can transform your data into grouped structures using [`groupBy`](groupBy.md) or [`pivot`](pivot.md).

## Does Kotlin DataFrame support OpenAPI schemas?

Yes — the experimental `dataframe-openapi` module adds support for OpenAPI JSON schemas.
You can use it to parse and work with OpenAPI-defined structures directly in Kotlin DataFrame.

See the [OpenAPI Guide](https://github.com/Kotlin/dataframe/blob/master/examples/notebooks/json/KeyValueAndOpenApi.ipynb)
for details and examples.

## Does Kotlin DataFrame support geospatial data?

Yes — the experimental `dataframe-geo` module provides functionality for working with geospatial data,
including support for reading and writing GeoJSON and Shapefile formats, as well as tools for manipulating geometry types.

See the [GeoDataFrame Guide](https://kotlin.github.io/kandy/geo-plotting-guide.html)
for details and examples with beautiful [Kandy](https://kotlin.github.io/kandy) geo visualizations.

## What is the difference between Compiler Plugin, Gradle Plugin, and KSP Plugin?

All these plugins relate to working with [dataframe schemas](schemas.md), but they serve different purposes:

- **[Gradle Plugin](gradleReference.md)** and **[KSP Plugin](https://github.com/Kotlin/dataframe/tree/master/plugins/symbol-processor)**
are used to **generate data schemas** from external sources as part of the Gradle build process.

- **Gradle Plugin**: You declare the data source in your `build.gradle.kts` file
using the `dataframes { ... }` block.

- **KSP Plugin**: You annotate your Kotlin file with `@ImportDataSchema` file annotation,
and the schema will be generated via Kotlin Symbol Processing.

See [Data Schemas in Gradle Projects](https://kotlin.github.io/dataframe/schemasgradle.html) for more.

- **[Compiler Plugin](Compiler-Plugin.md)**
provides **on-the-fly generation** of [extension properties](extensionPropertiesApi.md)
based on an existing schema **during compilation**.
However, when reading data from files or external sources (like SQL),
the schema cannot be inferred automatically — you need to
specify it manually or use the Gradle or KSP plugin to generate it.

## How do I contribute or report an issue?

We’re always happy to receive contributions!

If you’d like to contribute, please refer to our
[contributing guidelines](https://github.com/Kotlin/dataframe/blob/master/CONTRIBUTING.md).

To report bugs or suggest improvements, open an issue on the
[DataFrame GitHub repository](https://github.com/Kotlin/dataframe/issues).

You’re also welcome to ask questions or discuss anything related to Kotlin DataFrame in the
[#datascience](https://slack-chats.kotlinlang.org/c/datascience) channel on Kotlin Slack.
If you’re not yet a member, you can
[request an invite](https://surveys.jetbrains.com/s3/kotlin-slack-sign-up?_gl=1*1ssyqy3*_gcl_au*MTk5NzUwODYzOS4xNzQ2NzkxMDMz*FPAU*MTk5NzUwODYzOS4xNzQ2NzkxMDMz*_ga*MTE0ODQ1MzY3OS4xNzM4OTY1NzM3*_ga_9J976DJZ68*czE3NTE1NDUxODUkbzIyNyRnMCR0MTc1MTU0NTE4NSRqNjAkbDAkaDA.).