Disambiguate duplicated column names when reading data

E.g. when using `DataFrame.readExcel()`, it fails to read sheets with duplicated column names. It's bad practice to have this type of duplication in data, but that's how data ends up on one's desk quite frequently.

kdf should follow the approach implemented in other tabular-data-APIs, to disambiguate (or repair) duplications, e.g by correcting duplicatedcolumn name
* foo (first appearance)
* foo_1 (second appearance)
* foo_2 (third appearance)

Such functionality is also referred to as name-repair strategy, e.g. see https://readr.tidyverse.org/reference/read_delim.html (name_repair)

The function should be applied/provided to/by all `DataFrame.read*` methods for API consistency.

Optionally, the user could be given more control over the repair strategy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Disambiguate duplicated column names when reading data #342

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Disambiguate duplicated column names when reading data #342

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions