Custom Prefixes for data column and few shot column for prompt

Currently `generate_data_for_column`and `fewshot_example_columns` are used as the prefixs for the prompt.

E.g.:

```python
from datasets import Dataset
from ai_dataset_generator.prompts import BasePrompt

fewshot_examples = Dataset.from_dict({
    "text": ["This movie is great!", "This movie is bad!"],
    "label": ["positive", "negative"]
})

prompt_template = BasePrompt(
    task_description="Annotate movie reviews as either: {label_options}",
    label_options=["positive", "negative"],
    generate_data_for_column="label",
    fewshot_example_columns="text",
)
```

Has Output:

```
Annotate movie reviews as either: positive, negative

text: This movie is great!
label: positive

text: This movie is bad!
label: negative

text: {text}
label: 
```

With `text:` and `label:` as the prefixes. 

## Proposal/Motivation

What if I use a custom fine-tuned model, that does not work well with `text`and `label` as prefixes in the prompt, but was trained with `sentence` and `prediction`. 

For more flexibility, those prefixes should be optionally configurable. For example:

```python
from datasets import Dataset
from ai_dataset_generator.prompts import BasePrompt

fewshot_examples = Dataset.from_dict({
    "text": ["This movie is great!", "This movie is bad!"],
    "label": ["positive", "negative"]
})

prompt_template = BasePrompt(
    task_description="Annotate movie reviews as either: {label_options}",
    label_options=["positive", "negative"],
    generate_data_for_column=("label", "sentence"),  # Second tuple item contains the new prefix string
    fewshot_example_columns=("text", "prediction"),  # Second tuple item contains the new prefix string
)
```

Has Output:

```
Annotate movie reviews as either: positive, negative

sentence: This movie is great!
prediction: positive

sentence: This movie is bad!
prediction: negative

sentence: {text}
prediction: 
```

The default behaviour could stay the same and the column name is used as the prefix. If it is a tuple (or other structure) then the second item is used.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Custom Prefixes for data column and few shot column for prompt #45

Proposal/Motivation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Custom Prefixes for data column and few shot column for prompt #45

Description

Proposal/Motivation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions