StoryBot: Interactive Story Generation For Kids

Group: Ghiwa Lamah, Ram Senthamarai, Nicolas Loffreda

Section Instructor: Natalie Ahn

Repository for the final project for MIDS W266: Natural Language Processing with Deep Learning, Spring 2023 course from the UCB MIDS program.

The goal for this project is to develop an NLP sentence continuation model that will enable an eventual interactive story development experience for children, in order to make the activity more accessible while reducing the burden on childcare providers. This system takes in the last few sentences of a story as an input, and generates the next sentence while staying on topic.

View Final Presentation Slides

Repo Folders:

- Code:

data_processing.ipynb: The data processing Jupyter notebook
model_finetuning.ipynb: Jupyter notebook used to train OPT and T5
inferencing.ipynb: Jupyter notebook used for initial inference of model outputs
Auto-Evaluation.ipynb: Jupyter notebook used to perform auto-evaluation
bert2bert_train_eval.ipynb: Jupyter notebook used to train and evlauate Bert2Bert (B2B)
annotation.ipynb: Jupyter notebook used to perform manual annotation
common.py: Setup and configurations imported by other notebooks in this folder

- Data:

Raw data file for the Children Stories Text Corpus
The processed S1, S2 and S3 datasets (inside a 7zip given its large size)

- Evaluation:

- Auto:

final_scores.csv: csv of summary of scores for all models
Csv's of auto-evaluation results for each fine-tuned model

- Manual:

prompts_for_manual_evaluation: List of 30 prompts used for manual evaluation process
manual_eval_generated_output.csv: Generated output for all models for the list of 30 prompts. Five sequences generated for each prompt/model combination.
manual_eval_generated_output_with_bib.csv: Annotated version of the csv above, with flags indicating which returned sequence the model chose as best, and which returned sequence the annotator chose as best.
Csv's of the resulting annotations made by each annotator, scoring the returned sequences on the five chosen metrics: Relevance, Readability, Grammar, Non-Redundancy, and Kid-Friendly Language.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
Data		Data
code		code
evaluation		evaluation
.gitignore		.gitignore
FinalReport.pdf		FinalReport.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StoryBot: Interactive Story Generation For Kids

Group: Ghiwa Lamah, Ram Senthamarai, Nicolas Loffreda

Section Instructor: Natalie Ahn

Repo Folders:

- Code:

- Data:

- Evaluation:

- Auto:

- Manual:

About

Releases

Packages

Contributors 3

Languages

ram-senth/datasci-w266-2023-spring-team-story-bot

Folders and files

Latest commit

History

Repository files navigation

StoryBot: Interactive Story Generation For Kids

Group: Ghiwa Lamah, Ram Senthamarai, Nicolas Loffreda

Section Instructor: Natalie Ahn

Repo Folders:

- Code:

- Data:

- Evaluation:

- Auto:

- Manual:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages