Question about dataset and fine-tuning details

Hi! And thanks for sharing these models and the ebook-summary project — I find it very useful.

I have a question about the training of the obook_summary and obook_title models. From the documentation I understand they are LoRA fine-tunes of Mistral-7B-Instruct v0.2, using TRL and Unsloth. Could you share a bit more detail about:
	•	The approximate dataset size and composition (e.g. number of texts / chapters, whether based on public domain ebooks, technical texts, etc.)
	•	Any evaluation process or benchmarks you used to check summarization quality (ROUGE, human eval, etc.)
	•	Whether human feedback / curation was part of the training (e.g. cleaning the data, adjusting style)
	•	If possible, the hyperparameters you found important (epochs, learning rate, context length, etc.)

I realize not all details may be public, but even a high-level overview would be very helpful for understanding the strengths and limitations of the models.

Thanks again for the great work!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about dataset and fine-tuning details #18

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question about dataset and fine-tuning details #18

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions