Hi! And thanks for sharing these models and the ebook-summary project — I find it very useful.
I have a question about the training of the obook_summary and obook_title models. From the documentation I understand they are LoRA fine-tunes of Mistral-7B-Instruct v0.2, using TRL and Unsloth. Could you share a bit more detail about:
• The approximate dataset size and composition (e.g. number of texts / chapters, whether based on public domain ebooks, technical texts, etc.)
• Any evaluation process or benchmarks you used to check summarization quality (ROUGE, human eval, etc.)
• Whether human feedback / curation was part of the training (e.g. cleaning the data, adjusting style)
• If possible, the hyperparameters you found important (epochs, learning rate, context length, etc.)
I realize not all details may be public, but even a high-level overview would be very helpful for understanding the strengths and limitations of the models.
Thanks again for the great work!
Hi! And thanks for sharing these models and the ebook-summary project — I find it very useful.
I have a question about the training of the obook_summary and obook_title models. From the documentation I understand they are LoRA fine-tunes of Mistral-7B-Instruct v0.2, using TRL and Unsloth. Could you share a bit more detail about:
• The approximate dataset size and composition (e.g. number of texts / chapters, whether based on public domain ebooks, technical texts, etc.)
• Any evaluation process or benchmarks you used to check summarization quality (ROUGE, human eval, etc.)
• Whether human feedback / curation was part of the training (e.g. cleaning the data, adjusting style)
• If possible, the hyperparameters you found important (epochs, learning rate, context length, etc.)
I realize not all details may be public, but even a high-level overview would be very helpful for understanding the strengths and limitations of the models.
Thanks again for the great work!