Conversation
This language data file support financial numbers like (5,555) 12,555
|
Make your PR for https://github.com/tesseract-ocr/tessdata_contrib instead of this. Give comparative data about how and why this is better than existing data. |
|
Reading financial numbers is not that much accurate on normal eng.traindata, When reading data using normal eng.traindata it read numbers like
as
. I just find the issue on my personal project and it's not only in (4) also other numbers like (16) 1 etc, Some time the issue is present on large number also, so I just try to train model from eng.traindata, It only includes the financial number format(0123456789(),.) . |
|
@anuraghkp1 : as @Shreeshrii point out: can you please make pull request to tessdata_contrib repository? |
|
@stweil can the PR be applied to tessdata_contrib directly? It will be useful to other users too. E.g. see request on Shreeshrii/tessdata_shreetest#9 |
Yes, that is possible. @anuraghkp1, could you please describe how you generated the new traineddata? Ideally it should be possible to reproduce your training process. |
|
@anuraghkp1, ping. |
|
@anuraghkp1, do you have some examples where the new model is superior to existing models? Would a white list of expected characters with an existing model achieve the same results as your model? |
This language data file support financial numbers like (5,555) 12,555