TexFinOps

AI-powered document reconciliation platform for cotton procurement in textile operations.

TexFinOps reduces manual effort in matching purchase data across invoices, weighbridge slips, and inward records. It extracts data from uploaded files, validates business rules, computes shortages/debit quantities, and exposes a workflow-friendly dashboard for review.

1) What this project is about

In cotton procurement, teams usually compare:

supplier invoice weight,
actual received weight,
commercial rate (candy rate),
allowable process loss.

This is often done manually in spreadsheets, which is slow and error-prone.

TexFinOps automates that process by combining:

Document upload (invoice/weighbridge),
OCR extraction of text,
LLM + regex parsing into structured fields,
Reconciliation logic to calculate shortage/debit,
UI review for approval and correction.

2) Why this project is significant

Faster turnaround: reduces the time from document receipt to reconciliation.
Lower human error: systematic parsing and deterministic calculations.
Auditability: stores source documents and parsed fields in one place.
Scalability: async worker model supports growing document volume.
Finance control: highlights shortages and debit-note candidates quickly.

3) Technology stack and why each is used

Backend

FastAPI: high-performance API framework with automatic OpenAPI docs (/docs).
SQLModel + SQLAlchemy (async): typed models and async DB operations.
PostgreSQL: reliable relational persistence for transactional records.
Celery: background processing for OCR/parsing tasks.
Redis: message broker/result backend for Celery.
PaddleOCR + pdf2image + Pillow: OCR pipeline for PDFs/images.
LangChain + provider LLM APIs: semantic extraction from noisy document text.
Pydantic / pydantic-settings: config + schema validation.

Frontend

React + TypeScript: typed component-based UI.
Vite: fast local development and build tooling.
React Router: page routing.
TanStack Query: server-state fetching/caching.
React Hook Form + Zod: form handling and validation.
TanStack Table: data grid/table interactions.
Tailwind CSS + Radix UI: fast, accessible UI composition.
react-pdf / react-dropzone: PDF preview + document upload UX.

DevOps / Runtime

Docker + Docker Compose: reproducible multi-service environment.
Service split (web/worker/db/redis/frontend): clear separation of concerns.

4) High-level architecture

flowchart LR
  U[User] --> FE[React Frontend]
  FE -->|HTTP REST| API[FastAPI Backend]
  API --> DB[(PostgreSQL)]
  API --> FS[(Uploads Storage)]
  API -->|Queue task| R[(Redis)]
  R --> W[Celery Worker]
  W --> OCR[OCR Service]
  W --> LLM[LangChain Parser + LLM]
  W --> DB
  W --> FS
  FE -->|poll status/read data| API

Why this architecture

OCR/LLM tasks are variable-latency and CPU-intensive; keeping them async prevents API blocking.
API stays responsive while worker processes documents in background.
Postgres gives consistent state for records and reconciliation outputs.
Redis decouples ingestion from processing for reliability and scale.

5) Sequential workflow (end-to-end)

User creates a cotton inward record from dashboard.
User uploads invoice and/or weighbridge documents.
Backend validates type/size and stores file under uploads.
Backend creates Document row and queues Celery task.
Worker picks task and runs OCR extraction.
Parser converts OCR text to structured fields (LLM first, regex fallback).
Parsed values update Document and linked CottonInward.
Reconciliation service calculates shortage/debit metrics.
Inward status updates (e.g., processing/reconciled/debit required).
Frontend displays results for review and action.

6) Repository structure

backend/ → FastAPI, models, schemas, OCR/parser/reconciliation services, Celery worker.
frontend/ → React dashboard and review UI.
uploads/ → stored uploaded files (invoice/weighbridge).
docker-compose.yml → local orchestration for all services.
.env.example → environment template.

7) API overview

Base URL: http://localhost:8000/api/v1

Cotton inward

GET /cotton-inwards → list with pagination/filter/search
POST /cotton-inwards → create record
GET /cotton-inwards/{id} → details
PATCH /cotton-inwards/{id} → update record
DELETE /cotton-inwards/{id} → delete
POST /cotton-inwards/{id}/reconcile → manual reconcile

Documents

POST /documents/upload → upload invoice/weighbridge
GET /documents/inward/{cotton_inward_id} → list inward documents
GET /documents/{document_id} → document metadata
GET /documents/{document_id}/status → OCR processing status
GET /documents/{document_id}/download → download file
DELETE /documents/{document_id} → remove document/file

Health/docs:

GET /health
GET /docs

8) User guide (how to use)

A. Start the platform (Docker, recommended)

Go to project root:
- cd /home/hariswar/Documents/TextFinOps
Ensure .env exists (copy from .env.example if needed).
Start services:
- docker compose up -d --build
Open:
- Frontend: http://localhost:5173
- API docs: http://localhost:8000/docs

B. Daily usage flow

Open dashboard and create an inward entry.
Upload invoice and weighbridge files.
Wait for processing status to complete.
Open inward detail page and review extracted values.
Verify reconciliation output (shortage, allowable loss, debit qty/amount).
Update/override received weight if needed and trigger reconcile.
Download source documents if required for audit.

C. Stop services

docker compose down

D. View logs when troubleshooting

docker compose logs --tail=200 web
docker compose logs --tail=200 worker
docker compose logs --tail=200 frontend

9) Configuration

Important environment variables:

DATABASE_URL → async Postgres URL used by backend/worker.
REDIS_URL / CELERY_BROKER_URL / CELERY_RESULT_BACKEND → queue infra.
UPLOAD_DIR / MAX_UPLOAD_SIZE_MB → document storage rules.
GOOGLE_API_KEY / OPENAI_API_KEY / ANTHROPIC_API_KEY → optional LLM provider key.

Use only one provider key unless your parser logic explicitly supports fallback order.

10) Security and operational notes

Do not commit real API keys in .env.
Prefer secrets management in production.
Restrict CORS origins for non-dev environments.
Add authentication/authorization before production usage.
Add rate limits and file scanning for upload hardening.

11) Current limitations and next steps

Limited document templates may need parser tuning.
OCR quality depends on scan quality.
No auth yet (suitable for controlled internal environments only).
Consider adding:
- role-based access control,
- retry/monitoring dashboards for Celery,
- model/parse confidence review queue,
- automated test coverage for parsing/reconciliation.

12) Quick verification checklist

After startup, verify:

http://localhost:8000/health returns healthy JSON.
http://localhost:8000/docs loads API docs.
http://localhost:5173 loads dashboard.
Uploading one sample file creates a document and processing status endpoint responds.

If all pass, TexFinOps is running correctly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TexFinOps

1) What this project is about

2) Why this project is significant

3) Technology stack and why each is used

Backend

Frontend

DevOps / Runtime

4) High-level architecture

Why this architecture

5) Sequential workflow (end-to-end)

6) Repository structure

7) API overview

Cotton inward

Documents

8) User guide (how to use)

A. Start the platform (Docker, recommended)

B. Daily usage flow

C. Stop services

D. View logs when troubleshooting

9) Configuration

10) Security and operational notes

11) Current limitations and next steps

12) Quick verification checklist

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
uploads		uploads
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

TexFinOps

1) What this project is about

2) Why this project is significant

3) Technology stack and why each is used

Backend

Frontend

DevOps / Runtime

4) High-level architecture

Why this architecture

5) Sequential workflow (end-to-end)

6) Repository structure

7) API overview

Cotton inward

Documents

8) User guide (how to use)

A. Start the platform (Docker, recommended)

B. Daily usage flow

C. Stop services

D. View logs when troubleshooting

9) Configuration

10) Security and operational notes

11) Current limitations and next steps

12) Quick verification checklist

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages