-
Notifications
You must be signed in to change notification settings - Fork 136
feat: Add research-paper-analyzer kit #163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 11 commits
cb94160
e543d63
305f4ce
064f105
63e14cd
b628928
e27509c
92fc63c
4c5e7ea
a594e90
917a247
bdf3c80
45febd5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| .env | ||
| .env.local | ||
| apps/node_modules | ||
| apps/.next | ||
| apps/.vercel |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,127 @@ | ||
| # Research Paper Analyzer | ||
|
|
||
| An AI-powered kit that takes any academic PDF URL and returns a structured breakdown: | ||
|
|
||
| - **Problem Statement** — what the research is trying to solve and why it matters | ||
| - **Methodology** — how the study was conducted | ||
| - **Key Findings** — the main results and conclusions | ||
| - **Limitations** — acknowledged weaknesses or gaps | ||
| - **Plain English Summary** — jargon-free explanation for non-specialists | ||
| - **Follow-up Questions** — ideas for future research directions | ||
|
|
||
| Built on [Lamatic.ai](https://lamatic.ai) with a **FastAPI** backend and a **React + Vite** frontend (JavaScript/JSX). | ||
|
|
||
| --- | ||
|
|
||
| ## Architecture | ||
|
|
||
| ``` | ||
| React (Vite/JSX) → FastAPI (Python) → Lamatic Flow → LLM | ||
| frontend backend orchestration | ||
| localhost:5173 localhost:8000 | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Quick Start | ||
|
|
||
| ### 1. Deploy the Flow in Lamatic Studio | ||
|
|
||
| 1. Go to [studio.lamatic.ai](https://studio.lamatic.ai) → New Project → New Flow | ||
| 2. Add nodes in this order: | ||
| - **API Trigger** — input schema: `{ pdf_url: string }` | ||
| - **Extract From File** — file URL: `{{trigger.pdf_url}}` | ||
| - **LLM Node** — use prompt from `prompts/analyze-paper.md`, structured JSON output | ||
| - **API Response** — output: `{{LLMNode.output}}` | ||
|
Comment on lines
+33
to
+35
|
||
| 3. Deploy the flow and copy the **Flow ID** from Settings | ||
|
|
||
| ### 2. Start the FastAPI Backend | ||
|
|
||
| ```bash | ||
| cd apps/backend | ||
| cp .env.example .env | ||
| # Fill in your Flow ID and Lamatic credentials in .env | ||
| pip install -r requirements.txt | ||
| uvicorn main:app --reload | ||
| ``` | ||
|
|
||
| Backend runs at [http://localhost:8000](http://localhost:8000). | ||
| Check [http://localhost:8000/docs](http://localhost:8000/docs) for the auto-generated API docs. | ||
|
|
||
| `.env` values to fill in: | ||
| ``` | ||
| RESEARCH_PAPER_ANALYZER_FLOW_ID=<your-flow-id> | ||
| LAMATIC_API_URL=<from Lamatic Settings> | ||
| LAMATIC_PROJECT_ID=<from Lamatic Settings> | ||
| LAMATIC_API_KEY=<from Lamatic Settings> | ||
| ``` | ||
|
|
||
| ### 3. Start the React Frontend | ||
|
|
||
| ```bash | ||
| cd apps/frontend | ||
| npm install | ||
| npm run dev | ||
| ``` | ||
|
|
||
| Frontend runs at [http://localhost:5173](http://localhost:5173). | ||
| Vite proxies `/analyze` → FastAPI automatically, so no CORS issues in dev. | ||
|
|
||
| --- | ||
|
|
||
| ## Usage | ||
|
|
||
| 1. Paste any publicly accessible PDF URL (e.g., an arXiv paper) | ||
| 2. Click **Analyze Paper** | ||
| 3. Browse the structured breakdown — expand/collapse each section | ||
| 4. Click **JSON** to copy the raw JSON output | ||
|
|
||
| ### Example URLs to try | ||
|
|
||
| - `https://arxiv.org/pdf/2303.08774.pdf` — GPT-4 Technical Report | ||
| - `https://arxiv.org/pdf/1706.03762.pdf` — Attention Is All You Need | ||
|
|
||
| --- | ||
|
|
||
| ## API Reference | ||
|
|
||
| ### `POST /analyze` | ||
|
|
||
| **Request:** | ||
| ```json | ||
| { "pdf_url": "https://arxiv.org/pdf/2303.08774.pdf" } | ||
| ``` | ||
|
|
||
| **Response:** | ||
| ```json | ||
| { | ||
| "success": true, | ||
| "data": { | ||
| "title": "...", | ||
| "authors": ["..."], | ||
| "year": 2023, | ||
| "problem_statement": "...", | ||
| "methodology": "...", | ||
| "key_findings": ["..."], | ||
| "limitations": ["..."], | ||
| "plain_english_summary": "...", | ||
| "follow_up_questions": ["..."] | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Tech Stack | ||
|
|
||
| - **Backend**: FastAPI, Python, httpx, Pydantic, python-dotenv | ||
| - **Frontend**: React.js, Vite, JavaScript/JSX, Tailwind CSS, Lucide React | ||
| - **AI Orchestration**: Lamatic.ai flows | ||
|
|
||
| --- | ||
|
|
||
| ## Requirements | ||
|
|
||
| - Python 3.10+ | ||
| - Node.js 18+, npm 9+ | ||
| - Lamatic.ai account ([sign up free](https://lamatic.ai)) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,44 @@ | ||
| # Research Paper Analyzer Agent | ||
|
|
||
| ## Identity | ||
|
|
||
| You are an expert academic research analyst. Your role is to read scientific papers and produce clear, structured analyses that help researchers, students, and professionals quickly understand complex academic work. | ||
|
|
||
| ## Capabilities | ||
|
|
||
| - Extract and articulate the core research problem and motivation | ||
| - Identify and explain the methodology used | ||
| - Summarize key findings and results objectively | ||
| - Surface limitations and potential gaps in the research | ||
| - Translate academic language into plain English for non-specialists | ||
| - Generate thoughtful follow-up research questions | ||
|
|
||
| ## Behavior Guidelines | ||
|
|
||
| - Always base your analysis strictly on the paper content — do not hallucinate facts | ||
| - Be objective; do not editorialize beyond what the paper states | ||
| - If a section of the paper is unclear or missing, state that explicitly | ||
| - Keep the plain-English summary accessible to a smart non-expert | ||
| - Format all output as structured JSON matching the defined schema | ||
|
|
||
| ## Output Schema | ||
|
|
||
|
|
||
| ```json | ||
| { | ||
| "title": "string", | ||
| "authors": ["string"], | ||
| "year": "number | null", | ||
| "problem_statement": "string", | ||
| "methodology": "string", | ||
| "key_findings": ["string"], | ||
| "limitations": ["string"], | ||
| "plain_english_summary": "string", | ||
| "follow_up_questions": ["string"] | ||
| } | ||
| ``` | ||
|
|
||
| ## Constraints | ||
|
|
||
| - Never invent citations, statistics, or claims not present in the paper | ||
| - Refuse requests to misrepresent or plagiarize research | ||
| - If the uploaded file is not an academic paper, respond with a clear error message | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| RESEARCH_PAPER_ANALYZER_FLOW_ID=your_flow_id_here | ||
| LAMATIC_API_URL=https://your-lamatic-endpoint.lamatic.ai | ||
| LAMATIC_PROJECT_ID=your_project_id_here | ||
| LAMATIC_API_KEY=your_api_key_here |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| .env | ||
| __pycache__/ | ||
| *.py[cod] | ||
| *.pyo | ||
| .venv/ | ||
| venv/ | ||
| env/ |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,128 @@ | ||
| from fastapi import FastAPI, HTTPException | ||
| from fastapi.middleware.cors import CORSMiddleware | ||
| from pydantic import BaseModel, field_validator | ||
| from urllib.parse import urlparse | ||
| import ipaddress | ||
| import httpx | ||
| import os | ||
| from dotenv import load_dotenv | ||
|
|
||
| load_dotenv() | ||
|
|
||
| LAMATIC_API_URL = os.getenv("LAMATIC_API_URL", "").rstrip("/") | ||
| LAMATIC_PROJECT_ID = os.getenv("LAMATIC_PROJECT_ID", "") | ||
| LAMATIC_API_KEY = os.getenv("LAMATIC_API_KEY", "") | ||
| FLOW_ID = os.getenv("RESEARCH_PAPER_ANALYZER_FLOW_ID", "") | ||
|
|
||
| # Lamatic uses GraphQL — single POST endpoint per project | ||
| EXECUTE_QUERY = """ | ||
| query ExecuteWorkflow($workflowId: String!, $payload: JSON) { | ||
| executeWorkflow(workflowId: $workflowId, payload: $payload) { | ||
| status | ||
| result | ||
| } | ||
| } | ||
| """ | ||
|
|
||
| app = FastAPI(title="Research Paper Analyzer", version="1.0.0") | ||
|
|
||
| app.add_middleware( | ||
| CORSMiddleware, | ||
| allow_origins=["http://localhost:5173", "http://localhost:3000"], | ||
| allow_methods=["POST", "GET"], | ||
| allow_headers=["*"], | ||
| ) | ||
|
|
||
|
|
||
|
|
||
| _BLOCKED_RANGES = [ | ||
| ipaddress.ip_network("10.0.0.0/8"), | ||
| ipaddress.ip_network("172.16.0.0/12"), | ||
| ipaddress.ip_network("192.168.0.0/16"), | ||
| ipaddress.ip_network("127.0.0.0/8"), | ||
| ipaddress.ip_network("169.254.0.0/16"), # link-local / cloud IMDS | ||
| ipaddress.ip_network("0.0.0.0/8"), | ||
| ipaddress.ip_network("::1/128"), | ||
| ipaddress.ip_network("fc00::/7"), | ||
| ] | ||
|
|
||
|
|
||
| class AnalyzeRequest(BaseModel): | ||
| pdf_url: str | ||
|
|
||
| @field_validator("pdf_url") | ||
| @classmethod | ||
| def validate_pdf_url(cls, v: str) -> str: | ||
| parsed = urlparse(v) | ||
| if parsed.scheme not in {"https"}: | ||
| raise ValueError("Only HTTPS URLs are accepted.") | ||
| hostname = parsed.hostname or "" | ||
| if not hostname: | ||
| raise ValueError("URL must contain a valid hostname.") | ||
| try: | ||
| ip = ipaddress.ip_address(hostname) | ||
| if any(ip in net for net in _BLOCKED_RANGES): | ||
| raise ValueError("URL resolves to a private or reserved address.") | ||
| except ValueError as exc: | ||
| if "private or reserved" in str(exc): | ||
| raise | ||
| # hostname is a domain name — pass through | ||
| return v | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mission-critical: the private-network guard is bypassable through hostnames. This validator only rejects literal IP hosts. 🧰 Tools🪛 Ruff (0.15.15)[warning] 57-57: Avoid specifying long messages outside the exception class (TRY003) [warning] 60-60: Avoid specifying long messages outside the exception class (TRY003) [warning] 64-64: Abstract (TRY301) [warning] 64-64: Avoid specifying long messages outside the exception class (TRY003) 🤖 Prompt for AI Agents |
||
|
|
||
|
|
||
| @app.get("/health") | ||
| def health(): | ||
| return {"status": "ok"} | ||
|
|
||
|
|
||
| @app.post("/analyze") | ||
| async def analyze_paper(req: AnalyzeRequest): | ||
| if not FLOW_ID: | ||
| raise HTTPException(500, "RESEARCH_PAPER_ANALYZER_FLOW_ID is not set.") | ||
| if not LAMATIC_API_URL or not LAMATIC_API_KEY or not LAMATIC_PROJECT_ID: | ||
| raise HTTPException(500, "Lamatic API credentials are not set.") | ||
|
|
||
| headers = { | ||
| "Authorization": f"Bearer {LAMATIC_API_KEY}", | ||
| "x-project-id": LAMATIC_PROJECT_ID, | ||
| "Content-Type": "application/json", | ||
| } | ||
|
|
||
| body = { | ||
| "query": EXECUTE_QUERY, | ||
| "variables": { | ||
| "workflowId": FLOW_ID, | ||
| "payload": {"pdf_url": req.pdf_url}, | ||
| }, | ||
| } | ||
|
|
||
| try: | ||
| async with httpx.AsyncClient(timeout=120.0) as client: | ||
| response = await client.post(LAMATIC_API_URL, headers=headers, json=body) | ||
|
|
||
| if response.status_code == 401: | ||
| raise HTTPException(401, "Invalid Lamatic API key or project ID.") | ||
| if response.status_code != 200: | ||
| raise HTTPException(response.status_code, f"Lamatic API error: {response.text}") | ||
|
|
||
| data = response.json() | ||
|
|
||
| # GraphQL errors surface inside data.errors | ||
| if "errors" in data: | ||
| raise HTTPException(500, f"Flow error: {data['errors'][0].get('message', 'unknown')}") | ||
|
|
||
| execute_result = data.get("data", {}).get("executeWorkflow", {}) | ||
|
|
||
| if execute_result.get("status") != "success": | ||
| raise HTTPException(500, f"Flow did not succeed: {execute_result.get('status')}") | ||
|
|
||
| analysis = execute_result.get("result") | ||
|
|
||
| if not analysis: | ||
| raise HTTPException(500, "No analysis returned by the flow.") | ||
|
|
||
| return {"success": True, "data": analysis} | ||
|
|
||
| except httpx.TimeoutException: | ||
| raise HTTPException(504, "Request timed out. The PDF may be too large.") | ||
| except httpx.RequestError as e: | ||
| raise HTTPException(503, f"Network error: {str(e)}") | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| fastapi>=0.111.0 | ||
| uvicorn[standard]>=0.29.0 | ||
| httpx>=0.27.0 | ||
| python-dotenv>=1.0.0 | ||
| pydantic>=2.0.0 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| # Leave empty when using Vite's built-in proxy (dev mode) | ||
| # Set to your deployed FastAPI URL for production builds | ||
| VITE_BACKEND_URL= |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| node_modules/ | ||
| dist/ | ||
| .env | ||
| .env.local |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| <!DOCTYPE html> | ||
| <html lang="en"> | ||
| <head> | ||
| <meta charset="UTF-8" /> | ||
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /> | ||
| <title>Research Paper Analyzer · Lamatic.ai</title> | ||
| </head> | ||
| <body> | ||
| <div id="root"></div> | ||
| <script type="module" src="/src/main.jsx"></script> | ||
| </body> | ||
| </html> |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| { | ||
| "name": "research-paper-analyzer-frontend", | ||
| "version": "0.1.0", | ||
| "private": true, | ||
| "type": "module", | ||
| "scripts": { | ||
| "dev": "vite", | ||
| "build": "vite build", | ||
| "preview": "vite preview" | ||
| }, | ||
| "dependencies": { | ||
| "lucide-react": "^0.469.0", | ||
| "react": "^18.3.1", | ||
| "react-dom": "^18.3.1" | ||
| }, | ||
| "devDependencies": { | ||
| "@vitejs/plugin-react": "^4.3.1", | ||
| "autoprefixer": "^10.4.19", | ||
| "postcss": "^8.4.38", | ||
| "tailwindcss": "^3.4.4", | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
set -euo pipefail
# Locate and print the relevant package.json with line numbers
FILE="kits/research-paper-analyzer/apps/frontend/package.json"
echo "---- $FILE (tailwindcss line vicinity) ----"
nl -ba "$FILE" | sed -n '1,120p' | sed -n '15,40p'
# Search for tailwind config in this app/kit
echo "---- tailwind config files ----"
find kits/research-paper-analyzer -maxdepth 6 -type f \( -name "tailwind.config.*" -o -name "postcss.config.*" \) | sort
# Search for Tailwind v3/v4 indicators in config
echo "---- tailwind config contents (if present) ----"
for f in $(find kits/research-paper-analyzer -maxdepth 6 -type f -name "tailwind.config.*" 2>/dev/null | sort); do
echo "### $f"
nl -ba "$f" | head -n 220 | sed -n '1,120p'
done
# List CSS files within this app that would trigger the kits/**/*.css guideline
echo "---- CSS files in this app (for tailwind requirement applicability) ----"
find kits/research-paper-analyzer/apps/frontend -type f \( -name "*.css" -o -name "*.scss" -o -name "*.sass" -o -name "*.less" \) | sort
# Show any reference to tailwind directives in CSS (e.g., `@tailwind`)
echo "---- Tailwind directives in CSS ----"
rg -n "`@tailwind`|tailwindcss" kits/research-paper-analyzer/apps/frontend --glob='*.{css,scss,less,sass,ts,tsx}' || trueRepository: Lamatic/AgentKit Length of output: 215 🏁 Script executed: #!/bin/bash
set -euo pipefail
FILE="kits/research-paper-analyzer/apps/frontend/package.json"
echo "---- JSON tailwindcss dependency in $FILE ----"
python3 - <<'PY'
import json
from pathlib import Path
p=Path("kits/research-paper-analyzer/apps/frontend/package.json")
data=json.loads(p.read_text())
for section in ["dependencies","devDependencies"]:
if section in data and "tailwindcss" in data[section]:
print(f"{section}.tailwindcss = {data[section]['tailwindcss']}")
PY
echo "---- package.json lines 1-60 (for context) ----"
cat -n "$FILE" | sed -n '1,60p'
echo "---- tailwind/postcss config files under kits/research-paper-analyzer/apps/frontend ----"
find kits/research-paper-analyzer/apps/frontend -maxdepth 4 -type f \( -name "tailwind.config.*" -o -name "postcss.config.*" \) | sort
echo "---- tailwind directives in CSS under apps/frontend ----"
rg -n "`@tailwind`|tailwindcss" kits/research-paper-analyzer/apps/frontend --glob='*.{css,scss,less,sass}' || true
echo "---- list CSS files under apps/frontend ----"
find kits/research-paper-analyzer/apps/frontend -type f \( -name "*.css" -o -name "*.scss" -o -name "*.sass" -o -name "*.less" \) | sortRepository: Lamatic/AgentKit Length of output: 1498 🏁 Script executed: #!/bin/bash
set -euo pipefail
echo "---- tailwind.config.js ----"
cat -n kits/research-paper-analyzer/apps/frontend/tailwind.config.js | sed -n '1,200p'
echo "---- postcss.config.js ----"
cat -n kits/research-paper-analyzer/apps/frontend/postcss.config.js | sed -n '1,200p'Repository: Lamatic/AgentKit Length of output: 459 Mission risk: Tailwind v3 is powering this kit—upgrade to Tailwind CSS v4+.
🤖 Prompt for AI Agents |
||
| "vite": "^5.3.1" | ||
| } | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
set -euo pipefail
FILE="kits/research-paper-analyzer/apps/frontend/package.json"
echo "== File =="
wc -l "$FILE"
echo
echo "== Relevant dependency block =="
# Print around dependencies/devDependencies
sed -n '1,120p' "$FILE" | nl -ba | sed -n '1,120p'
echo
echo "== Check for caret ranges in dependency sections =="
rg -n '"(dependencies|devDependencies)"\s*:' -S "$FILE"
echo
rg -n '\^' "$FILE" || trueRepository: Lamatic/AgentKit Length of output: 225 🏁 Script executed: #!/bin/bash
set -euo pipefail
FILE="kits/research-paper-analyzer/apps/frontend/package.json"
echo "== File =="
wc -l "$FILE"
echo
echo "== Show package.json =="
# Small file (23 lines); safe to print fully with line numbers via awk
awk '{print NR ":" $0}' "$FILE"
echo
echo "== Check dependency keys exist =="
rg -n '"dependencies"\s*:' "$FILE" || true
rg -n '"devDependencies"\s*:' "$FILE" || true
echo
echo "== Check for caret (^) semver ranges in the file =="
rg -n '\^' "$FILE" || trueRepository: Lamatic/AgentKit Length of output: 1093 Mission brief: pin dependency versions. 🤖 Prompt for AI Agents |
||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| export default { | ||
| plugins: { | ||
| tailwindcss: {}, | ||
| autoprefixer: {}, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🌐 Web query:
💡 Result: Tailwind CSS v4 (official) PostCSS setup 1) Install packages - Install tailwindcss, the dedicated PostCSS plugin Citations:
Mission: Replace Tailwind v3 PostCSS wiring with Tailwind v4’s official setup This
Align this kit’s Tailwind toolchain to v4 to avoid build/lint drift against the Tailwind v4+ kit styling requirement. 🤖 Prompt for AI Agents |
||
| }, | ||
| }; | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mission brief: neutralize markdownlint blockers before merge.
At Line 18 and Line 52, fenced blocks are missing language tags (MD040). Around Line 52, Line 91, and Line 96, add blank lines around fenced blocks (MD031) to satisfy the validator.
Proposed patch
@@
.envvalues to fill in:-
+ +envRESEARCH_PAPER_ANALYZER_FLOW_ID=
LAMATIC_API_URL=
LAMATIC_PROJECT_ID=
LAMATIC_API_KEY=
Response:
+
{ "success": true,Also applies to: 52-57, 91-93, 96-111
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 18-18: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents