OCR
2 posts tagged OCR.
Apr 26, 2026
Building a Local LLM-Powered Hybrid OCR Engine
How I built a privacy-first, fully offline OCR pipeline that pairs Surya's layout detection with local Vision Language Models (OlmOCR, GLM-OCR, Qwen3-VL) and a Needleman-Wunsch aligner — turning handwriting, forms, and scanned PDFs into pixel-perfect searchable documents on your own laptop.
OCRLLMVision Language Models
12 min read

Feb 25, 2026
From DOI to Markdown: A Two-Repo Pipeline for Faster Research
How DOI Paper Scraper and Papers-to-Markdown work together to convert academic papers into structured Markdown for better reading, search, and research productivity.
Research ProductivityWeb ScrapingOCR
4 min read
