OCRmyPDF OCRmyPDF

OCRmyPDFocrmypdf创建于 11 年前, 最后更新于 1 天前

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Python 65.34MB MPL-2.0 Github
Stars
29.3k
Fork
2k
Watch
188
Open Issues
129

kkFileView

12.7k Java

Universal File Online Preview Project based on Spring-Boot

1 7 年前 1 个月前

OpenBB

36.6k Python NOASSERTION

Investment Research for Everyone, Everywhere.

1 4 年前 3 个月前

markitdown

58k Python MIT

Python tool for converting files and office documents to Markdown.

1 7 个月前 54 分钟前

OCRmyPDF

29.3k Python MPL-2.0

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

1 11 年前 1 天前

browser-use

61.2k Python MIT

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

1 7 个月前 19 天前

instructor

10.4k Python MIT

structured outputs for llms

1 1 年前 1 个月前

kokoro-onnx

2k Python MIT

TTS with kokoro and onnx runtime

1 5 个月前 4 天前

Jobs_Applier_AI_Agent_AIHawk

28.3k Python AGPL-3.0

AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.

1 10 个月前 4 天前

MoneyPrinterTurbo

19.9k Python MIT

利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.

1 1 年前 4 个月前

tensorflow

190.2k C++ Apache-2.0

An Open Source Machine Learning Framework for Everyone

1 9 年前 7 天前

PDFMathTranslate

24.6k Python AGPL-3.0

PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero

1 9 个月前 1 天前

LLMs-from-scratch

44.3k Jupyter Notebook NOASSERTION

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

1 1 年前 1 个月前

ChatTTS

36.7k Python AGPL-3.0

A generative speech model for daily dialogue.

1 1 年前 1 天前

dify

87.8k TypeScript NOASSERTION

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

1 2 年前 2 个月前

paperless-ngx

27.8k Python GPL-3.0

A community-supported supercharged version of paperless: scan, index and archive all your physical documents

1 3 年前 9 天前

MinerU

34.9k Python AGPL-3.0

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

1 1 年前 1 天前

yt-dlp

113k Python Unlicense

A feature-rich command-line audio/video downloader

1 4 年前 9 天前

fastapi

85.8k Python MIT

FastAPI framework, high performance, easy to learn, fast to code, ready for production

1 6 年前 9 天前
OSZAR »