Image to Text Automation (OCR + Desktop Automation)

A Python-based desktop automation system that extracts text from an image using OCR and automatically writes and saves the result into Notepad.

This project combines:

Computer Vision (OCR)
Text reconstruction using spatial logic
Desktop GUI automation
Window management

Features

Extracts text from images using EasyOCR
Reconstructs text lines based on bounding box coordinates
Automatically opens Notepad
Simulates human typing
Saves the output as a .txt file
Prevents overwriting existing files (auto-increment naming)
Closes unexpected popup windows during execution

How It Works

The system follows this pipeline:

Load OCR model
Detect and recognize text from image
Sort detected words vertically (top → bottom)
Group words into lines using Y-distance threshold
Sort words horizontally inside each line (left → right)
Open Notepad automatically
Type extracted text
Save as .txt file in output directory

Install dependencies:

pip install easyocr pyautogui pygetwindow

Technologies Used

EasyOCR (Deep Learning OCR)
PyAutoGUI (Desktop Automation)
PyGetWindow (Window Management)
Python Pathlib (Modern File Handling)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Output		Output
README.md		README.md
img.png		img.png
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image to Text Automation (OCR + Desktop Automation)

Features

How It Works

Install dependencies:

Technologies Used

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Image to Text Automation (OCR + Desktop Automation)

Features

How It Works

Install dependencies:

Technologies Used

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages