🚀 DRISHTI: AI VISION FOR THE BLIND

🎓 VIT Bhopal College Project Exhibition - Group 36

📱 Project Overview

DRISHTI is an innovative Android application designed to empower visually impaired individuals by providing real-time environmental awareness and navigation assistance through advanced AI technology. This project represents the culmination of our academic journey at VIT Bhopal, showcasing cutting-edge mobile development and artificial intelligence integration.

🌟 Key Features

1️⃣ Navigation Mode 🧭

Real-time Camera Analysis: Continuous environmental monitoring using device camera
AI-Powered Object Detection: Identifies obstacles, people, vehicles, and environmental hazards
Voice Feedback: Instant audio descriptions of surroundings for safe navigation
Smart Frame Processing: Optimized 3-second intervals for responsive performance

2️⃣ Assistant Mode 🤖

Interactive Voice Commands: Ask questions about your environment in natural language
Context-Aware Responses: AI understands spatial context and provides relevant information
Environmental Queries: "What color is the car?", "How is the weather?", "Is there a person nearby?"
General Knowledge: Access information about people, objects, and concepts

3️⃣ Voice Mode 📖

Text Recognition: Reads signs, documents, books, and any printed text
Optical Character Recognition (OCR): Converts visual text to speech
High-Accuracy Reading: Powered by Google's Gemini AI for reliable text interpretation
Instant Audio Output: Real-time text-to-speech conversion

🛠️ Technical Specifications

Platform & Architecture

Framework: Android Native (Kotlin)
UI Framework: Jetpack Compose
Architecture: MVVM with Repository Pattern
Minimum SDK: Android 6.0 (API 23)
Target SDK: Android 14 (API 34)

AI Integration

Primary AI: Google Gemini AI API
Computer Vision: Real-time image analysis and processing
Natural Language Processing: Context-aware voice interactions
Text Recognition: Advanced OCR capabilities

Performance Optimizations

Frame Rate: Optimized 3-second processing intervals
Camera Resolution: 640x480 for optimal performance
Memory Management: Efficient image processing and cleanup
Speech Rate: 1.2x optimized for clarity and speed

🚀 Getting Started

Prerequisites

Android Studio Arctic Fox or later
Android SDK 23+
Google Gemini AI API Key
Android device with camera and microphone

Installation Steps

Clone the Repository

git clone https://github.com/your-username/drishti-blind-assistant.git
cd drishti-blind-assistant

API Key Setup
- Visit Google AI Studio
- Generate your Gemini API key
- Add the key to the following files:
  - GeminiAPI.kt
  - GeminiAPI 1.kt
  - GeminiAPI 2.kt
Build & Run
- Open project in Android Studio
- Sync Gradle files
- Connect your Android device
- Click "Run" to install the app

🎯 Usage Instructions

Navigation Mode

Single Tap: Activate camera and start navigation
Voice Feedback: Automatic environmental descriptions every 3 seconds
Safety Alerts: Immediate warnings for obstacles and hazards

Assistant Mode

Double Tap: Switch to interactive assistant mode
Voice Commands: Ask questions about your surroundings
Natural Language: Use conversational queries for information

Voice Mode

Long Press: Activate text reading mode
Point Camera: Aim at text you want to read
Automatic Reading: Instant text-to-speech conversion

👥 Team Members - Group 36

Project Lead & Development

Name	Role	Contribution
SUMIT PRASAD	Team Lead & Full-Stack Developer	Core Architecture, AI Integration, UI/UX Design
SUJEET GUPTA
ADVAY BHAGAT
KUMAR AMAN
KRISHANU DAS

Academic Details

Institution: VIT Bhopal (Vellore Institute of Technology, Bhopal Campus)
Course: Bachelor of Technology in Computer Science
Semester: Final Year Project
Academic Year: 2024-2025
Project Type: College Project Exhibition

🔧 Technology Stack

Frontend Technologies

Jetpack Compose: Modern Android UI toolkit
Material Design 3: Latest Material Design components
Kotlin Coroutines: Asynchronous programming
Android Navigation: Screen navigation and routing

Backend & AI

Google Gemini AI: Advanced AI model for vision and language
RESTful APIs: Efficient data communication
JSON Processing: Data serialization and parsing
HTTP Networking: Secure API communication

Development Tools

Android Studio: Primary development environment
Git: Version control and collaboration
Gradle: Build automation and dependency management
Android Debug Bridge (ADB): Device testing and debugging

📊 Performance Metrics

Optimization Results

Frame Processing: Reduced from 12s to 3s (4x improvement)
Camera Resolution: Optimized from 1280x720 to 640x480
Speech Rate: Balanced at 1.2x for clarity and speed
Memory Usage: Efficient resource management

User Experience Improvements

Reduced Latency: Faster response times for better user experience
Improved Accuracy: Better AI recognition and processing
Enhanced Accessibility: Optimized for visually impaired users
Battery Efficiency: Better power management

🎓 Academic Impact

Learning Outcomes

Advanced Android Development: Mastery of modern Android development practices
AI Integration: Practical experience with cutting-edge AI technologies
Accessibility Design: Understanding of inclusive design principles
Project Management: Real-world project planning and execution

Innovation Contributions

Assistive Technology: Development of tools for differently-abled individuals
AI for Social Good: Application of AI to solve real-world accessibility challenges
Mobile Innovation: Pushing boundaries of mobile app capabilities
User-Centric Design: Focus on user experience and accessibility

🔮 Future Enhancements

Planned Features

Offline Mode: Local AI processing for areas without internet
Multi-Language Support: Support for regional languages
Gesture Recognition: Advanced hand gesture controls
Cloud Integration: User data synchronization and backup

Technical Improvements

Edge AI: On-device AI processing for faster response
Machine Learning: Continuous learning and improvement
IoT Integration: Smart home and environment connectivity
Wearable Support: Smartwatch and smart glass integration

📚 References & Resources

Technical Documentation

Research Papers

"Computer Vision for the Visually Impaired" - IEEE Access
"AI-Powered Assistive Technologies" - ACM Digital Library
"Mobile Accessibility in Modern Applications" - Mobile HCI Conference

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Academic Use

This project is developed for academic purposes at VIT Bhopal. Please respect the educational nature of this work and provide appropriate attribution when referencing or building upon this code.

🤝 Acknowledgments

VIT Bhopal Faculty: For guidance and academic support
Google AI Team: For providing the Gemini AI platform
Android Developer Community: For open-source contributions
Accessibility Advocates: For insights into user needs and requirements

📞 Contact Information

Group 36 - VIT Bhopal

Email: sumit.24bce11520@vitbhopal.ac.in
GitHub: Project Repository
Institution: Vellore Institute of Technology, Bhopal Campus
Location: Bhopal, Madhya Pradesh, India

"Empowering the visually impaired through technology and innovation" 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.idea		.idea
app		app
gradle		gradle
.gitignore		.gitignore
README.md		README.md
build.gradle.kts		build.gradle.kts
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Folders and files

Latest commit

History

Repository files navigation

🚀 DRISHTI: AI VISION FOR THE BLIND

🎓 VIT Bhopal College Project Exhibition - Group 36

📱 Project Overview

🌟 Key Features

1️⃣ Navigation Mode 🧭

2️⃣ Assistant Mode 🤖

3️⃣ Voice Mode 📖

🛠️ Technical Specifications

Platform & Architecture

AI Integration

Performance Optimizations

🚀 Getting Started

Prerequisites

Installation Steps

🎯 Usage Instructions

Navigation Mode

Assistant Mode

Voice Mode

👥 Team Members - Group 36

Project Lead & Development

Academic Details

🔧 Technology Stack

Frontend Technologies

Backend & AI

Development Tools

📊 Performance Metrics

Optimization Results

User Experience Improvements

🎓 Academic Impact

Learning Outcomes

Innovation Contributions

🔮 Future Enhancements

Planned Features

Technical Improvements

📚 References & Resources

Technical Documentation

Research Papers

📄 License

Academic Use

🤝 Acknowledgments

📞 Contact Information

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages