DRISHTI is an innovative Android application designed to empower visually impaired individuals by providing real-time environmental awareness and navigation assistance through advanced AI technology. This project represents the culmination of our academic journey at VIT Bhopal, showcasing cutting-edge mobile development and artificial intelligence integration.
- Real-time Camera Analysis: Continuous environmental monitoring using device camera
- AI-Powered Object Detection: Identifies obstacles, people, vehicles, and environmental hazards
- Voice Feedback: Instant audio descriptions of surroundings for safe navigation
- Smart Frame Processing: Optimized 3-second intervals for responsive performance
- Interactive Voice Commands: Ask questions about your environment in natural language
- Context-Aware Responses: AI understands spatial context and provides relevant information
- Environmental Queries: "What color is the car?", "How is the weather?", "Is there a person nearby?"
- General Knowledge: Access information about people, objects, and concepts
- Text Recognition: Reads signs, documents, books, and any printed text
- Optical Character Recognition (OCR): Converts visual text to speech
- High-Accuracy Reading: Powered by Google's Gemini AI for reliable text interpretation
- Instant Audio Output: Real-time text-to-speech conversion
- Framework: Android Native (Kotlin)
- UI Framework: Jetpack Compose
- Architecture: MVVM with Repository Pattern
- Minimum SDK: Android 6.0 (API 23)
- Target SDK: Android 14 (API 34)
- Primary AI: Google Gemini AI API
- Computer Vision: Real-time image analysis and processing
- Natural Language Processing: Context-aware voice interactions
- Text Recognition: Advanced OCR capabilities
- Frame Rate: Optimized 3-second processing intervals
- Camera Resolution: 640x480 for optimal performance
- Memory Management: Efficient image processing and cleanup
- Speech Rate: 1.2x optimized for clarity and speed
- Android Studio Arctic Fox or later
- Android SDK 23+
- Google Gemini AI API Key
- Android device with camera and microphone
-
Clone the Repository
git clone https://github.com/your-username/drishti-blind-assistant.git cd drishti-blind-assistant -
API Key Setup
- Visit Google AI Studio
- Generate your Gemini API key
- Add the key to the following files:
GeminiAPI.ktGeminiAPI 1.ktGeminiAPI 2.kt
-
Build & Run
- Open project in Android Studio
- Sync Gradle files
- Connect your Android device
- Click "Run" to install the app
- Single Tap: Activate camera and start navigation
- Voice Feedback: Automatic environmental descriptions every 3 seconds
- Safety Alerts: Immediate warnings for obstacles and hazards
- Double Tap: Switch to interactive assistant mode
- Voice Commands: Ask questions about your surroundings
- Natural Language: Use conversational queries for information
- Long Press: Activate text reading mode
- Point Camera: Aim at text you want to read
- Automatic Reading: Instant text-to-speech conversion
| Name | Role | Contribution |
|---|---|---|
| SUMIT PRASAD | Team Lead & Full-Stack Developer | Core Architecture, AI Integration, UI/UX Design |
| SUJEET GUPTA | ||
| ADVAY BHAGAT | ||
| KUMAR AMAN | ||
| KRISHANU DAS |
- Institution: VIT Bhopal (Vellore Institute of Technology, Bhopal Campus)
- Course: Bachelor of Technology in Computer Science
- Semester: Final Year Project
- Academic Year: 2024-2025
- Project Type: College Project Exhibition
- Jetpack Compose: Modern Android UI toolkit
- Material Design 3: Latest Material Design components
- Kotlin Coroutines: Asynchronous programming
- Android Navigation: Screen navigation and routing
- Google Gemini AI: Advanced AI model for vision and language
- RESTful APIs: Efficient data communication
- JSON Processing: Data serialization and parsing
- HTTP Networking: Secure API communication
- Android Studio: Primary development environment
- Git: Version control and collaboration
- Gradle: Build automation and dependency management
- Android Debug Bridge (ADB): Device testing and debugging
- Frame Processing: Reduced from 12s to 3s (4x improvement)
- Camera Resolution: Optimized from 1280x720 to 640x480
- Speech Rate: Balanced at 1.2x for clarity and speed
- Memory Usage: Efficient resource management
- Reduced Latency: Faster response times for better user experience
- Improved Accuracy: Better AI recognition and processing
- Enhanced Accessibility: Optimized for visually impaired users
- Battery Efficiency: Better power management
- Advanced Android Development: Mastery of modern Android development practices
- AI Integration: Practical experience with cutting-edge AI technologies
- Accessibility Design: Understanding of inclusive design principles
- Project Management: Real-world project planning and execution
- Assistive Technology: Development of tools for differently-abled individuals
- AI for Social Good: Application of AI to solve real-world accessibility challenges
- Mobile Innovation: Pushing boundaries of mobile app capabilities
- User-Centric Design: Focus on user experience and accessibility
- Offline Mode: Local AI processing for areas without internet
- Multi-Language Support: Support for regional languages
- Gesture Recognition: Advanced hand gesture controls
- Cloud Integration: User data synchronization and backup
- Edge AI: On-device AI processing for faster response
- Machine Learning: Continuous learning and improvement
- IoT Integration: Smart home and environment connectivity
- Wearable Support: Smartwatch and smart glass integration
- Android Developer Documentation
- Jetpack Compose Guide
- Google Gemini AI Documentation
- Material Design Guidelines
- "Computer Vision for the Visually Impaired" - IEEE Access
- "AI-Powered Assistive Technologies" - ACM Digital Library
- "Mobile Accessibility in Modern Applications" - Mobile HCI Conference
This project is licensed under the MIT License - see the LICENSE file for details.
This project is developed for academic purposes at VIT Bhopal. Please respect the educational nature of this work and provide appropriate attribution when referencing or building upon this code.
- VIT Bhopal Faculty: For guidance and academic support
- Google AI Team: For providing the Gemini AI platform
- Android Developer Community: For open-source contributions
- Accessibility Advocates: For insights into user needs and requirements
Group 36 - VIT Bhopal
- Email: sumit.24bce11520@vitbhopal.ac.in
- GitHub: Project Repository
- Institution: Vellore Institute of Technology, Bhopal Campus
- Location: Bhopal, Madhya Pradesh, India
"Empowering the visually impaired through technology and innovation" ๐
ยฉ 2024 Group 36, VIT Bhopal. All rights reserved.
