Skip to content

Sumit-5002/DRISHTI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿš€ DRISHTI: AI VISION FOR THE BLIND

๐ŸŽ“ VIT Bhopal College Project Exhibition - Group 36

DRISHTI Logo

๐Ÿ“ฑ Project Overview

DRISHTI is an innovative Android application designed to empower visually impaired individuals by providing real-time environmental awareness and navigation assistance through advanced AI technology. This project represents the culmination of our academic journey at VIT Bhopal, showcasing cutting-edge mobile development and artificial intelligence integration.


๐ŸŒŸ Key Features

1๏ธโƒฃ Navigation Mode ๐Ÿงญ

  • Real-time Camera Analysis: Continuous environmental monitoring using device camera
  • AI-Powered Object Detection: Identifies obstacles, people, vehicles, and environmental hazards
  • Voice Feedback: Instant audio descriptions of surroundings for safe navigation
  • Smart Frame Processing: Optimized 3-second intervals for responsive performance

2๏ธโƒฃ Assistant Mode ๐Ÿค–

  • Interactive Voice Commands: Ask questions about your environment in natural language
  • Context-Aware Responses: AI understands spatial context and provides relevant information
  • Environmental Queries: "What color is the car?", "How is the weather?", "Is there a person nearby?"
  • General Knowledge: Access information about people, objects, and concepts

3๏ธโƒฃ Voice Mode ๐Ÿ“–

  • Text Recognition: Reads signs, documents, books, and any printed text
  • Optical Character Recognition (OCR): Converts visual text to speech
  • High-Accuracy Reading: Powered by Google's Gemini AI for reliable text interpretation
  • Instant Audio Output: Real-time text-to-speech conversion

๐Ÿ› ๏ธ Technical Specifications

Platform & Architecture

  • Framework: Android Native (Kotlin)
  • UI Framework: Jetpack Compose
  • Architecture: MVVM with Repository Pattern
  • Minimum SDK: Android 6.0 (API 23)
  • Target SDK: Android 14 (API 34)

AI Integration

  • Primary AI: Google Gemini AI API
  • Computer Vision: Real-time image analysis and processing
  • Natural Language Processing: Context-aware voice interactions
  • Text Recognition: Advanced OCR capabilities

Performance Optimizations

  • Frame Rate: Optimized 3-second processing intervals
  • Camera Resolution: 640x480 for optimal performance
  • Memory Management: Efficient image processing and cleanup
  • Speech Rate: 1.2x optimized for clarity and speed

๐Ÿš€ Getting Started

Prerequisites

  • Android Studio Arctic Fox or later
  • Android SDK 23+
  • Google Gemini AI API Key
  • Android device with camera and microphone

Installation Steps

  1. Clone the Repository

    git clone https://github.com/your-username/drishti-blind-assistant.git
    cd drishti-blind-assistant
  2. API Key Setup

    • Visit Google AI Studio
    • Generate your Gemini API key
    • Add the key to the following files:
      • GeminiAPI.kt
      • GeminiAPI 1.kt
      • GeminiAPI 2.kt
  3. Build & Run

    • Open project in Android Studio
    • Sync Gradle files
    • Connect your Android device
    • Click "Run" to install the app

๐ŸŽฏ Usage Instructions

Navigation Mode

  • Single Tap: Activate camera and start navigation
  • Voice Feedback: Automatic environmental descriptions every 3 seconds
  • Safety Alerts: Immediate warnings for obstacles and hazards

Assistant Mode

  • Double Tap: Switch to interactive assistant mode
  • Voice Commands: Ask questions about your surroundings
  • Natural Language: Use conversational queries for information

Voice Mode

  • Long Press: Activate text reading mode
  • Point Camera: Aim at text you want to read
  • Automatic Reading: Instant text-to-speech conversion

๐Ÿ‘ฅ Team Members - Group 36

Project Lead & Development

Name Role Contribution
SUMIT PRASAD Team Lead & Full-Stack Developer Core Architecture, AI Integration, UI/UX Design
SUJEET GUPTA
ADVAY BHAGAT
KUMAR AMAN
KRISHANU DAS

Academic Details

  • Institution: VIT Bhopal (Vellore Institute of Technology, Bhopal Campus)
  • Course: Bachelor of Technology in Computer Science
  • Semester: Final Year Project
  • Academic Year: 2024-2025
  • Project Type: College Project Exhibition

๐Ÿ”ง Technology Stack

Frontend Technologies

  • Jetpack Compose: Modern Android UI toolkit
  • Material Design 3: Latest Material Design components
  • Kotlin Coroutines: Asynchronous programming
  • Android Navigation: Screen navigation and routing

Backend & AI

  • Google Gemini AI: Advanced AI model for vision and language
  • RESTful APIs: Efficient data communication
  • JSON Processing: Data serialization and parsing
  • HTTP Networking: Secure API communication

Development Tools

  • Android Studio: Primary development environment
  • Git: Version control and collaboration
  • Gradle: Build automation and dependency management
  • Android Debug Bridge (ADB): Device testing and debugging

๐Ÿ“Š Performance Metrics

Optimization Results

  • Frame Processing: Reduced from 12s to 3s (4x improvement)
  • Camera Resolution: Optimized from 1280x720 to 640x480
  • Speech Rate: Balanced at 1.2x for clarity and speed
  • Memory Usage: Efficient resource management

User Experience Improvements

  • Reduced Latency: Faster response times for better user experience
  • Improved Accuracy: Better AI recognition and processing
  • Enhanced Accessibility: Optimized for visually impaired users
  • Battery Efficiency: Better power management

๐ŸŽ“ Academic Impact

Learning Outcomes

  • Advanced Android Development: Mastery of modern Android development practices
  • AI Integration: Practical experience with cutting-edge AI technologies
  • Accessibility Design: Understanding of inclusive design principles
  • Project Management: Real-world project planning and execution

Innovation Contributions

  • Assistive Technology: Development of tools for differently-abled individuals
  • AI for Social Good: Application of AI to solve real-world accessibility challenges
  • Mobile Innovation: Pushing boundaries of mobile app capabilities
  • User-Centric Design: Focus on user experience and accessibility

๐Ÿ”ฎ Future Enhancements

Planned Features

  • Offline Mode: Local AI processing for areas without internet
  • Multi-Language Support: Support for regional languages
  • Gesture Recognition: Advanced hand gesture controls
  • Cloud Integration: User data synchronization and backup

Technical Improvements

  • Edge AI: On-device AI processing for faster response
  • Machine Learning: Continuous learning and improvement
  • IoT Integration: Smart home and environment connectivity
  • Wearable Support: Smartwatch and smart glass integration

๐Ÿ“š References & Resources

Technical Documentation

Research Papers

  • "Computer Vision for the Visually Impaired" - IEEE Access
  • "AI-Powered Assistive Technologies" - ACM Digital Library
  • "Mobile Accessibility in Modern Applications" - Mobile HCI Conference

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

Academic Use

This project is developed for academic purposes at VIT Bhopal. Please respect the educational nature of this work and provide appropriate attribution when referencing or building upon this code.


๐Ÿค Acknowledgments

  • VIT Bhopal Faculty: For guidance and academic support
  • Google AI Team: For providing the Gemini AI platform
  • Android Developer Community: For open-source contributions
  • Accessibility Advocates: For insights into user needs and requirements

๐Ÿ“ž Contact Information

Group 36 - VIT Bhopal


"Empowering the visually impaired through technology and innovation" ๐Ÿš€

ยฉ 2024 Group 36, VIT Bhopal. All rights reserved.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages