This project is a Python-based webscraper tool that extracts job listings related to Python from TimesJobs. The tool leverages the requests library to fetch HTML content from the job search page and BeautifulSoup with the lxml parser to parse and extract relevant job data from the website.
- Job Extraction: Fetches Python-related job listings from TimesJobs.
- Skill Filtering: Allows users to input a skill they are not familiar with and filters out jobs that require that skill.
- Job Details Storage: Extracted job details, including company names, required skills, and job links, are stored in individual text files for easy access and review.
- The script sends an HTTP request to the TimesJobs website to retrieve job listings related to Python.
- The HTML content is parsed using
BeautifulSoup, and job postings are extracted based on specific HTML tags and classes. - The user is prompted to enter a skill they are not familiar with. The tool then filters out any jobs that list this skill as a requirement.
- For each filtered job, details such as the company name, required skills, and a link to more information are printed to the console and saved in a text file within the
storedirectory.
- Clone the Repository:
git clone https://github.com/angelshinh1/Web_Scraper_Python.git
- Navigate to the Project Directory:
cd Web_Scraper_Python - Install Required Libraries:
pip install requests pip install beautifulsoup4 pip install lxml
- Run the Script:
python main.py
- Follow the Prompts: Enter a skill you want to filter out, and the script will extract and save relevant job listings.
- Python 3.x
requestslibraryBeautifulSoup(bs4package)lxmlparser
You can install the required libraries using:
pip install requests beautifulsoup4 lxml