In this project, you will make use of Python to explore data related to bike share systems for the three major cities in the United States - Chicago, New York City, and Washington. You will write code to import the data and answer interesting questions about it by computing descriptive statistics. You will also write a script that takes in raw input to create an interactive experience in the terminal to present these statistics.
- Complete the "to do" lines in the template python file "bike_investigation.py"
- Push your code to a private Github repository
- Document what you've done in the code and with a README
You will investigate about bike share use in Chicago, New York City, and Washington by computing a variety of descriptive statistics. In this project, you'll write code to provide the following information:
#1 Popular times of travel (i.e., occurs most often in the start time)
- most common month
- most common day of the week
- most common hour of day
#2 Popular stations and trip
- most common start station
- most common end station
- most common trip from start to end (i.e., most frequent combination of start station and end station)
#3 Trip duration
- total travel time
- average travel time
#4 User info
- counts of each user type
- counts of each gender (only available for NYC and Chicago)
- earliest, most recent, most common year of birth (only available for NYC and Chicago)
To answer these questions using Python, you will need to write a Python script. To help guide your work in this project, a template with helper code and comments is provided in a bike_investigation.py file, and you will do your scripting in there also. You will need the three city dataset files that are in the ZIP file éBike_raw_data" : chicago.csv new_york_city.csv washington.csv
Randomly selected data from https://www.capitalbikeshare.com/system-data for the first six months of 2017 are provided for all three cities. All three of the data files contain the same core six columns:
- Start Time (e.g., 23/06/2017 15:09:32)
- End Time (e.g., 23/06/2017 15:14:53)
- Trip Duration (in seconds - e.g., 321)
- Start Station (e.g., Wood St & Hubbard St)
- End Station (e.g., Damen Ave & Chicago Ave)
- User Type (Subscriber or Customer)
- Quality of the code
- Scalability of the algorithm
- Usage of good practices and modern Python