This Portolio is a compilation of all the Data Science and Data Analysis projects I have done for academic, self-learning and hobby purposes. This portfolio also contains my work experience, skills and certificates.
Tools: Python, Streamlit, Scikit-Learn, Isolation Forest, Random Forest, OpenAI GPT-4, GeoPandas
This project builds an AI-powered dashboard to detect, predict, and respond to urban disasters using multimodal data. By integrating sensor anomalies, weather data, and social media streams, it enables real-time decision-making through predictive modeling, misinformation, and GPT-generated emergency response plans.
Tools: Python, Python, Neo4j, LangChain, OpenAI GPT-4, Streamlit
This project develops an AI-powered Q&A system that transforms unstructured 10-K filings into a structured financial knowledge graph. Using tools like Python, Streamlit, Neo4j, spaCy, and OpenAI GPT-4, it enables investors to query SEC filings in natural language and receive context-aware, grounded answers backed by graph-based retrieval and semantic search. The system automates entity extraction, connects related risks and strategies, and visualizes relationshipsβturning complex financial documents into accessible, actionable insights for faster, smarter decision-making.
Tools: Python, Pandas, Matplotlib, SMOTE, Scikit-Learn, XGBoost
This project applies machine learning techniques to predict customer churn in the telecom industry. By building and comparing Logistic Regression, Decision Tree, Random Forest and XGBoost models, this project provides actionable insights to telecom providers.
Tools: Python, Streamlit, Pandas, Matplotlib, Seaborn
An end-to-end Streamlit web app that lets users upload any CSV and instantly explore their data. Features include data preview, dataset info, descriptive statistics, correlation heatmaps and interactive distribution plots.
Tools: Python, Pandas, Matplotlib, Seaborn, API
This project dives into my Spotify's rich streaming data to uncover patterns in my streaming history, preferences, and listening behaviors. By performing data cleaning, exploratory analysis, and visualizations, it transforms raw data into actionable insights.
Tools: Python, APIs, BeautifulSoup
This project helps pet owners make informed decisions by recommending breed-specific information, pet-friendly locations, weather-based recommendations, and upcoming pet events.
π Equity Research Analyst @ Burkenroad Reports
Tools: Bloomberg, Excel
Performed equity valuation and operated P/TBV ratio and 3 statement pro-forma models to forecast revenue streams; Presented investment recommendation conclusions to over 20,000 institutional and individual investors
- Professional: Python, SQL, R, Tableau, Adobe Premiere Pro, Bloomberg
- Language: English(Native), Mandarin Chinese(Native)
- Certificates: Google Data Analytics Specialization
- Areas of Expertise: Data Analysis, Marketing Analytics, Predictive Modeling
Data Analyst @ BroadVision Marketing
As part of the MSBA, analyzing SEO optimization metrics for a digital marketing firm by optimizing website analysis workflow and conducting data-driven research to enhance the online promotion strategies for attorney firms
- Built a web scraper using Python and Google Search API to extract SEO metrics (e.g. backlinks, page speed, domain authority) from attorney firm websites, filtering out irrelevant websites, automating data collection and reducing manual work by 50%
- Developed a machine learning classifier to predict SEO performance, integrating web scraping results with regression analysis; identifying three key factors driving 70% of client traffic
- Created a data-driven business insights report with SEO recommendations, projected to reduce client reliance on paid advertising by 25%, enhancing organic search performance and cost efficiency
Data Analyst Intern @ Kantar
- Conducted market research reports using SQL for six brand clients across multiple industries, including Heineken, New Balance, Continental, Shangri-La, OSM and Abbott, generating strategy to improve brand awareness on specific mediums and audience groups
- Created data visualization using Tableau with the data above; Produced six reports analyzing brand performance, campaign impact and benchmarking against competitor, providing strategic recommendations that boosted brand power scores by 15% on average
Private Equity Analyst Intern @ Guotai Junan Securities
- Wrote two in-depth industry research reports about Hydrogen Fuel Cell and Advanced Driver Assistance System
- Presented industry research reports to the team (including Investor Relations Manager); My recommendation of not investing in Shanghai Shen-Li High Tech Co., Ltd because of poor government policy, low production efficiency, and high related-party transaction was implemented by the company, preventing a potential loss of $50 million
Financial Planning and Analysis (FP&A) Intern @ Wuxi Biologics USA
- Extracted data from the 10-K reports of four competitors in the Contract Development and Manufacturing Organization (CDMO) industry; Calculated gross profit and EBITDA benchmarks and provided profit improvement recommendations, including strong M&A or follow the industry trend to provide integration of AI/Machine Learning in preclinical drug development
- Built two advanced Excel models to analyze profitability using 4-year revenue and cost data; Presented findings to the Finance team to inform future budget forecasting and decision making
Master of Science(M.S.) in Business Analytics (MSBA) @ UC Davis
Core Modules: Analytic Decision Making, Big Data, Advanced Statistics and Forecasting, Machine Learning and Artificial Intelligence
Bachelor of Science in Management @ Tulane University
Major: Finance, Minor: Marketing
Core Modules: Business Analytics, Research and Analytics, Financial Modeling, Equity Analysis