EDA - Amazon
EDA - Amazon

EDA - Amazon

logo
Tool
Python
logo
Project Type
Data Cleaning
Data Analysis
Source
kaggle.com

⬅️
Navigation Bar
HomepageHomepage


🥅
Goals
  • Analyze pricing and discount patterns across Amazon product categories
  • Identify which categories are most discounted, most expensive, and best rated
  • Understand the relationship between price, discount, and customer ratings
  • Find the top performing products by reviews and ratings
 
💼
Process
  • Cleaned price columns by removing ₹ and comma symbols and converting to float
  • Extracted main category from nested category strings using string splitting
  • Added two engineered features — savings (actual minus discounted price) and discount tier (binned into 5 ranges)
  • Built 19 visualizations across 5 sections — category, price, rating, discount, and product analysis
 
 
Insights
  • Electronics is the core revenue driver — highest product count (526) and highest average price (₹10,127)
  • Office Products has the highest average rating (4.31) with nearly zero discounting — suggesting loyal professional buyers who don't need price incentives
  • Discount percentage has near-zero correlation with rating (-0.16) — sellers cannot buy better ratings through higher discounts
  • AmazonBasics dominates the most reviewed products with 426,973 reviews — the in-house brand has captured massive customer trust for basic accessories
  • The 41–60% discount tier is the sweet spot — 508 products fall here, suggesting Amazon and sellers have converged on this as the optimal pricing strategy
  • Cables and accessories are discounted up to 94% — likely used as loss leaders to drive traffic to product pages

 
 
⬅️
Navigation Bar
HomepageHomepage