A data science project for analyzing international trade data, including imports and exports across different countries, products, and modes of transport. This project demonstrates data cleaning, exploratory data analysis, and interactive visualization using Python.
This project processes and analyzes trade data to uncover insights about:
- Top imported and exported products by country
- Trade patterns across different transportation modes
- Comparative trade statistics (imports vs. exports) by country
learnds/
├── README.md # This file
├── requirements.txt # Python dependencies
├── data/
│ └── processed_trade.csv # Cleaned trade data
└── src/
├── analysis/
│ ├── process_data.py # Data cleaning and preprocessing
│ ├── imported_products.py # Analysis of top imported products
│ ├── mode_distribution.py # Trade by transportation mode analysis
│ └── overall_trade.py # Overall import/export comparison
└── visualization/
├── imported_products.html # Interactive product visualization
├── mode_distribution.html # Transportation mode visualization
└── overall_trade.html # Trade comparison visualization
- Python 3.7+
- pip or conda package manager
- Clone or download this repository:
cd learnds- Install required dependencies:
pip install -r requirements.txt- pandas - Data manipulation and analysis
- bokeh - Interactive data visualization
- squarify - Treemap visualization library
Run the data cleaning script to prepare raw trade data:
python src/analysis/process_data.pyThis script:
- Loads raw trade data from
data/trade.csv - Converts date formats
- Filters data from 2024 onwards
- Removes unnecessary columns
- Cleans entries with zero values
- Exports cleaned data to
data/processed_trade.csv
Generate visualizations by running the analysis scripts:
Top Imported Products Analysis:
python src/analysis/imported_products.pyOutputs: src/visualization/imported_products.html
Transportation Mode Distribution:
python src/analysis/mode_distribution.pyOutputs: src/visualization/mode_distribution.html
Overall Trade Comparison:
python src/analysis/overall_trade.pyOutputs: src/visualization/overall_trade.html
Open the generated HTML files in your web browser to explore interactive visualizations.
Identifies and visualizes the top 20 most imported products across different countries. Creates an interactive chart showing:
- Top 20 products by import value
- Top 10 countries contributing to imports
- Product codes and names with import values in millions
Analyzes trade patterns by transportation mode. Currently focuses on:
- Top 10 products imported by the United States via Road transport
- Pie chart visualization showing the distribution and percentage of each product
Provides a comparative view of imports and exports:
- Grouped by country and trade type (Import/Export)
- Side-by-side bar chart comparing import and export values
- Interactive hover tooltips with formatted values
All visualizations are created using Bokeh and saved as interactive HTML files. Features include:
- Hover tooltips for detailed information
- Interactive tools (zoom, pan, reset)
- Responsive design
- Professional styling
The processed trade data includes the following columns:
- DATE - Reference date in MM/YYYY format
- COUNTRY - Principal trading partner country
- MODE - Mode of transport (Road, Air, Rail, etc.)
- PRODUCT - Product classification with code
- Trade - Trade type (Import/Export)
- VALUE - Trade value or volume
- Automated Data Cleaning - Handles missing values, formatting, and filtering
- Interactive Visualizations - Bokeh-based charts for data exploration
- Scalable Analysis - Easy to extend with additional analyses
- Clean Code Structure - Modular functions for data processing and visualization
- Add time series analysis of trade trends
- Implement correlation analysis between products
- Create dashboard combining multiple visualizations
- Add forecasting models for future trade patterns
- Include geographic mapping of trade data
This project is available for educational purposes.
Data Science Learning Project