Kaggle Competition: American Express Default Prediction
Overview
Led a predictive analytics project achieving Top 0.4% ranking (20th out of 4,874 teams) and Silver Medal in the American Express Default Prediction Kaggle competition.
Key Technologies
- Machine Learning: LightGBM (DART), XGBoost (GPU-accelerated), CatBoost
- Ensemble Methods: Linear weighted averages, correlation-based weighting
- Feature Engineering: Time-series features, lag features, rolling statistics
Achievement
🏆 Ranked 20th out of 4,874 teams (Top 0.4%)
🥈 Silver Medal
Technical Approach
- Developed weighted ensemble of LightGBM (DART) and GPU-accelerated XGBoost models
- Processed 16 GB tabular time-series data covering transactions, balances, delinquencies, and repayments
- Optimized hyperparameters via grid search and stratified 5-fold cross-validation
- Designed diverse feature sets including:
- Lag features
- Rolling statistics
- Trend indicators
- Transaction patterns
- Temporal aggregations
Model Strategy
- Ensemble Architecture: Combined multiple model types with correlation-based weights
- Stability Enhancement: Trained multiple seeds to boost prediction stability
- Feature Diversity: Extracted temporal patterns from time-series credit data
- Efficient Design: Delivered compact, high-performing solution
Impact
- Demonstrated ability to work with large-scale financial data
- Showcased expertise in ensemble learning and feature engineering
- Achieved top-tier performance in highly competitive global competition
- Led model tuning and ensemble strategy for the team
Competition Details
American Express – Default Prediction
Platform: Kaggle
Timeframe: May 2022 – August 2022
Dataset: 16 GB tabular time-series credit card data
