This project analyzes instructor performance on an EdTech platform by evaluating data from 2,000 batches. The goal is to quantify "effectiveness" using a multi-factor composite score and use Machine Learning to categorize instructors into performance tiers (Low, Medium, High).
The model processes the following student engagement and outcome metrics:
- Completion Rate: Percentage of students finishing the course.
- Dropout Rate: Percentage of students leaving the course early.
- Score Improvement: Average increase in student performance scores.
- Engagement: Average watch time, quiz scores, and assignment submission rates.
- Feedback: Average feedback scores and instructor response rates.
To ensure all features contribute equally, variables like avg_score_improvement and avg_quiz_score were normalized to a common range.
A custom weighted formula was developed to quantify performance:
- 30%: Completion Rate
- 20%: Retention (1 - Dropout Rate)
- 10% each: Score Improvement, Quiz Scores, Watch Time, and Assignment Submissions
- 5% each: Forum Activity and Feedback Scores
- Algorithm: Random Forest Classifier.
- Logic: Instructors were grouped by their mean scores and split into 3 tiers using quantiles.
- Split: 80% Training / 20% Testing.
The model achieved an Accuracy of ~92%.
Precision 0.94 Recall 0.92 F1-Score 0.92
- Completion Rate (~33.9%): The strongest predictor of success.
- Dropout Rate (~28.2%): A critical indicator of low engagement.
- Avg Score Improvement (~10.1%): Reflects direct learning outcomes.
- Misleading Variables: Assignment submission rates might be influenced more by course difficulty than instructor skill.
- Real-world Risks: The model does not capture qualitative data like teaching style or communication skills.
- Recommendation: This should be used as a supportive tool rather than the sole metric for performance evaluation.
- Clone this repository.
- Ensure you have
pandas,seaborn,matplotlib, andscikit-learninstalled. - Update the CSV file path in the first code cell of
main.ipynb. - Run the notebook cells sequentially.