About this project
This project involved analyzing a comprehensive HR dataset encompassing various employee attributes and performance metrics. The dataset included details on employee demographics (age, gender, race, marital status, etc.), employment history (hire date, termination date, reasons for termination), compensation (salary), performance evaluations, engagement survey results, and absenteeism records. The goal was to extract meaningful insights regarding employee performance, identify factors influencing employee turnover and satisfaction, and explore potential correlations between different employee characteristics and their performance scores.
My analysis involved several key steps, including data cleaning, exploratory data analysis, and statistical modeling. I explored relationships between performance scores and salary, tenure, demographics, department, and management styles. Additionally, I investigated the factors contributing to employee attrition and employee satisfaction levels. The results were visualized using various charts and graphs to facilitate clear communication and interpretation.
Task to be done
I. Performance and Compensation
Task: Investigate the correlation between PerformanceScore and Salary.
Question: Do higher performers receive higher salaries? The impact of SpecialProjectsCount have to be cincidered.
II. Performance and Demographics
Task: Analyze if there’s a relationship between PerformanceScore and demographic factors like Sex, RaceDesc, HispanicLatino, or Age (calculated from DOB).
Be cautious about interpreting these results. Focus is on identifying the potential disparities, not making causal claims.
III. Absenteeism and Performance
Task: To examine the correlation between Absences and PerformanceScore.
Question: Do employees with higher absences tend to perform worse? Consider also the interaction with DaysLateLast30.
IV. Departmental Performance
Task: To compare the average PerformanceScore across different Departments.
Question: Are certain departments consistently outperforming others?
V. Recruitment Source and Performance
Task: To analyze if the RecruitmentSource has an impact on PerformanceScore or employee retention (Termd).
VI. Tenure and Performance
Task: To explore if employee tenure (time in the company, calculated from Date ofHire) correlates with PerformanceScore or EmpSatisfaction.
VII. Marital Status and Performance
Task: To investigate if MaritalStatusID or MaritalDesc is related to PerformanceScore or Absences.
VIII. Manager Impact
Task: To analyze if employees under certain managers (ManagerName or ManagerID) exhibit different performance patterns.
Other insights to explore
Employee Turnover
Task: To analyze the TermReason and DateofTermination to understand why employees leave the company. Segment the reasons and explore potential trends.
Diversity and Inclusion
Task: To analyze the representation of different demographic groups (RaceDesc, HispanicLatino, Sex) within the company. To calculate representation ratios for each department.
Employee Engagement and Satisfaction
Task: To explore relationships between EngagementSurvey, EmpSatisfaction, PerformanceScore, and Absences.
Salary Distribution
Task To analyze the distribution of Salary across departments, positions, and performance levels. To Check for potential pay gaps.
Time Series Analysis
Task: To use Date of Hire, LastPerformanceReview_Date, and DateofTermination for time-series analyses to spot trends in hiring, performance, and turnover.

Tools to be used
Python libraries like Pandas, NumPy, Scikit-learn, and visualization libraries (Matplotlib, Seaborn) will be helpful. Google project IDX Cloud SaaS will used for editing.
Dataset sample
Employee_Name | EmpID | MarriedID | MaritalStatusID | GenderID | EmpStatusID | DeptID | PerfScoreID | FromDiversityJobFairID | Salary | Termd | PositionID | Position | State | Zip | DOB | Sex | MaritalDesc | CitizenDesc | HispanicLatino | RaceDesc | DateofHire | DateofTermination | TermReason | EmploymentStatus | Department | ManagerName | ManagerID | RecruitmentSource | PerformanceScore | EngagementSurvey | EmpSatisfaction | SpecialProjectsCount | LastPerformanceReview_Date | DaysLateLast30 | Absences |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Adinolfi, Wilson K | 10026 | 0 | 0 | 1 | 1 | 5 | 4 | 0 | 62506 | 0 | 19 | Production Technician I | MA | 1960 | 7/10/1983 | M | Single | US Citizen | No | White | 7/5/2011 | N/A-StillEmployed | Active | Production | Michael Albert | 22 | Exceeds | 4.6 | 5 | 0 | 1/17/2019 | 0 | 1 | ||
Ait S, Karthikeyan | 10084 | 1 | 1 | 1 | 5 | 3 | 3 | 0 | 104437 | 1 | 27 | Sr. DBA | MA | 2148 | 5/5/1975 | M | Married | US Citizen | No | White | 3/30/2015 | 6/16/2016 | career change | Voluntarily Terminated | IT/IS | Simon Roup | 4 | Indeed | Fully Meets | 4.96 | 3 | 6 | 2/24/2016 | 0 | 17 |
Akinkuolie, Sarah | 10196 | 1 | 1 | 0 | 5 | 5 | 3 | 0 | 64955 | 1 | 20 | Production Technician II | MA | 1810 | 9/19/1988 | F | Married | US Citizen | No | White | 7/5/2011 | 9/24/2012 | hours | Voluntarily Terminated | Production | Kissy Sullivan | 20 | Fully Meets | 3.02 | 3 | 0 | 5/15/2012 | 0 | 3 | |
Alagbe,Trina | 10088 | 1 | 1 | 0 | 1 | 5 | 3 | 0 | 64991 | 0 | 19 | Production Technician I | MA | 1886 | 9/27/1988 | F | Married | US Citizen | No | White | 1/7/2008 | N/A-StillEmployed | Active | Production | Elijiah Gray | 16 | Indeed | Fully Meets | 4.84 | 5 | 0 | 1/3/2019 | 0 | 15 | |
Anderson, Carol | 10069 | 0 | 2 | 0 | 5 | 5 | 3 | 0 | 50825 | 1 | 19 | Production Technician I | MA | 2169 | 9/8/1989 | F | Divorced | US Citizen | No | White | 7/11/2011 | 9/6/2016 | return to school | Voluntarily Terminated | Production | Webster Butler | 39 | Google Search | Fully Meets | 5 | 4 | 0 | 2/1/2016 | 0 | 2 |

Latest blog
Follow me on

