My Journey into Data Science
A first-hand look at American Tire Distributors, Data Science Accelerator program
At American Tire Distributors (ATD), we’re committed to helping our associates be at their best by making sure they have the resources needed to take charge of their learning and professional development. One example of this is the Data Science Accelerator (DSA) program. At the end of 2019, our analytics team Launched the ATD Data Science Accelerator Program which provided associates an opportunity to elevate their skills in data science and machine learning.
Meet Sridhar, an inaugural DSA program graduate.
In this blog, Sridhar shares his journey into data science and how he embraces our culture of knowledge. Going from Zero to Data Scientist in 8-Months is impressive and we’re excited for him to share his experience.
For the last 20 years, I’ve held various IT roles and supported multiple enterprise projects at different companies. One thing that stayed constant was my eagerness to learn new technologies. One area I’ve always wanted to learn more about is data science because it blends my favorite subjects of mathematics, statistics and computer science. But I wasn’t sure where to start.
This all changed when I learned about the brand new DSA program that ATD’s analytics team put together. Right away I knew this was the start of my journey into data science that I’ve been waiting for. With no hesitation, I filled out my application and waited anxiously as the analytics team reviewed applicants.
It was a dream come true when I learned I’d been accepted into the inaugural DSA program. I was filled with positive energy and enthusiasm, ready to learn new skills and immerse myself in the world of data science.
Kicking off the DSA program
To help balance the DSA program and my day-to-day responsibilities, I established a routine to help maximize my time. The program required a commitment of 5 to 10 hours per week for learning activities along with a recurring meeting with the mentors and participants. The meeting provided an opportunity to discuss any issues we were experiencing and what to expect throughout the initial 12 weeks of virtual learning modules.
The virtual learning modules were a mix of courses from DataCamp, Pluralsight, and other online learning sites. Each course finished with an in-depth lab exercises that was extremely helpful. The labs put my new knowledge to the test and helped me learn even more. Being able to learn at my own pace was beneficial to growing my knowledge and strengthened my ability to solve complex problems with a data-driven solution.
After completing the virtual learning, I was looking forward to working on my Capstone Project.
The Capstone Project
To leverage my new skills and make a positive impact across ATD, I identified an area of opportunity that benefited associates across ATD. My Capstone Project focused on improving the way associates interacted with ServiceNow when creating incident tickets. With requests taking several minutes each to complete, I knew there had to be a more efficient way. An associate is required to fill in the correct Business Service, Category and Subcategory, as these data fields assign the appropriate priority ranking and direct each request to the proper team.
My solution would decrease the time it takes to complete an incident ticket and increase priority ranking accuracy. By leveraging a machine learning algorithm, the process of completing the required data fields would be automated. Minimizing the data entry required by associates, while helping to better assign the right priority ranking.
Lets take a closer look at how the machine learning algorithm was built:
I first needed to obtain data the model could use. My primary source of data came from our ServiceNow application and the dataset had an approximate total of 45,000 records and 25 columns that could be used to filter down the data. I had my work cut out for me.
Once I gathered the data, I focused on data-preprocessing. This is the most important step in machine learning as it helps improve the ML models accuracy. I began cleaning up the raw data by fixing the missing data and outliers. I then had to convert the categorical data like the Business Service, Category and Subcategory into numeric features. Allowing the information to then be processed by my ML model.
Using different data visualization packages and tools such as Matplotlib, Plotly, Tableau, and PowerBI, I was able to build a graphical representation of the data. This helped to identify and understand the trends and patterns within the data.
Choosing the right model
Next, I needed to determine which model would perform the best. I explored different Scikit-learn classification algorithms, including Decision Tree, Random Forest, XGBoost, Support Vector Machine, and Logistic Regression models. Knowing I needed a highly efficient and flexible model, I decided on using the XGBoost classification model. This aided in automating the incident report after the dataset was trained and the predicted target values were reported.
Model evaluation and outcome
Now came the real test. Did I choose the right model, and would it work? To help determine the accuracy of the model I used a confusion matrix. This helped me know how well the chosen model worked. The bagging and boosting ensemble techniques allowed me to decrease the variance and adjust the weights of my dataset’s classifications strengthening the model’s performance. The model achieved an accuracy rating of more than 78% along with a precision and recall score of 98%. This meant a significant reduction to the ticket creation time and improved resolution time for the HelpDesk team. On average, incident tickets were reduced by 30 to 90-seconds each, equivalent to 3 to 4 hours/day or 125 hours/month collectively. Resulting in a more positive associate experience and more time back in their day.
Reflecting on the DSA program
The structure of the program is one of the reasons I chose to apply. Having the ability to learn through a virtual setting, but also having the support from the DSA mentors impressed me. The learning modules were very well thought out and the flow of content provided a great learning experience.
I loved the entire experience and learned more than I could’ve ever imagined. The highlight of the program for me was the Capstone Project and being able to put my new skills to the test. I can’t thank the mentors David, Sid, and Salman enough for their support throughout the program.
I couldn’t be prouder to show off my Certificate of Completion and be a part of the inaugural DSA program!