Analytics Practice from Practicum Project

Angela Wang
6 min readApr 20, 2021

--

It’s the graduation season of the year! Are you ready to start the job-hunting process? It’s going to be a tough process. Luckily, the UCD MSBA program has prepared me well, especially the practicum project.

The year-long practicum project not only helps me to gain industry experience but also helps to hone my technical skills. All these precious experiences and knowledge I learned from the practicum project can be contributed to my future career development. In this blog, I am going to list in detail how the UCD MSBA practicum project provides me the unique opportunity to grow my professional capabilities.

  • How practicum prepared me for technical rounds.
  • How practicum prepared me from business aspects.

How did practicum prepare me for technical rounds?

Throughout the practicum project, our team performed data transformation to extract data in SQL and built machine learning models to help the organization improve its reliability. We also build dashboards using Tableau to track flight performance and yearly flight progress. “Machine Learning”, “Dashboard Building” are some of the essential tasks that data analyst and data scientist position are looking for. Moreover, SQL, Python, R, Tableau are also the critical software many companies currently use. I will be breaking into detail on the tasks I have been doing that help me with the data analysis job hunting.

Last quarter, I worked on the Easy to Fill model. The objective of this mission is to identify if the mission is Easy to Fill, by doing this the coordinators can better utilize time on missions that are more likely to get canceled due to no pilot or are otherwise in general difficult to fill. My task is to validate the existing model that was created by the previous AFW team. Moreover, I will make adjustments to improve the model accuracy and add new test data that will include missions scheduled and canceled during the pandemic.

My first task is to update the dataset from October 1st, 2019 to the present, which is the data preparation phase. I extracted data from the AFW database through SQL and exported it as a CSV file. The data originally contains 34 variables, including mission type, passengers, mission date, etc. After examining each variable and its data types, I decided to remove all the unnecessary columns from the data to avoid over-fitting. In addition, I converted NAs in the columns to 0 and created dummy variables for the age column to explain missing values. I dropped the lead time column whose null values were more than 80%. I ultimately kept 25 variables for further analysis.

After the data preparation, I did data exploration using Python to drive insights from the dataset. I used the crosstab() function to create crosstab visualizations to find out that the patients with more companions mean less likely to be easy-to-fill. Moreover, I derived that weekends and summer months seem less likely to be easy to fill. Return passengers tend to have more easy-to-fill. The results make sense to me. Usually, pilots will not work on weekends and summer, since they would like to have their vacations. Pilots are more likely to serve return customers since the customers already have a good record and intimate relationships with the company.

After data exploration, I went into the data modeling process. I developed a random forest model in Python to predict flights that get filled with no coordinator intervention. The random forest model is a classification algorithm consisting of many decision trees. I split the dataset into training data and testing data. The training data used only the missions where cancellation due to no pilot. This was done so that the model is trained on a richer dataset. For testing data, I did without filtering no pilot cancellation since any incoming mission can be anything the testing dataset used. The accuracy score we currently get is 0.79. In order to improve accuracy, I did grid search and parameter tuning that increase accuracy by 3%.

In this task, I practiced data cleaning, data exploration, and data modeling steps, which are the essential steps and requirements for a data analyst and data scientist position. I could transform my above tasks into bullet points to put on my current resume.

How did practicum prepare me from business aspects?

Throughout the practicum project, our team worked closely with CTO and coordinator team. Working with a non-technical team reminds me of the importance of translating technological solutions into the business. This is also very critical in future job requirements because not all people will understand the technical aspects of the work. I will in detail explain how I improve my ability to translate technology into business value.

The practicum project’s Mission Operations Team is responsible for processing incoming mission requests, coordinating mission logistics with volunteers and passengers, and facilitating onboarding for new passengers. A mission can be canceled in two situations: 1) A mission without volunteer pilot signs up; 2) Poor weather. The first situation refers to “No Pilot Cancellation” and the latter one considers as “Weather Cancellation”. In both cases, coordinators from the Mission Operations Team need to intervene and resolve the problems. For no pilot cancellations, the coordinators need to intervene two days before the scheduled departure by approaching potential volunteers through e-mail and SMS. There is a small chance that the mission will be staffed through this method. For weather cancellations, the coordinators can solicit other volunteers two days before the scheduled departure. However, if the weather is severe, AFW will never endanger its volunteers or its passengers by making volunteers fly against their best judgment. If the weather cancellation happens, the team will need to create a backup plan for the patients.

The organization wants to increase operational efficiency with respect to no pilot and weather cancellations. No pilot cancellations are preventable if the team can successfully solicit a volunteer, but 2 days is too short for the pilot response rate to be high. Weather Cancellations are unpreventable, but the team has the means to provide a backup plan. Knowing which missions might be prone to poor weather makes it easier to proactively back them up. This is where data analytics is applied.

Our practicum team has built machine learning models using historical data that predict whether or not an upcoming mission is at risk of being canceled or unstaffed. This model will assist coordinators in prioritizing missions that need special attention, eliminate preventable no-pilot cancellations, and systematically back up potential weather cancellations. We also found the ten most important features that have the biggest impact on predicted outcomes. We documented the model requirement and usage. We also listed out the potential outcomes and business impact with the implementation of our models. By reading our documentation, coordinations could easily use our models to improve the daily operation process.

Conclusion

I am very honored to be UCD MSBA 2021. This program helps me a lot along the way of job hunting. Hard work will pay off. We will all receive job offers from our dream company. The best is yet to come!

--

--