- A look at the key roles in Data Science
- Karlijn Willems, Datacamp (2015). The Data Science Industry: Who Does What (Infographic)
Author Archives: Kapil Sharma
Researcher Intro Series: Hassan
I have been working with Yoshie as a research assistant since the start of the summer. I was very excited to get onboard and do pricing analytics work, which has not been done before here at executive education, and is also a field that I enjoy very much. Here at the lab I have been working on all things revenue and cost, through various aspects of pricing that range from price optimization to modelling the program choices of our customers
I am most passionate about price optimization, finding an optimal price point for product is a notion that just really appeals to me. In addition, I am also very passionate about the intersection of machine learning and business analytics with pricing. This is the area where I use statistical models to estimate our customers’ behavioral response to pricing, such as in the estimation of price-response functions and fitting choice models.
The biggest challenge I faced in the process is dealing with missing data and lack of price variability. These problems one has to come up with unconventional ways to approach a problem, because the standard ways end up not working anymore. 
Researcher Intro Series: Max
Hi, this is Mengxiong Li. It’s my great honor to work with Yoshie and other team mates. My major job in this lab is to clean the data set and build the Recommendation System. The data set available for us is very messy, and there are also a lot of empty or useless records in the data set. Therefore, cleaning the data set is a very time-consuming process. During the data cleaning, our team use some predictive model to fill in the missing records. For the Recommendation System, it is one of the major projects in our team. We hope that by using this recommendation system, we can recommend some courses for the individuals who input some information to our system. Since the available data set is very limited, right now we only build a draft Recommendation System. However, there are lots of room for this system to be improved.
Data analytics is one of the hottest topic in the industry. As a statistics student, data analytics is always the field that I am very interested in. I believe that with the development of Machine Learning, Deep Learning and other advanced statistical techniques, the function of data analytics will be even broader.
While the data analytics is very interesting, I also face many challenges when dealing with the data analytics. One of the biggest challenges is the limitation of the data set. Even though we have many great ideas or statistical algorithms, we are always limited by the availability of the data set. What’s more, since we are facing the raw data set, it takes a lot of time to clean the data set.
Overall speaking, it is a great experience to work with Yoshie and her amazing team.
Researcher Intro Series: Kapil
As a Graduate Research Assistant here at the ExecEd’s Data Science team, I’ve been working on building the predictive model and recommendation system to improve conversions from inquiries to enrollments. I also manage the blog and social media posts for RoL Lab at ExecEd. This is a great place to learn and experience hands-on implementation of advanced statistical methods using real world data.
I am passionate about building products and processes that continuously leverage the power of data to better themselves. Most tech product development today is focused on user experience & optimizing the use of software architecture while completely overlooking the insights that could be generated from user data. I want to build products that generate and act upon these insights leading to potential avenues of revenue growth or increased user satisfaction.
The biggest challenge that I face in data analytics is figuring out the quality of data and identifying the key parameters that need to be included in the model. Especially, when you have data pouring in from various sources, it is important to make sure that certain parameters do not get over-represented. Often times, a good understanding of the business/operations process that generates the data set helps in tackling this problem.
Researcher Intro Series: Aidan
Apart from handling the website and social media sides of the lab; I focus mainly on compiling data to use from ExecEd’s Canvas site and social media outlets.
I am most passionate about seeing the product of what we do. Working with a tangible source of data and working on the algorithms which format that information into usable data makes me appreciate seeing a nice graph or curve from the messy reality in the end.
The biggest challenge in what I do is the constant learning required to get what information we want. There is no one-fits-all solution to the problems we face, and so I end up spending a small portion of my time accomplishing large amounts of things, and a large portion of my time learning about small details required to put all the parts together.

Researcher Intro Series: Kevin
I have been working with Yoshie and the team for 2 years now, as a research assistant. I help with data cleaning, analysis, and occasionally with literature reviews using my knowledge of Excel, R, and other software.
I am most passionate about the idea of exploring the unknown – handling new parameters and navigating through the myriad of data points to discover something new is quite exciting.
Data cleaning for me is sometimes bottle-necked by lack of software functionalities, and results in manual inputting, which can be very time-consuming.

The ROL team congratulates Kevin on his Graduation and wishes him good luck for his next adventure.
Researcher Intro Series: Ryo
I have conducted research for and supported many quantitative and qualitative analytics projects over the years, including determining competitive social media efficacy, conducting social network analyses from survey data, and creating a comprehensive regression model to forecast enrollment patterns on both a micro and macro level.
Apart from teasing out the key insights from a large data set that initially might seem random and chaotic, I am most passionate about figuring out how to tell a compelling story from the data because it challenges me every time and forces me to think outside the box.
The biggest challenge by far that I face in data analytics is cleaning the data. Often the data that we get is messy or has not been collected in the way that we want (or is missing key aspects that we need for the analysis). For example, when doing the social network analysis with Gephi, we ran into several roadblocks when converting the XML files into CSV files readable by Gephi.

The ROL team congratulates Ryo on his Graduation and wishes him good luck for his next adventure.