Researcher Intro Series: Max

Hi, this is Mengxiong Li. It’s my great honor to work with Yoshie and other team mates. My major job in this lab is to clean the data set and build the Recommendation System. The data set available for us is very messy, and there are also a lot of empty or useless records in the data set. Therefore, cleaning the data set is a very time-consuming process. During the data cleaning, our team use some predictive model to fill in the missing records. For the Recommendation System, it is one of the major projects in our team. We hope that by using this recommendation system, we can recommend some courses for the individuals who input some information to our system. Since the available data set is very limited, right now we only build a draft Recommendation System. However, there are lots of room for this system to be improved.

Data analytics is one of the hottest topic in the industry. As a statistics student, data analytics is always the field that I am very interested in. I believe that with the development of Machine Learning, Deep Learning and other advanced statistical techniques, the function of data analytics will be even broader.

While the data analytics is very interesting, I also face many challenges when dealing with the data analytics. One of the biggest challenges is the limitation of the data set. Even though we have many great ideas or statistical algorithms, we are always limited by the availability of the data set. What’s more, since we are facing the raw data set, it takes a lot of time to clean the data set.

Overall speaking, it is a great experience to work with Yoshie and her amazing team.

Researcher Intro Series: Kapil

As a Graduate Research Assistant here at the ExecEd’s Data Science team, I’ve been working on building the predictive model and recommendation system to improve conversions from inquiries to enrollments. I also manage the blog and social media posts for RoL Lab at ExecEd. This is a great place to learn and experience hands-on implementation of advanced statistical methods using real world data.

I am passionate about building products and processes that continuously leverage the power of data to better themselves. Most tech product development today is focused on user experience & optimizing the use of software architecture while completely overlooking the insights that could be generated from user data. I want to build products that generate and act upon these insights leading to potential avenues of revenue growth or increased user satisfaction.

The biggest challenge that I face in data analytics is figuring out the quality of data and identifying the key parameters that need to be included in the model. Especially, when you have data pouring in from various sources, it is important to make sure that certain parameters do not get over-represented. Often times, a good understanding of the business/operations process that generates the data set helps in tackling this problem.