Senior Data Scientist @ Fidelity Investments

I’m currently working as a Senior Data Scientist at Fidelity Investments since March 2023. I’m in charge of various machine learning projects ranging from explainable AI and LLM applications.

Applied Scientist II @ Amazon

I worked as an Applied Scientist II since October 2021. I worked in Alexa AI and was responsible for entity resolution models. For a voice assistant as Alexa, Entity Resolution is the step to search from the huge catalog (hundreds, thousands and millions music, podcast, videos, etc.) and retrieve the desired result based on user’s query. During my time, I worked on two projects:

Ranker-based Entity Exploration Model for Entity Resolution

  • Lead, design and develop a ranker-based entity exploration model for entity resolution
  • Applied the model on a use case with more than 500K weekly traffic. Through online A/B test and offline analysis, demonstrate an improvement of 5.04% comparing to the current production system

Graph-based Data Augmentation for Entity Resolution

  • Design and develop graph-based data augmentation method for entity resolution for a use-cases with more than 500K traffic weekly to improve robustness to upstream ASR (Automatic Speech Recognition) error.
  • Experiment achieves 5.19% improvement on accuracy overall, and 27.86% improvement on harder cases.

Applied Scientist Internship @ Amazon

I worked as an Applied Scientist Intern at Amazon for Summer 2019 and Fall 2020. For the two internships, I worked on projects related to natural language processing and learning to rank.

Cross-query Ranker on ASR N-best for Entity Resolution (Fall 2020, Alexa AI)

  • Develop a machine learning ranker to leverage results from upstream ASR (Automatic Speech Recognition) to make the Entity Resolution result robust to ASR errors
  • Experiment the ranker on two use-cases with 100k examples and 1 million examples separately, and achieved about 10% gain in accuracy.

Entity linking on Customer Reviews and Queries (Summer 2019, Amazon Search)

  • Using natural language processing and learning to rank method, developed an entity linking system using wikipedia data on customer queries and reviews.
  • Designed evaluation method on both wikipedia data and collected Mechanical Turk labeled data, and achieved about 20% improvement comparing to baseline.

Computational Social Science Summer School (CSS'18)

I went to the Computational Social Science Summer School in Los Angeles in the summer of 2018. It was a full week summer school with awesome lectures and one project. The project I participated in was Cyberbullying on Instagram, collaborated with Qinyu E, Reham Al Tamine, Sijia Yang, with advisor Prof Homa Hosseinmardi. Our project luckily won the Best Project Award Prize for the Summer School! Here are the slides for our project!

Complex System Summer School (CSSS'18)

I participated the Complex System Summer School at Santa Fe Institute in summer 2018. It is a one-month long program with extensive lectures, projects, and a lot of fun! I participated in two projects:

  • Understanding music using higher order networks (see more details in Projects)
  • Analyzing Singapore public transportation data and economic sector clustering

Addtionally, several of us went to a ranch in Colorado to see yaks and get yak milk to make yak butter tea from scratch!