Where is my mind?
Like everyone else in the industry, I’m obsessed with ChatGPT lately. Besides I’m very interested in Mechine Learning, Experiment Design, Causal Inference and Software Engineering in general. I love exploring Github, Stackoverflow or Kaggle and discovering new stuff.
Skills
Here comes a list of subjects I’ve worked with, grouped by topics…
Scripting and Programming Languages
Python (PyData,SciPy,TensorFlow) • R (dplyr,ggplot2,shiny) • DBs (HQL,Presto,PostgreSQL) • JavaScript, PHP (for web-based visualization or apps) • Hadoop,Spark,Flink • Airflow,Docker,Kubernetes
Data Science Stacks
Rflex Based Models (Binary Classification, Regression, Structured Prediction) • Feature Selection and Dimensionality Reduction • Time Series Analysis • Inferential Statistics • Multi-Armed Bandit
Domain Knowledges
Enterprise Level Experiments (ELE) • Behavioral Design • Advertisement Platform • Marketplace Optimization • Marketplace Monetization • Citrix Cloud Platform Business Manager
Experience
Slack Sr.Data Scientist 2022 - Presnt
- Lead the DS project that involves developing a deduction-based model comprising 47 components to produce a unified Grid health score. This will aid enterprise customers in comprehending their Slack usage.
- Improved the data foundation for Slack Connect by redesigning and constructing pipelines that meet modeling and analytical requirements, significantly improved the data accessibility.
- Introduced Enterprise Level Experiments (ELE) concept to the team and initiated collaboration with Slack experimentation platform to realize the statistical methods and tests enable ELE within the enterprise team.
TikTok Sr.Data Scientist 2021 - 2022
- Designed, executed, and consulted split tests for new product launches and provided insights to guide the team on the next steps on both product and experimentation platform end.
- Investigated TikTok Ads experimentation platform horizontally and provided best practices and guidelines for statistical methods and tests to accommodate pain points such as low sample size, pre-existing biases etc.
- Build framework for experimentation runbook within the org, covering topics such as questions to answer before experiments, type of experiments needed, and detailed implementation and analysis methods.
- Lead the DS effort on Ads Manager Lite v0.5 iterations and Automated Ads Product Solution initiatives, such as Northstar metric design, impact estimate, value analysis etc.
Uber Data Scientist 2019 - 2021
- Designed, constructed, and executed various experiments such as dif-in-dif, traditional A/B, switchbacks for new product launches and provided insights to guide the team on the next steps based on test results.
- Researched and quantified addressable markets by deep diving into prior trends, forecasting expected KPI goals and contributed to product roadmap building.
- Communicated on potential wins and drawbacks using quantitative measurements and improved the efficiency significantly.
- Lead, designed, and built guidelines for data instrumentation for all products across the org and integrated a new internal BI event tracking tool to standardize instrumentation procedure.
- Owned end to end ETL workflow from raw kafka topic to fact/dimension/aggregation tables that are consumed by all stakeholders across the company in multiple geographic locations.
- Investigated and modified previously developed logistic regression model-based search algorithm and iterated to meet the need of the product.
OfferUp Product Data Science 2017 - 2019
- Key contributor of Designing and implementing an in-house A/B testing platform, enabling randomized user bucketing, multiple statistical tests and reduced time required to analysis and deploy critical revenue impacting features by 75+%.
- Identified a user randomization issue by unexpected behavior in two independent experiments, later revealed as a correlation between experiments created by the hash function and confirmed the non-orthogonality by a Chi-squared test.
- Defined, constructed, communicated, and tracked KPIs and automated analysis and visualization pipelines to enable new revenue streams without cannibalization.
- Provided insights to initiate a P0 hot fix team and pinpointed the root cause by diving into Terabytes of unstructured data.
- Projected sales and churn rate with a cohort analysis, reached 80% accuracy and resulted in millions of revenues.
- Identified pain points and empowered A/B test for multiple iterations of personalization NLP models. Reduced the experiment time by 50% and lifted ads revenue by 10%.