"All the models are wrong, but some are useful"
-- George Box
-- George Box
Hi, this is Liz!
I'm a Data Scientist with previous experience in consulting and banking industry. I am passionate to wrestle with complex data and apply machine learning to tell stories and solve business problems.
I will soon graduate from my Master's of Science in Data Science (MSDS) at the University of San Francisco, where I have developed a strong programming and statistics skill set that can tackle business problems involving big data.
As a Data Scientist at Wiser Solution , my primary role is to help my client boost revenue and optimize pricing strategy with predictive modeling techniques and machine learning algorithms.
In the following part, I'd like to share some of my projects I completed so far that I found interesting.
Click "more" for details and source code on github.
A data pipeline which automates Ford Gobike data extraction from AWS S3 to MongoDB and connected to Spark to model bike demand.
(ETL, data pipeline, AWS, S3, MongoDB, SparkSQL, SparkML, Pyspark, )
A complete web product that shows dynamic food truck location and business information in SF.
(AWS, ETL, Flask, MongoDB, Google Analytics)
A fully interactive visualization website to demonstrate gamer strategy in PUBG.
(D3, Plotly, Python, HTML)
An Predictive model to classify whether an email is a spam or not
(python, boosting trees, numpy, XGBClassifier)
A collaborative filtering system to potential predict movie rating for a viewer.
(matrix factorization, stochastic gradient descent optimization, numpy)
A vanila version of Neural Network to classify digits from images. Trained on MNIST dataset.
(Pytorch, AWS, neural network tuning, )
Time Series forecast of Canadian bankruptcy rate with macroeconomic indicator.
(R, Holt-Winters, SARIMA, VARX)
An regression analysis and business report of house price prediction in Iowa.
(R, OLS, Lasso, Ridge, Elastic Net)
An Random Forest model to predict a given flight's delay rate.
(Python, feature engineering, model interpretation)
A sentiment prediction models to summarize whether an IMDB movie review conveys positive or negative sentiment
(NLTK, Naive Bayes, Word Embedding)An digested twitter list page with colored twitter feeds based on feeds' sentiment and average sentiment score
(Tweepy, vaderSentiment, Jinja, flask)An interactive website deployed to recommend other similar articles to your choice.
(word2vec, Standford GloVe, AWS, Python)