Carolina Arriaga
Data Scientist | Vallejo, CA | caro.arriaga@gmail.com | github:caroarriaga
As a published researcher, former entrepreneur, and serial hackathon winner, I consistently deliver exceptional results in diverse stakeholder environments. I am eager to join a team to facilitate data-driven decisions, contribute to impactful outcomes, and expand my expertise.
Skills
- Python:
-
- Spark
- Pytorch
- Streamlit
- Pandas
- Bokeh
- Other languages:
-
- R
- SQL
- BigQuery
- HTML/CSS
Experience
–
Independent researcher at Progressive Emergence
Highlights
- Lead author of a 2023 working paper on electrification of an early-adopter fleet in Canada.
- Co-authored a paper for IEEE EDUCON 2020 on evaluating quality decision-making and applied sentiment analysis.
–
Co-Founder, Data Scientist at Facets
Highlights
- Implemented data analytics to web, and react native apps using Segment and Mixpanel.
- Conducted product fit surveys, interviews, and data analysis leading to major pivot decisions.
–
Data Scientist, Zero Emission Vehicles Intern at The ICCT
Highlights
- Designed and implemented an energy rates database covering all Canadian provinces.
- Deployed a web app with interactive visualizations to help fleet managers assess EV savings.
–
Senior Integration Engineer, Cooling at Whirlpool's Center of Innovation and Technology (CETEC)
Highlights
- Designed and ran causal experiments using regressor models. Saved 9m in cost reductions.
- Led the simulation optimization to comply with 2020 DOE energy and refrigerant standards.
- Data mined refrigerator reviews and ratings and built a rating predictor for new products.
Education
–
Master in Information and Data Science from University of California, Berkeley
Courses
- Leopard Spotting: A Deep Learning model to re-id leopards. Used roboflow, YOLOv5, combined with ResNet-18 CNN embedding model for re-id. We achieved a top-1 0.61 accuracy, on par with SOTA on other species.
- ShapSum: A framework to predict human judgement multi-dimensional quality scores for text summarization: Used a random forests model coefficients and shapley value analysis to understand what NLP metrics evaluate a summary so its on par with human judgement.
- A study on people's memory after exposed to a controlled tweet: we found that people tend to remember false information with positive sentiment.
- Effectiveness of government interventions and Covid-19 deaths: Used a linear regression model, we found that in early pandmic, states with interventions saved more lives than states that didn't had them in effect.
–
B.Sc. in Engineering Physics from Tec de Monterrey
Courses
Publications
Case study: Electrification of an early-adopter fleet in Canada. by The ICCT
Teaching decision making in engineering using a multi-attribute approach and ship design by IEEE
Hybrid twist tray ice maker by USPTO
Awards
Winner - Honorable mention from DataPalooza
Winner - 1st Place from AI and Identity - Okta Hackathon