Carolina Arriaga

Data Scientist | Vallejo, CA | caro.arriaga@gmail.com | github:caroarriaga

As a published researcher, former entrepreneur, and serial hackathon winner, I consistently deliver exceptional results in diverse stakeholder environments. I am eager to join a team to facilitate data-driven decisions, contribute to impactful outcomes, and expand my expertise.

Skills

Python:
  • Spark
  • Pytorch
  • Streamlit
  • Pandas
  • Bokeh
  • Other languages:
  • R
  • SQL
  • BigQuery
  • HTML/CSS
  • Experience

    Independent researcher at Progressive Emergence

    Highlights

    • Lead author of a 2023 working paper on electrification of an early-adopter fleet in Canada.
    • Co-authored a paper for IEEE EDUCON 2020 on evaluating quality decision-making and applied sentiment analysis.

    Co-Founder, Data Scientist at Facets

    Highlights

    • Implemented data analytics to web, and react native apps using Segment and Mixpanel.
    • Conducted product fit surveys, interviews, and data analysis leading to major pivot decisions.

    Data Scientist, Zero Emission Vehicles Intern at The ICCT

    Highlights

    • Designed and implemented an energy rates database covering all Canadian provinces.
    • Deployed a web app with interactive visualizations to help fleet managers assess EV savings.

    Senior Integration Engineer, Cooling at Whirlpool's Center of Innovation and Technology (CETEC)

    Highlights

    • Designed and ran causal experiments using regressor models. Saved 9m in cost reductions.
    • Led the simulation optimization to comply with 2020 DOE energy and refrigerant standards.
    • Data mined refrigerator reviews and ratings and built a rating predictor for new products.

    Education

    Master in Information and Data Science from University of California, Berkeley

    Courses

    • Leopard Spotting: A Deep Learning model to re-id leopards. Used roboflow, YOLOv5, combined with ResNet-18 CNN embedding model for re-id. We achieved a top-1 0.61 accuracy, on par with SOTA on other species.
    • ShapSum: A framework to predict human judgement multi-dimensional quality scores for text summarization: Used a random forests model coefficients and shapley value analysis to understand what NLP metrics evaluate a summary so its on par with human judgement.
    • A study on people's memory after exposed to a controlled tweet: we found that people tend to remember false information with positive sentiment.
    • Effectiveness of government interventions and Covid-19 deaths: Used a linear regression model, we found that in early pandmic, states with interventions saved more lives than states that didn't had them in effect.

    B.Sc. in Engineering Physics from Tec de Monterrey

    Courses

    Awards

    Winner - Honorable mention from DataPalooza

    Winner - 1st Place from AI and Identity - Okta Hackathon

    Winner - TinyMCE Challenge and Clarifai Challenge from API World + AI DevWorld Hackathon


    © 2019. All rights reserved.