Yiran Li

Yiran Li

Aspiring Data Analyst | Ex Data Analyst Intern @ FlexGen | Grad Student @ UC Berkeley

University of California at Berkeley

About Me

👋 Hi there! I’m Yiran, and I’m a data geek with a knack for solving complex problems. I’m currently honing my skills as a Master’s student in Statistics at UC Berkeley, one of the world’s leading institutions for statistical science and data analysis.

🌏 Originally from Beijing, China, I ventured across the world fueled by a deep-seated passion for data. My journey has been nothing short of thrilling, transforming me from an eager student into a data analyst and problem-solver.

🔍 Why Data Science?

For me, data isn’t just a bunch of numbers; it’s a story waiting to be told. I find immense satisfaction in diving into datasets, unraveling complexities, and emerging with actionable insights. Data science is my canvas, and Python, R, and SQL are my brushes.

🛠️ Technical Proficiency

  • Programming: Fluent in Python (NumPy, Pandas, Seaborn, scikit-learn, Spark), SQL, and R (ggplot2, dplyr, tidyverse)
  • Software: Proficient in AWS (S3, Lambda, SageMaker), Databricks, Git, Bash, Tableau, Microsoft Excel, JavaScript, and HTML
  • Courses: Accomplished in Data Modeling & Inference, Data Science Programming, Database Application, and Machine Learning

📈 Real-world Experience

One highlight is my internship as a Data Analyst at FlexGen. I tackled the challenge of optimizing Round-Trip Efficiency (RTE) for 25 battery sites using Python. Building on this, I re-engineered the RTE calculation logic via Databricks, which now delivers real-time insights at 5-minute intervals. This technological shift set the stage for me to author a 16-page RTE methodology documentation, thereby enriching our company’s knowledge base and equipping clients for data-driven decision-making. Beyond this, I collaborated with the Analyze Team to develop Python functions that dramatically cut data-cleansing time from 2 hours to just 1 minute.

🤝 Soft Skills

Effective communication and customer engagement are as important to me as technical skills. I thrive in collaborative settings and am always eager to share my analytical expertise. I believe in a growth mindset—always learning, always evolving.

🗣️ Language

Fluent in English and native in Mandarin, I can effortlessly bridge cultural and linguistic gaps, making me an asset in diverse work environments.

🌱 What’s Next?

I’m on the lookout for full-time opportunities in data science and data analytics where I can leverage my skills to make a meaningful impact. If you’re interested in solving complex problems and telling stories through data, let’s connect!

Download my resumé.

Interests
  • Machine Learning
  • Quantitative Data Analysis
  • Data Visualization
  • Statistical Inference
Education
  • M.A. Statistics, 2023

    University of California at Berkeley

  • B.S. Statistics and Analytics, 2022

    University of North Carolina at Chapel Hill

  • Teaching Chinese to Speakers of Other Languages (major), 2019

    Shanghai International Studies University

  • Finance (minor), 2019

    Shanghai International Studies University

Skills

R
SQL
Python
Microsoft Suites
Cloud Practitionor
Machine Learning

Experience

 
 
 
 
 
Data Analyst Intern
May 2023 – Aug 2023 Durham, NC
  • Metric Design: Developed Python scripts for RTE (Round-Trip Efficiency) calculations across 25 battery sites at diverse time intervals (annual/monthly/weekly), resulting in critical insights for site performance and optimal stability periods
  • Data Pipeline Automation: Automated real-time data collection and analysis process by implementing RTE calculation logic via Databricks, achieving real-time insights every 5 minutes and a 2-hour reduction in runtime
  • Supervised Learning: Trained random forest, XGBoost model of 50+ features to predict the battery SoC (State of Charge), resulting in essential insights for optimal BESS (Battery Energy Storage Systems) management
  • Data Visualization: Created informative dashboards using Power BI on time-series data, presenting complex data insights in a user-friendly format, resulting in improved data communication and informed decision-making by stakeholders
  • Documentation: Enriched the company’s Confluence page by publishing a 16-page documentation on RTE methodology
 
 
 
 
 
Undergraduate Research Assistant
Jan 2021 – May 2021 Chapel Hill, NC
  • Data Preprocessing: Transformed raw UK Biobank data including patients’ age, gender, clinic records, lifestyle into an optimized model-ready format, slashing features from 664 to 226 and reducing model run time by 1 hour
  • Feature Engineering: Enhanced model interpretability and performance by developing a composite “Lifestyle Score” by aggregating health factors (alcohol, smoking, diet, sleep, exercise) with weighted values assigned
  • Exploratory Data Analysis: Initiated a thorough EDA process, leading a group of 3 to uncover significant patterns linking patients’ diabetes status with their cholesterol, lipids, and blood pressure levels
  • Supervised Learning: Employed rigorous model training and selection including LASSO, Random Forest and XGBoost, increasing the accuracy for diabetes detection by 20%
 
 
 
 
 
Marketing Intern
Jan 2019 – Apr 2019 Shanghai, China
  • Data Visualization: Boosted the average event ROI by 11.3% through quantitative analysis on events KPIs and visualizing key trends (locational, seasonal) in Tableau
  • Team Collaboration: Led a collaborative effort between 2 stakeholders (third party vendors, internal functions) in event planning, resulting in 8 successful outcomes

Licenses ­ certifications

Projects