About

About me

This is Yikai Zhang. I’m a statistical Ph.D. candidate with 10+ years of experience in statistics and machine learning:

  • Developed innovative algorithms, including the Finite Smoothing Algorithm and Generalized Takeuchi’s Information Criteria, transforming SVM and large_margin classifiers.
  • Proficient in Python, R, SQL, and Fortran, delivering high-performance tools and open-source packages; integrate Fortran/CUDA kernels.
  • Published author with research showcased at ICML.
  • Successfully collaborated across diverse industries, including insurance and chemical manufacturing, to deliver impactful tools.

I will use this blog to update my packages’ tutorials.

Tech Stack

  • Languages: Python, R, CUDA C/C++, Fortran, SQL
  • ML: PyTorch, scikit‑learn, XGBoost
  • Systems: HPC, GitHub Actions, packaging (PyPI/R‑pkg)

Work Experience

Data Science Intern — UFG Insurance (Summer 2024)

  • Refitted a bodily injury (BI) cost model in commercial auto (CA) insurance using XGBoost; leveraged large language models (LLMs) to extract, validate, and interpret multi-source data, improving prediction accuracy by 15% and significantly enhancing model robustness.
  • Built a Python Shiny tool integrating SQL and LLMs to estimate insurable replacement value.
  • Automated fraud detection from police reports with an LLM-based backend which significantly boosts claim flagging efficiency.

Data Science Intern — Dow Inc. (Summer 2023)

  • Developed a DOE simulation app (R + Shiny) that improved design efficiency by 50% and computation speed by 35%.
  • Enhanced usability for statisticians and engineers, achieving 90%+ user satisfaction.

Graduate Researcher — University of Iowa (2019–Present)

  • Research on large-scale kernel SVMs, kernel logistics regression, GPU acceleration, and insurance risk modeling.
  • Built open-source packages (TorchSVM, hdsvm, SAFE, GTIC) in PyTorch and R.
  • Integrated Fortran/CUDA kernels for high-performance computing.