About me
This is Yikai Zhang. I’m a statistical Ph.D. candidate with 10+ years of experience in statistics and machine learning:
- Developed innovative algorithms, including the Finite Smoothing Algorithm and Generalized Takeuchi’s Information Criteria, transforming SVM and large_margin classifiers.
- Proficient in Python, R, SQL, and Fortran, delivering high-performance tools and open-source packages; integrate Fortran/CUDA kernels.
- Published author with research showcased at ICML.
- Successfully collaborated across diverse industries, including insurance and chemical manufacturing, to deliver impactful tools.
I will use this blog to update my packages’ tutorials.
Tech Stack
- Languages: Python, R, CUDA C/C++, Fortran, SQL
- ML: PyTorch, scikit‑learn, XGBoost
- Systems: HPC, GitHub Actions, packaging (PyPI/R‑pkg)
Work Experience
Data Science Intern — UFG Insurance (Summer 2024)
- Refitted a bodily injury (BI) cost model in commercial auto (CA) insurance using XGBoost; leveraged large language models (LLMs) to extract, validate, and interpret multi-source data, improving prediction accuracy by 15% and significantly enhancing model robustness.
- Built a Python Shiny tool integrating SQL and LLMs to estimate insurable replacement value.
- Automated fraud detection from police reports with an LLM-based backend which significantly boosts claim flagging efficiency.
Data Science Intern — Dow Inc. (Summer 2023)
- Developed a DOE simulation app (R + Shiny) that improved design efficiency by 50% and computation speed by 35%.
- Enhanced usability for statisticians and engineers, achieving 90%+ user satisfaction.
Graduate Researcher — University of Iowa (2019–Present)
- Research on large-scale kernel SVMs, kernel logistics regression, GPU acceleration, and insurance risk modeling.
- Built open-source packages (TorchSVM, hdsvm, SAFE, GTIC) in PyTorch and R.
- Integrated Fortran/CUDA kernels for high-performance computing.