Applied mathematics · research systems · evidence design
Author / ResearcherLi Yunyi

Build what
data can become.

I work across statistical software, predictive modeling, and environmental communication. The common thread is practical: make data work traceable, make models easier to inspect, and make research findings clear enough to use.

03
Code-backed analytical systems
04
Feature-task states
5,266
Source lines in flagship platform
94.67%
Post-event satisfaction
QF / Research signal synthesis
PARAMETRIC_MODEL
DYNAMIC_RENDER
DATA_LINEAGE
// INFERENCE PIPELINEingest → validate → model
render → compare → communicate

status: signals converging
01 systems engineering02 predictive intelligence03 environmental insight
SYSTEMS
01 / RESEARCH SYSTEMS

Models that move.

These projects focus on the full path from input to explanation: data access, parameter settings, computation, visual output, and audit trails. The aim is not simply to display a result, but to preserve the steps that produced it. Click “Research details” to enter the expanded scientific pages.

Parametric Statistical Modeling & Dynamic Rendering

A full-stack platform for data-source integration, parameterized statistical models, task scheduling, and browser-based rendering. Each model configuration can be saved, compared, run as a task, and mapped to a visual rule, keeping analysis and presentation in the same workflow.

SYSTEM ONLINE
Data Source IntegrationMySQL · Redis · files · schema preview
Parameter Workbenchvariables · model settings · version control
STATISTICAL ENGINE task queue state transitions
Rendering Template Studiorules · geometry · color gradients · trajectories
Dynamic Evidence ViewThree.js · ECharts · real-time feedback
SYSTEM / 01

Large-Sample Feature Extraction & Visual Analytics

Designed for industrial inspection, spectral analysis, and vibration-signal workflows. It accepts CSV, MAT, XLSX, and Parquet files; uses a directed acyclic graph to organize feature steps; and combines dimensionality reduction, clustering, scatter plots, heatmaps, and report generation in one reproducible workspace.

DAG pipelinePCA / UMAPScatter / HeatmapCSV · MAT · XLSX · Parquet
06chunk upload → MD5 check → merge → parse → store → notify
Read Research Article
SYSTEM / 02

Interactive Situation Analysis & Geometric Expression

A spatial analysis system for event monitoring, geometry checks, layer management, data import, and scenario analysis. It keeps event states, coordinate validation, permissions, and audit records visible so that spatial decisions can be reviewed later.

Event state machineGeometry validationLayer compositionAudit logging
5business centers plus a situation-overview workspace
Read Research Article

Interaction contract

RESTful APIs move validated configuration and model-state data between the interface, services, and storage. Each request returns a clear status and trace identifier, which makes failures easier to diagnose and key actions easier to follow.

Real-time behavior

Long-running jobs report progress back to the interface. Queues, task states, failures, and completed render payloads remain visible while work continues in the background.

Research utility

Parameter settings are stored as part of the research record. Users can compare versions, reproduce a configuration, and map model outputs to geometry, color, trajectories, or chart states without losing the link to the original assumption.

PREDICT
02 / KAGGLE HOUSE PRICES

Prediction, under a microscope.

The Kaggle House Prices work is presented as a small, repeatable modeling study. It records how raw fields become features, how validation tests assumptions, how model families are compared, and how residuals inform the next revision.

Read Research Article Expanded paper: audit, modelling, validation, and residual diagnostics.

End-to-end modeling protocol

feature → fit → diagnose
01

Audit the raw table

Separate numeric and categorical fields, inspect missingness, flag high-leverage observations, and record the preprocessing rules. Before fitting a model, make sure the table itself is coherent.

02

Engineer the feature space

Encode categorical fields, derive a few composite indicators, and inspect skewed variables. Keep every feature traceable to an original column so the model still has an interpretable story.

03

Transform the target

Inspect the target distribution and compare raw and transformed target scales when useful. The aim is a steadier learning signal and residuals that are easier to read—not transformation for its own sake.

04

Compare model families

Move from a transparent linear baseline to regularized and gradient-boosting models. The baseline exposes direct structure; regularization checks feature discipline; boosting tests nonlinear interactions.

05

Validate, tune, and read residuals

Use cross-validation, bounded parameter search, and error slices rather than optimizing a single training split. Residuals point to missing representation: sparse neighborhoods, extreme homes, or interactions that need a better feature.

Analysis principle: every score is treated as evidence of a modeling decision, not as the final story.

House-price inference lab

A compact visual record of a tabular-regression workflow: inspect, encode, compare, validate, and review residuals.

Method view · no score claim
Illustrative normalized observationsleast-squares fit

Model comparison lens

Linear baseline — interpretable anchor

Start with a transparent regression model to surface direct relationships, expose data issues, and anchor later comparisons.

Method note: this panel compares modeling roles, not measured leaderboard scores.
Overall qualitystructural rating
Living areasize signal
Year builttemporal context
Garage capacityutility proxy
Neighborhoodlocation encoding
Basement arealatent capacity
Research note: this panel visualizes the analytical method and decision logic rather than claiming an unpublished score or ranking.
Q?
Decision trail: begin with a reliable baseline, then ask whether a new feature, a transformed target, or a more flexible learner improves validation behavior for a reason that can be explained. A better score matters only when its error pattern is understood.
EARTH
03 / ENVIRONMENTAL RESEARCH

Nature, made legible.

Environmental work connects descriptive statistics, time-series thinking, uncertainty, and public communication. The goal is to interpret a changing system carefully and then present the evidence in a form that people can understand and discuss.

Read Research Article Expanded paper: observation, uncertainty, interpretation, and public communication.

Environmental data as a living system

The work begins with monitoring signals, trend structure, and uncertainty. It then turns those findings into plain visual explanations and interactive exhibits, so audiences can engage with evidence instead of receiving it as a one-way message.

01
Distribution lens

Describe variability, concentration, and outlier structure before making causal claims.

02
Temporal lens

Trace trend, seasonality, anomaly, and persistence across environmental time series.

03
Public lens

Use visual explanation and interaction design to turn evidence into attention and action.

74%of 100 valid respondents attended once a year or less
50%of 100 valid respondents reported satisfaction
83%of 100 valid respondents would participate again
Illustrative signal fieldwater / motion / pattern

Interactive environmental science communication

An art-based public science initiative used discarded garments, plastic bottles, and interactive exhibits to make water-resource protection tangible. Water footprint, pollution, and circular-use themes became exhibits; audience feedback then tested whether visual artistry, novelty, and interaction improved attention to the science.

94.67%satisfaction

75 returned surveys: visitors most valued visual artistry, interaction, and novelty. The next design priorities were exhibit quantity, spatial layout, and thematic depth.

50offline visitors
104online viewers
75survey responses
EVIDENCE
Research profile

One quantitative language.
Multiple frontiers.

Across these projects, the working method stays consistent: define the system, test it with data, and present the evidence clearly. Whether the unit is a software module, a validation fold, a geometric layer, or an exhibition visitor, the reasoning should remain visible.

01

Software systems

Three documented applications spanning data integration, statistical modeling, feature extraction, visual analytics, geometric expression, access control, and auditability. Together they show an interest in research infrastructure as well as research outputs.

engineering
02

Predictive analytics

A Kaggle House Prices workflow centered on feature engineering, regression, model comparison, validation, and residual interpretation. It is presented as a reusable method rather than a leaderboard claim.

machine learning
03

Environmental research

Data-informed communication research combining survey signals, interactive art, water-resource science, material reuse, and measurable audience feedback. It expands quantitative practice toward questions of attention, understanding, and action.

interdisciplinary

Quantitative core

Probability · statistical inference · optimization · mathematical modeling · experimental reasoning · uncertainty awareness

Software architecture

Java · Spring Boot · JavaScript · Node.js · Python · Go · REST APIs · service-oriented workflows

Visual computing

Three.js · ECharts · WebGL concepts · interactive dashboards · geometric expression · dynamic rendering templates

Data systems

MySQL · Redis · task queues · data validation · RBAC · JWT sessions · audit logging