研究方向与代表工作 Research & Representative Work
基因表达调控算法 Gene Expression Regulation Algorithms
基于机器学习方法开发基因表达与功能富集分析工具,致力于提升生命科学数据挖掘的准确性与效率。 Developing gene expression and functional enrichment analysis tools using machine learning methods, dedicated to improving the accuracy and efficiency of life science data mining.
基于知识库与智能排序算法,实现基因集的生物学功能探索与交互可视化。 Based on knowledge bases and intelligent ranking algorithms, enabling biological function exploration and interactive visualization of gene sets.
CGPS
利用机器学习整合多个主流富集工具结果,提升关键通路排序的生物学相关性。 Using machine learning to integrate results from multiple mainstream enrichment tools, improving the biological relevance of key pathway rankings.
生命科学高性能计算 High-Performance Computing for Life Sciences
针对生命科学计算高内存、高IO的特点,从体系架构、系统软件到评估方法进行全栈设计与优化。 Addressing the high-memory, high-IO characteristics of life science computing with full-stack design and optimization from architecture to system software to evaluation methods.
面向生命科学领域的开放式计算性能评估体系,基于真实软件负载刻画计算特征,指导集群设计、选型与调度优化。 An open computing performance evaluation system for life sciences, characterizing computational features based on real software workloads to guide cluster design, procurement, and scheduling optimization.
Axon OS
大模型原生的超智融合集群操作系统。 LLM-native Super-Intelligent Converged Cluster OS.
大规模并行集群架构 Large-Scale Parallel Cluster Architecture
设计建设P级GPU双精度算力集群,针对生命科学计算特点优化架构,实测GPU Linpack效率 74%,处于先进水平。 Designed and built a petaflop GPU double-precision computing cluster optimized for life science workloads, achieving 74% GPU Linpack efficiency.
高质量数据集建设 High-Quality Dataset Construction
面向生物医药领域的数据孤岛与质量参差问题,提出系统化数据治理框架,构建"可读、可懂、可信"的高质量数据资产。 Addressing data silos and quality issues in biomedicine, proposing a systematic data governance framework to build 'readable, understandable, trustworthy' high-quality data assets.
"高质量数据集不是天然存在的,而是系统性工程的结果。" "High-quality datasets do not exist naturally; they are the result of systematic engineering."