Research Focus Academic Profile

研究方向 Research Overview

张泽兴 Zexing Zhang · 国防科技大学博士研究生 Ph.D. Student, National University of Defense Technology · 博一 Ph.D. Year 1

研究基础模型驱动的智能感知与自主决策。 Foundation-model-driven intelligent perception and autonomous decision-making.

目前在国防科技大学攻读博士学位，研究聚焦基础模型驱动的智能感知与自主决策。
围绕大语言模型、推理智能体和多模态生理信号，关注可验证评估、跨模态表示学习与真实场景泛化。 Currently pursuing a Ph.D. at the National University of Defense Technology, with research focused on foundation-model-driven intelligent perception and autonomous decision-making.
My work spans large language models, reasoning agents, and multimodal physiological signals, with an emphasis on verifiable evaluation, cross-modal representation learning, and generalization in real-world scenarios.

发表/录用论文17篇，包括CCF A 3篇、B 1篇、C 1篇，JCR Q1 2篇、Q2 2篇；第一/共同第一作者8篇，通信/共同通信作者2篇。 Published or accepted 17 papers, including 3 CCF A, 1 CCF B, 1 CCF C, 2 JCR Q1, and 2 JCR Q2 papers; 8 as first/co-first author and 2 as corresponding/co-corresponding author. 自动整理 Auto-organized

大语言模型评估 Large Language Model Evaluation 推理智能体博弈 Reasoning-Agent Games 生理信号基础模型 Physiological-Signal Foundation Models 多模态医学感知 Multimodal Medical Perception 自动整理 Auto-organized

17 已发表 Published 8 第一作者 +2 共一 First +2 co- 2 通讯 +1 共通讯 Corr. +1 co- 3 CCF A 1 CCF B 1 CCF C 3 JCR Q1 4 JCR Q2 28.3 累计IF Cum. IF 7 papers 1 在投 Working

近期成果 Recent Outputs

From Teacher Pathways to Invariant Manifolds: Consensus Subspace Distillation for TSFMs
第一作者 First Author

CCF A 已录用 Accepted International Conference on Machine Learning · 2026
PPGPT: Transferring Next-Token Modeling from Language to PPG Signals
第一作者 First Author

CCF A 已出版 Published AAAI Conference on Artificial Intelligence · 2026
Who's Adam? Benchmarking Hallucinations in Scientific Dialogue
第一作者 First Author

CCF A 已录用 Accepted ACM SIGKDD Conference on Knowledge Discovery and Data Mining · 2026
Dynamic Incremental Learning for Non-invasive Blood Glucose Estimation from Wearable Physiological
第一作者 First Author

JCR Q1 在投审稿 Under Review Engineering Applications of Artificial Intelligence · 2026

发表论文 Publications

默认按 CCF / JCR 等级排序，可按身份与年份联合筛选 Sorted by CCF / JCR rank by default, filterable by author role and year 自动排序 Auto-sorted

From Teacher Pathways to Invariant Manifolds: Consensus Subspace Distillation for TSFMs

第一作者 First Author

International Conference on Machine Learning 2026 已录用 Accepted

CCF A EI SCOPUS

摘要 Abstract 自动补全 Auto-filled

Time-series foundation models (TSFMs) deliver strong cross-domain generalization, but their scale makes deployment costly. Knowledge distillation is a natural compression route, yet prior TSFM distillation typically imitates teacher outputs, features, or pairwise relations, and therefore remains tightly coupled to teacher-specific training trajectories while underutilizing two empirical properties: (i) high-level representations across model scales tend to converge toward a shared, approximately low-rank geometry, and (ii) layer-wise utility follows a long-tail pattern. We propose consensus subspace distillation, which reframes distillation as aligning a student to a model-agnostic geometric object: a scale-invariant low-rank consensus subspace together with its center statistics. Offline, we screen high-contribution layers via drop-layer marginal loss, estimate a shrinkage-stabilized covariance from their embeddings, and derive a truncated eigensubspace that defines a consensus projector. Online, we project student embeddings into this subspace and match the teacher’s projected mean and covariance using a lightweight mean--covariance objective, enabling stable optimization without rigid pointwise feature binding. To mitigate subset-induced bias, we further introduce a frequency-domain uncertainty injection mechanism that inflates spectral density based on characteristic-function discrepancies and injects dispersion only within the consensus directions. Across forecasting and imputation, the distilled student matches or slightly improves upon the teacher, while exhibiting a predictable trade-off under strict zero-shot classification. With MOMENT-Large as teacher, we achieve about 90% parameter reduction and substantial distillation-time savings while retaining comparable performance across multiple time-series tasks. Code and compressed weights are available at anonymous.4open.science/r/CSD-13C3/.

PPGPT: Transferring Next-Token Modeling from Language to PPG Signals

第一作者 First Author

AAAI Conference on Artificial Intelligence 2026 已出版 Published DOI

CCF A

摘要 Abstract 自动补全 Auto-filled

The success of large language models (LLMs) in cognitive tasks prompts the question of whether their next-token prediction (NTP) paradigm can be adapted to model physiological signals from wearable devices. A key target for this adaptation is photoplethysmography (PPG), the most prevalent sensing modality in consumer wearables for non-invasive monitoring of diverse physiological conditions. Unlike in NLP, where NTP aligns with generative objectives, physiological signal analysis involves fundamentally different tasks, such as continuous parameter estimation (regression) and discrete state recognition (classification). This disparity creates a semantic mismatch between the pre-training paradigm and the downstream tasks. To bridge this gap, we propose PPGPT, the first foundation model that reformulates NTP into next-feature token prediction (NFTP), learning hierarchical feature transition probabilities to unify pre-training and downstream objectives. PPGPT features a novel dual-stream encoder that generates feature tokens by jointly modeling temporal dynamics and local-global morphological patterns. The model is developed using a two-stage training framework: it is first pre-trained on a large-scale mixed dataset of 1.6 billion data points and then validated on our newly released BioMTL benchmark, which includes data from 172 subjects over 285 days across seven different tasks. Extensive experiments show that PPGPT significantly outperforms competing methods, achieving a 16.5% improvement in F1-score and a 25.9% reduction in Mean Absolute Error (MAE). Furthermore, the model demonstrates robust few-shot learning capabilities.

Who's Adam? Benchmarking Hallucinations in Scientific Dialogue

第一作者 First Author

ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2026 已录用 Accepted

CCF A EI SCOPUS

摘要 Abstract 自动补全 Auto-filled

LLMs and LMMs are increasingly applied in scientific dialogue, but it remains unclear whether they can reliably ground specific dialogue statements to paper-based evidence. A central challenge is paper-grounded hallucination under a paper-as-truth setting: statements that are contradicted by, not found in, or otherwise not decidable from the paper PDF. These hallucinations can be caused by both human misinterpretations and model-generated assertions, ultimately undermining the efficiency, fairness, and credibility of scientific dialogue. Existing benchmarks often overlook this issue, focusing either on subjective macro-level quality assessments or lacking cross-modal evidence localization. We introduce ADAM-Bench (Auditing Dialogue Assertions with Multimodal Evidence), a benchmark for paper-grounded hallucinations in scientific dialogue. Starting from around 27,000 papers, ADAM-Bench is a multi-layer benchmark with three tiers: Scale, Core, and Gold. ADAM-Bench pairs approximately 1 million atomic claims with over 7 million multimodal evidence objects extracted from the corresponding PDFs. We build it through a four-stage pipeline of claim atomization, candidate evidence recall, model-assisted pre-alignment, and human verification. Based on this dataset, we define two tasks: hallucination detection and minimal evidence set localization. Additionally, to avoid the brittleness introduced by single-rationale supervision, we formalize minimal evidence as a set of equivalent evidence sets and evaluate localization by best-matching against multiple gold evidence sets. We conduct a comprehensive benchmark of 34 LLMs and 10 LMMs, spanning large proprietary models (Claude-Opus-4-6, GPT-5.2) and open-source models (Qwen3-235B, GLM-4.6V 106B). Results are markedly low (25.2%–51.1%), indicating that grounding conversational hallucinations in real multimodal papers remains far from solved. We hope this benchmark will contribute to building scientific assistants that make calibrated judgments, cite minimal, auditable evidence, and mitigate the impact of hallucinations in scientific discovery evaluation.

Integrated Channel Equally-Divided and Coordinate Attention Feature Pyramid for subtype detection of lung cancer

Biomedical Signal Processing and Control 2026 已出版 Published DOI

JCR Q1 IF 4.9 SCI-E EI SCOPUS EMBASE

摘要 Abstract 自动补全 Auto-filled

Lung cancer, a global leading cause of incidence and mortality, demands accurate detection. Current research primarily targets early nodule detection and distinguishing benign from malignant tumors, with limited focus on lung cancer classification. Identifying various lung cancer types is vital for tailored treatments, while challenges persist in localizing small lesions and early tumors. To tackle these challenges, we propose the framework integrated the Channel Equally-Divided (CED) and the Coordinate Attention Feature Pyramid Network (CAFPN). CAFPN, an innovative feature pyramid structure, integrates the Semantic Information Enhanced (SIE) module and Coordinate Attention. The SIE module filters redundant semantic information, enriches texture features through Coordinate Attention, emphasizes shallow semantic details, and amplifies trustworthy specifics for enhanced deep semantic information. Additionally, the CED module is devised to proficiently extract local contextual information across channels for more precise feature representations. The superior performance of our proposed method was empirically validated through comparative experiments on two mainstream datasets with Competition Performance Metric (CPM) scores and Mean Average Precision (mAP) values reaching 0.942 and 99.18%, respectively, outperforming current state-of-the-art approaches.

Spore: Spatio-Temporal Collaborative Perception and representation space disentanglement for remote heart rate measurement

通讯作者 Corresponding

Neurocomputing 2025 已出版 Published

JCR Q1 IF 6.5

摘要 Abstract 自动补全 Auto-filled

Remote Photoplethysmography (rPPG) leverages standard RGB cameras for contactless heart rate monitoring, overcoming the limitations of traditional PPG technology in telemedicine and offering a highly scalable, cost-effective health monitoring solution. Despite the advancements of current deep learning methods, which utilize spatiotemporal convolutional networks to capture subtle rPPG signals, these approaches often fail to fully exploit local similarities and global quasi-periodicity in both spatial and temporal dimensions. Additionally, non-physiological noise remains prevalent in the representation space, impeding the accurate estimation of physiological parameters across diverse representation domains. To address these measurement challenges, we propose Spore, a novel training strategy that integrates a Spatio-Temporal Cooperative Perception Network (STCPNet) and a Separable Network (SpNet). Spore effectively disentangles noise and extracts physiological signals through differential orthogonal disentanglement and parallel approximation techniques, ensuring precise measurement of heart rate. STCPNet meticulously aggregates semantic context across spatial and temporal dimensions, enhancing global-level and trend cross-correlations in a fine-grained manner. Meanwhile, the resource-efficient SpNet identifies and constructs target representation spaces by realigning the distribution of the source latent space, thereby adaptively capturing disentangled physiological signal patterns from the computationally intensive STCPNet. For validation, extensive experiments were conducted not only on multiple benchmark datasets but also through deployment testing in real-world scenarios. The results demonstrate that our proposed training strategy achieves state-of-the-art performance in heart rate measurement while maintaining resource efficiency. The code will be released at https://github.com/zacheryzhang/spore.

A general framework for generative self-supervised learning in non-invasive estimation of physiological parameters using photoplethysmography

第一作者 First Author

Biomedical Signal Processing and Control 2024 已出版 Published DOI

JCR Q1 IF 4.9 SCI-E EI SCOPUS EMBASE

摘要 Abstract 自动补全 Auto-filled

Aligning physiological parameter labels with large-scale photoplethysmographic (PPG) data for deep learning is challenging and resource-intensive. While self-supervised representation learning (SSRL) can handle limited annotated data, the challenge lies in learning robust shared representations from vast unlabeled data and integrating various contextual cues to learn distinctive representations. To alleviate these challenges, a generative SSRL framework TS2TC is proposed to collaboratively utilize the temporal, spectrogram, and temporal-spectrogram mixed domains to explore and incorporate the unique features of PPG for universal and non-invasive physiological parameter estimation. Initially, a pretext task named Cross-Temporal Fusion Generative Anchor (CTFGA) is designed, modeling temporal dependencies and reconstructing independent segments at a coarse level to provide robust global feature extraction and local semantic contextual representation. The framework also includes sub-signals from PPG with diverse frequency scales and order derivatives reflecting hemodynamics to facilitate learning shared representations at varying semantic levels. Secondly, an advanced cognitive-inspired dual-process transfer (DPT) strategy is formulated, consisting of prior-dependent autonomous processes and posterior observation reasoning processes, to leverage the independent and integrated advantages of shared and specific representations. Furthermore, TS2TC introduces a novel bilinear temporal-spectrogram fusion method in the mixed domain, aligning latent representations from different domains, and establishing fine-grained contextual interactions at the feature level across multiple sources of information. Extensive experiments on physiological parameter estimation tasks showed that the joint performance of CTFGA and DPT outperforms standard generative learning significantly. TS2TC achieved an average 2.49% improvement in RMSE over the current state-of-the-art estimation methods with only 10% training data.

DeBeauty: A Joint Framework for Facial Beautification Removal Based on Spatial Collaborative Adaptation and Hyperplane Relocation

IEEE International Conference on Acoustics, Speech and Signal Processing 2025 已出版 Published DOI

CCF B EI SCOPUS

摘要 Abstract 自动补全 Auto-filled

Facial beautification removal presents a formidable inverse challenge due to the inherent diversity and unpredictability of beautification processes. Current methodologies often fall short in effectively restoring facial structural alterations and preserving texture features during makeup removal. This paper introduces DeBeauty, an innovative joint framework for facial beautification removal, comprising two primary workflows: Adversarial De-Makeup Flow (ADF) and Relocation Deformation Flow (RDF). ADF incorporates Multi-Level Perception Collaborative Discrimination (MLPCD) and Discriminator-Guided Spatial Adaptive Multi-Scale Attention (SAMA), which enhance the detection of subtle makeup and facilitate the comprehensive removal of extensive makeup while preserving facial texture. RDF introduces a Hyperplane Relocation Strategy that adjusts the latent code of the input face to align with the original structural distribution. Experimental evaluations on the newly proposed Multivariate Beautified Face dataset demonstrate that this approach effectively restores the original color and structural context of the face while preserving essential facial features, achieving state-of-the-art performance.

Adaptive Physiological Subspace Reorganization and Hemodynamic Feature Encoding for Cuffless Blood Pressure Estimation from Photoplethysmography

共同一作 Co-first Author

Qingxin Zhao, Zexing Zhang, Huimin Lu

Digital Signal Processing 2026 已出版 Published DOI

JCR Q2 IF 3.0 SCI-E SCOPUS EI Compendex

摘要 Abstract 自动补全 Auto-filled

Current photoplethysmography-based cuffless blood pressure estimation still faces key challenges, including incomplete physiological feature disentanglement, heterogeneous blood pressure distributions across subjects, and limited robustness under real acquisition conditions. To address these issues, we present a hybrid two-stage framework that combines adaptive physiological subspace reorganization with hemodynamic feature encoding. In the first stage, the model performs feature learning and fusion: self-supervised contrastive learning is used to construct compact representations, kernel density-based adaptive binning and separating-hyperplane-guided expert transfer preserve local boundary structure across blood pressure subspaces, and full state-space hemodynamic encoding with coupled channel-spatial attention extracts informative embeddings that are fused with prior physiological features. In the second stage, a downstream Random Forest regressor maps the fused features to systolic blood pressure and diastolic blood pressure. Across extensive datasets, the proposed method reduces the estimation error of systolic blood pressure by 22.7% and that of diastolic blood pressure by 19.3%, satisfying AAMI/ESH standards. Moreover, its practical utility is supported by a community field study of 1071 person-times, demonstrating stable performance in uncontrolled settings.

A survey on deep learning-based object detection for crop monitoring: pest, yield, weed, and growth applications: H. Lu et al.

The Visual Computer 2025 已出版 Published DOI

JCR Q2 IF 2.9 SCI-E SCOPUS

摘要 Abstract 自动补全 Auto-filled

Modern agriculture faces significant challenges in enhancing crop production efficiency and management. Crop monitoring has emerged as a critical component for achieving precision and intelligent agricultural management. This paper presents a comprehensive review of the latest advancements in deep learning-based object detection techniques applied to crop monitoring. Object detection methods are categorized into single-stage and two-stage approaches, further classified based on feature extraction techniques, namely CNN-based and SSM-based methods. This analysis highlights the significant contributions and limitations of these methods across four primary application domains: pest and disease detection, crop growth monitoring, yield estimation, and weed detection. Statistical data indicates that research in these domains accounts for 84% of the total studies in crop monitoring. Additionally, challenges related to data collection and processing, model selection, and optimization are discussed, along with potential solutions. A summary of publicly available datasets, commonly used evaluation metrics, and performance comparisons of mainstream models in crop monitoring research is also provided. In the future, emphasis is placed on algorithm performance optimization, improved dataset quality, and the development of customized solutions tailored for practical agricultural applications. Addressing these challenges will further advance the modernization and intelligence of agricultural practices.

MBRSTCformer: a knowledge embedded local--global spatiotemporal transformer for emotion recognition

共同一作 Co-first Author

Cognitive Neurodynamics 2025 已出版 Published DOI

JCR Q2 IF 3.9

摘要 Abstract 自动补全 Auto-filled

Emotion recognition is an essential prerequisite for realizing generalized BCI, which possesses an extensive range of applications in real life. EEG-based emotion recognition has become mainstream due to its real-time mapping of brain emotional activities, so a robust EEG-based emotion recognition model is of great interest. However, most existing deep learning emotion recognition methods treat the EEG signal as a whole feature extraction, which will destroy its local stimulation differences and fail to extract local features of the brain region well. Inspired by the cognitive mechanisms of the brain, we propose the multi-brain regions spatiotemporal collaboration transformer (MBRSTCfromer) framework for EEG-based emotion recognition. First, inspired by the prior knowledge, we propose the Multi-Brain Regions Collaboration Network. The EEG data are processed separately after being divided by brain regions, and stimulation scores are presented to quantify the stimulation produced by different brain regions and feedback on the stimulation degree to the MBRSTCfromer. Second, we propose a Cascade Pyramid Spatial Fusion Temporal Convolution Network for multi-brain regions EEG features fusion. Finally, we conduct comprehensive experiments on two mainstream emotion recognition datasets to validate the effectiveness of our proposed MBRSTCfromer framework. We achieved 98.63, 98.15, and 98.58 accuracy on the three dimensions (arousal, valence, and dominance) on the DEAP dataset; and 97.66, 97.07, and 97.97 on the DREAMER dataset.

PPG Sensor-Based Biometric Identification and Physiological Analysis via Temporal-Frequency Disentanglement with Liquid Neural Networks

第一作者 First Author

IEEE Sensors Letters 2025 已出版 Published DOI

JCR Q2 IF 2.2

摘要 Abstract 自动补全 Auto-filled

Photoplethysmography (PPG) sensors support both physiological monito- ring and biometric identification, making them key components in wearable sensing systems. However, real-world applications face challenges from signal nonstationarity and physiological variability. This work proposes a temporal-frequency manifold disentanglement framework to improve the robustness and accuracy of PPG-based biometric recognition. A closed-form continuous-time (CfC) liquid neural network captures temporal and spectral features from raw PPG signals, while an orthogonal manifold projection separates identity-related and physiological representations. To support physiological analysis, we construct and release a new multiphysiological PPG dataset with synchronized annotations for body mass index (BMI), blood pressure, blood glucose, and heart rate. Our method achieves 94.12% accuracy (F1-score: 0.93), outperforming eight state-of-the-art approaches. Further analysis reveals that BMI, blood glucose, and heart rate strongly influence identity features, highlighting the need for physiologically aware modeling in sensor systems. The proposed framework enhances PPG sensor signal interpretation, offering a scalable solution for real-time biometric sensing applications.

SDA-SAM: Semantic-Driven Adaptive Mixed-Precision Quantization for Segment Anything Model

共同通讯 Co-corresponding

Proceedings of the International Joint Conference on Neural Networks 2026 已录用 Accepted

CCF C EI SCOPUS

PIKGMA: PrIori Knowledge-Guided Multimodal Alignment And Domain Adaptation For Emotion Recognition

IEEE International Conference on Computer Research and Development 2025 已出版 Published DOI

EI SCOPUS

摘要 Abstract 自动补全 Auto-filled

Multimodal domain adaptive emotion recognition aims to utilize the feature information of multiple modalities to achieve cross-domain emotion recognition, and how to constrain the inconsistency between inner-domain and cross-domain modality properly has been a hot spot in research. To address the above challenges, we propose a priori knowledge-guided multimodal alignment domain adaptive emotion recognition model, PIKGMA. PIKGMA is a semi-supervised multimodal physiological signal domain adaptation model for emotion recognition. It aligns multimodal emotion features draw support from inner-domain modalities alignment and cross-domain modalities alignment. Meanwhile, to better utilize the priori knowledge from the source domain (the data distribution of the source domain), we propose the priori dual-alignment bank strategy. The memory bank can be used to assist in cross-domain modality alignment and pseudo-label generation for domain adaptation. We conducted detailed experiments using two physiological signals on the DREAMER dataset, and the experimental results show that PIKGMA performs very well and is superior to existing methods.

UAVEL-YOLO: An Efficient and Lightweight Target Detection Method for Aerial Imagery Captured by UAVs

International Conference on Geology, Mapping and Remote Sensing 2025 已出版 Published DOI

EI SCOPUS

摘要 Abstract 自动补全 Auto-filled

To address the challenge of target detection in UAV aerial images, which is exacerbated by large scale variations and complex backgrounds, this paper proposes a novel target detection model named UAVEL-YOLO. First, we design a lightweight Self-Adaptive Multi-Scale Contextual Perception Attention (SAMCPA) mechanism that enables the network to more effectively capture contextual information in the image and focus on more important regions, thereby enhancing the perceptual and interpretive abilities of the model. Second, we propose a Dual Branch Linear Separable Kernel (DBLSK) module, which not only suppresses the exponential growth of the parameter count but also provides richer gradient flow information. Moreover, to better detect small, dense, and variably sized objects, we incorporate a P2 detection head to enhance the model’s ability to perceive small targets. The proposed model is evaluated on the VisDrone2019 dataset. Compared with YOLO11s, our model improves mAP@.5, mAP@.5:.95, Precision, and Recall by 6.5 %, 4.3 %, 3.9 %, and 5.5%, respectively, while reducing the model’s parameter count by 66.96%.

MILD: A Multimodal Biometric Recognition Framework Integrating Large Foundation Models

Chinese Conference on Biometric Recognition 2024 已出版 Published DOI

EI CPCI-S SCOPUS

摘要 Abstract 自动补全 Auto-filled

Traditional unimodal biometric recognition technologies, wh-ile widely applied across various fields, still face limitations such as environmental interference, spoofing attacks, and individual differences, leading to insufficient accuracy and reliability. Consequently, multimodal biometric recognition technology enhances recognition performance by integrating multiple biometric features. However, effectively merging the semantic information of different modalities remains a key challenge. This paper proposes a multimodal biometric recognition framework with integrated large models (MILD). The framework incorporates foundational large models for audio, language, and images, and innovatively designs modality adapters and multimodal decoders to address the semantic alignment issue of large models. Additionally, MILD uniquely combines voiceprints, electrocardiograms (ECG), and palm prints to enhance the anti-spoofing performance of biometric recognition. Experimental results validate the effectiveness of the MILD framework in cross-modal feature fusion and accurate recognition, demonstrating the potential of foundational large models in complex scenarios, with the highest cross-dataset recognition accuracy reaching 97.65%.

MultiBioGM: a hand multimodal biometric model combining texture prior knowledge to enhance generalization ability

第一作者 First Author

Chinese Conference on Biometric Recognition 2023 已出版 Published DOI

EI CPCI-S SCOPUS

摘要 Abstract 自动补全 Auto-filled

Authentication through hand texture features is one of the crucial directions in biometric identification, and some recognition methods based on traditional machine learning or deep learning have been proposed. However, the generalization ability of these methods is not satisfying due to the different entities, backgrounds, and sensors. In this paper, based on the three modalities of fingerprint, fingervein, and palmprint, the texture prior knowledge extractor (PKE) is innovatively designed as a unified paradigm for texture extraction, aiming to improve the model generalization ability through prior knowledge. The feature vectors of texture images are obtained for matching by a knowledge embedding extractor (KEG) based on the Siamese Network. The credibility algorithm is proposed for multimodal decision-level feature fusion. Cascading PKE and KEG is our proposed multimodal biometric generalization model MultiBioGM. Experimental results on three multimodal datasets demonstrate the effectiveness of our model for biometrics, which achieves 0.098%, 0.024%, and 0.117% EERs on unobserved data.

Pre-clustered Generative Adversarial Network Model for Mongolian Font Style Transfer

International Conference on Optimization, Simulation and Control 2022 已出版 Published DOI

SCOPUS

摘要 Abstract 自动补全 Auto-filled

Font style transfer has important application value in the field of data enhancement and can be used to alleviate the problem of insufficient data in fields such as handwritten character recognition, glyph inference restoration, and ancient book restoration. The complexity of traditional Mongolian has brought many challenges to character recognition and restoration of ancient books. This paper first builds a small-scale Mongolian font style dataset and a graph cluster aggregator algorithm. Secondly, an improved conditional generative adversarial neural network model with MSE loss function is proposed, and the self-built dataset is used to train the model after image aggregation. The experimental results show that the model can learn the traditional Mongolian font style and transfer it to the same semantic text in a small amount of training, and generate images with prominent style.

预印本与在投工作 Preprints & Working Papers

Dynamic Incremental Learning for Non-invasive Blood Glucose Estimation from Wearable Physiological

第一作者 First Author

Engineering Applications of Artificial Intelligence 2026 在投审稿 Under Review

JCR Q1 SCI-E EI SCOPUS

学术服务与交流 Service & Exchanges

审稿服务、会议交流与学术报告 Reviewing, conference exchanges, and academic talks

Service 审稿服务 Reviewing Service

审稿服务：AAAI、ESWA、EAAI、JBHI、NEUCOM 等期刊/会议。 Reviewing service: AAAI, ESWA, EAAI, JBHI, NEUCOM.

2026.01 新加坡 Singapore

前往 AAAI 人工智能会议进行汇报交流。 Presented and exchanged research at the AAAI Conference on Artificial Intelligence.

2024.12 中国 · 长春 Changchun, China

参加 CCF 吉林省研究生学术交流研讨会并作主题汇报，获优秀论文奖。 Attended the CCF Jilin Graduate Academic Exchange Symposium and delivered an invited presentation; received the Excellent Paper Award.

2024.10 中国 · 杭州 Hangzhou, China

参加中国计算机大会。 Attended the China Computer Congress.

2023.12 中国 · 徐州 Xuzhou, China

在中国生物特征识别大会进行汇报交流，获最佳论文提名奖。 Presented and exchanged research at the Chinese Conference on Biometric Recognition; received a Best Paper Nomination.