•   按检索    检索词:    高级检索
     排序:相关度 OA 时间 被引次数 点击次数 下载次数 共有10000条符合的查询结果,以下是第241-260项 搜索用时 126 毫秒
[首页] « 上一页 [8] [9] [10] [11] [12] 13 [14] [15] [16] [17] [18] 下一页 » 末  页»
241.
现有基于生成对抗模仿学习(GAIL)的轨迹生成方法多采用马尔可夫决策过程(MDP)建模人类移动规律, 在训练数据有限的情况下, 这些工作难以学习到动作选择与位置间的潜在关系, 并且计算状态转移函数时也没有考虑到位置间的距离约束, 生成的轨迹质量有待提升. 为此, 本文提出了一种基… …   相似文献
242.
少样本图像分类旨在从有限的标注数据中学习分类器. 尽管现有方法已取得显著进展, 但由于训练样本有限、类内差异过大、类间差异过小, 支持样本与查询样本容易发生混淆, 导致现有方法在提取有用特征和准确区分图像类别方面仍面临挑战. 为了解决这些问题, 我们设计了一种新的多元嵌入增强网络… …   相似文献
243.
  
中医在预防疾病方面有着数千年的经验,而中医体质则作为中医的重要组成部分,与个体健康密切相关,因此在疾病预防和治疗中发挥着重要作用。近年来,信息技术与人工智能的快速发展,推动了众多智能技术在中医体质辨识领域的广泛应用。这些技术不仅使传统的中医体质辨识过程更加科学系统化,还为中医的现… …   相似文献
244.
  
考虑到高级持续性威胁(APT)的复杂特性,利用溯源图能够将系统事件以因果关系联系起来,现有研究尝试将溯源图技术应用于检测此类攻击及取证分析。针对溯源图规模爆炸、数据样本不均衡引起的过度平滑现象,以及系统事件类型过度多样化导致的关系型算法数据稀疏性问题,提出了一种多维度边优化溯源图… …   相似文献
245.
  
提示学习可以将下游任务转换为预训练任务形式的掩码预测任务。然而,将提示学习应用于关系抽取任务时,由于掩码的输出是标签类别的语义向量表示,容易导致生成空间过大而对掩码语义解析度不足。针对这一问题,提出了一种基于提示标签协同的关系抽取方法。为每个关系类构建两组同义词标签。其中一组用于… …   相似文献
246.
  
全球定位系统(GPS)、全球移动通信系统(GSM)的快速发展以及移动设备的普遍应用,产生了大量的轨迹数据。目前的轨迹数据处理方法通常以定长的向量形式输入到模型,因此如何将变长的轨迹数据转换成定长低维的嵌入向量十分重要。轨迹表示学习旨在将轨迹数据转换为更具表达力和可解释性的表示形式… …   相似文献
247.
  
针对X光违禁品图像目标重叠遮挡、关键特征信息提取困难和复杂背景干扰等问题,提出了多分支轻量化卷积和注意力机制改进的X光违禁品检测模型。所提模型在主干网络设计空间和通道重构注意力机制(SCAM),通过对特征图在通道和空间上重组,区分特征图冗余信息和非冗余信息,加强关键特征提取并抑制… …   相似文献
248.
  
近年来,无丢失网络在高性能计算、数据中心等领域得到了广泛应用. 无丢失网络通过链路层流量控制技术保障网内交换机不会因缓存溢出而丢包,避免了数据丢失与重传,极大提高了应用的时延和吞吐量性能. 然而,链路层流量控制带来的负面效应(拥塞扩展、死锁等)使得无丢失网络的大规模部署面临着诸多… …   相似文献
249.
随机块模型可以拟合各种网络的生成,挖掘网络的隐含结构与潜在联系,在社团检测中具有明显的优势.广义随机块模型GSB是基于链接社团的思想发现广义社团的,但其仅适用于有向无属性网络.针对无向属性网络,对网络拓扑信息建模的同时对节点属性进行建模,提出一种度修正的属性网络广义随机块模型DC… …   相似文献
王笑  戴芳  郭文艳  王军锋 《软件学报》2025,36(5):2308-2320
250.
  
超级计算机是“国之重器”,我国在“十四五”期间建设后E级国产超算,支撑关系国计民生的重大计算应用。操作系统作为超算核心系统软件之一,其开销将影响超算整机的运行性能,因此操作系统测评成为新一代国产超算技术路线的重要研究课题之一。openEuler在搭载了鲲鹏处理器的系统上有良好的性… …   相似文献
251.
  
随着云存储服务的快速发展,越来越多的数据拥有者愿意将数据存储到云服务器中,从而减小自己在本地的存储负担。然而,一旦数据拥有者上传数据至云服务器,本地将不保存数据,数据拥有者将失去对数据的直接控制权。为了保证保存在云服务器上远程数据的完整性,数据完整性检验是必不可少的。它可以使得数… …   相似文献
252.
  
遥感图像的空间分辨率高,不同类型对象的尺度差异大、类别不平衡,是精准语义分割任务所面临的主要挑战。为了提高遥感图像语义分割的准确性,提出了一种改进U-Net的多尺度特征融合遥感图像语义分割网络(Multi-scale Feature Fusion Network,MFFNet)。… …   相似文献
253.
  
知识图谱补全旨在预测给定三元组中缺失的实体和关系,以增强知识图谱的完整性和质量。现有的知识图谱补全方法通常只考虑三元组自身的结构信息或者是实体单一的附加信息(如实体的文本描述或拓扑结构信息),而忽略了融合多种附加信息来增强实体的特征信息,从而导致现有方法补全缺失实体时性能不佳。针… …   相似文献
254.
  
鉴于边缘AI的高性能与低功耗需求,基于 RISC-V 指令集架构,针对边缘设备数字信号处理的实际问题,设计了一种边缘AI的专用指令集处理器,在有限的硬件开销下,提升了边缘AI的执行效率,降低了边缘AI的能量消耗,能够满足边缘AI应用中进行高效大语言模型(LLM) 推理计算的需求。… …   相似文献
255.
ObjectiveUltrasound imaging plays a crucial role in medical diagnosis due to its convenience, non-invasive nature, and cost-effectiveness, m… …   相似文献
《中国图象图形学报》2025,30(5):1303-1317
256.
ObjectiveRegong art, originating from the Longwu River valley in the Tibetan region of Huangnan, Qinghai Province, has flourished in this ar… …   相似文献
《中国图象图形学报》2025,30(5):1377-1388
257.
ObjectiveSingle-modality medical imaging is often insufficient for providing a comprehensive review of lesion characteristics, including structure, metabolism, and other critical details. Medical images can generally be categorized into anatomical medical imaging and functional medical imaging. Anatomical medical imaging offers rich information on the structure of the body, but it lacks insight into metabolic processes. In contrast, functional medical imaging is the opposite. In clinical applications, doctors use medical imaging from multiple modalities to diagnose diseases, localize lesions, and plan surgeries. However, simultaneously observing multimodal medical images is not intuitive and may not fully capture all the relevant features of the lesion. Therefore, multimodal medical image fusion is commonly employed in practice to integrate and enhance the information from different imaging techniques. How to fully retain the unique features of each modality while effectively integrating the shared features between modalities is a common challenge in medical image fusion. The information interaction of shared modal features in currently used two-branch image coding methods is often underdeveloped, and the process is somewhat inadequate. This condition limits the establishment of feature correlations between multimodal images. A multiscale medical image fusion network is designed to address these issues. This network is based on progressive feature extraction, frequency domain information supplementation, and image reconstruction by Swin Transformer and convolutional neural network(CNN).MethodFirst, a multiscale feature extraction module guided by gradient information was designed, which can be integrated into a three-branch feature extraction architecture. The left and right branches are responsible for extracting the unique features from each modality of the medical images, while the middle branch extracts the shared features between modalities. The extraction architecture comprises several multiscale feature extraction modules, each based on gradient information guidance. These submodule can simultaneously integrate features from all scale levels. The extraction architecture fully considers the information interaction between modalities and can progressively extract the common and unique features across different modalities. In addition, this extraction architecture effectively integrates multiscale features from multimodal medical images. A progressive fusion module that integrates cross-attention mechanisms was designed to fully utilize the frequency domain information and guide the fusion process at the modal level. This fusion module enhances the interaction of spatial domain information between different modalities and leverages high- and low-frequency positional information from the frequency domain, guiding the model for more targeted multimodal fusion. Finally, a Swin-CNN reconstruction module was designed to determine the relationship between global and local area features of medical images. The reconstruction module uses Swin Transformer to capture global information, such as the overall structure and shape of the image, while simultaneously employing CNN to extract regional features, such as local texture details. The reconstruction module can effectively improve the quality of fused images by integrating the global and local feature information of medical images simultaneously.ResultThe datasets used for the experiments include the MRI-SPECT and MRI-PET fusion datasets from the whole brain database at Harvard Medical School and the GFP-PC fusion dataset from the John Innes Center, respectively. Considering the visual effect of the fused images, the proposed fusion model effectively preserves the structural and functional features of different medical image modalities and improves the quality of the fused images. The advantages of the fused images generated by this model are as follows: 1) The fused image has richer texture details and sharper features such as edges and contours. These images effectively preserve the information-rich regions of each modal image. 2) The fused image also effectively preserves the visual features in all original medical images, which ensures no bias toward preserving information from only one modality of the medical image. 3) The fused image is rendered effectively, with no artifacts affecting the visual effect. In addition, in terms of comparison of quantitative indicators, the model achieves optimization for all eight image fusion evaluation metrics in MRI-SPECT and MRI-PET fusion tasks. Compared to the model with the second-best performance, the mutual information (MI) and discrete cosine transform feature mutual information (FMIdct) are drastically improved. MI demonstrated an improvement of 4.42% and 17.30%, respectively, and FMIdct showed improvements of 5.17% and 11%, respectively. In the GFP-PC fusion task, six optimal and two sub-optimal results are achieved. Compared to the model with the second-best performance, MI and visual information fidelity (VIF) are substantially improved by 16.43% and 16.87%, respectively. Ablation experiments were also conducted for the network structure and loss function of the model to effectively analyze the experimental results and evaluate the effectiveness of each part of the model in this paper. Experimental results show that all model components and the loss function enhance the image fusion effect.ConclusionThe proposed fusion model leverages the common and unique features of different medical image modalities and progressively integrates multiscale information using a three-branch architecture. The model also utilizes a progressive fusion module that incorporates cross-attention to fuse high- and low-frequency features in a highly targeted manner. Furthermore, the model focuses on the global and local attribute information of medical images in the reconstruction process, effectively enhancing the quality of multimodal medical image fusion. The proposed model in this paper performs well in three medical image fusion tasks with good generalization capability. This model can provide multimodal medical fusion images with clear contour structures and rich texture details, aiding doctors in clinical diagnosis and improving diagnostic efficiency and accuracy. Future studies will investigate the constraints or effects of downstream medical semantic segmentation and other tasks on image fusion. The network architecture will also be optimized for specific tasks, ensuring a close integration between tasks such as semantic segmentation and image fusion. This research aims to improve the quality of fused images while enhancing the performance of downstream tasks, thereby expanding the application possibilities of multimodal medical image fusion.… …   相似文献
《中国图象图形学报》2025,30(5):1510-1527
258.
ObjectiveColorectal cancer, a high-incidence and extremely harmful disease, represents a serious threat to human health. Statistics show tha… …   相似文献
《中国图象图形学报》2025,30(5):1479-1496
259.
ObjectiveCeladon is not only a dazzling pearl among the cultural treasures of the Chinese nation but also a cultural messenger in cultural exchanges between China and other countries. It has rich historical and cultural connotations and demonstrates excellent artistic value. Its elegant shape and moist glaze make it an outstanding representative of traditional Chinese craft aesthetics. The production of celadon embodies the wisdom and creativity of ancient craftsmen and is an important carrier for the inheritance of excellent traditional Chinese culture. In the context of cultural digitization, constructing a cross-modal knowledge graph of celadon is one of the key technologies for promoting the protection and inheritance of celadon culture. In this process, matching the same entities across different modalities, which involves aligning the different modal features of equivalent entities, is crucial. However, the inherent structural differences between cross-modal data present challenges for alignment tasks. Traditional methods that rely on manually annotated data can ensure the accuracy of alignment to some extent, but they have problems such as low efficiency and high cost. In addition, coarse-grained annotated data can hardly meet the requirements for fine-grained concepts and for entity recognition when constructing a cross-modal knowledge graph. At present, the vision-language pretraining (VLP) model can effectively capture cross-modal semantic associations by learning rich cross-modal representations from large-scale unmarked image-text pair data. The strong cross-modal understanding ability of the VLP model can provide precise semantic associations and fine-grained entity recognition for aligning entities of different modalities in graph construction. Here, a cross-modal entity alignment method based on the VLP model, which can map multiple features of images, is proposed to maximize the degree of matching between celadon images and text.MethodThe cross-modal entity alignment method proposed in this study, which maps multiple features of images, is initialized with the publicly available VLP model for both the image and the text encoders, and the parameters of the encoders remain unchanged during the training process. The method mainly consists of four parts. First, on the basis of the visual characteristics of celadon images, local features in terms of contour, texture, and color are extracted. Then, a gated multifusion unit is introduced to adaptively assign weights to the image features, and the extracted multiple local image features are used to generate reliable fused features. Furthermore, a multilayer fully connected mapper is designed to learn the mapping of the fused features to an appropriate intermediate representation space by using multiple layers of nonlinear transformations, guiding the text encoder to generate text features that match the image features more closely. Finally, the model is trained and optimized via the information noise contrastive estimation loss function, that is, by optimizing the similarity of positive sample pairs and the difference in negative sample pairs through calculating the cosine similarity between cross-modality features, thereby establishing the connection between image features and text features.ResultThe proposed method was compared with four of the latest benchmark methods in an experimental comparison, namely, contrastive VLP in Chinese (CN-CLIP), context optimization (CoOp), conditional context optimization (CoCoOp), and mapping pictures to words (Pic2Word). The quantitative evaluation metrics are the recall rates, including R@1, R@5, R@10, and the mean recall (MR). The experiments were conducted using the ChinaWare dataset, so all methods were trained on this dataset. A data table comparing each method’s performance on recall rate metrics was provided. In terms of the MR metric, the proposed method outperformed zero-shot CN-CLIPViT-B/16 by 3.2% in the text-to-image alignment task and by 7.5% in the image-to-text task. CoOp focuses on text features; it also outperforms CoOp by 11.4% and 12.1%, respectively. Moreover, CoCoOp considers image features on the basis of CoOp, and the proposed method outperforms CoCoOp by 8.4% and 9.5%, respectively. Pic2Word also focuses on original image features and does not fully utilize other local image features to improve model performance, and the proposed method outperforms Pic2Word by 5.8% and 5.6%, respectively.ConclusionThe cross-modal entity alignment method proposed in this study can fully explore the effective intermediate representation of image features to reconstruct text features without changing the parameters of the VLP model, thereby improving the cross-modal recognition accuracy of the details of celadon. The experimental results show that this method is superior to several state-of-the-art methods and has improved the performance of alignment. Ultimately, a celadon cross-modal knowledge graph with 8 949 nodes and 18 211 relationships was successfully constructed by applying technologies such as ontology modeling, data mining, and the cross-modal entity alignment method proposed in this study.… …   相似文献
《中国图象图形学报》2025,30(5):1318-1333
260.
  
语音到语音翻译(S2ST)是智能语音领域中新兴的研究方向,旨在将一种语言的语音准确翻译成另一种语言的语音。随着人们对跨语言交流需求的增加,S2ST受到广泛的关注,相关研究也不断涌现。传统的级联模型在S2ST过程中存在诸多问题,如错误传播、推理延迟和无法翻译无文字系统的语言等,因此… …   相似文献
[首页] « 上一页 [8] [9] [10] [11] [12] 13 [14] [15] [16] [17] [18] 下一页 » 末  页»