估计阅读时长: 6 分钟CentOS查看系统版本信息 cat /etc/redhat-release # CentOS Stream release 8 cat /proc/version # Linux version 4.18.0-489.el8.x86_64 (mockbuild@x86-05.stream.rdu2.redhat.com) (gcc […]
Recent Posts
Archives
- February 2026 (2)
- January 2026 (2)
- December 2025 (10)
- November 2025 (2)
- October 2025 (1)
- August 2025 (3)
- July 2025 (2)
- June 2025 (6)
- May 2025 (3)
- November 2023 (1)
- June 2023 (2)
- May 2023 (2)
- April 2023 (2)
- March 2023 (2)
- February 2023 (1)
- August 2022 (2)
- July 2022 (2)
- June 2022 (5)
- May 2022 (5)
- April 2022 (4)
- March 2022 (3)
- January 2022 (2)
- December 2021 (2)
- November 2021 (2)
- October 2021 (6)
- September 2021 (8)
- August 2021 (8)
- July 2021 (6)
- June 2021 (20)
- May 2021 (10)
Tags
algorithm (33)
bilibili (3)
binary tree (3)
clustering (19)
contour (3)
Darwinism (4)
dataframe (3)
data visualization (23)
dotnet-core (25)
GCModeller (20)
gdi+ (23)
gem (7)
ggplot (14)
graph (14)
heatmap (5)
http (4)
image processing (7)
kegg (8)
kmeans (3)
language (7)
linq (3)
linux (8)
machine learning (4)
mass spectrometry (12)
math (19)
metagenomics (5)
motif (4)
MSI (4)
mzkit (19)
network (8)
pathway (4)
pipeline (4)
query (5)
R# (44)
rsharp (23)
scripting (14)
single-cell (6)
sql (3)
symbolic computation (3)
text processing (4)
typescript (3)
ubuntu (4)
uniprot (3)
vb (19)
VisualBasic (50)

[…] 基于之前的一篇文章《TF-IDF与N-gram One-hot文档嵌入算法原理》的学习,我们了解到可以将生物序列通过分解为kmer,组成单词集合用来表示一个文档。从而将长度各异的生物序列嵌入为长读一致的数值向量,进而可以用于后续的各种数据处理工作中。在这里,假设我们将基因组中的所有基因提取出来,然后通过blast比对的方式将基因注释到对应的ec number编号,既可以将某一个基因组使用一个ec number的集合来表示。通过这样子的数据表示方法,我们就可以将任意一个大小各异,基因组成不同的基因组都嵌入为具有相同维度特征的数值向量用于机器学习建模之类的工作。 […]
I'm fine, thank you. and you?
起了个头而已,等后续更新🤣
Marvelous, what a weblog it is! This web site provides helpful facts to us, keep it up.
过来围观大佬的文章