0%

Bacteria isolates的分析步骤

bacteria isolates的分析步骤

结果储存在/data/Food/analysis/R0987_nextgen/Erkang.Zhang/combined_analysis/中

anvio-pangenome分析

METABOLIC分析结果(pathway 热图)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/bin/sh
#SBATCH --error /data/Food/analysis/R0987_nextgen/Erkang.Zhang/AAB/logfile/err_metabolictest_20260325
#SBATCH --output /data/Food/analysis/R0987_nextgen/Erkang.Zhang/AAB/logfile/out_metabolictest_20260325
#SBATCH --job-name metabolictest_20260325
#SBATCH --mail-user Erkang.Zhang@teagasc.ie
#SBATCH --mail-type END,FAIL
#SBATCH --cpus-per-task=20
#SBATCH -p Priority
#SBATCH -N 1


perl /data/Food/analysis/R0987_nextgen/Erkang.Zhang/AAB/food_bins_overall_METABOLIC/METABOLIC/METABOLIC/METABOLIC-G.pl \
-in-gn /data/Food/analysis/R0987_nextgen/Erkang.Zhang/test/KLI1101_hybrid_test_2 \
-o /data/Food/analysis/R0987_nextgen/Erkang.Zhang/test/metabolic_test_KLI1101_2 -p single

Antismash的次级代谢物预测

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/bin/sh
#SBATCH --error /data/Food/analysis/R0987_nextgen/Erkang.Zhang/AAB/logfile/err_antismashtest_20260326
#SBATCH --output /data/Food/analysis/R0987_nextgen/Erkang.Zhang/AAB/logfile/out_antismashtest_20260326
#SBATCH --job-name antismashtest_20260326
#SBATCH --mail-user Erkang.Zhang@teagasc.ie
#SBATCH --mail-type END,FAIL
#SBATCH --cpus-per-task=20
#SBATCH -p Priority
#SBATCH -N 1


antismash KLA1304.fasta --genefinding-tool prodigal --output-dir KLA1304 --output-basename KLA1304 --cc-mibig --cb-knownclusters --cb-general --cb-subclusters -v --logfile KLA1304/KLA1304_log.txt

antismash KLS1202.fasta --genefinding-tool prodigal --output-dir KLS1202 --output-basename KLS1202 --cc-mibig --cb-knownclusters --cb-general --cb-subclusters -v --logfile KLS1202/KLS1202_log.txt

antibiotic resistance genes的注释

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/sh
#SBATCH --error /data/Food/analysis/R0987_nextgen/Erkang.Zhang/AAB/logfile/err_rgi_20260407
#SBATCH --output /data/Food/analysis/R0987_nextgen/Erkang.Zhang/AAB/logfile/out_rgi_20260407
#SBATCH --job-name rgi_20260407
#SBATCH --mail-user Erkang.Zhang@teagasc.ie
#SBATCH --mail-type END,FAIL
#SBATCH --cpus-per-task=10
#SBATCH -p Priority
#SBATCH -N 1


rgi load --card_json /data/Food/analysis/R0987_nextgen/Erkang.Zhang/AAB/CARD_database/card.json --local

rgi main --input_sequence /data/Food/analysis/R0987_nextgen/Erkang.Zhang/combined_analysis/gluconobacter_potus/KLA1304.fasta \
--output_file /data/Food/analysis/R0987_nextgen/Erkang.Zhang/combined_analysis/gluconobacter_potus/rgi_result/KLA1304_rgi --local --clean

rgi main --input_sequence /data/Food/analysis/R0987_nextgen/Erkang.Zhang/combined_analysis/gluconobacter_potus/KLS1202.fasta \
--output_file /data/Food/analysis/R0987_nextgen/Erkang.Zhang/combined_analysis/gluconobacter_potus/rgi_result/KLS1202_rgi --local --clean

或者使用rgi网站上的注释,或者使用abricate

1
2
#或者使用abricate注释,可以使用不同的数据库(-db)
abricate KLS1202.fasta >abricate_result/KLS1202.txt

horizontal gene transfer的分析

1
2
3
4
5
6
7
8
#建立database有点麻烦,程序自带的脚本有问题,需要手动从他们提供的网盘下载(github网页上有),然后手动使用diamond建立database
#数据库在/data/Food/primary/R0987_nextgen/Erkang.Zhang/hgtdb_20230102
#输入文件是faa,要先预测,可以先运行metabolic,会生成faa文件

hgtector search -i /data/Food/primary/R0987_nextgen/Erkang.Zhang/o55h7.faa.gz -o . -m diamond -p 16 -d /data/Food/primary/R0987_nextgen/Erkang.Zhang/ref107/diamond/db -t /data/Food/primary/R0987_nextgen/Erkang.Zhang/ref107/taxdump

hgtector analyze -i o55h7.tsv -o . -t /data/Food/primary/R0987_nextgen/Erkang.Zhang/ref107/taxdump --donor-name
#似乎不使用grid的效果好一些
  • One can force the potential donors to be reported at a certain rank using the --donor-rank parameter (e.g., “genus”). Donors below this rank will be raised to this rank (e.g., “E. coli“ becomes “Escherichia“), however donors above this rank will be discarded. Since it is not uncommon that the true donor cannot be accurately determined using the taxonomy of extant organisms, we recommend not using this parameter, or setting it to a high rank (e.g., “phylum”).

  • 完成后的tsv文件可以使用taxonkit重新获取taxid以及物种分类

    1
    cat KLA1304/hgts/assembly.txt | taxonkit name2taxid --data-dir /data/Food/primary/R0987_nextgen/Erkang.Zhang/taxonkit_database --name-field 3 --show-rank -o test.tsv