Application Information
Basic Info
Using Prokka annotation pipeline to annotate the genome and perform file format conversion, including converting fa files to faa, gff, and gbk formats.
More Details

Whole genome annotation is the process of identifying features of interest in a set of genomic DNA sequences, and labelling them with useful information.  Prokka is a software tool to annotate bacterial, archaeal and viral genomes quickly and produce standards-compliant output files.

What's New

Prokka uses a variety of databases when trying to assign function to the predicted CDS features. It takes a hierarchical approach to make it fast.
A small, core set of well characterized proteins are first searched using BLAST+. This combination of small database and fast search typically completes about 70% of the workload. Then a series of slower but more sensitive HMM databases are searched using HMMER3.

The three core databases, applied in order, are:

  1. ISfinder: Only the tranposase (protein) sequences; the whole transposon is not annotated.

  2. NCBI Bacterial Antimicrobial Resistance Reference Gene Database: Antimicrobial resistance genes curated by NCBI.

  3. UniProtKB (SwissProt): For each --kingdom we include curated proteins with evidence that (i) from Bacteria (or Archaea or Viruses); (ii) not be "Fragment" entries; and (iii) have an evidence level ("PE") of 2 or lower, which corresponds to experimental mRNA or proteomics evidence.

Additional Information

Related links:

https://github.com/tseemann/prokka

Literature

Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30(14):2068-2069. doi:10.1093/bioinformatics/btu153

About
  • APPID: 01733
  • Compute cost: Free
  • Running time: < 5min
  • Current version: 1.0.0
  • Last update: 2025-07-10