OMP ID format
copied over from the googledoc
Prefixes
<protect>
Pan-genome |
OMP_PG: vs OMP_PGM vs ? |
Jim and Michelle suggested OMP_TX (for taxon) or OMP_SP (for species) |
---|---|---|
Pan-gene |
OMP_GN: vs OMP_PGN vs ? |
|
Strain/Substrain |
OMP_ST: vs OMP_STR vs ? |
|
Allele |
OMP_AL: vs OMP_ALL vs ? |
|
Phenotype Annotation |
OMP_AN: vs. OMP_ANN vs ? |
|
edit table |
</protect>
Jon:
Michelle/JH: for Pan-genome TX for taxon or SP for species
Number
with or without leading zeros? How many zeros?
example OMP_GN:000785 vs. OMP_GN:785
No leading zeros
Pangenome:
- Includes only the genomes of strains.
- Includes 3 different categories:
1. core genome: genes present in all strains 2. Dispensable genome: genes present in two or more strains 3. Unique genes: specific to single strains
Strains and Sub-strains:
In ecoli wiki the prefix for all strains and their derivatives is “strain:” Should we just omit this prefix in OMP? OMP_ST:00004 ! K-12 vs. OMP_ST:00004 ! Strain:K-12 OMP_ST:00062 ! MG1655 vs. OMP_ST:00062 ! Strain:MG1655
Should we make the distinction between a strain and its derivatives? OMP_ST:00004 ! K-12 OMP_ST:00062 ! K-12_MG1655 vs. OMP_ST:00062 ! MG1655
- E.coli_K-12_MG1655
- We will have our own unique ID’s and cross ref with NCBI