
Protein datasets
The full protein dataset is available in FASTA format (link downloads gzipped file from JaponicusDB).
The protein data directory contains assorted data (see the README for file formats):
- PeptideStats.tsv - Predicted molecular weight (kDa), predicted pI, charge, length (residues), codon adaptation index (CAI)
- protein_domains_and_features.tsv - Protein features such as domains and family assignments
- aa_composition.tsv - Amino acid composition
- transmembrane_domain_coords_and_seqs.tsv - Sequences and coordinates for transmembrane domains
See also the JaponicusDB protein modification annotations.