VenomsBase / VenomZone

Protein-centric data architecture and web platform for venom research.

Data Model

Three object types:

Protein Record — The primary data container. Each protein carries its embedding (ProtT5), sequences, structure pointers, taxonomy, classification, expression, proteomics, activity, and literature as typed data slots.
Genome Record — Standalone. Gene annotations and scaffolds from GFF3. Linked to proteins via gene IDs, but not hierarchically above them.
Raw Data — Stored as files (reads, spectra, matrices). Features extracted into protein/genome records.

See docs/architecture_proposal.md for the full data model.

Prototypes

Interactive HTML viewers demonstrating protein-centric exploration:

VenomsBase_Snakes.html — 2,235 snake venom proteins (3FTx + KLK/SP + PLA2) with real ProtT5 embeddings, publication colors, genomic context
VenomsBase_Bees.html — 4,606 bee venom proteins (33 families) with real ProtT5 embeddings, 739 GFF3 genomic regions across 23 species

Open in Chrome. Click proteins to inspect. Color by any annotation dimension.

Data

Data files are not tracked in this repository. Use the setup scripts:

python scripts/download_data.py          # Download reference datasets
python scripts/setup_local_symlinks.py   # Symlink to parent repository data (dev only)

Pipeline

python 04_build_bee_viewer.py     # Bee venom viewer
python 05_build_snake_viewer.py   # Snake venom viewer

Authors

Ivan Koludarov (TUM)
Todd A. Castoe (UT Arlington)

References

Castoe TA, Daly M, Jungo F, Kirchhoff KN, Koludarov I, et al. (2025). A Vision for VenomsBase: An Integrated Knowledgebase for the Study of Venoms and Their Applications. Integrative Organismal Biology. doi:10.1093/iob/obaf026

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
01_build_unified_dataset.py		01_build_unified_dataset.py
02_build_prototype_bundle.py		02_build_prototype_bundle.py
03_build_html_viewer.py		03_build_html_viewer.py
04_build_bee_viewer.py		04_build_bee_viewer.py
05_build_snake_viewer.py		05_build_snake_viewer.py
06_generate_embeddings.py		06_generate_embeddings.py
README.md		README.md
TODO.md		TODO.md
config.py		config.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VenomsBase / VenomZone

Data Model

Prototypes

Data

Pipeline

Authors

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VenomsBase / VenomZone

Data Model

Prototypes

Data

Pipeline

Authors

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages