Loading Knowledge Graph… this may take a moment.

CPP Knowledge Graph

A structured RDF representation of endocytic and non-endocytic pathways through which Cell-Penetrating Peptides (CPPs) enter cells. The graph encodes relationships between mechanisms, genes, inhibitors, cell lines, and cargo types — enabling semantic queries and knowledge discovery across the CPP literature. Source on GitHub ↗


Global Metrics
Total Triples
Unique Nodes
Unique Edge Types

Instances per Class

Click a row to browse that class.

ClassDescription# Instances
Loading…
Association Types

Usage counts for the seven key relationship types.

AssociationDescription# Uses
Loading…
Data Sources & Ontologies
Ensembl
Gene identifiers (ENSG…). ensembl.org ↗
ChEBI
Inhibitor compound IDs (CHEBI_…). ebi.ac.uk/chebi ↗
Gene Ontology
Biological process & cellular component terms. geneontology.org ↗
CLO — Cell Line Ontology
Standardised cell line identifiers. obofoundry.org ↗
SIO — Semanticscience Integrated Ontology
Core entity types and semantic relations. semanticscience.org ↗
Cell-Penetrating Peptide Ontology
Domain-specific schema: Experiment, CPP-Complex, CellPenetratingPeptide, CellPenetratingPeptideRole, Cargo, CargoRole, UptakeMechanism. cppkg.bio2vec.net/dataset ↗

Browse Entities

Browse all instances of each entity class. Click any row to view the full entity detail page.


Loading classes…
IDLabelURI
Select a class above to browse entities.

SPARQL Query Interface

Write or select a SPARQL SELECT query to interrogate the Knowledge Graph directly.


Quick Examples
Example Questions

Documentation

Schema description, ontology reuse, data provenance, and how to cite this resource.


Entity Classes
ClassURI / PrefixDescription
Cell-Penetrating Peptidecpp:CellPenetratingPeptideA short amino acid sequence with intrinsic ability to traverse biological membranes.
CPP-Complexcpp:CPP-ComplexA molecular assembly formed between a CPP and its associated cargo molecule.
Cargocpp:CargoA therapeutic or reporter molecule transported into cells via a CPP.
Uptake Mechanismcpp:UptakeMechanismA cellular route (endocytic or non-endocytic) through which the CPP-complex crosses the membrane.
Genesio:SIO_010035Upregulator genes that positively regulate an endocytic uptake mechanism.
Inhibitorsio:SIO_010435A chemical compound used to selectively block an endocytic uptake mechanism.
Cell Linesio:SIO_010054The cultured cell population serving as the experimental host model.
Subcellular Entitysio:SIO_001400An organelle or intracellular compartment where CPPs/cargo are delivered.
Experimentsio:SIO_000994The experimental validation for CPP-complex internalization activity.
Documentsio:SIO_000148The scientific article or patent evidencing the experimental validation.
Key Properties
PropertyURIMeaning
Is component part ofsio:SIO_000313CPP is a component part of a CPP-Complex.
Is participant insio:SIO_000062CPP-Complex or Cell Line participates in an Experiment.
Has component part sio:SIO_000369CPP-Complex has component part Cargo and Cell Penetrating Peptide.
Is located insio:SIO_000061CPP-Complex is located in a Subcellular Entity.
Is aboutsio:SIO_000332An Experiment is about an Uptake Mechanism.
Positively regulatessio:SIO_001401Gene positively regulates an Uptake Mechanism.
Negatively regulatessio:SIO_001402Inhibitor negatively regulates an Uptake Mechanism.
Is realized insio:SIO_000356A CPP or Cargo role is realized in an Uptake Mechanism.
Is described bysio:SIO_000557An Experiment is described by a Document (PubMed/patent).
sequencecppS:sequenceAmino acid sequence of the CPP (one-letter code).
peptideNamecppS:peptideNameCommon name of the CPP (e.g., Penetratin, TAT).
cargoTypecppS:cargoTypeAnnotated type of cargo given in the source experiment.
uptakeEfficiencycppS:uptakeEfficiencyQualitative or quantitative uptake efficiency from the experiment.
Namespace Prefixes
PrefixURIDescription
cpp:https://cppkg.bio2vec.net/dataset/CPP dataset instances
cppS:https://cppkg.bio2vec.net/schema#CPP custom schema / ontology
sio:http://semanticscience.org/resource/Semanticscience Integrated Ontology
obo:http://purl.obolibrary.org/obo/OBO Foundry (GO, CLO terms)
rdf:http://www.w3.org/1999/02/22-rdf-syntax-ns#RDF vocabulary
rdfs:http://www.w3.org/2000/01/rdf-schema#RDF Schema
owl:http://www.w3.org/2002/07/owl#Web Ontology Language
Provenance

The CPP Knowledge Graph was constructed by collecting 5,288 peptide sequences from CPPSite3 [1], which contain only standard aminoacids and their respective annotations. The experimental annotations (1) uptake mechanisms; (2) cell line models; (3) cargo molecule types; and (4) subcellular delivery destinations were standardised against public databases (Ensembl, ChEBI, GO, CLO), and serialised as RDF/OWL 2 in Turtle format using the SIO framework and a custom CPP domain schema.

Uptake mechanisms regulation network focused on: (1) identification of endocytic uptake routes; (2) genes that regulate each pathway; (3) chemical inhibitors used to validate pathway specificity.

[1] Bajiya, N. et al. CPPsite3: An updated large repository of experimentally validated cell-penetrating peptides. Drug Discovery Today, 104421 (2025). https://doi.org/10.1016/j.drudis.2025.104421

How to Cite
@dataset{cpp_kg, title = {A knowledge graph and dataset of natural cell-penetrating peptides}, author = {Gomez-Castillo, Maria and Dominik, Renn and Magnus, Rueping and Robert, Hoehndorf}, year = {2026}, url = {https://github.com/bio-ontology-research-group/Cell-penetrating-peptides}, license = {CC-BY 4.0} }

Download

Download the full knowledge graph in standard RDF formats, or access it programmatically via the REST API.


Turtle (.ttl)

Full RDF graph in Turtle serialization. Compatible with Protege, rdflib, Apache Jena, and all major triple stores.

JSON-LD (.jsonld)

Full RDF graph in JSON-LD format. Ideal for web applications and Schema.org integration.

Download JSON-LD
Source Repository

Source data, conversion scripts, and ontology files are maintained on GitHub.

GitHub Repository
REST API Reference
EndpointMethodDescription
/api/statusGETGraph load status and triple count
/api/metricsGETGlobal graph metrics (nodes, predicates)
/api/classesGETInstance counts per class (includes URI)
/api/browse?class=URI&page=1GETPaginated entity browser
/api/entity/{id}GETFull entity detail (properties + graph)
/api/searchPOST JSONPeptide search across multiple fields
/api/sparqlPOST JSONAd-hoc SPARQL SELECT queries
/api/charts/mechanismsGETMechanism distribution data
/api/charts/cargoGETCargo type distribution data
/api/charts/celllinesGETCell line frequency data
/api/download/ttlGETDownload full graph as Turtle
/api/download/jsonldGETDownload full graph as JSON-LD
/dataset/{id}GETContent-negotiated entity access (RDF or HTML redirect)
/robots.txtGETRobots exclusion file
/sitemap.xmlGETXML sitemap of all entities

About

Background, architecture, team, and contact information for the CPP Knowledge Graph.


What is the CPP Knowledge Graph?

The CPP Knowledge Graph is a machine-readable resource to structure cell-penetrating peptides and their biological context into an interconnected network. It links peptide sequences to critical experimental details, including their uptake mechanisms, cell lines, cargo molecules, and subcellular delivery locations.

Data Sources

To ensure the data is standardized and interoperable, biological factors are mapped to community ontologies:

Architecture
Uptake Mechanisms & Regulation

Maps cellular uptake mechanisms and directly connects them to the specific genes that activate them and the chemicals that inhibit them.

Biological Entities

Each CPP-Complex represents the peptide conjugated to its cargo. Biological entities link these physical entities to their functional roles.

Experimental Context

Each experiment links the CPP-Complex to the specific cell line used, the final subcellular delivery location, and the source literature (PubMed/patents).

Team
María de los Ángeles Gómez Castillo
Biological and Environmental Science & Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Dominik Renn
Biological and Environmental Science & Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Magnus Rueping
Biological and Environmental Science & Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Robert Hoehndorf
Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
License

The CPP Knowledge Graph is released under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

Contact & Feedback

For questions, bug reports, or suggestions, please contact: maria.castillo@kaust.edu.sa