CPP Knowledge Graph
A structured RDF representation of endocytic and non-endocytic pathways through which Cell-Penetrating Peptides (CPPs) enter cells. The graph encodes relationships between mechanisms, genes, inhibitors, cell lines, and cargo types — enabling semantic queries and knowledge discovery across the CPP literature. Source on GitHub ↗
Global Metrics
Instances per Class
Click a row to browse that class.
| Class | Description | # Instances |
|---|---|---|
| Loading… | ||
Association Types
Usage counts for the seven key relationship types.
| Association | Description | # Uses |
|---|---|---|
| Loading… | ||
Data Sources & Ontologies
Gene identifiers (ENSG…). ensembl.org ↗
Inhibitor compound IDs (CHEBI_…). ebi.ac.uk/chebi ↗
Biological process & cellular component terms. geneontology.org ↗
Standardised cell line identifiers. obofoundry.org ↗
Core entity types and semantic relations. semanticscience.org ↗
Domain-specific schema: Experiment, CPP-Complex, CellPenetratingPeptide, CellPenetratingPeptideRole, Cargo, CargoRole, UptakeMechanism. cppkg.bio2vec.net/dataset ↗
Browse Entities
Browse all instances of each entity class. Click any row to view the full entity detail page.
| ID | Label | URI | |
|---|---|---|---|
| Select a class above to browse entities. | |||
Peptide Search
Search peptides and retrieve all associations — mechanism, cargo, cell line, subcellular delivery, and source document.
SPARQL Query Interface
Write or select a SPARQL SELECT query to interrogate the Knowledge Graph directly.
Quick Examples
Example Questions
Documentation
Schema description, ontology reuse, data provenance, and how to cite this resource.
Entity Classes
| Class | URI / Prefix | Description |
|---|---|---|
| Cell-Penetrating Peptide | cpp:CellPenetratingPeptide | A short amino acid sequence with intrinsic ability to traverse biological membranes. |
| CPP-Complex | cpp:CPP-Complex | A molecular assembly formed between a CPP and its associated cargo molecule. |
| Cargo | cpp:Cargo | A therapeutic or reporter molecule transported into cells via a CPP. |
| Uptake Mechanism | cpp:UptakeMechanism | A cellular route (endocytic or non-endocytic) through which the CPP-complex crosses the membrane. |
| Gene | sio:SIO_010035 | Upregulator genes that positively regulate an endocytic uptake mechanism. |
| Inhibitor | sio:SIO_010435 | A chemical compound used to selectively block an endocytic uptake mechanism. |
| Cell Line | sio:SIO_010054 | The cultured cell population serving as the experimental host model. |
| Subcellular Entity | sio:SIO_001400 | An organelle or intracellular compartment where CPPs/cargo are delivered. |
| Experiment | sio:SIO_000994 | The experimental validation for CPP-complex internalization activity. |
| Document | sio:SIO_000148 | The scientific article or patent evidencing the experimental validation. |
Key Properties
| Property | URI | Meaning |
|---|---|---|
| Is component part of | sio:SIO_000313 | CPP is a component part of a CPP-Complex. |
| Is participant in | sio:SIO_000062 | CPP-Complex or Cell Line participates in an Experiment. |
| Has component part | sio:SIO_000369 | CPP-Complex has component part Cargo and Cell Penetrating Peptide. |
| Is located in | sio:SIO_000061 | CPP-Complex is located in a Subcellular Entity. |
| Is about | sio:SIO_000332 | An Experiment is about an Uptake Mechanism. |
| Positively regulates | sio:SIO_001401 | Gene positively regulates an Uptake Mechanism. |
| Negatively regulates | sio:SIO_001402 | Inhibitor negatively regulates an Uptake Mechanism. |
| Is realized in | sio:SIO_000356 | A CPP or Cargo role is realized in an Uptake Mechanism. |
| Is described by | sio:SIO_000557 | An Experiment is described by a Document (PubMed/patent). |
| sequence | cppS:sequence | Amino acid sequence of the CPP (one-letter code). |
| peptideName | cppS:peptideName | Common name of the CPP (e.g., Penetratin, TAT). |
| cargoType | cppS:cargoType | Annotated type of cargo given in the source experiment. |
| uptakeEfficiency | cppS:uptakeEfficiency | Qualitative or quantitative uptake efficiency from the experiment. |
Namespace Prefixes
| Prefix | URI | Description |
|---|---|---|
| cpp: | https://cppkg.bio2vec.net/dataset/ | CPP dataset instances |
| cppS: | https://cppkg.bio2vec.net/schema# | CPP custom schema / ontology |
| sio: | http://semanticscience.org/resource/ | Semanticscience Integrated Ontology |
| obo: | http://purl.obolibrary.org/obo/ | OBO Foundry (GO, CLO terms) |
| rdf: | http://www.w3.org/1999/02/22-rdf-syntax-ns# | RDF vocabulary |
| rdfs: | http://www.w3.org/2000/01/rdf-schema# | RDF Schema |
| owl: | http://www.w3.org/2002/07/owl# | Web Ontology Language |
Provenance
The CPP Knowledge Graph was constructed by collecting 5,288 peptide sequences from CPPSite3 [1], which contain only standard aminoacids and their respective annotations. The experimental annotations (1) uptake mechanisms; (2) cell line models; (3) cargo molecule types; and (4) subcellular delivery destinations were standardised against public databases (Ensembl, ChEBI, GO, CLO), and serialised as RDF/OWL 2 in Turtle format using the SIO framework and a custom CPP domain schema.
Uptake mechanisms regulation network focused on: (1) identification of endocytic uptake routes; (2) genes that regulate each pathway; (3) chemical inhibitors used to validate pathway specificity.
[1] Bajiya, N. et al. CPPsite3: An updated large repository of experimentally validated cell-penetrating peptides. Drug Discovery Today, 104421 (2025). https://doi.org/10.1016/j.drudis.2025.104421
How to Cite
Download
Download the full knowledge graph in standard RDF formats, or access it programmatically via the REST API.
Turtle (.ttl)
Full RDF graph in Turtle serialization. Compatible with Protege, rdflib, Apache Jena, and all major triple stores.
JSON-LD (.jsonld)
Full RDF graph in JSON-LD format. Ideal for web applications and Schema.org integration.
Download JSON-LDSource Repository
Source data, conversion scripts, and ontology files are maintained on GitHub.
GitHub RepositoryREST API Reference
| Endpoint | Method | Description |
|---|---|---|
/api/status | GET | Graph load status and triple count |
/api/metrics | GET | Global graph metrics (nodes, predicates) |
/api/classes | GET | Instance counts per class (includes URI) |
/api/browse?class=URI&page=1 | GET | Paginated entity browser |
/api/entity/{id} | GET | Full entity detail (properties + graph) |
/api/search | POST JSON | Peptide search across multiple fields |
/api/sparql | POST JSON | Ad-hoc SPARQL SELECT queries |
/api/charts/mechanisms | GET | Mechanism distribution data |
/api/charts/cargo | GET | Cargo type distribution data |
/api/charts/celllines | GET | Cell line frequency data |
/api/download/ttl | GET | Download full graph as Turtle |
/api/download/jsonld | GET | Download full graph as JSON-LD |
/dataset/{id} | GET | Content-negotiated entity access (RDF or HTML redirect) |
/robots.txt | GET | Robots exclusion file |
/sitemap.xml | GET | XML sitemap of all entities |
About
Background, architecture, team, and contact information for the CPP Knowledge Graph.
What is the CPP Knowledge Graph?
The CPP Knowledge Graph is a machine-readable resource to structure cell-penetrating peptides and their biological context into an interconnected network. It links peptide sequences to critical experimental details, including their uptake mechanisms, cell lines, cargo molecules, and subcellular delivery locations.
Data Sources
To ensure the data is standardized and interoperable, biological factors are mapped to community ontologies:
Architecture
Uptake Mechanisms & Regulation
Maps cellular uptake mechanisms and directly connects them to the specific genes that activate them and the chemicals that inhibit them.
Biological Entities
Each CPP-Complex represents the peptide conjugated to its cargo. Biological entities link these physical entities to their functional roles.
Experimental Context
Each experiment links the CPP-Complex to the specific cell line used, the final subcellular delivery location, and the source literature (PubMed/patents).
Team
License
The CPP Knowledge Graph is released under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
Contact & Feedback
For questions, bug reports, or suggestions, please contact: maria.castillo@kaust.edu.sa