OrientDB Interface Methods

ebel.manager.orientdb.odb_meta.Graph

Bases: ABC

Generic parent class for BioDBs.

classes: Tuple[OClass, ...] property

Return generic, node and edge classes as List[OClass].

number_of_generics: Dict[str, int] property

Returns for number of entries in OrientDB classes and RDB tables. Tables have priority.

number_of_nodes property

Return node count.

number_of_edges property

Return edge count.

__init__(generics: Tuple[Generic] = (), nodes: Tuple[Node] = (), edges: Tuple[Edge] = (), indices: Tuple[OIndex] = (), urls: dict = None, biodb_name: str = '', tables_base=None, config_params: Optional[dict] = None, overwrite_config: bool = False)

Init method.

__config_params_check(overwrite_config: bool = False)

Go through passed/available configuration params.

execute(command_str: str) -> List[OrientRecord]

Execute a command directly in the OrientDB server.

Parameters

command_str: str The SQL to be executed

Raises

PyOrientCommandException Caused by a disconnect to the ODB server. eBEL will try to reconnect if possible.

Returns

ODB response.

set_configuration_parameters()

Set configuration for OrientDB database client instance using configuration file or passed params.

get_client() -> OrientDB

Attempts to connect to the OrientDB client. This is currently done by using session tokens.

__repr__()

Represent the class.

insert_data() -> Dict[str, int] abstractmethod

Insert all generic data.

update_interactions() -> int abstractmethod

Insert all generic data.

clear_and_import_data() -> Dict[str, int]

Clears the associated table and inserts the data from raw downloaded data.

create_index_rdbms(table_name: str, columns)

Creates index on column(s) in RDBMS.

clear_edges_by_bel_doc_rid(bel_document_rid: str, even_if_other_doc_rids_exists=True)

Delete all edges linked to a specified BEL document rID.

clear_documents() -> int

Clear all document info. Returns number of deleted documents.

get_number_of_bel_statements_by_document_rid(bel_document_rid: str) -> int

Return BEL statement count with a given document ID.

get_documents()

Return all document info as pandas DataFrame.

get_documents_as_dict()

Return all document info as pandas DataFrame.

add_keyword(keyword: str, description: str) -> pd.DataFrame

Add a keyword and description used to tagging BEL documents.

Parameters

keyword : str The name of a project the work is based on or type of work description : str Detailed explanation of the keyword

get_info_class(class_name)

Return info about class.

get_info_properties(class_name: str, short: bool = True)

Get the property information for a specified table.

entry_exists(class_name, **params) -> bool

Check if class_name whith parameters exists.

query(sql: str) -> pd.DataFrame

Return a pandas DataFrame results table.

query_get_dict(sql: str) -> List[dict]

Return list of dictionaries using a given SQL query.

query_class(class_name: str, limit: int = 0, skip: int = 0, columns: Iterable[str] = None, with_rid=True, with_class=False, print_sql: bool = False, group_by: List[str] = None, distinct=False, as_dataframe: bool = False, where_list: Tuple[str] = (), **params) -> Union[List[dict], pd.DataFrame]

Query class by params and returns list of pyorient.OrientRecord.

query_class_chunks(class_name: str, chunk_size: int = 10000, columns: Iterable[str] = None, with_rid=True, with_class=False, print_sql: bool = False, group_by: List[str] = None, distinct: bool = False)

Query class by params and only return a set of results in batches. Creates a generator.

query_rid(rid, columns: list = None)

Query specified columns of a given rID entry.

download(url_dict: Dict[str, str] = None, biodb: str = None, expiration_days: int = 100) -> Dict[str, bool]

Download url to file_path if not older than expiration_days.

download_file(url: str, biodb: str, expiration_days: int = 100, addtional_header: dict = None) -> bool staticmethod

Download file. Returns True if it was needed to download the file.

index_exists(index_name: str)

Check if index_name exists.

create_index(index: OIndex)

Create index.

get_index_name(index: OIndex)

Return index name.

create_all_classes()

Create all classes.

create_all_indices()

Create indices.

create_indices(indices: Tuple[OIndex])

Create indices.

drop_all_indices()

Drop indices.

drop_indices(indices: List[OIndex])

Drop indices.

drop_index(index: OIndex)

Drop index.

create_node_classes()

Create node classes.

create_edge_classes()

Create edge classes.

create_generic_classes()

Create generic classes.

create_classes(oclasses: Tuple[OClass] = None)

Create classes.

list of classes (odb_structure.OClass) OrientDB class v, e or g for vertex, edge or generic

create_class(oclass: OClass, print_sql=False)

Create class.

OrientDB class v, e or g for vertex, edge or generic

create_class_property(class_name: str, prop: OProperty, print_sql: bool = False)

Create OrientDB class property.

class_exists(class_name: str) -> bool

Check if OrientDB class exists.

classes_exists(list_of_class_names)

Check if list of OrientDB classes exists.

drop_class(class_name: str)

Drop the specified table.

__drop_classes(classes: Iterable[OClass])

Delete the classes in opposite order.

drop_all_classes()

Drop all classes.

drop_generic_classes()

Drop all generic classes.

drop_node_classes()

Drop all node classes.

drop_edge_classes()

Drop all edge classes.

clear()

Clear (delete entries) from all classes.

is_abstract_class(class_name: str) -> bool

Returns true if class is abstract.

clear_edges() -> Dict[str, int]

Delete all edges.

clear_class(class_name)

Delete all entries from class if exists.

clear_nodes()

Delete all nodes.

clear_nodes_with_no_edges()

Delete all nodes from a class with no edges.

clear_generics()

Delete all entries in generic tables.

recreate_tables()

Recreate SQLAlchemy tables in relational database.

clear_nodes_and_edges()

Delete all nodes and edges of a specific biodb.

clear_all_nodes_and_edges()

Delete all nodes and edges in the whole database.

clear_exp_edges()

Delete all DEA experiment associated edges.

recreate()

Recreate OrientDB collection.

table_exists(table: Table)

Checks if the table exists in RDBMS.

__get_sql_where_part(params, where_list: Tuple[str] = ()) staticmethod

Return a ODB SQL where part by params.

get_number_of_class(class_name, distinct_column_name: str = None, **params)

Return count of unique values for a given class_name and column name.

get_cluster_ids(class_name: str) -> list

Get all cluster ids by class name.

insert_record(class_name: str, value_dict: dict, print_sql=False) -> Optional[str]

Insert new entry in class with values from dictionary. Returns rid.

create_record(class_name: str, value_dict: dict) -> Optional[str]

Create record/ insert into class_name with content of value_dict.

update_record(class_name: str, value_dict: dict) -> str

Update record with content of value_dict.

edge_exists(class_name: str, from_rid: str, to_rid: str, value_dict: dict = {}) -> str

Check if edge exists. Return rid if exists else None.

node_exists(class_name: str, value_dict: dict = {}, check_for: Union[Iterable[str], str] = None, print_sql: bool = False) -> str

Check if node exists. Return rid if exists else None.

create_edge(class_name: str, from_rid: str, to_rid: str, value_dict: dict = {}, print_sql=False, if_not_exists=False, ignore_empty_values=False) -> str

Create edge from from_rid(@rid) to to_rid(@rid) with content of value_dict.

get_create_rid(class_name: str, value_dict: dict, check_for=None, print_sql=False) -> str

Return class_name.@rid by value_dict. Create record/insert if not exists.

update_correlative_edges() -> List[str]

Create a reverse edge for every correlative edge.

update_document_info()

Update document metadata.

update_pmcids() -> int

Add PMC ID to bel_relation if one exists.

update_pmids(edge_name='bel_relation')

Update PMID metadata for all edges of the specified edge_name.

import_dataframe(dataframe: pd.DataFrame, class_name: str, replace_nulls_with_nones: bool = True, standardize_column_names: bool = True, replace: bool = True) -> int

Import dataframe into OrientDb class with name.

batch_insert(dataframe: pd.DataFrame, database: str, chunk_size: int = 100, desc: str = None, standardize_column_names: bool = False, replace: bool = True, replace_nulls_with_nones: bool = False) -> int

Adds rows of a dataframe into specified generic table in batches.

Parameters

dataframe: pandas DataFrame A dataframe of information to be inserted into the generic table. database: str Name of the generic table to be inserted into. chunk_size: int (optional) Number of chunks to break the dataframe into for batching. desc: str A description for tqdm about what is being iterated through. standardize_column_names: bool If True (default=False), standardize column names. replace: bool If True (default), content of dataframe replaces old data. replace_nulls_with_nones: bool If True (default=False), replace numpy.nan with None (==null in OrientDB). Returns


int Number of rows inserted into the table.

get_set_gene_rids_by_position(chromosome: str, position: int, gene_types=['mapped', 'downstream', 'upstream']) -> Dict[str, List[str]]

Return dictionary of mapped gene by chromosal position.

ALERT: creates new BEL HGNC gene is not exists.

class_is_descendant_of(child_name: str, descendant_name: str) -> bool

Returns True if child_name is a child class of descendent_name.

class_has_children(class_name) -> bool

Checks if class_name has children that inherit from class_name.

get_child_classes(class_name) -> List[str]

Get list of child classes for given class_name.

get_leaf_classes_of(class_name: str) -> List[str]

Return list of children classes for the given class_name.

insert() -> Dict[str, int]

Check if files missing for download or generic table empty. If True then insert data.

update() -> None

Check generics and update BEL interactions.

update_bel() -> None

Delete and update all class specific edges.

delete_nodes_with_no_edges(class_name=None) -> int

Delete all nodes without any edges.

get_pure_symbol_rids_dict_in_bel_context(class_name='protein', namespace='HGNC') -> Dict[str, str]

Return dictionary with HGNC names as key and OrientDB @rid as value.

Applies to all pure nodes in graph with class name directly or indirectly involved in BEL stmt. This method could be helpful to avoid graph explosion.

get_pure_uniprots_in_bel_context() -> Set[str]

Returns a list of all uniprot accessions in BEL annotation context.

get_pure_symbol_rid_df_in_bel_context(class_name='protein', namespace='HGNC') -> pd.DataFrame

Return dictionary with gene symbols as keys and node rIDs as values.

get_pure_symbol_rids_dict(class_name='protein', namespace='HGNC') -> Dict[str, str]

Return dictionary with protein name as keys and node rIDs as values.

get_pure_rid_by_uniprot(uniprot: str)

Get rIDs of node based on UniProt ID.

get_pure_uniprot_rid_dict_in_bel_context() -> Dict[str, str]

Return dictionary with UniProt accession id as key and OrientDB @rid as value.

Applies to all pure nodes in graph with class name directly or indirectly involved in BEL statement. This method could be helpful to avoid graph explosion.

get_pure_uniprot_rids_dict()

Return dictionary with UniProt IDs as keys and node rIDs as values.