OrientDB Interface Methods

`ebel.manager.orientdb.odb_meta.Graph`

Bases: ABC

Generic parent class for BioDBs.

`classes: Tuple[OClass, ...]` `property`

Return generic, node and edge classes as List[OClass].

`number_of_generics: Dict[str, int]` `property`

Returns for number of entries in OrientDB classes and RDB tables. Tables have priority.

`number_of_nodes` `property`

Return node count.

`number_of_edges` `property`

Return edge count.

`init(generics: Tuple[Generic] = (), nodes: Tuple[Node] = (), edges: Tuple[Edge] = (), indices: Tuple[OIndex] = (), urls: dict = None, biodb_name: str = '', tables_base=None, config_params: Optional[dict] = None, overwrite_config: bool = False)`

Init method.

`__config_params_check(overwrite_config: bool = False)`

Go through passed/available configuration params.

`execute(command_str: str) -> List[OrientRecord]`

Execute a command directly in the OrientDB server.

Parameters

command_str: str The SQL to be executed

Raises

PyOrientCommandException Caused by a disconnect to the ODB server. eBEL will try to reconnect if possible.

Returns

ODB response.

`set_configuration_parameters()`

Set configuration for OrientDB database client instance using configuration file or passed params.

`get_client() -> OrientDB`

Attempts to connect to the OrientDB client. This is currently done by using session tokens.

`repr()`

Represent the class.

`insert_data() -> Dict[str, int]` `abstractmethod`

Insert all generic data.

`update_interactions() -> int` `abstractmethod`

Insert all generic data.

`clear_and_import_data() -> Dict[str, int]`

Clears the associated table and inserts the data from raw downloaded data.

`create_index_rdbms(table_name: str, columns)`

Creates index on column(s) in RDBMS.

`clear_edges_by_bel_doc_rid(bel_document_rid: str, even_if_other_doc_rids_exists=True)`

Delete all edges linked to a specified BEL document rID.

`clear_documents() -> int`

Clear all document info. Returns number of deleted documents.

`get_number_of_bel_statements_by_document_rid(bel_document_rid: str) -> int`

Return BEL statement count with a given document ID.

`get_documents()`

Return all document info as pandas DataFrame.

`get_documents_as_dict()`

Return all document info as pandas DataFrame.

`add_keyword(keyword: str, description: str) -> pd.DataFrame`

Add a keyword and description used to tagging BEL documents.

Parameters

keyword : str The name of a project the work is based on or type of work description : str Detailed explanation of the keyword

`get_info_class(class_name)`

Return info about class.

`get_info_properties(class_name: str, short: bool = True)`

Get the property information for a specified table.

`entry_exists(class_name, **params) -> bool`

Check if class_name whith parameters exists.

`query(sql: str) -> pd.DataFrame`

Return a pandas DataFrame results table.

`query_get_dict(sql: str) -> List[dict]`

Return list of dictionaries using a given SQL query.

`query_class(class_name: str, limit: int = 0, skip: int = 0, columns: Iterable[str] = None, with_rid=True, with_class=False, print_sql: bool = False, group_by: List[str] = None, distinct=False, as_dataframe: bool = False, where_list: Tuple[str] = (), **params) -> Union[List[dict], pd.DataFrame]`

Query class by params and returns list of pyorient.OrientRecord.

`query_class_chunks(class_name: str, chunk_size: int = 10000, columns: Iterable[str] = None, with_rid=True, with_class=False, print_sql: bool = False, group_by: List[str] = None, distinct: bool = False)`

Query class by params and only return a set of results in batches. Creates a generator.

`query_rid(rid, columns: list = None)`

Query specified columns of a given rID entry.

`download(url_dict: Dict[str, str] = None, biodb: str = None, expiration_days: int = 100) -> Dict[str, bool]`

Download url to file_path if not older than expiration_days.

`download_file(url: str, biodb: str, expiration_days: int = 100, addtional_header: dict = None) -> bool` `staticmethod`

Download file. Returns True if it was needed to download the file.

`index_exists(index_name: str)`

Check if index_name exists.

`create_index(index: OIndex)`

Create index.

`get_index_name(index: OIndex)`

Return index name.

`create_all_classes()`

Create all classes.

`create_all_indices()`

Create indices.

`create_indices(indices: Tuple[OIndex])`

Create indices.

`drop_all_indices()`

Drop indices.

`drop_indices(indices: List[OIndex])`

Drop indices.

`drop_index(index: OIndex)`

Drop index.

`create_node_classes()`

Create node classes.

`create_edge_classes()`

Create edge classes.

`create_generic_classes()`

Create generic classes.

`create_classes(oclasses: Tuple[OClass] = None)`

Create classes.

list of classes (odb_structure.OClass) OrientDB class v, e or g for vertex, edge or generic

`create_class(oclass: OClass, print_sql=False)`

Create class.

OrientDB class v, e or g for vertex, edge or generic

`create_class_property(class_name: str, prop: OProperty, print_sql: bool = False)`

Create OrientDB class property.

`class_exists(class_name: str) -> bool`

Check if OrientDB class exists.

`classes_exists(list_of_class_names)`

Check if list of OrientDB classes exists.

`drop_class(class_name: str)`

Drop the specified table.

`__drop_classes(classes: Iterable[OClass])`

Delete the classes in opposite order.

`drop_all_classes()`

Drop all classes.

`drop_generic_classes()`

Drop all generic classes.

`drop_node_classes()`

Drop all node classes.

`drop_edge_classes()`

Drop all edge classes.

`clear()`

Clear (delete entries) from all classes.

`is_abstract_class(class_name: str) -> bool`

Returns true if class is abstract.

`clear_edges() -> Dict[str, int]`

Delete all edges.

`clear_class(class_name)`

Delete all entries from class if exists.

`clear_nodes()`

Delete all nodes.

`clear_nodes_with_no_edges()`

Delete all nodes from a class with no edges.

`clear_generics()`

Delete all entries in generic tables.

`recreate_tables()`

Recreate SQLAlchemy tables in relational database.

`clear_nodes_and_edges()`

Delete all nodes and edges of a specific biodb.

`clear_all_nodes_and_edges()`

Delete all nodes and edges in the whole database.

`clear_exp_edges()`

Delete all DEA experiment associated edges.

`recreate()`

Recreate OrientDB collection.

`table_exists(table: Table)`

Checks if the table exists in RDBMS.

`__get_sql_where_part(params, where_list: Tuple[str] = ())` `staticmethod`

Return a ODB SQL where part by params.

`get_number_of_class(class_name, distinct_column_name: str = None, **params)`

Return count of unique values for a given class_name and column name.

`get_cluster_ids(class_name: str) -> list`

Get all cluster ids by class name.

`insert_record(class_name: str, value_dict: dict, print_sql=False) -> Optional[str]`

Insert new entry in class with values from dictionary. Returns rid.

`create_record(class_name: str, value_dict: dict) -> Optional[str]`

Create record/ insert into class_name with content of value_dict.

`update_record(class_name: str, value_dict: dict) -> str`

Update record with content of value_dict.

`edge_exists(class_name: str, from_rid: str, to_rid: str, value_dict: dict = {}) -> str`

Check if edge exists. Return rid if exists else None.

`node_exists(class_name: str, value_dict: dict = {}, check_for: Union[Iterable[str], str] = None, print_sql: bool = False) -> str`

Check if node exists. Return rid if exists else None.

`create_edge(class_name: str, from_rid: str, to_rid: str, value_dict: dict = {}, print_sql=False, if_not_exists=False, ignore_empty_values=False) -> str`

Create edge from from_rid(@rid) to to_rid(@rid) with content of value_dict.

`get_create_rid(class_name: str, value_dict: dict, check_for=None, print_sql=False) -> str`

Return class_name.@rid by value_dict. Create record/insert if not exists.

`update_correlative_edges() -> List[str]`

Create a reverse edge for every correlative edge.

`update_document_info()`

Update document metadata.

`update_pmcids() -> int`

Add PMC ID to bel_relation if one exists.

`update_pmids(edge_name='bel_relation')`

Update PMID metadata for all edges of the specified edge_name.

`import_dataframe(dataframe: pd.DataFrame, class_name: str, replace_nulls_with_nones: bool = True, standardize_column_names: bool = True, replace: bool = True) -> int`

Import dataframe into OrientDb class with name.

`batch_insert(dataframe: pd.DataFrame, database: str, chunk_size: int = 100, desc: str = None, standardize_column_names: bool = False, replace: bool = True, replace_nulls_with_nones: bool = False) -> int`

Adds rows of a dataframe into specified generic table in batches.

Parameters

dataframe: pandas DataFrame A dataframe of information to be inserted into the generic table. database: str Name of the generic table to be inserted into. chunk_size: int (optional) Number of chunks to break the dataframe into for batching. desc: str A description for tqdm about what is being iterated through. standardize_column_names: bool If True (default=False), standardize column names. replace: bool If True (default), content of dataframe replaces old data. replace_nulls_with_nones: bool If True (default=False), replace numpy.nan with None (==null in OrientDB). Returns

int Number of rows inserted into the table.

`get_set_gene_rids_by_position(chromosome: str, position: int, gene_types=['mapped', 'downstream', 'upstream']) -> Dict[str, List[str]]`

Return dictionary of mapped gene by chromosal position.

ALERT: creates new BEL HGNC gene is not exists.

`class_is_descendant_of(child_name: str, descendant_name: str) -> bool`

Returns True if child_name is a child class of descendent_name.

`class_has_children(class_name) -> bool`

Checks if class_name has children that inherit from class_name.

`get_child_classes(class_name) -> List[str]`

Get list of child classes for given class_name.

`get_leaf_classes_of(class_name: str) -> List[str]`

Return list of children classes for the given class_name.

`insert() -> Dict[str, int]`

Check if files missing for download or generic table empty. If True then insert data.

`update() -> None`

Check generics and update BEL interactions.

`update_bel() -> None`

Delete and update all class specific edges.

`delete_nodes_with_no_edges(class_name=None) -> int`

Delete all nodes without any edges.

`get_pure_symbol_rids_dict_in_bel_context(class_name='protein', namespace='HGNC') -> Dict[str, str]`

Return dictionary with HGNC names as key and OrientDB @rid as value.

Applies to all pure nodes in graph with class name directly or indirectly involved in BEL stmt. This method could be helpful to avoid graph explosion.

`get_pure_uniprots_in_bel_context() -> Set[str]`

Returns a list of all uniprot accessions in BEL annotation context.

`get_pure_symbol_rid_df_in_bel_context(class_name='protein', namespace='HGNC') -> pd.DataFrame`

Return dictionary with gene symbols as keys and node rIDs as values.

`get_pure_symbol_rids_dict(class_name='protein', namespace='HGNC') -> Dict[str, str]`

Return dictionary with protein name as keys and node rIDs as values.

`get_pure_rid_by_uniprot(uniprot: str)`

Get rIDs of node based on UniProt ID.

`get_pure_uniprot_rid_dict_in_bel_context() -> Dict[str, str]`

Return dictionary with UniProt accession id as key and OrientDB @rid as value.

Applies to all pure nodes in graph with class name directly or indirectly involved in BEL statement. This method could be helpful to avoid graph explosion.

`get_pure_uniprot_rids_dict()`

Return dictionary with UniProt IDs as keys and node rIDs as values.

OrientDB Interface Methods

ebel.manager.orientdb.odb_meta.Graph

classes: Tuple[OClass, ...] property

number_of_generics: Dict[str, int] property

number_of_nodes property

number_of_edges property

__init__(generics: Tuple[Generic] = (), nodes: Tuple[Node] = (), edges: Tuple[Edge] = (), indices: Tuple[OIndex] = (), urls: dict = None, biodb_name: str = '', tables_base=None, config_params: Optional[dict] = None, overwrite_config: bool = False)

__config_params_check(overwrite_config: bool = False)

execute(command_str: str) -> List[OrientRecord]

Parameters

Raises

Returns

set_configuration_parameters()

get_client() -> OrientDB

__repr__()

insert_data() -> Dict[str, int] abstractmethod

update_interactions() -> int abstractmethod

clear_and_import_data() -> Dict[str, int]

create_index_rdbms(table_name: str, columns)

clear_edges_by_bel_doc_rid(bel_document_rid: str, even_if_other_doc_rids_exists=True)

clear_documents() -> int

get_number_of_bel_statements_by_document_rid(bel_document_rid: str) -> int

get_documents()

get_documents_as_dict()

add_keyword(keyword: str, description: str) -> pd.DataFrame

Parameters

get_info_class(class_name)

get_info_properties(class_name: str, short: bool = True)

entry_exists(class_name, **params) -> bool

query(sql: str) -> pd.DataFrame

query_get_dict(sql: str) -> List[dict]

query_class(class_name: str, limit: int = 0, skip: int = 0, columns: Iterable[str] = None, with_rid=True, with_class=False, print_sql: bool = False, group_by: List[str] = None, distinct=False, as_dataframe: bool = False, where_list: Tuple[str] = (), **params) -> Union[List[dict], pd.DataFrame]

query_class_chunks(class_name: str, chunk_size: int = 10000, columns: Iterable[str] = None, with_rid=True, with_class=False, print_sql: bool = False, group_by: List[str] = None, distinct: bool = False)

query_rid(rid, columns: list = None)

download(url_dict: Dict[str, str] = None, biodb: str = None, expiration_days: int = 100) -> Dict[str, bool]

download_file(url: str, biodb: str, expiration_days: int = 100, addtional_header: dict = None) -> bool staticmethod

index_exists(index_name: str)

create_index(index: OIndex)

get_index_name(index: OIndex)

create_all_classes()

create_all_indices()

create_indices(indices: Tuple[OIndex])

drop_all_indices()

drop_indices(indices: List[OIndex])

drop_index(index: OIndex)

create_node_classes()

create_edge_classes()

create_generic_classes()

create_classes(oclasses: Tuple[OClass] = None)

create_class(oclass: OClass, print_sql=False)

create_class_property(class_name: str, prop: OProperty, print_sql: bool = False)

class_exists(class_name: str) -> bool

classes_exists(list_of_class_names)

drop_class(class_name: str)

__drop_classes(classes: Iterable[OClass])

drop_all_classes()

drop_generic_classes()

drop_node_classes()

drop_edge_classes()

clear()

is_abstract_class(class_name: str) -> bool

clear_edges() -> Dict[str, int]

clear_class(class_name)

clear_nodes()

clear_nodes_with_no_edges()

clear_generics()

recreate_tables()

clear_nodes_and_edges()

clear_all_nodes_and_edges()

clear_exp_edges()

recreate()

table_exists(table: Table)

__get_sql_where_part(params, where_list: Tuple[str] = ()) staticmethod

get_number_of_class(class_name, distinct_column_name: str = None, **params)

get_cluster_ids(class_name: str) -> list

insert_record(class_name: str, value_dict: dict, print_sql=False) -> Optional[str]

create_record(class_name: str, value_dict: dict) -> Optional[str]

update_record(class_name: str, value_dict: dict) -> str

edge_exists(class_name: str, from_rid: str, to_rid: str, value_dict: dict = {}) -> str

node_exists(class_name: str, value_dict: dict = {}, check_for: Union[Iterable[str], str] = None, print_sql: bool = False) -> str

`ebel.manager.orientdb.odb_meta.Graph`

`classes: Tuple[OClass, ...]` `property`

`number_of_generics: Dict[str, int]` `property`

`number_of_nodes` `property`

`number_of_edges` `property`

`init(generics: Tuple[Generic] = (), nodes: Tuple[Node] = (), edges: Tuple[Edge] = (), indices: Tuple[OIndex] = (), urls: dict = None, biodb_name: str = '', tables_base=None, config_params: Optional[dict] = None, overwrite_config: bool = False)`

`__config_params_check(overwrite_config: bool = False)`

`execute(command_str: str) -> List[OrientRecord]`

`set_configuration_parameters()`

`get_client() -> OrientDB`

`repr()`

`insert_data() -> Dict[str, int]` `abstractmethod`

`update_interactions() -> int` `abstractmethod`

`clear_and_import_data() -> Dict[str, int]`

`create_index_rdbms(table_name: str, columns)`

`clear_edges_by_bel_doc_rid(bel_document_rid: str, even_if_other_doc_rids_exists=True)`

`clear_documents() -> int`

`get_number_of_bel_statements_by_document_rid(bel_document_rid: str) -> int`

`get_documents()`

`get_documents_as_dict()`

`add_keyword(keyword: str, description: str) -> pd.DataFrame`

`get_info_class(class_name)`

`get_info_properties(class_name: str, short: bool = True)`

`entry_exists(class_name, **params) -> bool`

`query(sql: str) -> pd.DataFrame`

`query_get_dict(sql: str) -> List[dict]`

`query_class(class_name: str, limit: int = 0, skip: int = 0, columns: Iterable[str] = None, with_rid=True, with_class=False, print_sql: bool = False, group_by: List[str] = None, distinct=False, as_dataframe: bool = False, where_list: Tuple[str] = (), **params) -> Union[List[dict], pd.DataFrame]`

`query_class_chunks(class_name: str, chunk_size: int = 10000, columns: Iterable[str] = None, with_rid=True, with_class=False, print_sql: bool = False, group_by: List[str] = None, distinct: bool = False)`

`query_rid(rid, columns: list = None)`

`download(url_dict: Dict[str, str] = None, biodb: str = None, expiration_days: int = 100) -> Dict[str, bool]`

`download_file(url: str, biodb: str, expiration_days: int = 100, addtional_header: dict = None) -> bool` `staticmethod`

`index_exists(index_name: str)`

`create_index(index: OIndex)`

`get_index_name(index: OIndex)`

`create_all_classes()`

`create_all_indices()`

`create_indices(indices: Tuple[OIndex])`

`drop_all_indices()`

`drop_indices(indices: List[OIndex])`

`drop_index(index: OIndex)`

`create_node_classes()`

`create_edge_classes()`

`create_generic_classes()`

`create_classes(oclasses: Tuple[OClass] = None)`

`create_class(oclass: OClass, print_sql=False)`

`create_class_property(class_name: str, prop: OProperty, print_sql: bool = False)`

`class_exists(class_name: str) -> bool`

`classes_exists(list_of_class_names)`

`drop_class(class_name: str)`

`__drop_classes(classes: Iterable[OClass])`

`drop_all_classes()`

`drop_generic_classes()`

`drop_node_classes()`

`drop_edge_classes()`

`clear()`

`is_abstract_class(class_name: str) -> bool`

`clear_edges() -> Dict[str, int]`

`clear_class(class_name)`

`clear_nodes()`

`clear_nodes_with_no_edges()`

`clear_generics()`

`recreate_tables()`

`clear_nodes_and_edges()`

`clear_all_nodes_and_edges()`

`clear_exp_edges()`

`recreate()`

`table_exists(table: Table)`

`__get_sql_where_part(params, where_list: Tuple[str] = ())` `staticmethod`

`get_number_of_class(class_name, distinct_column_name: str = None, **params)`

`get_cluster_ids(class_name: str) -> list`

`insert_record(class_name: str, value_dict: dict, print_sql=False) -> Optional[str]`

`create_record(class_name: str, value_dict: dict) -> Optional[str]`

`update_record(class_name: str, value_dict: dict) -> str`

`edge_exists(class_name: str, from_rid: str, to_rid: str, value_dict: dict = {}) -> str`

`node_exists(class_name: str, value_dict: dict = {}, check_for: Union[Iterable[str], str] = None, print_sql: bool = False) -> str`

`create_edge(class_name: str, from_rid: str, to_rid: str, value_dict: dict = {}, print_sql=False, if_not_exists=False, ignore_empty_values=False) -> str`

`get_create_rid(class_name: str, value_dict: dict, check_for=None, print_sql=False) -> str`

`update_correlative_edges() -> List[str]`

`update_document_info()`

`update_pmcids() -> int`