OrientDB Interface Methods
ebel.manager.orientdb.odb_meta.Graph
Bases: ABC
Generic parent class for BioDBs.
classes: Tuple[OClass, ...]
property
Return generic, node and edge classes as List[OClass].
number_of_generics: Dict[str, int]
property
Returns for number of entries in OrientDB classes and RDB tables. Tables have priority.
number_of_nodes
property
Return node count.
number_of_edges
property
Return edge count.
__init__(generics: Tuple[Generic] = (), nodes: Tuple[Node] = (), edges: Tuple[Edge] = (), indices: Tuple[OIndex] = (), urls: dict = None, biodb_name: str = '', tables_base=None, config_params: Optional[dict] = None, overwrite_config: bool = False)
Init method.
__config_params_check(overwrite_config: bool = False)
Go through passed/available configuration params.
execute(command_str: str) -> List[OrientRecord]
Execute a command directly in the OrientDB server.
Parameters
command_str: str The SQL to be executed
Raises
PyOrientCommandException Caused by a disconnect to the ODB server. eBEL will try to reconnect if possible.
Returns
ODB response.
set_configuration_parameters()
Set configuration for OrientDB database client instance using configuration file or passed params.
get_client() -> OrientDB
Attempts to connect to the OrientDB client. This is currently done by using session tokens.
__repr__()
Represent the class.
insert_data() -> Dict[str, int]
abstractmethod
Insert all generic data.
update_interactions() -> int
abstractmethod
Insert all generic data.
clear_and_import_data() -> Dict[str, int]
Clears the associated table and inserts the data from raw downloaded data.
create_index_rdbms(table_name: str, columns)
Creates index on column(s) in RDBMS.
clear_edges_by_bel_doc_rid(bel_document_rid: str, even_if_other_doc_rids_exists=True)
Delete all edges linked to a specified BEL document rID.
clear_documents() -> int
Clear all document info. Returns number of deleted documents.
get_number_of_bel_statements_by_document_rid(bel_document_rid: str) -> int
Return BEL statement count with a given document ID.
get_documents()
Return all document info as pandas DataFrame.
get_documents_as_dict()
Return all document info as pandas DataFrame.
add_keyword(keyword: str, description: str) -> pd.DataFrame
Add a keyword and description used to tagging BEL documents.
Parameters
keyword : str The name of a project the work is based on or type of work description : str Detailed explanation of the keyword
get_info_class(class_name)
Return info about class.
get_info_properties(class_name: str, short: bool = True)
Get the property information for a specified table.
entry_exists(class_name, **params) -> bool
Check if class_name whith parameters exists.
query(sql: str) -> pd.DataFrame
Return a pandas DataFrame results table.
query_get_dict(sql: str) -> List[dict]
Return list of dictionaries using a given SQL query.
query_class(class_name: str, limit: int = 0, skip: int = 0, columns: Iterable[str] = None, with_rid=True, with_class=False, print_sql: bool = False, group_by: List[str] = None, distinct=False, as_dataframe: bool = False, where_list: Tuple[str] = (), **params) -> Union[List[dict], pd.DataFrame]
Query class by params and returns list of pyorient.OrientRecord.
query_class_chunks(class_name: str, chunk_size: int = 10000, columns: Iterable[str] = None, with_rid=True, with_class=False, print_sql: bool = False, group_by: List[str] = None, distinct: bool = False)
Query class by params and only return a set of results in batches. Creates a generator.
query_rid(rid, columns: list = None)
Query specified columns of a given rID entry.
download(url_dict: Dict[str, str] = None, biodb: str = None, expiration_days: int = 100) -> Dict[str, bool]
Download url to file_path if not older than expiration_days.
download_file(url: str, biodb: str, expiration_days: int = 100, addtional_header: dict = None) -> bool
staticmethod
Download file. Returns True if it was needed to download the file.
index_exists(index_name: str)
Check if index_name exists.
create_index(index: OIndex)
Create index.
get_index_name(index: OIndex)
Return index name.
create_all_classes()
Create all classes.
create_all_indices()
Create indices.
create_indices(indices: Tuple[OIndex])
Create indices.
drop_all_indices()
Drop indices.
drop_indices(indices: List[OIndex])
Drop indices.
drop_index(index: OIndex)
Drop index.
create_node_classes()
Create node classes.
create_edge_classes()
Create edge classes.
create_generic_classes()
Create generic classes.
create_classes(oclasses: Tuple[OClass] = None)
Create classes.
list of classes (odb_structure.OClass) OrientDB class v, e or g for vertex, edge or generic
create_class(oclass: OClass, print_sql=False)
Create class.
OrientDB class v, e or g for vertex, edge or generic
create_class_property(class_name: str, prop: OProperty, print_sql: bool = False)
Create OrientDB class property.
class_exists(class_name: str) -> bool
Check if OrientDB class exists.
classes_exists(list_of_class_names)
Check if list of OrientDB classes exists.
drop_class(class_name: str)
Drop the specified table.
__drop_classes(classes: Iterable[OClass])
Delete the classes in opposite order.
drop_all_classes()
Drop all classes.
drop_generic_classes()
Drop all generic classes.
drop_node_classes()
Drop all node classes.
drop_edge_classes()
Drop all edge classes.
clear()
Clear (delete entries) from all classes.
is_abstract_class(class_name: str) -> bool
Returns true if class is abstract.
clear_edges() -> Dict[str, int]
Delete all edges.
clear_class(class_name)
Delete all entries from class if exists.
clear_nodes()
Delete all nodes.
clear_nodes_with_no_edges()
Delete all nodes from a class with no edges.
clear_generics()
Delete all entries in generic tables.
recreate_tables()
Recreate SQLAlchemy tables in relational database.
clear_nodes_and_edges()
Delete all nodes and edges of a specific biodb.
clear_all_nodes_and_edges()
Delete all nodes and edges in the whole database.
clear_exp_edges()
Delete all DEA experiment associated edges.
recreate()
Recreate OrientDB collection.
table_exists(table: Table)
Checks if the table exists in RDBMS.
__get_sql_where_part(params, where_list: Tuple[str] = ())
staticmethod
Return a ODB SQL where part by params.
get_number_of_class(class_name, distinct_column_name: str = None, **params)
Return count of unique values for a given class_name and column name.
get_cluster_ids(class_name: str) -> list
Get all cluster ids by class name.
insert_record(class_name: str, value_dict: dict, print_sql=False) -> Optional[str]
Insert new entry in class with values from dictionary. Returns rid.
create_record(class_name: str, value_dict: dict) -> Optional[str]
Create record/ insert into class_name with content of value_dict.
update_record(class_name: str, value_dict: dict) -> str
Update record with content of value_dict.
edge_exists(class_name: str, from_rid: str, to_rid: str, value_dict: dict = {}) -> str
Check if edge exists. Return rid if exists else None.
node_exists(class_name: str, value_dict: dict = {}, check_for: Union[Iterable[str], str] = None, print_sql: bool = False) -> str
Check if node exists. Return rid if exists else None.
create_edge(class_name: str, from_rid: str, to_rid: str, value_dict: dict = {}, print_sql=False, if_not_exists=False, ignore_empty_values=False) -> str
Create edge from from_rid(@rid) to to_rid(@rid) with content of value_dict.
get_create_rid(class_name: str, value_dict: dict, check_for=None, print_sql=False) -> str
Return class_name.@rid by value_dict. Create record/insert if not exists.
update_correlative_edges() -> List[str]
Create a reverse edge for every correlative edge.
update_document_info()
Update document metadata.
update_pmcids() -> int
Add PMC ID to bel_relation if one exists.
update_pmids(edge_name='bel_relation')
Update PMID metadata for all edges of the specified edge_name.
import_dataframe(dataframe: pd.DataFrame, class_name: str, replace_nulls_with_nones: bool = True, standardize_column_names: bool = True, replace: bool = True) -> int
Import dataframe into OrientDb class with name.
batch_insert(dataframe: pd.DataFrame, database: str, chunk_size: int = 100, desc: str = None, standardize_column_names: bool = False, replace: bool = True, replace_nulls_with_nones: bool = False) -> int
Adds rows of a dataframe into specified generic table in batches.
Parameters
dataframe: pandas DataFrame A dataframe of information to be inserted into the generic table. database: str Name of the generic table to be inserted into. chunk_size: int (optional) Number of chunks to break the dataframe into for batching. desc: str A description for tqdm about what is being iterated through. standardize_column_names: bool If True (default=False), standardize column names. replace: bool If True (default), content of dataframe replaces old data. replace_nulls_with_nones: bool If True (default=False), replace numpy.nan with None (==null in OrientDB). Returns
int Number of rows inserted into the table.
get_set_gene_rids_by_position(chromosome: str, position: int, gene_types=['mapped', 'downstream', 'upstream']) -> Dict[str, List[str]]
Return dictionary of mapped gene by chromosal position.
ALERT: creates new BEL HGNC gene is not exists.
class_is_descendant_of(child_name: str, descendant_name: str) -> bool
Returns True if child_name is a child class of descendent_name.
class_has_children(class_name) -> bool
Checks if class_name has children that inherit from class_name.
get_child_classes(class_name) -> List[str]
Get list of child classes for given class_name.
get_leaf_classes_of(class_name: str) -> List[str]
Return list of children classes for the given class_name.
insert() -> Dict[str, int]
Check if files missing for download or generic table empty. If True then insert data.
update() -> None
Check generics and update BEL interactions.
update_bel() -> None
Delete and update all class specific edges.
delete_nodes_with_no_edges(class_name=None) -> int
Delete all nodes without any edges.
get_pure_symbol_rids_dict_in_bel_context(class_name='protein', namespace='HGNC') -> Dict[str, str]
Return dictionary with HGNC names as key and OrientDB @rid as value.
Applies to all pure nodes in graph with class name directly or indirectly involved in BEL stmt. This method could be helpful to avoid graph explosion.
get_pure_uniprots_in_bel_context() -> Set[str]
Returns a list of all uniprot accessions in BEL annotation context.
get_pure_symbol_rid_df_in_bel_context(class_name='protein', namespace='HGNC') -> pd.DataFrame
Return dictionary with gene symbols as keys and node rIDs as values.
get_pure_symbol_rids_dict(class_name='protein', namespace='HGNC') -> Dict[str, str]
Return dictionary with protein name as keys and node rIDs as values.
get_pure_rid_by_uniprot(uniprot: str)
Get rIDs of node based on UniProt ID.
get_pure_uniprot_rid_dict_in_bel_context() -> Dict[str, str]
Return dictionary with UniProt accession id as key and OrientDB @rid as value.
Applies to all pure nodes in graph with class name directly or indirectly involved in BEL statement. This method could be helpful to avoid graph explosion.
get_pure_uniprot_rids_dict()
Return dictionary with UniProt IDs as keys and node rIDs as values.