Controlled Vocabulary Objects¶
psims
uses controlled vocabularies to refer to externally controlled
and organized terms to describe the entities being written about in the file
formats it produces. These domain-specific vocabularies can be updated independently
from the file schemas for faster update and maintenance life cycles.
The ControlledVocabulary
type represents a parsed and interpreted
controlled vocabulary, a collection of Entity
objects.
- class psims.controlled_vocabulary.controlled_vocabulary.ControlledVocabulary(terms, id=None, metadata=None, version=None, name=None, import_resolver: Optional[Callable[[str], psims.controlled_vocabulary.controlled_vocabulary.ControlledVocabulary]] = None)[source]¶
A Controlled Vocabulary is a collection of terms or entities with controlled meanings and semantics.
This object makes entities resolvable by name, accession number, or synonym.
This object implements the
Mapping
protocol.- version¶
A string describing the version of this controlled vocabulary. Not all vocabularies are versioned the same way, so this is value is not interpreted further automatically.
- Type
- id¶
An identifier for this controlled vocabulary that is unique within a particular context
- Type
- classmethod from_obo(handle, **kwargs)[source]¶
Construct a new instance from an OBO format stream.
- Parameters
handle (file-like) – A file-like object over an OBO format.
- Returns
- Return type
- Raises
ValueError: – When the controlled vocabulary produced contains no terms
- names()[source]¶
A key-view over all the names in this controlled vocabulary, distinct from accessions.
- Returns
- Return type
collections.KeysView
- query(key)[source]¶
Search for a term whose id or name matches key, or if it is a synonym.
This search is case-insensitive, but case-matching is preferred.
- Parameters
key (str) – The key to look up.
- Returns
term – The found entity, if any.
- Return type
- Raises
KeyError : – If there is no match to any term in this vocabulary
See also
search
,__getitem__
Caching¶
psims
accesses controlled vocabularies from the internet to retrieve the
most up-to-date version of each vocabularies. If an internet connection is unavailable,
it will fall back to a vendored copy of a specific version of each controlled vocabulary
bundled with psims
at build time.
Additionally, an application might choose to save a copy of each required controlled
vocabulary file on the file system in a specific location. This can be accomplished with the
psims.controlled_vocabulary.controlled_vocabulary.obo_cache
object, an instance of OBOCache
type.
Setting cache_path
will specify the path to the directory to cache files
in, and enabled
to toggle whether or not the cache is used. If the cache
is enabled and a copy of the controlled vocabulary is not in the cache, a new copy will be
downloaded or loaded from the vendored copy if unavailable, and writes it to the cache directory
for future re-use.
If a library wants to create its own separate cache directory, it can create a new instance of
OBOCache
and configure it separately. This custom cache instance can be passed to all
XML file writing classes as the vocabulary_resolver
parameter.
Note
OBOCache
has two behavioral switches that interact:OBOCache.enabled
- When this isTrue
, files from the cache directory will be used and new files will be added to the cache directory. Otherwise, a new copy of each CV file will be requested when accessing a vocabulary.OBOCache.use_remote
- When this isTrue
, new copies of CV files will be requested over the network, falling back to packaged copy inpsims
only when the network request fails. Otherwise, the packaged copy will be used automatically.
- class psims.controlled_vocabulary.controlled_vocabulary.OBOCache(cache_path='.obo_cache', enabled=True, resolvers=None, use_remote=True, user_agent_emulation=True)[source]¶
A cache for retrieved ontology sources stored on the file system, and an abstraction layer to make registered controlled vocabularies constructable from a URI even if they are not in the same format.
- resolvers¶
A mapping from ontology URL to a function which will be called instead of opening the URL to retrieve the
ControlledVocabulary
object. A resolver is any callable that takes only anOBOCache
instance as a single argument.- Type
- use_remote¶
Whether or not to try to access remote repositories over the network to retrieve controlled vocabularies. If not, will automatically default to either the cached copy or use the fallback value.
- Type
- user_agent_emulation¶
Whether or not to try to emulate a web browser’s user agent when trying to download a controlled vocabulary.
- Type
- fallback(uri)[source]¶
Obtain a stream for the vocabulary specified by uri from the packaged bundle distributed with
psims
.- Parameters
uri (str) – The URI to retrieve a fallback stream for.
- Returns
result – Returns a backup stream, or
None
if no fallback exists.- Return type
file-like or
None
- path_for(name, setext=False)[source]¶
Construct a path for a given controlled vocabulary file in the cache on the file system.
Note
If the cache directory does not exist, this will create it.
- resolve(uri)[source]¶
Get an readable file-like object for the controlled vocabulary referred to by uri.
If uri has a custom resolver, by
has_custom_resolver()
, the custom resolver function will be called instead.
Semantic Data¶
Terms in a controlled vocabulary define entities, categories, properties and relationships between
them. The Entity
type is how these are represented
in memory.
- class psims.controlled_vocabulary.entity.Entity(vocabulary=None, **attributes)[source]¶
Represent a term in a controlled vocabulary.
While this type implements the
Mapping
, it supports attribute access notation for keys.- vocabulary¶
The source vocabulary. May be used for upward references
- Type
- is_of_type(tp: Union[str, psims.controlled_vocabulary.entity.Entity]) bool [source]¶
Test if tp is an ancestor of this
Entity
- parent() Union[None, psims.controlled_vocabulary.entity.Entity, List[psims.controlled_vocabulary.entity.Entity]] [source]¶
Fetch the parent or parents of this
Entity
in the bound controlled vocabulary.