Knowledge assembly modules (indra_world.assembly)

Statement preprocessing (indra_world.assembly.preprocess)

indra_world.assembly.preprocess.preprocess_statements(raw_statements, steps)[source]

Run a preprocessing pipeline on raw statements.

Parameters:
  • raw_statements (List[Statement]) – A list of INDRA Statements to preprocess.

  • steps (List[Dict[str, Any]]) – A list of AssemblyPipeline steps that define the steps of preprocessing.

Returns:

A list of preprocessed INDRA Statements.

Return type:

preprocessed_statements

Assembly operations (indra_world.assembly.operations)

class indra_world.assembly.operations.CompositionalRefinementFilter(ontology, nproc=None)[source]
extend(stmts_by_hash)[source]

Extend the initial data structures with a set of new statements.

Parameters:

stmts_by_hash (dict[int, indra.statements.Statement]) – A dict of statements keyed by their hashes.

Return a set of statement hashes that a given statement is potentially related to.

Parameters:
  • stmt (indra.statements.Statement) – The INDRA statement whose potential relations we want to filter.

  • possibly_related (set or None) – A set of statement hashes that this statement is potentially related to, as determined by some other filter. If this parameter is a set (including an empty set), this function should return a subset of it (intuitively, this filter can only further eliminate some of the potentially related hashes that were previously determined to be potential relations). If this argument is None, the function must assume that no previous filter was run before, and should therefore return all the possible relations that it determines.

  • direction (str) – One of ‘less_specific’ or ‘more_specific. Since refinements are directed relations, this function can operate in two different directions: it can either find less specific potentially related stateemnts, or it can find more specific potentially related statements, as determined by this argument.

Returns:

A set of INDRA Statement hashes that are potentially related to the given statement.

Return type:

set of int

initialize(stmts_by_hash)[source]

Initialize the filter class with a set of statements.

The filter can build up some useful data structures in this function before being applied to any specific statements.

Parameters:

stmts_by_hash (dict[int, indra.statements.Statement]) – A dict of statements keyed by their hashes.

indra_world.assembly.operations.get_expanded_events_influences(stmts)[source]

Return a list of all standalone events from a list of statements.

indra_world.assembly.operations.location_matches_compositional(stmt)[source]

Return a matches_key which takes geo-location into account.

indra_world.assembly.operations.location_refinement_compositional(st1, st2, ontology, entities_refined=True)[source]

Return True if there is a location-aware refinement between stmts.

indra_world.assembly.operations.make_display_name(comp_grounding)[source]

Return display name from a compositional grounding with ‘of’ linkers.

indra_world.assembly.operations.make_display_name_linear(comp_grounding)[source]

Return display name from compositional grounding with linear joining.

indra_world.assembly.operations.merge_deltas(stmts_in)[source]

Gather and merge original Influence delta information from evidence.

This function is only applicable to Influence Statements that have subj and obj deltas. All other statement types are passed through unchanged. Polarities and adjectives for subjects and objects respectivey are collected and merged by travesrsing all evidences of a Statement.

Parameters:

stmts_in (list[indra.statements.Statement]) – A list of INDRA Statements whose influence deltas should be merged. These Statements are meant to have been preassembled and potentially have multiple pieces of evidence.

Returns:

stmts_out – The list of Statements now with deltas merged at the Statement level.

Return type:

list[indra.statements.Statement]

indra_world.assembly.operations.remove_namespaces(stmts, namespaces)[source]

Remove unnecessary namespaces from Concept grounding.

indra_world.assembly.operations.remove_raw_grounding(stmts)[source]

Remove the raw_grounding annotation to decrease output size.

Matches functions (indra_world.assembly.matches)

indra_world.assembly.matches.event_location_time_matches(event)[source]

Return Event matches key which takes location and time into account.

indra_world.assembly.matches.get_location(stmt)[source]

Return the grounded geo-location context associated with a Statement.

indra_world.assembly.matches.get_location_from_object(loc_obj)[source]

Return geo-location from a RefContext location object.

indra_world.assembly.matches.get_time(stmt)[source]

Return the time context associated with a Statement.

indra_world.assembly.matches.has_location(stmt)[source]

Return True if a Statement has grounded geo-location context.

indra_world.assembly.matches.has_time(stmt)[source]

Return True if a Statement has time context.

indra_world.assembly.matches.location_matches(stmt)[source]

Return a matches_key which takes geo-location into account.

indra_world.assembly.matches.location_matches_compositional(stmt)[source]

Return a matches_key which takes geo-location into account.

Refinement functions (indra_world.assembly.refinement)

class indra_world.assembly.refinement.CompositionalRefinementFilter(ontology, nproc=None)[source]
extend(stmts_by_hash)[source]

Extend the initial data structures with a set of new statements.

Parameters:

stmts_by_hash (dict[int, indra.statements.Statement]) – A dict of statements keyed by their hashes.

Return a set of statement hashes that a given statement is potentially related to.

Parameters:
  • stmt (indra.statements.Statement) – The INDRA statement whose potential relations we want to filter.

  • possibly_related (set or None) – A set of statement hashes that this statement is potentially related to, as determined by some other filter. If this parameter is a set (including an empty set), this function should return a subset of it (intuitively, this filter can only further eliminate some of the potentially related hashes that were previously determined to be potential relations). If this argument is None, the function must assume that no previous filter was run before, and should therefore return all the possible relations that it determines.

  • direction (str) – One of ‘less_specific’ or ‘more_specific. Since refinements are directed relations, this function can operate in two different directions: it can either find less specific potentially related stateemnts, or it can find more specific potentially related statements, as determined by this argument.

Returns:

A set of INDRA Statement hashes that are potentially related to the given statement.

Return type:

set of int

initialize(stmts_by_hash)[source]

Initialize the filter class with a set of statements.

The filter can build up some useful data structures in this function before being applied to any specific statements.

Parameters:

stmts_by_hash (dict[int, indra.statements.Statement]) – A dict of statements keyed by their hashes.

indra_world.assembly.refinement.event_location_refinement(st1, st2, ontology, entities_refined, ignore_polarity=False)[source]

Return True if there is a location-aware refinement between Events.

indra_world.assembly.refinement.event_location_time_refinement(st1, st2, ontology, entities_refined)[source]

Return True if there is a location/time refinement between Events.

indra_world.assembly.refinement.get_agent_key(agent, comp_idx)[source]

Return a key for an Agent for use in refinement finding.

Parameters:

agent (indra.statements.Agent or None) – An INDRA Agent whose key should be returned.

Returns:

The key that maps the given agent to the ontology, with special handling for ungrounded and None Agents.

Return type:

tuple or None

indra_world.assembly.refinement.location_refinement(st1, st2, ontology, entities_refined)[source]

Return True if there is a location-aware refinement between stmts.

indra_world.assembly.refinement.location_refinement_compositional(st1, st2, ontology, entities_refined=True)[source]

Return True if there is a location-aware refinement between stmts.

indra_world.assembly.refinement.location_time_refinement(st1, st2, ontology, entities_refined)[source]

Return True if there is a location/time refinement between stmts.

Incremental Assembler (indra_world.assembly.incremental_assembler)

class indra_world.assembly.incremental_assembler.AssemblyDelta(new_stmts, new_evidences, new_refinements, beliefs, matches_fun=None)[source]

Represents changes to the assembly structure as a result of new statements added to a set of existing statements.

new_stmts

A dict of new statement keyed by hash.

Type:

dict[str, indra.statements.Statement]

new_evidences

A dict of new evidences for existing or new statements keyed by statement hash.

Type:

dict[str, indra.statements.Evidence]

new_refinements

A list of statement hash pairs representing new refinement links.

Type:

list[tuple]

beliefs

A dict of belief scores keyed by all statement hashes (both old and new).

Type:

dict[str, float]

matches_fun

An optional custom matches function. When using a custom matches function for assembly, providing it here is necessary to get correct JSON serialization.

Type:

Optional[Callable[[Statement], str]]

to_json()[source]

Return a JSON representation of the assembly delta.

class indra_world.assembly.incremental_assembler.IncrementalAssembler(prepared_stmts, refinement_filters=None, matches_fun=<function location_matches_compositional>, curations=None, post_processing_steps=None, ontology=None)[source]

Assemble a set of prepared statements and allow incremental extensions.

Parameters:
  • prepared_stmts (list[indra.statements.Statement]) – A list of prepared INDRA Statements.

  • refinement_filters (Optional[list[indra.preassembler.refinement.RefinementFilter]]) – A list of refinement filter classes to be used for refinement finding. Default: the standard set of compositional refinement filters.

  • matches_fun (Optional[function]) – A custom matches function for determining matching statements and calculating hashes. Default: matches function that takes compositional grounding and location into account.

  • curations (dict[dict]) – A dict of user curations to be integrated into the assembly results, keyed by statement hash.

  • post_processing_steps (list[dict]) – Steps that can be used in an INDRA AssemblyPipeline to do post-processing on statements.

refinement_edges

A set of tuples of statement hashes representing refinement links between statements.

Type:

set

add_statements(stmts)[source]

Add new statements for incremental assembly.

Parameters:

stmts (list[indra.statements.Statement]) – A list of new prepared statements to be incrementally assembled into the set of existing statements.

Returns:

An AssemblyDelta object representing the changes to the assembly as a result of the new added statements.

Return type:

AssemblyDelta

static annotate_evidences(stmt)[source]

Add annotations to evidences of a given statement.

apply_curations()[source]

Apply the set of curations to the de-duplicated statements.

static build_refinements_graph(stmts_by_hash, refinement_edges)[source]

Return a refinements graph based on statements and refinement edges.

deduplicate()[source]

Build hash-based statement and evidence data structures to deduplicate.

get_all_supporting_evidence(sh)[source]

Return direct and indirect evidence for a statement hash.

get_beliefs()[source]

Calculate and return beliefs for all statements.

get_curation_effect(old_hash, curation)[source]

Return changed matches hash as a result of curation.

get_refinements()[source]

Calculate refinement relationships between de-duplicated statements.

get_statements()[source]

Return a flat list of statements with their evidences.

indra_world.assembly.incremental_assembler.parse_factor_grounding_curation(cur)[source]

Parse details from a curation that changes a concept’s grounding.

indra_world.assembly.incremental_assembler.parse_factor_polarity_curation(cur)[source]

Parse details from a curation that changes an event’s polarity.

Statistics (indra_world.assembly.stats)