Extractors

IntermediateExtractorBase

class activation_extractor.extractors.intermediateExtractorBase.IntermediateExtractorBase(model, layer_list)[source]

Extractor for intermediate model outputs. Extracts the intermediate calculations (from a specified list of modules) from a pytorch model during inference.

Parameters:
  • model – the Pytorch model object

  • layer_list (list of strings) – list of module names to get outputs from

clear_all_hooks()[source]

Clears ALL the forward hooks registered to the model.

create_hook(layer_name)[source]

Creates a pytorch hook that saves the output of a given module/layer in the model. A pytorch hook is a function that is executed after the module is called.

Parameters:

layer_name (str) – name of the module/layer

Returns:

the corresponding hook function

Return type:

function

detach_hooks()[source]

Detaches all the registered hooks saved in the hook_handles attribute.

get_outputs()[source]

Returns the intermediate activation outputs.

Returns:

dictionary with intermediate outputs for each specified module/layer.

Return type:

dictionary

register_hooks()[source]

Registers all the hooks for the specified layers. It saves the hook handles to the hook_handles attribute.

IntermediateExtractor

class activation_extractor.extractors.intermediateExtractor.IntermediateExtractor(model, layer_list)[source]

Extends the functionality of the IntermediateExtractorBase to automatically save activations.

emb_reformatting(outputs, emb_format='full', sequence_axis=1, custom_position=None)[source]

Takes an output to save and reformats it according to emb_format : * full: nothing * mean: mean along sequence axis * LT: last token in sequence * FT: first token in sequence * custom: custom token position in sequence

gpu_to_cpu()[source]

Takes all the intermediate activations stored in the extractor object and moves them to CPU (if on GPU) after formatting them to a numpy array.

save_outputs(output_folder, output_id, reset=False, move_to_cpu=True, save_method='numpy_compressed', emb_formats=['LT', 'FT'], sequence_axis=1, custom_position=None)[source]

Save intermediate activation dictionary to output folder. You can choose:

  • the saving function (numpy_compressed or numpy)

  • the embedding format (full, mean, LT: last token) : a list.

  • sequence_axis : sequence length axis to take mean or last token from