Skip to content

Proposal: refactor probing pipeline args  #93

@oserikov

Description

@oserikov

I propose to allow for model_config, model, and tokenizer to be optional arguments to the experiment class, rather than setting them post-factum. Like, you either simply pass the model's name, like here, or pass the whole config-model-tokenizer triplet like this :

    model_config = AutoConfig.from_pretrained(
        model_name,
        output_hidden_states=True, 
        output_attentions=True
    )
    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        config=model_config,
        device_map="auto",
        torch_dtype=dtype,
        max_memory = get_max_memory_per_gpu_dict(dtype, model_name)
    )
    tokenizer = AutoTokenizer.from_pretrained(model_name, config = model_config)
    experiment = ProbingPipeline(
        config=config, model = model, tokenizer=tokenizer,
        device = device,
        metric_names = ["f1", "accuracy"],
        encoding_batch_size = encoding_batch_size,
        classifier_batch_size = classifier_batch_size
    )
    

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions