Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Support for Falcon-11B model (Falcon 2) #4902

Closed
s-smits opened this issue May 18, 2024 · 1 comment
Closed

[Feature]: Support for Falcon-11B model (Falcon 2) #4902

s-smits opened this issue May 18, 2024 · 1 comment

Comments

@s-smits
Copy link

s-smits commented May 18, 2024

馃殌 The feature, motivation and pitch

Falcon-11B is trained on multilingual data. There is a lot of potential to serve this model where these languages are preferred. Functional, working inference in fp16 would be a great addition in my opinion.

Additional context

The tokenizer has been consistent, however the architecture has been changed from:

    "model_type": "falcon",
    "architectures": [
        "FalconForCausalLM"
    ],
    "pre_weights": [
        {
            "name": "transformer.word_embeddings.weight",
            "is_embed": true
        }
    ],
    "post_weights": [
        {
            "name": "transformer.ln_f.weight"
        },
        {
            "name": "transformer.ln_f.bias"
        },
        {
            "name": "lm_head.weight",
            "is_embed": true
        }
    ],
    "num_layers_config_key": "num_hidden_layers",
    "layer_templates": {
        "weights": [
            {
                "name": "transformer.h.${layer_index}.ln_attn.bias"
            },
            {
                "name": "transformer.h.${layer_index}.ln_attn.weight"
            },
            {
                "name": "transformer.h.${layer_index}.ln_mlp.bias"
            },
            {
                "name": "transformer.h.${layer_index}.ln_mlp.weight"
            },
            {
                "name": "transformer.h.${layer_index}.mlp.dense_4h_to_h.weight"
            },
            {
                "name": "transformer.h.${layer_index}.mlp.dense_h_to_4h.weight"
            },
            {
                "name": "transformer.h.${layer_index}.self_attention.dense.weight"
            },
            {
                "name": "transformer.h.${layer_index}.self_attention.query_key_value.weight"
            }
        ]
    }
}

to

    "model_type": "falcon",
    "architectures": [
        "FalconForCausalLM"
    ],
    "pre_weights": [
        {
            "name": "transformer.word_embeddings.weight",
            "is_embed": true
        }
    ],
    "post_weights": [
        {
            "name": "transformer.ln_f.weight"
        },
        {
            "name": "transformer.ln_f.bias"
        },
        {
            "name": "lm_head.weight",
            "is_embed": true
        }
    ],
    "num_layers_config_key": "num_hidden_layers",
    "layer_templates": {
        "weights": [
            {
                "name": "transformer.h.${layer_index}.input_layernorm.bias"
            },
            {
                "name": "transformer.h.${layer_index}.input_layernorm.weight"
            },
            {
                "name": "transformer.h.${layer_index}.mlp.dense_4h_to_h.weight"
            },
            {
                "name": "transformer.h.${layer_index}.mlp.dense_h_to_4h.weight"
            },
            {
                "name": "transformer.h.${layer_index}.self_attention.dense.weight"
            },
            {
                "name": "transformer.h.${layer_index}.self_attention.query_key_value.weight"
            }
        ]
    }
}`
which means the architecture has been changed.
model-00001-of-00005.safetensors: 100%|鈻堚枅鈻堚枅| 4.98G/4.98G [18:21<00:00, 4.52MB/s]
[rank0]: Traceback (most recent call last):
[rank0]:   File "/usr/local/bin/scandeval", line 8, in <module>
[rank0]:     sys.exit(benchmark())
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
[rank0]:     return self.main(*args, **kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
[rank0]:     rv = self.invoke(ctx)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
[rank0]:     return ctx.invoke(self.callback, **ctx.params)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
[rank0]:     return __callback(*args, **kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/scandeval/cli.py", line 332, in benchmark
[rank0]:     benchmarker(model=models)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/scandeval/benchmarker.py", line 770, in __call__
[rank0]:     return self.benchmark(*args, **kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/scandeval/benchmarker.py", line 593, in benchmark
[rank0]:     benchmark_output = self._benchmark_single(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/scandeval/benchmarker.py", line 720, in _benchmark_single
[rank0]:     results, metadata_dict, model, tokenizer = dataset(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/scandeval/benchmark_dataset.py", line 601, in __call__
[rank0]:     return self.benchmark(*args, **kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/scandeval/benchmark_dataset.py", line 146, in benchmark
[rank0]:     model, tokenizer = load_model(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/scandeval/model_loading.py", line 52, in load_model
[rank0]:     model, tokenizer = setup.load_model(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/scandeval/model_setups/hf.py", line 311, in load_model
[rank0]:     model = VLLMModel(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/scandeval/vllm_models.py", line 132, in __init__
[rank0]:     self._model = self._initialise(vllm_kwargs=vllm_kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/scandeval/vllm_models.py", line 145, in _initialise
[rank0]:     model = LLM(**vllm_kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/llm.py", line 123, in __init__
[rank0]:     self.llm_engine = LLMEngine.from_engine_args(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 292, in from_engine_args
[rank0]:     engine = cls(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 160, in __init__
[rank0]:     self.model_executor = executor_class(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/executor_base.py", line 41, in __init__
[rank0]:     self._init_executor()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/gpu_executor.py", line 23, in _init_executor
[rank0]:     self._init_non_spec_worker()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/gpu_executor.py", line 69, in _init_non_spec_worker
[rank0]:     self.driver_worker.load_model()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 118, in load_model
[rank0]:     self.model_runner.load_model()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 164, in load_model
[rank0]:     self.model = get_model(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/__init__.py", line 19, in get_model
[rank0]:     return loader.load_model(model_config=model_config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 224, in load_model
[rank0]:     model.load_weights(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/falcon.py", line 418, in load_weights
[rank0]:     param = params_dict[name]
[rank0]: KeyError: 'transformer.h.12.input_layernorm.weight'
@mgoin
Copy link
Collaborator

mgoin commented Jun 4, 2024

Resolved by #5069

@mgoin mgoin closed this as completed Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants