Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 15
How to use summerstars/MARK-Embedding with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("summerstars/MARK-Embedding")
sentences = [
"Nterprise Linux Services is expected to be available before then end of this year.",
"Beta versions of Nterprise Linux Services are expected to be available on certain HP ProLiant servers in July.",
"Spain turning back the clock on siestas",
"I don't like many flavored drinks."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]これは、訓練済みのsentence-transformersモデルです。このモデルは、文と段落を256次元の密なベクトル空間にマッピングし、意味的テキスト類似性、意味検索、言い換えマイニング、テキスト分類、クラスタリングなどに使用できます。
SentenceTransformer(
(0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
(1): Pooling({'word_embedding_dimension': 256, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
まず、Sentence Transformersライブラリをインストールします:
pip install -U sentence-transformers
次に、このモデルをロードして推論を実行できます。
from sentence_transformers import SentenceTransformer
# 🤗 Hubからダウンロード
model = SentenceTransformer("sentence_transformers_model_id")
# 推論を実行
sentences = [
'Comcast Class A shares were up 8 cents at $30.50 in morning trading on the Nasdaq Stock Market.',
'The stock rose 48 cents to $30 yesterday in Nasdaq Stock Market trading.',
'Malaysia: Chinese satellite found object in ocean',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 256]
# 埋め込みベクトルの類似度スコアを取得
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5752, 0.2980],
# [0.5752, 1.0000, 0.2161],
# [0.2980, 0.2161, 1.0000]])
| メトリクス | 値 |
|---|---|
| pearson_cosine | 0.464 |
| spearman_cosine | 0.4595 |
sentence_0, sentence_1, label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| 型 | string | string | float |
| 詳細 |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
Forecasters said warnings might go up for Cuba later Thursday. |
Watches or warnings could be issued for eastern Cuba later on Thursday. |
0.8 |
Death toll in Lebanon bombings rises to 47 |
1 suspect arrested after Lebanon car bombings kill 45 |
0.5599999904632569 |
Three dogs running on a racetrack. |
Three dogs round a bend at a racetrack. |
0.9600000381469727 |
CosineSimilarityLoss 以下のパラメータを使用:{
"loss_fct": "torch.nn.modules.loss.MSELoss"
}
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| エポック | ステップ | 訓練損失 | spearman_cosine |
|---|---|---|---|
| 1.0 | 360 | - | 0.2967 |
| 1.3889 | 500 | 0.11 | 0.3338 |
| 2.0 | 720 | - | 0.3665 |
| 2.7778 | 1000 | 0.0857 | 0.4101 |
| 3.0 | 1080 | - | 0.4595 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}