Describe the bug
After training a model with ATEPCTrainer using Spanish data, including specific reviews I added, the model fails to correctly identify aspects/sentiments—even on the exact same reviews it was trained on. For example, the model predicts aspects/sentiments that do not match the training data.
To Reproduce
Steps to reproduce the behavior:
-
Use the following dataset for training:
https://github.com/yangheng95/ABSADatasets/blob/v2.0/datasets/atepc_datasets/120.SemEval2016Task5/127.spanish/restaurants_train_spanish.xml.dat.atepc
(plus custom Spanish reviews, see below)
-
Add this example to the training data:
Muy O -100
amables B-ASP Positive
todos O -100
! O -100
Betty O -100
súper O -100
amable B-ASP Positive
y O -100
simpática O -100
-
Train the model using the following code:
import os
from pyabsa import AspectTermExtraction as ATEPC
if __name__ == '__main__':
dataset_name = "MySpanishRestaurant"
config = ATEPC.ATEPCConfigManager.get_atepc_config_multilingual()
config.model = ATEPC.ATEPCModelList.FAST_LCF_ATEPC
config.pretrained_bert = "BSC-LT/roberta-base-bne"
config.num_epoch = 20
config.evaluate_begin = 0
config.max_seq_len = 128
config.log_step = 10
config.learning_rate = 2e-5
config.batch_size = 16
config.patience = 3
config.device = "auto"
config.log_step = -1
config.l2reg = 1e-8
config.seed = 42
config.load_aug = True
print("Starting model training...")
trainer = ATEPC.ATEPCTrainer(
config=config,
dataset=dataset_name
)
trained_model_path = trainer.inference_model
print(f"Training complete! Model saved at: {trained_model_path}")
if os.path.exists(trained_model_path):
print("Verification successful: The model file exists at the specified path.")
else:
print("Verification FAILED: The model file does NOT exist at the specified path.")
-
After training, test the model on the same review:
Muy amables todos ! Betty súper amable y simpática
Actual behavior
The model outputs:
Aspects/Sentiments Found: [('todos', 'Positive'), ('y', 'Positive')]
This does not match the expected aspects/sentiments from the training data.
Expected behavior
The model should correctly identify the aspects and sentiments as provided in the training data, especially when evaluating on the exact same examples.
Environment:
- PyABSA version: 2.4.2
- Transformers version: 4.56.0
- Torch version: 2.8.0+cudaNone
- Device: Unknown
Additional context
- I have more examples with similar issues and can share them if needed.
- Training logs
Screenshots
N/A
Describe the bug
After training a model with ATEPCTrainer using Spanish data, including specific reviews I added, the model fails to correctly identify aspects/sentiments—even on the exact same reviews it was trained on. For example, the model predicts aspects/sentiments that do not match the training data.
To Reproduce
Steps to reproduce the behavior:
Use the following dataset for training:
https://github.com/yangheng95/ABSADatasets/blob/v2.0/datasets/atepc_datasets/120.SemEval2016Task5/127.spanish/restaurants_train_spanish.xml.dat.atepc
(plus custom Spanish reviews, see below)
Add this example to the training data:
Train the model using the following code:
After training, test the model on the same review:
Actual behavior
The model outputs:
This does not match the expected aspects/sentiments from the training data.
Expected behavior
The model should correctly identify the aspects and sentiments as provided in the training data, especially when evaluating on the exact same examples.
Environment:
Additional context
Screenshots
N/A