add CFE dataset by Jiawen-CS · Pull Request #591 · microsoft/BC-Bench

Jiawen-CS · 2026-03-18T08:38:51Z

This project investigates transfer failures in large language models (LLMs) when generating code for niche programming languages, using AL as a case study.

We design BC-Bench-CF, a benchmark suite that includes realistic AL development tasks and minimal counterfactual variants. The goal is to evaluate not only functional correctness, but also robustness to small specification changes and sensitivity to AL-specific execution semantics.

Our analysis is grounded in a layered failure framework, which attributes model errors to different abstraction levels, including syntax, validation semantics, event-driven paradigms, workflow composition, and ecosystem constraints.

tests/test_counterfactual.py

haoranpb · 2026-03-18T10:06:00Z

You can mark the PR as draft, so it won't get accidentally merged

…ter_thesis

…into master_thesis

add CFE dataset

e788fec

Jiawen-CS temporarily deployed to ado-read March 18, 2026 08:39 — with GitHub Actions Inactive

github-code-quality bot found potential problems Mar 18, 2026

View reviewed changes

tests/test_counterfactual.py Fixed Show fixed Hide fixed

Jiawen Sun added 2 commits March 18, 2026 10:24

make github action could validate cf-dataset

e3f5b30

update

d1c355c

Jiawen-CS temporarily deployed to ado-read March 18, 2026 09:25 — with GitHub Actions Inactive

Able to eidt git diff locally

27aa061

Jiawen-CS temporarily deployed to ado-read March 18, 2026 10:14 — with GitHub Actions Inactive

Add dataset

0b196db

Jiawen-CS marked this pull request as draft March 18, 2026 13:02

fix ruff error

0688398

Jiawen-CS temporarily deployed to ado-read March 18, 2026 13:11 — with GitHub Actions Inactive

Jiawen-CS temporarily deployed to ado-read March 18, 2026 13:17 — with GitHub Actions Inactive

Jiawen Sun added 3 commits March 18, 2026 15:57

Add dataset and fix test

7ea76d2

Change order

8584070

add dataset until 10 th

bf728d4

Jiawen-CS temporarily deployed to ado-read March 18, 2026 18:35 — with GitHub Actions Inactive

Jiawen-CS temporarily deployed to ado-read March 20, 2026 17:59 — with GitHub Actions Inactive

Jiawen-CS had a problem deploying to ado-read March 20, 2026 17:59 — with GitHub Actions Error

Jiawen-CS temporarily deployed to ado-read March 20, 2026 18:14 — with GitHub Actions Inactive

Add dataset from no.11-20

9e871ed

Jiawen-CS temporarily deployed to ado-read March 20, 2026 18:49 — with GitHub Actions Inactive

Jiawen Sun and others added 11 commits March 22, 2026 14:48

add dataset form 21 to 30

e1d174c

fix dataset issue

a53928b

Merge branch 'main' of https://github.com/microsoft/BC-Bench into mas…

5d0cc30

…ter_thesis

Add dataset form 31-40

cb1ad9a

Merge branch 'main' into master_thesis

6e47d79

Merge branch 'main' of https://github.com/microsoft/BC-Bench into mas…

bad3fc7

…ter_thesis

Add dataset from 41 to 45

9d0d03c

Merge branch 'main' into master_thesis

96302b4

Merge branch 'main' of https://github.com/microsoft/BC-Bench into mas…

4ce1a90

…ter_thesis

Merge branch 'master_thesis' of https://github.com/microsoft/BC-Bench …

4579a27

…into master_thesis

Add dataset from 46-50

277a732

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add CFE dataset#591

add CFE dataset#591
Jiawen-CS wants to merge 22 commits intomainfrom
master_thesis

Jiawen-CS commented Mar 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

haoranpb commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Jiawen-CS commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

haoranpb commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Jiawen-CS commented Mar 18, 2026 •

edited

Loading