Match Me Up Before I Go-Go!—Estimating Matching Functions for VET Students in Spatially Connected Labor Markets
This repository contains the R scripts and data necessary to reproduce the analysis and results presented in the paper: "Match Me Up Before I Go-Go!—Estimating Matching Functions for VET Students in Spatially Connected Labor Markets" by Dennis Oliver Kubitza.
Please note that this paper is currently under review for publication. The code and methods presented in this repository should be considered preliminary and may be subject to change in the final version.
This paper explores the spatial factors that affect the matching process for Vocational Education and Training (VET) students in Germany. Local labor markets are interconnected, and job-matching is influenced by spillovers from neighboring markets. Using a Matching Model with spatially lagged stocks of unemployment and vacancies, the study analyzes efficiency, elasticities, and spatial spillovers across five professional groups.
The findings highlight the importance of accurate spatial modeling, showing that only indices based on realistic travel times or commuting data produce measurable spillovers. The results reveal professional heterogeneity in regional impacts on matching efficiency and find that better-connected regions exhibit a more balanced matching process.
The project is organized into the following main directories:
./config/: Contains a list of all locally installed packages, providing detailed control over the R environment../renv/: Contains the output of therenvpackage. This folder allows other users to restore and replicate the exact package environment used for this analysis, ensuring full reproducibility../DataPreparation/: Contains the initial data preparation script (1_DataPreparation.Rmd), raw input data (underTables,Shapes,Wikidata), and the Google API script../DataBackups/: Stores pre-processed.Rdatafiles, including the crucial distance matrix data, to avoid re-running time-consuming scripts../Outputs/: The destination for all final outputs relevant to the paper, including figures, LaTeX tables, and regression results../: The root directory contains the four main analysis scripts (1_Analysis_AAB_based.Rmdto4_Analysis_Profession_based.Rmd).
Of course. Here is the corrected and improved version of your markdown text.
The main changes include fixing typos, improving sentence flow, and converting the raw link into a viewable HTML link that will render correctly in a browser.
To keep the repository size manageable, not every file generated during the analysis (e.g., intermediate tables, all regression outputs) is included here. However, the complete HTML protocol, which shows the full process and all outputs, is available for viewing online.
You can view the main analysis protocol using the following links:
- Analysis Protocol (1_Analysis_AAB_based.html)
- Analysis Protocol (2_Analysis_Spillover_based.html)
- Analysis Protocol (3_Analysis_Best_combination.html)
- Analysis Protocol (4_Analysis_Profession_based.html)
Regression tables are published under Output/Regressions.
To reproduce the results, please follow the steps below. The analysis scripts are designed to be run sequentially.
First, you must initialize the data by running the main data preparation script.
- Navigate to the
DataPreparationfolder and execute1_DataPreparation.Rmd.
All necessary input data for this script is provided within the DataPreparation/Tables, DataPreparation/Shapes, and DataPreparation/Wikidata subfolders.
Note: It is not necessary to re-run the
2_Collect_from_google.Rscript. The required travel time and distance data has been pre-fetched and is available in theDataBackupsfolder.
The main analysis is split into four R Markdown scripts located in the root directory. They should be executed consecutively. A critical preliminary step involves preparing the data for each professional group using the 1_Analysis_AAB_based.Rmd script.
I recommend the following order of execution:
-
Run
1_Analysis_AAB_based.Rmdmultiple times. This script must be configured and run for each of the six professional groups (including "all professions").- Before each run, open
1_Analysis_AAB_based.Rmdand change the selected profession group in the configuration section (around line 6). - You must run this for the "all professions" category before proceeding to scripts 2 and 3.
- To run script 4, you must first execute script 1 additionally for all five specific profession groups.
- Before each run, open
-
Run
2_Analysis_Spillover_based.Rmd. This script uses the output generated for "all professions." -
Run
3_Anaylsis_Best_combination.Rmd. This also uses the output generated for "all professions." -
Run
4_Analysis_Profession_based.Rmd. This script requires the outputs from all five specific profession groups generated in step 1.
All outputs that are relevant for the paper (figures, tables, regression summaries) will be saved to the ./Outputs directory. Intermediate results and protocols can be viewed in the HTML files generated by knitting the R Markdown scripts.
If you use this code or the findings from the paper, please cite:
@article{kubitza2025matchmeup,
title = {Match Me Up Before I Go-Go!---Estimating Matching Functions for VET Students in Spatially Connected Labor Markets},
author = {Kubitza, Dennis Oliver},
year = {2025},
journal = {Journal Title}
}For any questions regarding the code or the paper, please contact:
- Dennis Oliver Kubitza - d.kubitza@maastrichtuniversity.nl