Bac Data

This is the reference page for the Bac Data project containing links to the data itself, analyses that use this data and relevant discussions on the topic. Feel free to comment on the topic or to download and use any of the data provided. If you do find the data useful, please cite this page as the source, so that others can also find and use the resources here.

Background

The “Bac” (short for Bacalaureat) is a yearly national exam for Romanian highschool graduates. Although the data should be publicly available for any use or purpose, the Romanian Ministry of Education and its contractor, the SIVECO company, actively hinder any use of such data, by publishing it in formats that do not allow any direct access to the complete data set. Moreover, the formats used for publishing the data change from year to year and there is no support of analysts who need access to a clean and complete data set.

Given the above issues, I’ve written code to mine the data from the Ministry’s website and I’ve made all the Bac data (from 2006 to 2014) available in a useful format for anybody interested. Moreover, I’ve ran a few analyses and published the results on my blog (in Romanian).

Data

The results data contain the list of candidates for each year (first exam call) and are provided in .zip archives with the MD5 checksum provided below for each file. Currently the data covers years 2006 – 2015:

bac data_2006: 49d7a8f28507ff6e81cc4bb116e874d4

bac data_2007: aca4382b9189f104d43745b32c11746d

bac data_2008: f5846b68b27de00f068b78aa43be54ec

bac data_2009: e4457f1403a9ba0bfa1bbdba6e48b049

bac data_2010: fc4774a8f3df1357a6000b7c3bbc3b62

bac data_2011: 57a971f735e62b7525c40fb80c20559a

bac data_2012: df026420e3851fc672dfcc4d11c54f87

bac_data_2013: bbeb10637f5a6183f90214c65c039500

Updated (thanks to Andrei Filip for running the scripts for 2014 and sending me the file):
bac_data_2014: 653cb7accb52f5393382158f03567ca1

Updated 05/03/2015 (upon request for data from the 2nd exam session in autumn):

bac_data_2014_autumn_exams: 6F0BE97368E5B6F393CFA0BF3055D6E7

Updated 31/07/2015 Thanks to Gabriel Kreindler who sent me the cleaned data for 2015:

bac_data_2015: da17176824e849776d2f15e90d2ebdef

 

As others contacted me asking for additional data sets related to the main Bac data, I’ve also mined and published data related to the grading centres for the exam, from 2006 to 2012 2015 (thanks, Gabriel Kreindler!):

centres 2015:  4e96a9b94b3d8f24d6f44d8d0c8a0ed7

cleaned dataset centres 2006-2014: eb0996d48005b905859e6d9ee62fa19d

centre 2006 : c622dd2437e8dc50cad2c61a465d79e0

centre 2007 : b3adae3fc41c07759a651a5060e8c68a

centre 2008 : 12f472cc43af776cdfe1f7252ae144dc

centre 2009 : 2d524bc5cbc1308c6889df37ecffb430

centre 2010 : 2b2a432bc53921f4479ded67ed366ddf

centre 2011 : cc81b27dde5736b786fb3a0600169240

centre 2012 : cd6f1e768153efb1c658a6e145309758

Please note that these data sets were cleaned only superficially and hence errors or mismatches might still be present. (I cleaned basically what could be cleaned automatically, but a thorough cleaning would require some manual input too.)

Analyses

In 2011, when I first started mining this data, I’ve ran a few analyses of it and I published the results on my blog (in Romanian).

Several other people used the data I provided and published their own analyses and results. I’ve published (in Romanian) a summary of the main analyses of this data set, indicating also the people involved.

More recently, I found out of a project of researchers from the University of Gothenburg who use the Bac data to investigate “the consequences of monitoring on corruption and the education opportunities for the poor” (Oana Borcan, Andreea Mitrut si Mikael Lindhal).

Citation:

If you use the above data, please reference this page as the source. Here is an example:

Coman, D. (2013). Bac Data [online] Available at: <http://dianacoman.com/bac-data>

Your Input

If you use or plan to use Bac data, I am interested in hearing about your project. If there is some additional data you need, feel free to leave a message below and I might be able to help you.

Any other related comments and suggestions are welcome.

RSS Subscribe to Ossasepia Twitter: diana_coman

Archive:

Recent comments: