Data Sets
An overview of the two data sets you can select for labs 5-8. For each 'family' of data there are several different datasets from the same instrument.
HERA data
The HERA data is in hd5 files (this is a good thing). Unfortunately, they are too large for canvas to host. So here are Dropbox links to the three files in this family:
LHC data
The LHC data have two files per set. If you are working in python, it is best to read it using the pickle package (you may need to add this library to your environment)
Set 1: higgs_100000_pt_250_500.pkl Download higgs_100000_pt_250_500.pkl; qcd_100000_pt_250_500.pkl Download qcd_100000_pt_250_500.pkl
Set 2: higgs_100000_pt_1000_1200.pkl Download higgs_100000_pt_1000_1200.pkl; qcd_100000_pt_1000_1200.pkl Download qcd_100000_pt_1000_1200.pkl
If you have issues reading this data due to pickle version, here are hd5 versions of the files
Set 1: higgs_100000_pt_250_500.h5 Download higgs_100000_pt_250_500.h5; qcd_100000_pt_250_500.h5 Download qcd_100000_pt_250_500.h5
Set 2: higgs_100000_pt_1000_1200.h5 Download higgs_100000_pt_1000_1200.h5; qcd_100000_pt_1000_1200.h5 Download qcd_100000_pt_1000_1200.h5
One disadvantage of this loading method is that the data attributes are not labelled. The rows will be in the same order as described in the document.