Fitting multiple objects: fit_catalogue

This section describes fitting a catalogue of objects with the same model using the fit_catalogue class. Check out the sixth iPython notebook example for a quick-start guide.

API documentation: fit_catalogue

class bagpipes.fit_catalogue(IDs, fit_instructions, load_data, spectrum_exists=True, photometry_exists=True, make_plots=False, cat_filt_list=None, vary_filt_list=False, redshifts=None, redshift_sigma=0.0, run='.', analysis_function=None, time_calls=False, n_posterior=500, full_catalogue=False, load_indices=None, index_list=None, track_backlog=False)

Fit a model to a catalogue of galaxies.

Parameters:
  • IDs (list) – A list of ID numbers for galaxies in the catalogue

  • fit_instructions (dict) – A dictionary containing the details of the model to be fitted to the data.

  • load_data (function) – Function which takes ID as an argument and returns the model spectrum and photometry. Spectrum should come first and be an array with a column of wavelengths in Angstroms, a column of fluxes in erg/s/cm^2/A and a column of flux errors in the same units. Photometry should come second and be an array with a column of fluxes in microjanskys and a column of flux errors in the same units.

  • spectrum_exists (bool - optional) – If the objects do not have spectroscopic data set this to False. In this case, load_data should only return photometry.

  • photometry_exists (bool - optional) – If the objects do not have photometric data set this to False. In this case, load_data should only return a spectrum.

  • run (string - optional) – The subfolder into which outputs will be saved, useful e.g. for fitting more than one model configuration to the same data.

  • make_plots (bool - optional) – Whether to make output plots for each object.

  • cat_filt_list (list - optional) – The filt_list, or list of filt_lists for the catalogue.

  • vary_filt_list (bool - optional) – If True, changes the filter list for each object. When True, each entry in cat_filt_list is expected to be a different filt_list corresponding to each object in the catalogue.

  • redshifts (list - optional) – List of values for the redshift for each object to be fixed to.

  • redshift_sigma (float - optional) – If this is set, the redshift for each object will be assigned a Gaussian prior centred on the value in redshifts with this standard deviation. Hard limits will be placed at 3 sigma.

  • analysis_function (function - optional) – Specify some function to be run on each completed fit, must take the fit object as its only argument.

  • time_calls (bool - optional) – Whether to print information on the average time taken for likelihood calls.

  • n_posterior (int - optional) – How many equally weighted samples should be generated from the posterior once fitting is complete for each object. Default 500.

  • full_catalogue (bool - optional) – Adds minimum chi-squared values and rest-frame UVJ mags to the output catalogue, takes extra time, default False.

fit(verbose=False, n_live=400, mpi_serial=False, track_backlog=False, sampler='multinest', pool=1)

Run through the catalogue fitting each object.

Parameters:
  • verbose (bool - optional) – Set to True to get progress updates from the sampler.

  • n_live (int - optional) – Number of live points: reducing speeds up the code but may lead to unreliable results.

  • mpi_serial (bool - optional) – When running through mpirun/mpiexec, the default behaviour is to fit one object at a time, using all available cores. When mpi_serial=True, each core will fit different objects.

  • track_backlog (bool - optional) – When using mpi_serial, report the number of objects waiting to be added to the catalogue by the “zero” core that compiles results from all the others. High numbers mean cores are waiting around doing nothing.

Saving of output catalogues

fit_catalogue will generate an output catalogue of posterior percentiles for all fit parameters plus some basic derived parameters. This is saved in the pipes/cats folder as <run>.fits.

Parallelisation

Bagpipes supports MPI parallelisation using the python package mpi4py. You can run both fit or fit_catalogue using MPI, just do mpirun/mpiexec -n nproc python fit_with_bagpipes.py. The default behaviour is to fit one object at a time using all available cores. This is useful for complicated models (e.g. fitting spectroscopy).

For catalogue fitting, an alternative approach is also available, in which multiple objects are fitted at once, each using one core. This option can be activated by setting the mpi_serial keyword argument of fit_catalogue to True. This is better for fitting relatively simple models to large catalogues of photometry, and can readily be scaled up to fitting catalogues of tens to hundreds of thousands of objects using ~100 cores on a computing cluster.

This feature no longer requires a special distribution of pymultinest, and will work with bagpipes >= v0.8.5 using normal distributions of pymultinest >= v2.11.