parser.add_option("--inj_file",default=None,action="store",type="string",help="""Injection file path leading to the `.ini` file specifying the parameters of the injection.""")
parser.add_option("--inj_file",default=None,action="store",type="string",help="""[!BROKEN!] Injection file path leading to the `.ini` file specifying the parameters of the injection.""")
parser.add_option("--event",default=None,action="store",type="string",help="""Event label part of the catalog GWTC-1""")
parser.add_option("--trigger_file",default=None,action="store",type="string",help="""File .ini describing the parameters where to start the chain. If left to None and --event used, a default file corresponding to the event will be used.""")
parser.add_option("--config_file",default='../Codes/config_default.ini',action="store",type="string",help="""Configuration file path leading to the `.ini` file specifying some key options to run the HMC. The default file is '../Codes/config_default.ini'. Options set in the command line which are redundant with that of the config.ini will (should...) overwrite them.""")
parser.add_option("--outdir",default=None,action="store",type="string",help="""Output directory where results and plots will be stored. If not specified, it will be created automatically in the current working directory under ./outdir""")
parser.add_option("--trigger_file",default=None,action="store",type="string",help="""File .ini describing the parameter values where to start the chain. If left to None and --event used, a default file corresponding to the event will be used.""")
parser.add_option("--config_file",default='../Codes/config_default.ini',action="store",type="string",help="""Configuration file path leading to the `.ini` file specifying some key options to run the HMC. The default file is '../Codes/config_default.ini'. Options set in the command line which are redundant with that of the config.ini will (should...) overwrite them. Note that this config file must contain the path to a `prior_file` and to a `config_hmc_parameters_file.json`. The latter essentially defines the `search_parameter_keys` of the run.""")
parser.add_option("--outdir",default=None,action="store",type="string",help="""[!BROKEN!] Output directory where results and plots will be stored. If not specified, it will be created automatically in the current working directory under ./outdir""")
parser.add_option("-v","--verbose",default=True,action="store_true",help="""Print out information of interest during the run.""",dest='verbose')
parser.add_option("-v","--verbose",default=True,action="store_true",help="""[!BROKEN!] Print out information of interest during the run.""",dest='verbose')
parser.add_option("--plot_traj",default=False,action="store_true",help="""Will save positions, momentas, hamiltonian values etc of every point along every trajectory in phase to be able to plot them with `plot_trajectories.py` script.""")
parser.add_option("--sub_dir",default=None,action="store",type="string",help="""Optional sub-directory that will be appended at the endto the default output directory such that the final output directory is: '/default-output-dir/sub_dir/'""")
parser.add_option("--sub_dir",default=None,action="store",type="string",help="""Optional sub-directory that will be appended at the endto the default output directory such that the final output directory is: '/default-output-dir/sub_dir/'""")
parser.add_option("--fit_method",default='dnn',action="store",type="string",help="""Method used to fit the numerical gradients of the log likelihood accumulated in phase1. """)
parser.add_option("--no_seed",default=False,action="store_true",help="""New noise realisation from PSDs is generated""")
parser.add_option("--on_CIT",default=False,action="store_true",help="""Set this to True if running this script on Caltech's cluster. Will set the right path for the ROQ basis.""")
parser.add_option("--on_CIT",default=False,action="store_true",help="""[!BROKEN!] Set this to True if running this script on Caltech's cluster. Will set the right path for the ROQ basis.""")
cut.logger.info(f'Options for the sampler are: {sampler_dict_formatted}')
# SET THE OUTPUT DIRECTORY WHERE OUTPUT FILES WILL BE SAVED
# The typical the structure is set as `'../__output_data/[data_name]/[search_parameter_nbs]/[psd_option]/[config_options]/[approximant_name]/`; cf doc string of the function.
# Retrieve the PSDs of the ifos from the psd_file given in the config_file.
# For GWTC1 events, PSDs are available on DCC and formated for each event as a `GWTC1_{opts.event}_PSDs.dat` file, hence the single_file possibility.
# [TO BE MODIFIED]: so I made a shortcut if they are stored correctly under f'../__input_data/{opts.event}/GWTC1_{opts.event}_PSDs.dat' but this should be simply removed and `gwtc1_psds_file` should not exist.
cut.logger.info(f'No PSD path filled in options `psds_single_file` or `psd_dict` or could not be found. PSDs will therefore be computed from the strain files.')
# Retrieve the names of the .hdf5 files in the input directory of the event.
# [TO BE MODIFIED]: path of input strain files should be set in the configuration files and not half hard-coded as they are bellow...
# The minimun frequency for each interferometer needs to be set if we want the strain derived from the psd to start being non zero at that frequency and not at the default one which is 20Hz
# if type(ext_analysis_dict['minimum_frequency']) == str:
First, this function is clearly messy and needs some non-negligible re-writing... The idea is essentially to retrieve from the input parameters the GPS start_time of the analysis and the duration of the segment. Then some checks are performed, such as on the sampling frequency; the interferometers names are put in a list etc...
Parameters
----------
mass_1: float
Mass, in solar mass units, of the first component of the binary.
mass_2: float
Mass, in solar mass units, of the second component of the binary.
geocent_time: float
GPS time, as measured at the center of the earth, of the merger.
chirp_mass: float
Chirp mass of the binary.
analysis_kwargs: dictionary
Additional options set in the '[analysis]' section of the config.ini file. The dictionary must contain the keys ['approximant', 'roq', 'minimum_frequency_ifos', 'ifos', 'reference_frequency', 'maximum_frequency',
Returns
-------
ext_analysis_dict: dictionary
Dictionary containing most importantly the duration of the segment analyzed and `tc_3p5PN` which is the duration to coalescence, and the GPS start_time.
Takes the dictionary containing the list of search parameter keys and updates the key `geocent_time` to `geocent_duration` since it will be the duration used by the HMC and not GPS time for numerical precision reasons when computing numerical gradients on that parameter.
Parameters
----------
sampler_dict: dictionary
Dictionary which must contain the keys `search_parameter_keys` (which is the array describing the parameters of the search), `trajectory_functions` (like 'log' for the keys `chirp_mass` and `reduced_mass` or `cosine` for `theta_jn`) and `parameters_offsets` (offsets used for each key when computing numerical gradients, typically 1e-7).
Returns
-------
sampler_dict: dictionary
Returns the input dictionary where the search_parameter_key `geocent_time` has been updated to `geocent_duration`.
"""
cut.logger.info('As usual, for numerical precision reasons for its gradients, the HMC samples in geocent_duration instead of the GPS geocent_time and converts back to it afterwards.')
List of the keys defining the parameters of the search.
config_dict: dictionary
Dict containing some key options for the run such as the total number of trajectories in phase 1
sub_dir: string, optional
Optional string which will be appended at the end of the output_dir string built from the previous inputs.
Returns
-------
outdir: string
The path of the directory where output files will be stored, note this string ends with a slash `/`.
Notes
-----
At the moment `outdir` is hard coded to start as `outdir = '../__output_data/[inj_name]/... `.
The end of the string is built to have the structure `'../__output_data/[inj_name]/[search_parameter_nbs]/[psd_option]/[config_options]/[approximant_name]/`.
- [search_parameter_nbs] is built from `search_parameter_keys` and assigns a predefined integer to each key to built a string.
- [psd_option]: should be removed, it was there to distinguish runs using fake PSDs vs PSDs estimated from the strain vs PSDs read from input files.
Formats a dictionary such that the outputted string will be printed with the nested structure.
Parameters:
indent: int, default=3
number of spaces used to indent the nested dictionaries
align_keys: boolean, default=True
aligns vertically the printed values
decimal_format: int, default=None
number of decimals to print for numbers in the dictionary using the format {:.xf} where x is `decimal_format`. If left to None not format will be applied.
Parameters
----------
indent: int, default=3
Number of spaces used to indent the nested dictionaries
align_keys: boolean, default=True
Aligns vertically the printed values
decimal_format: int, default=None
Number of decimals to print for numbers in the dictionary using the format {:.xf} where x is `decimal_format`. If left to None not format will be applied.
Returns
-------
string_to_print: string
String to be printed with line breaks that follow the nested structure of the input dictionary.
# 1: sets the PSDs from official GWTC-1 open PSD data; 2: computes PSDs using Welch methods from gwosc open strain data; 3: uses bilby's analytical pre-stored PSD files.
- The `--event` option defines the name of the event and will be used to find input files in the directory `local_path_to/gwhmc/__input_data/GW170817/`, knowing that `main.py` is stored in `/gwhmc/main.py`.
## Configuration files
The current way config files work is unfortunately a bit complicated and confusing, it should be simplified in the long term.
### The `config.ini` file
- Path given in the `--config_file` option of the bash `python main.py` command line.
- Create a `config.ini` file, following the example of `gwhmc/examples/config_files/config_GW170817_IMRPhenomD_NRTidal.ini`: it must contain 3 sections named `[analysis]`, `[parameters]` and `[hmc]`:
```ini
[analysis]
# Interferometers to run the analysis on; must be separated by a comma.
ifos=H1,L1,V1
approximant=IMRPhenomD_NRTidal
# Names of the ifo in the dictionary here must match those defined above.
# 1: sets the PSDs from official GWTC-1 open PSD data; 2: computes PSDs using Welch methods from gwosc open strain data; 3: uses bilby's analytical pre-stored PSD files.
# Number of trajectories using numericalgradients in phase 1.
n_traj_fit=1500
# Number of iterations if phase1 loop to go over for this run. This permits to split phase1 into several chunks. If set to 0, then the value will be set to that of n_traj_fit
n_traj_for_this_run=300000
# Number of points used to sub-select points in Ordered Look-Up Tables for local fit bimodal parameters like (cos(inc), psi, logD)
n_fit_1=2000
# Number of points sub-selected from the above n1 ones to perform a linera fit
n_fit_2=200
# Length of numerical trajectories during phase1
length_num_traj=200
# Stepsize between steps of trajectories used for phase1.
epsilon0=5e-3
# Python file to configure your search parameter space
# Number of statistically independant samples desired.
n_sis=5000
```
- PSDs can either be set from a single file containing the PSDs of each ifos. They are stored on DCC in such files for GWTC1 events. In that case use the `psds_single_file =` option.
- If each PSD is stored in one file then use the `psd_dict = ` option to define each path.
- If none of the above, then PSDs are estimated strom the strain data using the Welch method.
- The `psd = 1` (or 2 or 3) option was used in case of injections but it certainly broken now.
-`n_fit_1` and `n_fit_2` should be removed from the code since we use the DNN now and not OLUTs.
### The `prior.prior` file
- Its path is defined in the `[parameters]` section of the above `config.ini` file after the key `prior_file =`.
- At the moment some examples of prior files can be found in... ...`/gwhmc/gw/prior_files`.
- They are handled as Bilby does since they defines Bilby prior functions and look like:
```
# Example of priors for GW170817 using IMRPhenomD_NRTidal
- Each line will be translated into the corresponding Bilby prior function, just as if we had written python code in this text file.
- The `boundary='reflective'` argument is not used in DeepHMC since we handle boundaries differently.
- Similar to what is done in Bilby, the prior on `geocent_time` is set internally by DeepHMC to be flat at `+/- 0.1` around the merger time.
### The `config_params.json` file
- Path defined in the `[hmc]` section of the `config.ini` file under the key `config_hmc_parameters_file = `.
- The information it contains could probably be directly defined in the `config.ini` file, which would be simpler, by writing some "string-dictionary" (just like for the `minimum_frequency_ifos = {H1: 20.0, L1: 30.0, V1: 20.0}`) which would then be converted into proper dictionaries using the `core.utils.convert_string_to_dict` method, but at the time I did not know about that handy bilby_pipe function so I used a json file to handle the nested structure of the information.
- So this json file typically looks like:
```json
{
"search_parameter_keys":[
"theta_jn",
"psi",
"luminosity_distance",
"chirp_mass",
"reduced_mass",
"dec",
"ra",
"geocent_time",
"chi_1",
"chi_2",
"lambda_1",
"lambda_2"
],
"trajectory_functions":{
"theta_jn":"cosine",
"psi":"None",
"luminosity_distance":"log",
"chirp_mass":"log",
"reduced_mass":"log",
"dec":"sine",
"ra":"None",
"geocent_time":"log",
"chi_1":"None",
"chi_2":"None",
"lambda_1":"None",
"lambda_2":"None"
},
"search_parameter_keys_local_fit":[
"theta_jn",
"psi",
"luminosity_distance"
],
"parameters_offsets":{
"theta_jn":1e-7,
"phase":1e-7,
"psi":1e-7,
"luminosity_distance":1e-7,
"chirp_mass":1e-7,
"reduced_mass":1e-7,
"dec":1e-7,
"ra":1e-7,
"geocent_time":1e-7
}
}
```
- Most importantly it defines the search parameter space of the analysis, so here 12 dimensional and implicitely marginalizing over phase at coalescence (which option must therefore be set to `True` in the `config.ini` file).
-`geocent_time` will automatically be converted internally into `geocent_duration` because otherwise the HMC suffers from numerical precision issues when computing numerical gradients on a GPS time. But final results are then converted back to `geocent_time` by simply adding the `start_time` of the analysis.
- Each search parameter can have a "trajectory function" which defines the space in which trajectories are computed. Priors, defined for instance on `luminosity_distance` and not on `log(luminosity_distance)`, automatically adapt to the trajectory functions used. If set to `None` or not set in this file then the identity function is used.
-`search_parameter_keys_local_fit` made sense for the OLUTs but should be removed from the code now (removing the key directly in the file might break the code so careful...).
-`parameters_offsets` defines the offsets which will be used when computing numerical differenciation. If not specified for a parameter, the default value will be `1e-7`.
- Note that priors can be defined on component masses even though we sample in `log(chirp_mass)` and `log(reduced_mass)`.
- Note that if not marginalizing over the phase at coalescence, then the key `phase` refers to the *orbital* phase, to be consistent with Bilby, and not the phase at coalescence.
### The `trigger.ini` file
- Path given in the `--trigger_file` option of the bash `python main.py` command line.
- It defines the parameter values of the starting point of the analysis.
- For GW170817 using IMRPhenomD_NRTidal and marginalizing over the phase at coalescence, it can look like
```ini
[parameters]
mass_1=1.72592086595
mass_2=1.10730509114
luminosity_distance=21.1702759705
theta_jn=2.04318308535
psi=1.61012417485
geocent_time=1187008882.43
ra=3.44616
dec=-0.408084
chi_1=0.027842426346
chi_2=0.0278262466181
lambda_1=130.651548464
lambda_2=210.695175639
```
-`chirp_mas`, `total_mass`, `symmetric_mass_ratio`, `mass_ratio` and `reduced_mass` are automatically computed from the two component masses, but not the other way around, hence one must specify `mass_1` and `mass_2` in this file.
## Input `.hdf5` strain files
- The `--event` option of the bash `python main.py` command line defines the name of the event and will be used to find input files in the directory `local_path_to/gwhmc/__input_data/[event_name]/`, knowing that `main.py` is stored in `/gwhmc/main.py`. But note that at the moment is it hard-coded to look in the relative path `../__input_data/[event_name]`. This behavior should of course be modified...
- Since at the moment the path for each strain file is not specified in any configuration file, the code lists every `.hdf5` file present in the directory `../__input_data/[event_name]/` assuming they all refer to one strain file per interferometer (for some reason I hard-coded in `main.py` that for GW170817 it looks into `../__input_data/GW170817/LOSC/` ...).
- Then to differentiate between each ifo it is assumed that each file starts with a consistent prefix: `H-H1_[...].hdf5` or `L-L1_[...].hdf5` or `V-V1_[...].hdf5`.
- These files can be downloaded from the GWOSC website.
- If the sampling frequency defined in `config.ini` is higher than that used in the `.hdf5` file, an error is thrown, if smaller then the strain data is automatically down-sampled accordingly by keeping only 1 strain sample every `N` where `N = int(sampling_frequency_from_strain_file / sampling_frequency_from_config_file)`.