Command-line documentation and usage of GhostKnockoffGWAS

Usage

Simple run

GhostKnockoffGWAS --zfile example_zfile.txt --LD-files EUR --N 506200 --genome-build 38 --out example_output

Required inputs

Option nameArgumentDescription
--zfileStringInput file containing Z-scores as well as CHR/POS/REF/ALT. See Acceptable Z-score files for detailed requirement on this file.
--LD-filesStringInput directory to the pre-processed LD files. Most users downloads this from the Downloads Page
--NIntSample size for target (original) study
--genome-buildIntThe human genome build used for SNP positions in zfile (this value must be 19 or 38)
--outStringOutput file name (without extensions)

Optional inputs

Option nameArgumentDescription
--CHRIntThe column in zfile that will be read as chromosome number (note this must be an integer, e.g. chr22, X, chrX, ...etc are NOT acceptable). [If not specified, we will search for a column with header CHR]
--POSIntThe column in zfile that will be read as SNP position . [If not specified, we will search for a column with header POS]
--REFIntThe column in zfile that will be read as REF (non-effectiv) allele . [If not specified, we will search for a column with header REF]
--ALTIntThe column in zfile that will be read as ALT (effective allele). [If not specified, we will search for a column with header REF]
--ZIntThe column in zfile that will be read as Z-scores. [If not specified, we will search for a column with header Z]
--seedIntSets the random seed [If not specified, defaults to 2023]
--verboseBoolWhether to print intermediate messages [If not specified, defaults to true]
--random-shuffleBoolWhether to randomly permute the order of Z-scores and their knockoffs to adjust for potential ordering bias. The main purpose of this option is to take care of potential ordering bias of Lasso solvers. However, in our simulations we never observed such biases, so we turn this off by default.[If not specified, defaults to false]
--skip-shrinkage-checkBoolWhether to allow Knockoff analysis to proceed even with large (>0.25) LD shrinkages [If not specified, defaults to false]

Output format

  1. A summary file, e.g. example_output_summary.txt. This file contains broad summary of the analysis
  2. A comma-separated file, e.g. example_output.txt. This file contains the full GhostKnockoffGWAS output, one SNP in each row.
  3. (optional) Manhattan plots, which can be generated by following step 5 of detailed example.

For a more detailed explanation on these 2 files, see Tutorial.