-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can not run vep #244
Comments
Hello @toanddt! Thank you for submitting this issue. I am afraid the malformed header line error reported above might be masking the real issue. Cannoli is expecting valid VCF on stdout from Might you be able to look for any error message(s) reported by |
Thank for your quick response @heuermh. STACK Bio::EnsEMBL::VEP::Config::check_config /home/spark/ensembl-vep/modules/Bio/EnsEMBL/VEP/Config.pm:649 `class Vep( override def apply(variantContexts: VariantContextDataset): VariantContextDataset = {
What version of VEP you used for testing? |
Thanks for the additional information! Oh, that is frustrating -- why would they make an incompatible change |
Hi @heuermh |
Thank @heuermh. But I am still not able to run cannoli - vep, another issue occured . Are you able to run cannoli - vep at your side? |
I am also not able to run the current version of vep via its Bioconda/Biocontainers Docker image
I tried with both the indexed and non-indexed versions of the VEP cache, no difference. It appears that Will continue to investigate... |
See also bigdatagenomics/adam#2150 |
Hello, |
Thank you @dmaziec1! I left a few questions as comments on your commit |
To answer some of my questions, from http://grch37.ensembl.org/info/docs/tools/vep/vep_formats.html#output
Thus it appears for Canolli we should be using vep --format vcf --output_file STDOUT --vcf --vcf_info_field ANN --no_stats --offline --dir_cache ... (edited) |
I've discovered multiple issues that will prevent mapping Ensembl VEP |
Properly mapping Ensembl VEP |
I can run normal vep tool with the same input file but errors occured when I run cannoli vep with following command:
cannoli-submit --master local --deploy-mode client -- vep /input/sample.vcf /output/out -executable /home/spark/ensembl-vep/vep -cache file:///home/spark/.vep/homo_sapiens/95_GRCh38/ -single
Log:
Caused by: htsjdk.tribble.TribbleException$InvalidHeader: Your input file has a malformed header: We never saw the required CHROM header line (starting with one #) for the input VCF file at htsjdk.variant.vcf.VCFCodec.readActualHeader(VCFCodec.java:119) at org.bdgenomics.adam.rdd.variant.VCFOutFormatter.read(VCFOutFormatter.scala:104) at org.bdgenomics.adam.rdd.OutFormatterRunner.<init>(OutFormatter.scala:32) at org.bdgenomics.adam.rdd.GenomicDataset$$anonfun$15.apply(GenomicDataset.scala:876) at org.bdgenomics.adam.rdd.GenomicDataset$$anonfun$15.apply(GenomicDataset.scala:838) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:801) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:801) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
Please help to resolve the issue
The text was updated successfully, but these errors were encountered: