Most Popular

1500 questions
3
votes
1 answer

Compare and Reorganize Fasta Headers Python

I want to compare the headers from the fasta file to file1, and if there's a match, reorganize the header and put the match first. If there's no match between fasta file and file1, look at file2 and if there's a match, put that name first on the…
nora job
  • 33
  • 3
3
votes
1 answer

Filter with bcftools

I want to filter a SNP, specifically CHROM1:POS:630128. I tried to use bcftools for this. I came up with the following variant of a bcftools call: bcftools filter -e "CHROM=1&POS=63018"…
mugdi
  • 145
  • 5
3
votes
1 answer

Using SED command in Nextflow script

I was wondering if someone can help me figure out this error message. I’m at the end of my Nextflow pipeline and I want to change the header in a FASTA file and it works when I use this command: sed -i "s/^>/>${assembly_id}_${gene_id}_/g"…
rimo
  • 453
  • 9
3
votes
1 answer

heatmap of specific sequence motifs in aligned fasta files

I have a collection of fasta files, each containing three aligned sequences. I am interested in understanding the distribution of a specific sequence motif PGP, RGP and KGP in all the alignments. I was wondering if there is possible way in R to…
Jalan
  • 31
  • 2
3
votes
1 answer

What does it mean when a gene and its transcript have opposite orientations in a GFF3 file?

I was working with a given GFF3 file, and I observed that some transcripts have orientation opposite to their transcripts. Here is a snippet: chr1 . gene 2189548 2194772 . + . …
Maximilian Press
  • 3,309
  • 5
  • 21
3
votes
1 answer

PL and QUAL values on VCF file?

I want to filter my VCF file to include only the relevant information I need but I have some questions about the results I got. This is what the first 10 lines of my VCF file looks like: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT …
3
votes
2 answers

What can be the bias of aligning paired-reads in a single-end mode?

I don't know if I'm in the right place but I have a technical problem to fix. I would have to align paired-end reads from Illumina sequencing to compare a normal genome with a tumor one. When I align with bwa-mem (default parameters) with paired-end…
cucalorda
  • 43
  • 5
3
votes
2 answers

Remote blastn - what is breaking my (bash) loop?

I am trying to search the NCBI non-redundant database for sequences similar to a few other (~40) sequences that I already have. So I've tried running blastn remotely, and looping through multiple queries. I've tried it both as a job (via slurm) and…
Laura
  • 733
  • 3
  • 11
3
votes
1 answer

Does cutadapt trim trailing N's first and then use max_n to filter reads?

Background I want to trim leading and (likely just) trailing N's from my WES (Illumina NextSeq500) reads with cutadapt (--trim-n). I also want to filter for reads with > 40%? N's (--max-n). Question Does cutadapt first remove the trailing/leading…
Dandelion
  • 129
  • 6
3
votes
1 answer

How to improve this WGCNA analysis

This is my first time working on using WGCNA on a microarray dataset. One of the major problems I am facing is merging close modules which is not really working well. I have quantile normalised the data before working on it. I have put my code below…
3
votes
2 answers

Searching for HLA-B in DNA results

I'm trying to find the HLA-B*15:01 variant in my DNA results, prompted from this research paper:…
stan
  • 131
  • 2
3
votes
0 answers

Aligning PacBio HiFi reads to reference genome using pbmm2

I am trying to align a yeast strain sequenced by PacBio HiFi reads to the reference genome S288C (https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000146045.2/) using pbmm2 but for some reason I am getting an error that I don't understand. I first…
3
votes
1 answer

Is there an alternative to bulked segregant analysis for insects?

This strategy seems to be most commonly used for plants. When crossing animals isn't possible, how can I do a similar study to identify a particular locus responsible for a polymorphic trait (specifically, the pigmentation of a beetle)?
Caterina
  • 257
  • 1
  • 4
3
votes
1 answer

How does one distinguish nuclear DNA from mitochondrial DNA when doing WGS?

I'm interested in doing de-novo sequencing but also phylogenetic analysis. In particular, after de-novo sequencing and annotating the genome, I need to align the CO1 gene and the nuclear 28S rRNA gene of several species. When extracting DNA and…
Caterina
  • 257
  • 1
  • 4
3
votes
0 answers

How to select the best ligands in a virtual screening matrix

I have the results of a virtual screening experiment using docking simulations with Autodock Vina. The result is a matrix of 7 (proteins) by 28000 (ligands) with the calculated binding energies for each protein/ligand pair: Ligands -…
albertr
  • 71
  • 4