• 0 Posts
  • 44 Comments
Joined 1 year ago
cake
Cake day: July 19th, 2023

help-circle















  • I’m a bioinformatician. The problem with using bioinformatics software here is that the input or output data size is huge for most tasks, which makes submitting jobs off site much more difficult.

    Bacterial genome assembly isn’t too bad though. I use Nanopore sequencing data and the input is usually on the order of a few gigabytes per task for an output file of a few megabytes. (pulling numbers outta my butt, but shouldn’t be too far off) But the multiplying this by 48 or 96 which is the number of samples out machine can run all at the same time and you’re getting into hundreds of gigabytes for input data. It’s just tough to manage this with cloud services.

    But if you go simpler, you could offer a BLAST server. You just need to host your own database and accept queries. Not sure if you can split it into smaller tasks though. If you segment the main database your p-value results will change.