Skip to content
This repository was archived by the owner on Aug 9, 2018. It is now read-only.
/ seqcos Public archive

A .NET application for performing quality control of high-throughput sequencing reads. My intern project at Microsoft Research (2011).

License

Notifications You must be signed in to change notification settings

kcha/seqcos

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Sequence Quality Control Studio (SeQCoS)

SeQCoS is an open source .NET software suite designed to perform quality control of massively parallel sequencing reads.

Project Description

  • SeQCoS generates a series of standard plots to illustrate the quality of the input data. These plots (saved in JPEG file format) provide information on commonly observed measurements, such as GC content and distribution of quality scores at position-specific and sequence-specific levels.
  • Basic trimming and discarding functions are provided to manipulate sequence files, according to:
    • Minimum read length; or
    • Minimum base quality score; or
    • Pattern matching by regular expression
  • (Experimental) As an optional step, SeQCoS can invoke NCBI BLAST (standalone blast+ toolkit) to search the input against a BLAST-formatted database. A pre-formatted database of NCBI UniVec, a repository of vector sequences, adapters, linkers and PCR primers that are used in DNA sequencing, is provided here; however, users are free to supply their own database for searching.

SeQCoS was written in C# using .NET 4.0 and integrates with .NET Bio, an open-source bioinformatics library. Hence, this application takes advantage of functionality offered by .NET Bio for handling sequence data as well as Sho, a data analysis and visualization application. The SeQCoS GUI was developed using Windows Presentation Foundation 4.0.

Formats Supported

Currently, the input and output formats supported by SeQCoS are limited to FASTA and FASTQ. For FASTA, only sequence-level analysis is performed.

Documentation

Please visit the project wiki page for more info.

System requirements

  • Windows 7 or newer (older versions of Windows should work but has not been tested)
  • Microsoft .NET 4.0
  • Sho 2.0.5 or higher (tested on 2.0.5)
  • 4 GB RAM or more
  • (optional) Standalone blast+ (NCBI Blast for Windows) - for executing BLAST searches

History of the project

The SeQCoS project was initiated by the Microsoft Biology Initiative group. I started and developed the project while interning at Microsoft Research (Redmond, WA) in the summer of 2011. Prior to being hosted on Github, the code was originally hosted on Codeplex (https://archive.codeplex.com/?p=seqcos). On November 2011, SeQCoS was featured in the Microsoft Research Connections Blog.

About

A .NET application for performing quality control of high-throughput sequencing reads. My intern project at Microsoft Research (2011).

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages