Generic Text File Import

If you have data in a non-standard format then you can use the generic text file import tool to bring it into SeqMonk. You can import multiple files at the same time as long as they all use exactly the same format.

The only restrictions on the import are:

Each file should contain data for only one sample
The file must be a plain text file (not a binary file such as a .doc or .xls)
The fields in the file must be separated by a delimiting character (tab, space or comma)
The file must contain a chromosome name and a start and end point. It can optionally include a strand or a count

If your data satisfies these constraints then you can import it into SeqMonk.

Once you have selected the files you want to import you will see a screen which looks like this:

Text file import dialog

The values you need to set are all contained in the drop down boxes on the right.

Column Delimiter: This is the character used to separate the fields in your file.
Start At Row: Set this to the row number of the first line of actual data. This lets you remove any headers from your file.
Chr Col: The column containing the chromosome information. This can be a raw chromosome name or a name starting with chr followed by the chromosome name.
Start / End Col: The start and end points for the read. It doesn't matter if the end is lower than the start as SeqMonk will reverse them automatically.
Strand Col: This column is optional so you can leave it unset. The parser understands '+', '1', or 'FF' for forward strand and '-', '-1' or 'RF' for reverse strand.
Count Col: This column is optional so you can leave it unset. This column can contain a count value so that the read specified will be duplicated this number of times.

Once you've set your options you just press the import button to import your data. Your samples will be named after the files from which they came.