Work+071012

generated glam2 output for 183 sequences "L:\storage\CIM Research Folder\DR\2012\7-10-12\glam2 for 183 seq"

How well does glam2 find motifs and give scores with just 2 peptides?

I ran glam2 on these two peptides >1 TISKYVMVEPMRQHEEW >5 RVGEMPMREYDISGGSG

and the program gave me

Score: 9.24529 Columns: 6 Sequences: 2


 * 1 10 PMRQHE 15 + 9.74**
 * 5 6 PMREYD 11 + 10.1**


 * I then ran glam2 on**
 * >1**
 * PMRQHE**
 * >2**
 * PMREYD**


 * and got**
 * Score: 9.24529 Columns: 6 Sequences: 2**

1 1 PMRQHE 6 + 9.74 2 1 PMREYD 6 + 10.1

It looks like the program is consistent, and it can also compare and score two short motifs. I think I will use GLAM2 rather than GEMODA to generate the initial list of motifs. I will then use GLAM2 to score these motifs.

What type of threshold score should i choose for GLAM2? To make this decision, I will see what kind of score GLAM2 when comparing pairs of random sequences. I'll do this about 20 times and obtain an average and standard deviation. I think this can help me set the threshold needed to decide whether two sequences are similar enough or not. Here's some random peptide sequences 071012

Oh, I'm also curious what kind of score two short sequences are assigned. Two sequences like >1 HEE >2 HEE

They get a score like this: Score: 7.85694 Columns: 3 Sequences: 2

1 1 HEE 3 + 7.94 2 1 HEE 3 + 7.94

see GLAM2 Significance Threshold

I don't think I need the GEMODA code in my project anymore, but here it is in case I want to refer to it. GEMODA Code 071012 I also don't think I need the significance explorer code any longer, but here it is in case I want it: "L:\storage\CIM Research Folder\DR\2012\7-10-12\Significance Explorer code 071012.txt"

made processGLAM2Output, getMotif, getScore, and createFASTAFileFromList of Sequences functions for Glam2Handler class

Started the main MotifPeptideAnalysis class made initialize, setStartingPeptideList, and generateAllMotifs functions