Motif+Peptide+Analysis

When an antibody binds to random peptides on an array, some of these random peptides may be more important or relevant to the original antigen than others. There may be a way to analyze these random peptides to find out which ones may be most important.

Map of Work 11-29-11

1st idea about strategy before 071012 Here's the basic strategy I attempt to use (071012) here.
 * Find motifs in the peptides.
 * To do this, I will use glam2 to compare every possible combination of two peptides in the list. If two peptides share a motif with a score greater than the GLAM2 Significance Threshold, then this motif will be added to the list of motifs.
 * Maybe I'll call this: generateAllMotifs
 * Group similar motifs so that they just represent one motif group (or it could just be called a motif)
 * To do this, I will compute a similarity score between every pair of motifs using glam2. For example, motif 2 will be selected and compared against all of the other peptides in the list after it. If the score is below some threshold, then the motifs will be grouped together. When this motif 2 is included in the list, every motif will then be compared to motif 2 to see if they also should be included in this same group. Therefore, this will be a recursive function.
 * Perhaps find the most representative sequence in the motif group. I might be able to do this by finding the similarity scores of every motif to a particular motif. I would then average all of the scores for that motif. The motif with the highest average score would be the representative sequence for the motif group.
 * Find how many peptides belong to each motif group
 * glam2scan will be used to assign peptides to a group.
 * Find the B cell epitope score for each peptide belonging to a motif group, and then average these B cell epitope scores and assign this to the motif group
 * Plot the motif groups based on the number of peptides that belong to them and their B cell epitope score. Then see where the ELISA confirmed peptides fall within this plot. Perhaps the best peptides belong to motif groups with good B cell epitope scores and a lot of peptides with this same motif
 * Do a short blast for all of the representative sequences of the motif groups. Then see if there are any overlaps in the different blast result lists. These overlap proteins may be proteins which the antibodies binding to these random peptides originally bound to.

I will make use of three basic programs to accomplish this.
 * GEMODA (I may not use this program after all) (see Gemoda 1.0): this program will list all possible motifs when given several sequences
 * Note that gemoda only tells you whether the similarity "signal" you are detecting is substantially different from the background noise.
 * see note from Mark 7-6-11
 * GLAM2: this program will try to find the best motif when given several sequences
 * GLAM2Scan: this program will try to see if certain sequences contain a given motif
 * Bepipred: this program can assign a B cell epitope score
 * BLAST: used to determine the similarity of a sequence to the database of known protein sequences.

Java Classes Most recent version of Java classes can be found here "C:\kurt\storage\CIM Research Folder\DR\2012\10-2-12\code\mpa src 10-3-12.zip" backup of classes on google https://docs.google.com/file/d/0B55PHkbittziaWdCa25naFNOOEE/edit https://docs.google.com/file/d/0B55PHkbittziWmxpX2lNa3Z1Qlk/edit

BLAST protein database stored here S:\Research\Cancer_Eradication\Users\kwhittem\DR\2012\9-26-12_database\nr

setting up computer to run program 9-23-12

Note that in some early work looking at this, I found that there were 2,044 motif groups. This may not have been entirely accurate.

see also: Amino Acid Substitution Matrix

Related papers 2-13-13