Yes, you are right, after checking all patterns that starts with "a", and all patterns that starts with "b", PrefixSpan will check all patterns that starts with "c" by doing the same process. In this case, PrefixSpan will find no frequent patterns that starts with "c". This is why the step is ommitted in the video and also to keep the video short. But I should have made this more clear. Thanks for the feedback!
Here is the updated Powerpoint with the step for C that is added: www.philippe-fournier-viger.com/COURSES/Pattern_mining/PrefixSpan_the_presentation.pdf
Philippe Fournier-Viger 1 second ago Yes, you are right, after checking all patterns that starts with "a", and all patterns that starts with "b", PrefixSpan will check all patterns that starts with "c" by doing the same process. In this case, PrefixSpan will find no frequent patterns that starts with "c". This is why the step is ommitted in the video and also to keep the video short. But I should have made this more clear. Thanks for the feedback!
Thanks for the great videos sir. If we already have the result, for example with support: 3. How we could retrieve which rows in our sequence database? thanks.
Hi, Thanks. If you want to find which sequences of the database contains a pattern like you could write a simple algorithm. The algorithm would read each sequence. For a sequence, the algorithm would first try to find the first itemset {a}. Then after it is found, it would continue reading the sequence to find the second itemset {c}, and if there are more itemsets, it would continue to find the third one, etc. Then if all the itemsets are found, it is a match. By repeating this for all sequences of the database, you can find all the sequences that contains the pattern. It is not complicated to do that. Just a bit of programming. If you use my software SPMF, there is an algorithm called OCCUR which exactly do that. You can give a list of patterns to OCCUR and it will find the sequenes that contains each pattern and output them. Also, another option is modify an algorithm like PrefixSpan so that it will output the list of sequences containing each pattern. This is also not very hard to do. In my software SPMF, you have that option for some algorithms. It is called "show sequence IDs".
@@philfv thanks for the reply and providing us the software 😊. i already try your spmf software in python wrapping github.com/lolei/spmf-py and set the option "show sequence IDs". Its work!. Just curios if we already have the pattern and do the query. will try the OCCUR. once again thanks for the sharing sir.
Good evening, Thanks for your suggestions. I have some slides about the SPADE algorithm that I use for teaching. But I need to polish them a bit more to record a video about SPADE. I will keep your suggestion and try to make a video when I have time. In the mean time, if you want slides for SPADE, you can send me an e-mail to philfv8@yahoo.com and I can share my current slides with you.
Can you make a video of the BIDE algorithm? I read the article of the authors of the algorithm, but as far as I remember some of the points were not clear.
That paper is very tricky. It took me a long time to understand it when I first read it. As I remember, the problem is that it explains only the simple case in the paper where each itemset has a single item. But when we try to implement it for the general case, it becomes very complex to make it work. The implementation in SPMF of BIDE has a bug that I did not fix because it is too complicated to fix it, and there are other algorithms that find closed sequential patterns. But for BIDE, I actually spend weeks to do the first implementation in SPMF. Then, I try to redo it completely again from scratch to make it better but still had trouble with the general cases. It is not an easy algorithm to implement correctly for the general case. I dont think that I will make a video about BIDE now. But I will keep your suggestions. I might do it later or at least make a video about closed sequential pattern mining that outline the main idea.
Hi, you could check the SPMF wrapper for Weka that was developed by someone else: github.com/christopher-beckham/spmf-wrapper It could help you call the sequential pattern mining algorithms from SPMF such as PrefixSpan from within Weka. I did not try it but it may work.