Тёмный
Riffomonas Project
Riffomonas Project
Riffomonas Project
Подписаться
I'm Pat Schloss! I produce videos about how to use data science tools to answer questions about the world around me. I believe that anyone can answer their own questions. Do you?! I'd love to learn more about the world around you. Share the questions you would like to answer and we can take them on together!
Комментарии
@ianworthington2324
@ianworthington2324 День назад
Difficult to believe that the size of a punch card remains the recommended line length all these years later.
@Riffomonas
@Riffomonas 23 часа назад
Hah!
@user-ps4fb1oy2r
@user-ps4fb1oy2r 2 дня назад
Hello, which package or options did you use to check integrity of your code after passing it to styler ? I'm referring to the one you used via the Build button
@Riffomonas
@Riffomonas День назад
That's my package - phylotypr - that I run Build on. Is that what you're asking about?
@user-ps4fb1oy2r
@user-ps4fb1oy2r День назад
@@Riffomonas yes, thanks
@djangoworldwide7925
@djangoworldwide7925 2 дня назад
Aay GitHub actions detected a lint issue, does it automatically run styler to fix it? If not, what benefit does it have?
@Riffomonas
@Riffomonas 2 дня назад
It doesn't, but there is a separate GHA for running styler. We'll readdress this in the next episode so you can detect problems before pushing. I think the benefit of this particular GHA is for pull requests from others so that their code can get run through a lintr before trying to pull it into your code
@zjardynliera-hood5609
@zjardynliera-hood5609 2 дня назад
Hello Patrick, I watched many of your videos and made my first R package to generate, filter, and sort relative abundance tables and make plots. We do a lot of amplicon sequencing from environmental samples at uWaterloo. The github repo is zjardyn/bubbler and the package is mostly done. I am still scared to submit it to CRAN lol.
@joshstat8114
@joshstat8114 5 дней назад
Thank you for showing the benchmark about their performances (I still recommend you the `bench` package, though). How about `tidypolars` (in R, not in Python)?
@Riffomonas
@Riffomonas 2 дня назад
I'll have to check out the tidypolars package, this was a new one to me. Thanks for watching!
@matthewson8917
@matthewson8917 5 дней назад
It was surprising that base pipe was generally slower than magrittr pipe
@Riffomonas
@Riffomonas 5 дней назад
Thanks for watching! More experimenting with both suggests that it really depends on the context. Any difference is really minimal
@jlntp1642
@jlntp1642 9 дней назад
thank you, I love this series. I am wondering if R analysis project could be done as package driven way, what is your opinion on this?
@Riffomonas
@Riffomonas 9 дней назад
Definitely, I've seen people create papers as R packages - check out this pre-print peerj.com/preprints/3192v2/
@jlntp1642
@jlntp1642 9 дней назад
@@Riffomonas thanks for sharing, I also tried myself to embeded my analytic code into a package.... but at some point I felt that generated data are too heavy for a R package. Also, I never considered (but I would like) using "test that" for an analytical workflow. In addition beyond "report generation" it would be nice to include github action within the workflow.
@Riffomonas
@Riffomonas 8 дней назад
Something like Test Driven Development for data analysis is always rolling around in the back of my head :) It's hard because it requires using functions to test and most data analyses don't use home made functions, they use functions from other packages, which are hopefully already tested
@djangoworldwide7925
@djangoworldwide7925 9 дней назад
Great notes on the upper right of the screen for further read. Thanks Pat! We're all excited about you submitting your pkg to cran 🤞🏻
@Riffomonas
@Riffomonas 9 дней назад
Thanks! A few more weeks 🤓
@MrJL-xk5jb
@MrJL-xk5jb 12 дней назад
your videos are so useful, lots of thanks. How do you comment (#) more thant one line at the same time?
@Riffomonas
@Riffomonas 10 дней назад
I highlight the lines and then use the shortcut to comment them. On my mac it's shift-command-c
@user-vp4ix6ff3b
@user-vp4ix6ff3b 12 дней назад
Ours also uses slurm
@grahamsharpe9812
@grahamsharpe9812 14 дней назад
Why do you use tibble? And not modify data in excel? Or is this a preference thing?
@Riffomonas
@Riffomonas 14 дней назад
It's about reproducibility and transparency. Modifying data in excel is very much not reproducible or transparent. It's next to impossible to document changes in excel files. Also, excel is $$$ whereas R is free. With R, we can document all of the changes in the script and rerun the script multiple times without having to worry about breaking things. Also, if I make a mistake, I can easily correct it by reprocessing the raw data with the corrected code.
@kpicsoffice4246
@kpicsoffice4246 15 дней назад
Thank you. Great work
@Riffomonas
@Riffomonas 15 дней назад
My pleasure - thanks for watching!🤓
@kkanden
@kkanden 16 дней назад
despite not currently developing an R package or planning to, i really enjoy watching your series on making one especially since you're showing the backstage, raw and "ugly" side of coding (the typos in particular). if it ever comes to me creating a package i'll make sure to use this series as a reference. cheers!
@Riffomonas
@Riffomonas 16 дней назад
wonderful - thanks for watching!
@chooby364
@chooby364 17 дней назад
I dont like my axis labels to repeat themselves. Is there a way to have a single centered x and or y labelling. Been trying to figure it out but cannot manage to get it rght
@Riffomonas
@Riffomonas 17 дней назад
This type of thing is much easier using the {patchwork} package. Thanks for watching!
@user-ro9ex5im2p
@user-ro9ex5im2p 19 дней назад
This was great. Thank you :)
@Riffomonas
@Riffomonas 17 дней назад
My pleasure! thanks for watching🤓
@Universe624
@Universe624 19 дней назад
how can I access the glb.ts+dsst.csv data file?
@Riffomonas
@Riffomonas 17 дней назад
You can get it through the links in the blog post linked in the show notes. github.com/riffomonas/climate_viz/tree/a428f64b7db493145bf84ec6f38f8e89da258675/data
@hassanhijazi4757
@hassanhijazi4757 21 день назад
Hey Pat, What is the best practice when you want to populate a list but you don’t know upfront how long this might grow? What you do in this case?
@Riffomonas
@Riffomonas 21 день назад
You can certainly grow the list, which really isn't a problem if you don't think it will be long. Alternatively, I've also seen people initialize a list that's larger than you think it will be and then prune it after you know how big it actually should be
@samlawrence4627
@samlawrence4627 22 дня назад
Thank you for the video. I just have a question about the @examples section. When I preview the documentation file under the help tab in Rstudio, I see a link that says "Run examples." When I click this link, it takes me to a blank page that says "Example/<function name> not found." When I look at other packages, I can click on this link, and it shows the output given by the examples. Is there something I have to do to get this link to work? Or does this happen after the package is submitted?
@Riffomonas
@Riffomonas 22 дня назад
I think that only works on packages already on CRAN
@samlawrence4627
@samlawrence4627 21 день назад
@@Riffomonas Okay, thank you
@ericagardner8249
@ericagardner8249 26 дней назад
Thank you, this is so helpful :)
@Riffomonas
@Riffomonas 23 дня назад
my pleasure! thanks for watching 🤓
@djangoworldwide7925
@djangoworldwide7925 29 дней назад
using tmp dir is such a flex. i really gotta use this more often...
@Riffomonas
@Riffomonas 29 дней назад
hah - thanks!
@souIsynapse
@souIsynapse Месяц назад
I faxed this to the author of my favorite package whose last update was from the early devonian thanks
@Riffomonas
@Riffomonas 29 дней назад
lol - thanks for watching 🤓
@markusmuller65656
@markusmuller65656 Месяц назад
Thanks for sharing.
@Riffomonas
@Riffomonas 29 дней назад
absolutely - thanks for watching!
@rags3791
@rags3791 Месяц назад
excellent
@Riffomonas
@Riffomonas 29 дней назад
my pleasure! thanks for watching :)
@SuperDashdash
@SuperDashdash Месяц назад
Sir, I have always been thrilled by your R techniques and the way of your explanation. Your videos have caused me to switch to R [and of course RStudio] from Python [Jupyter] literally captivated me in enjoying analytics using R. I am working in Aerospace Industry. While the organization leverages a couple of premium visualization tools even to analyze exploratorily, I, after having got lightning stuck by your amazing videos, have been using ggplot extensively along with my basic statistical knowledge and of course teaching my colleagues. Thank you, @Riffomonas. Please keep posting more videos to enlighten some of thirsty analysts like me.
@Riffomonas
@Riffomonas 29 дней назад
Thanks for your very kind comment!
@mocabeentrill
@mocabeentrill Месяц назад
Wow! Thanks Pat. This was the most advanced episode in the series and it requires one to be well versed in the intricacies of base R. Thoroughly enjoyed it!
@Riffomonas
@Riffomonas 29 дней назад
wonderful! sometimes it's fun to go into the weeds a bit :)
@chuckbecker4983
@chuckbecker4983 Месяц назад
Great instructional video, thanks! During the pandemic I became proficient at this but haven't used Git in a couple of years. You provided just the guidance I needed.
@Riffomonas
@Riffomonas 29 дней назад
So glad to hear it - thanks for watching!
@Jeep-d7c
@Jeep-d7c Месяц назад
Have you tried the join from the collapse package? It is very fast in my tests. collapse::join(x, y, how="inner", on=c("a"="b"))
@Riffomonas
@Riffomonas 29 дней назад
Thanks - I'll have to check that out
@user-rf9ow8ck2l
@user-rf9ow8ck2l Месяц назад
One of the frustrating limitations of the conda approach for R is that only a fraction of the packages in CRAN are compiled and installable via conda. Other packages are not in CRAN and its not clear how to install those via conda. Any thoughts on this?
@Riffomonas
@Riffomonas Месяц назад
I have found that the major packages are available in one of the conda collections. It's not a horrible process to contribute a conda package if you need to. This SO link is somewhat helpful stackoverflow.com/questions/52061664/install-r-package-from-github-using-conda
@user-ro9ex5im2p
@user-ro9ex5im2p Месяц назад
This was great! Thank you
@Riffomonas
@Riffomonas Месяц назад
Thanks for watching - I'm glad you enjoyed it!
@meronghirmay4960
@meronghirmay4960 Месяц назад
Three years since this video was posted, and here I am too making a nice figure. And this is not the only video that has helped me. Thank you very much, Dr Schloss.
@Riffomonas
@Riffomonas Месяц назад
Hah! Thanks so much for watching. I'm glad you're finding my videos helpful 🤓
@ahmed007Jaber
@ahmed007Jaber Месяц назад
Hi Pat. thank you for this could you please check the blog post? I guess it is not uploaded yet thank you so much for the knowledge sharing and efforts you do which have helped me immensly
@Riffomonas
@Riffomonas Месяц назад
Thanks for the head up - it's there now
@mariliaamaralmarcondes6943
@mariliaamaralmarcondes6943 Месяц назад
I am from Brazil and your explanation is so good that I can understand all your class. Thank you very much.
@Riffomonas
@Riffomonas Месяц назад
Wonderful - my pleasure!
@monzerthejoker343
@monzerthejoker343 Месяц назад
I didn't understand any thing
@Riffomonas
@Riffomonas Месяц назад
Sorry! If there's anything specific let me know what was confusing
@djangoworldwide7925
@djangoworldwide7925 Месяц назад
3 mins ago. I must be number one fan. Thanks Pat! Your series are great
@Riffomonas
@Riffomonas Месяц назад
lol - well, you're #1 today 😂
@AdamHillier-h7p
@AdamHillier-h7p Месяц назад
HI, I could not load BiodiversityR package. Error: package 'tcltk' could not be loaded. Any ideas ?
@Riffomonas
@Riffomonas Месяц назад
Sorry, i'm not familiar with the BiodiversityR package. You might try installing tcltk and then try BiodiversityR again
@miissJoceLyn
@miissJoceLyn Месяц назад
This is an outstanding explanation of everything you are doing here. This is amazing, this content really contributes to science. Thank you so much <3
@Riffomonas
@Riffomonas Месяц назад
my pleasure! thanks for watching🤓
@user-sb9oc3bm7u
@user-sb9oc3bm7u Месяц назад
I love how julia is always a honour guest in these debates
@Riffomonas
@Riffomonas Месяц назад
Hey, if your local community uses Julia, go for it. Same goes for Fortran, Haskell, whatever. There's no debate. The only rule is that people need to learn to program. It's best to learn what your local community uses. No interest in engaging in any type of language wars here 🤓
@PhilippusCesena
@PhilippusCesena Месяц назад
Great video as always!
@Riffomonas
@Riffomonas Месяц назад
Thanks Philippus!
@djangoworldwide7925
@djangoworldwide7925 Месяц назад
11:09 will instead of with. Not sure if you fixed it later 23:45 should fix to @returns
@Riffomonas
@Riffomonas Месяц назад
Thanks! I think I got the @returns after editing the video :) It really doesn't matter if you use @return or @returns, but @returns is the new way of doing things.
@djangoworldwide7925
@djangoworldwide7925 Месяц назад
I wonder how the process today might look if you're doing something like this with ChatGPT: Providing the function to chatgpt Uploading roxyegn documentation (optional) Asking it to write documentation with the important key headers, including examples. That way you can make sure your wording is the same across functions, as well as argument names. I mean, must provide some context but I bet it can spare a lot of time
@shadyamigo
@shadyamigo Месяц назад
I’ve used it for this very purpose. It’s very good and saves so much time
@Riffomonas
@Riffomonas Месяц назад
I'll make a deal with you and all of my other viewers... I promise that I will never intentionally use ChatGPT or its ilk to generate code, documentation, anything on my channel :)
@djangoworldwide7925
@djangoworldwide7925 Месяц назад
I believe you, seeing how good you recall complex regex ;) ​@@Riffomonas
@joshstat8114
@joshstat8114 Месяц назад
Would you like me to recommend you to use `bench::mark()` whenever you benchmark the expression?
@Riffomonas
@Riffomonas Месяц назад
Thanks - I've used it in other episodes, but I find the {microbenchmark} is easier to use for some applications
@elforich
@elforich Месяц назад
Very well put together, thanks
@Riffomonas
@Riffomonas Месяц назад
Thanks for watching!
@tedhermann3424
@tedhermann3424 Месяц назад
Just to note, using as.data.table() or setDT() will be considerably faster than data.table(). data.table also comes with its own version of merge() so you don't have to use the funky syntax for a full merge.
@Riffomonas
@Riffomonas Месяц назад
Thanks for the feedback - I'm finding that if I use as.data.table or setDT, I get similar results as plain data.table and inner_join
@tedhermann3424
@tedhermann3424 Месяц назад
​@@Riffomonas I made synthetic datasets since I don't have your fasta data. In my test, dtA was the fastest, followed by using setDT and as.data.table. data.table() was similar to dplyr with inner_join. Here is my code: each_num <- 1e4 animal_legs <- map_dfr(data.frame(animal = c("cow", "fish", "chicken", "dog", "sheep"), n_legs = c(4, 0, 2, 4, 4)), rep, each = each_num) %>% mutate(n = 1:nrow(.)) animal_sounds <- map_dfr(data.frame(animal = c("cow", "chicken", "cat", "sheep", "dog"), sounds = c("mooo", "cluck", "meow", "baaa", "bark")), rep, each = each_num) %>% mutate(n = 1:nrow(.)) # make a copy for setDT because it changes things in place, # which would affect everything else relying on animal_legs and animal_sounds # in microbenchmark. animal_legs_2 <- copy(animal_legs) animal_sounds_2 <- copy(animal_sounds) #data.table for dtA test animal_legs_dt <- data.table::data.table(animal_legs, key = "n") animal_sounds_dt <- data.table::data.table(animal_sounds, key = "n") microbenchmark::microbenchmark( base = base::merge(animal_legs, animal_sounds, by = "n", all = FALSE), ij = dplyr::inner_join(animal_legs, animal_sounds, by = "n"), dt = { animal_legs_dt_test <- data.table::data.table(animal_legs, key = "n") animal_sounds_dt_test <- data.table::data.table(animal_sounds, key = "n") animal_legs_dt_test[animal_sounds_dt_test, nomatch = NULL, on = .(n)] #inner join }, dtA = animal_legs_dt[animal_sounds_dt, nomatch = NULL, on = .(n)], as_dt = { animal_legs_dt_test <- data.table::as.data.table(animal_legs, key = "n") animal_sounds_dt_test <- data.table::as.data.table(animal_sounds, key = "n") animal_legs_dt_test[animal_sounds_dt_test, nomatch = NULL, on = .(n)] }, set_dt = setDT(animal_legs_2)[setDT(animal_sounds_2), nomatch = NULL, on = .(n)] ) Here's the results table on my computer: Unit: milliseconds expr min lq mean median uq max neval cld base 29.754131 31.562985 33.248288 32.509195 34.936214 38.573338 100 a ij 3.847486 4.260528 5.063156 4.359981 4.581972 10.221760 100 b dt 3.840432 4.029632 6.483946 4.263007 4.815313 145.038057 100 b dtA 2.474058 2.599248 4.473535 2.822705 3.102000 137.916202 100 b as_dt 3.255442 3.403854 4.397897 3.637299 4.121683 11.064186 100 b set_dt 2.881791 2.995842 3.809361 3.250692 3.767623 7.726401 100 b Base R was actually the fastest when I did this test with the original animal_* datasets, but it clearly doesn't scale very well.
@tedhermann3424
@tedhermann3424 Месяц назад
@@Riffomonas Apologies if this is showing up for a second time, but I replied earlier and now it seems to be gone. I made some synthetic data because I don't have your fasta data. I consistently find that dtA is fastest, followed by setDT() and as.data.table(). dplyr and data.table() are comparable. Here is my code: each_num <- 1e4 animal_legs <- map_dfr(data.frame(animal = c("cow", "fish", "chicken", "dog", "sheep"), n_legs = c(4, 0, 2, 4, 4)), rep, each = each_num) %>% mutate(n = 1:nrow(.)) animal_sounds <- map_dfr(data.frame(animal = c("cow", "chicken", "cat", "sheep", "dog"), sounds = c("mooo", "cluck", "meow", "baaa", "bark")), rep, each = each_num) %>% mutate(n = 1:nrow(.)) # make a copy for setDT because it changes things in place, # which would affect everything else relying on animal_legs and animal_sounds # in microbenchmark. animal_legs_2 <- copy(animal_legs) animal_sounds_2 <- copy(animal_sounds) #data.table for dtA test animal_legs_dt <- data.table::data.table(animal_legs, key = "n") animal_sounds_dt <- data.table::data.table(animal_sounds, key = "n") microbenchmark::microbenchmark( base = base::merge(animal_legs, animal_sounds, by = "n", all = FALSE), ij = dplyr::inner_join(animal_legs, animal_sounds, by = "n"), dt = { animal_legs_dt_test <- data.table::data.table(animal_legs, key = "n") animal_sounds_dt_test <- data.table::data.table(animal_sounds, key = "n") animal_legs_dt_test[animal_sounds_dt_test, nomatch = NULL, on = .(n)] #inner join }, dtA = animal_legs_dt[animal_sounds_dt, nomatch = NULL, on = .(n)], as_dt = { animal_legs_dt_test <- data.table::as.data.table(animal_legs, key = "n") animal_sounds_dt_test <- data.table::as.data.table(animal_sounds, key = "n") animal_legs_dt_test[animal_sounds_dt_test, nomatch = NULL, on = .(n)] }, set_dt = setDT(animal_legs_2)[setDT(animal_sounds_2), nomatch = NULL, on = .(n)] )
@tedhermann3424
@tedhermann3424 Месяц назад
​@@Riffomonas I've tried replying a few times, but youtube seems to be autoremoving the comment. Maybe something to do with the code snippet.... Anyway, I made large synthetic datasets because I don't have the fasta data, and ran everything again. dtA is consistently fastest, followed by setDT and as.data.table(). Here is my code below. each_num <- 1e4 animal_legs <- map_dfr(data.frame(animal = c("cow", "fish", "chicken", "dog", "sheep"), n_legs = c(4, 0, 2, 4, 4)), rep, each = each_num) %>% mutate(n = 1:nrow(.)) animal_sounds <- map_dfr(data.frame(animal = c("cow", "chicken", "cat", "sheep", "dog"), sounds = c("mooo", "cluck", "meow", "baaa", "bark")), rep, each = each_num) %>% mutate(n = 1:nrow(.)) # make a copy for setDT because it changes things in place, # which would affect everything else relying on animal_legs and animal_sounds # in microbenchmark. animal_legs_2 <- copy(animal_legs) animal_sounds_2 <- copy(animal_sounds) # data.table for dtA test animal_legs_dt <- data.table::data.table(animal_legs, key = "n") animal_sounds_dt <- data.table::data.table(animal_sounds, key = "n") microbenchmark::microbenchmark( base = base::merge(animal_legs, animal_sounds, by = "n", all = FALSE), ij = dplyr::inner_join(animal_legs, animal_sounds, by = "n"), dt = { animal_legs_dt_test <- data.table::data.table(animal_legs, key = "n") animal_sounds_dt_test <- data.table::data.table(animal_sounds, key = "n") animal_legs_dt_test[animal_sounds_dt_test, nomatch = NULL, on = .(n)] # inner join }, dtA = animal_legs_dt[animal_sounds_dt, nomatch = NULL, on = .(n)], as_dt = { animal_legs_dt_test <- data.table::as.data.table(animal_legs, key = "n") animal_sounds_dt_test <- data.table::as.data.table(animal_sounds, key = "n") animal_legs_dt_test[animal_sounds_dt_test, nomatch = NULL, on = .(n)] }, set_dt = setDT(animal_legs_2)[setDT(animal_sounds_2), nomatch = NULL, on = .(n)] )
@tedhermann3424
@tedhermann3424 Месяц назад
​@@Riffomonas I've tried replying numerous times, but my comment gets removed each time. I think it doesn't like the code snippet I'm trying to share... Anyway, I made synthetic datasets ~50,000 rows long, where each row is a unique group so that it is comparable to your fasta data. dtA is consistently fastest, followed by setDT and as.data.table. One thing I had to control for was using a copy of the dataframe for setDT (e.g., df_copy <- copy(df)) because setDT works in place. If you use the same df reference for all items in microbenchmark, you run the risk of setDT changing your dataframe in place to a data.table. Then any subsequent runs with data.table() or as.data.table() take the same amount of time because df is already a data.table. Maybe youtube will let me share the code as a github link... : github.com/mrguyperson/joins_example/blob/main/R/joins.R
@PhilippusCesena
@PhilippusCesena Месяц назад
Great video, I was used to dplyr and it is very interesting to see other approaches.
@Riffomonas
@Riffomonas Месяц назад
Thanks! Glad you enjoyed it 🤓
@rayflyers
@rayflyers Месяц назад
A few thoughts come to mind. 1) dplyr always outputs tibbles. If you're going to use dplyr, it might be worth using tibbles throughout your package. The loss in performance is worth the consistent formatting, and tibbles are just better. 2) dplyr allows for multiple backends (dtplyr, dbplyr, duckplyr, arrow, etc). Would those affect your code? If I call duckplyr::methods_overwrite(), and a package has a custom function that calls dplyr::inner_join() under the hood, would it now call duckplyr::inner_join() under the hood instead? 3) Similarly, If I pipe a dataframe into dtplyr::lazy_dt() and then into a custom join function that calls dplyr::inner_join() under the hood, would it work and use the data.table method? Or would dtplyr just not know how to translate the code? I know that you're not planning to write a custom join function, but your video still sparked these curiosities in me. Lately I've been looking at these dplyr backends as a way to scale up our work for big data projects without making my team have to learn new syntax, so they've been on my mind a lot. Great video as always!
@Riffomonas
@Riffomonas Месяц назад
Great - thanks for the feedback. For now, the input to the phylotypr functions will be data.frames, but they should work fine if people provide tibbles or data.tables. The output will be base R structures like lists and character strings.
@jmoggridge
@jmoggridge Месяц назад
> class(iris) [1] "data.frame" > x <- tibble::tibble(Species = iris$Species[1]) > class(x) [1] "tbl_df" "tbl" "data.frame" > iris |> dplyr::inner_join(x) |> class() [1] "data.frame" > iris |> dplyr::inner_join(x) |> tibble::as_tibble() |> class() [1] "tbl_df" "tbl" "data.frame"
@spacelem
@spacelem Месяц назад
That was super helpful, thank you! I'll admit that joining was something that I hadn't really got the hang of, and even though I have gone through the tutorials, I didn't really appreciate what was going on. Only got rolling to figure out and then I can say I've mastered data.table! I would add though that the animal_legs_dt[animal_sounds_dt[uniq_animals]] for a full join is... pretty ugly! Instead, data.table provides its own version of merge that looks exactly like base R.
@Riffomonas
@Riffomonas Месяц назад
Thanks - I hadn't seen data.table::merge. That would simplify things considerably
@aidanmorales2576
@aidanmorales2576 Месяц назад
Excellent video as always! You might want to check out the tidytable package by Mark Fairbanks. It provides a fast data.table backend for many dplyr, purrr, and tidyr functions, with tidyverse syntax. I find it works great for speeding up R package development and code, while keeping dependencies down and keeping the code readable/maintainable for those not as familiar with data.table.
@markrandall7631
@markrandall7631 2 месяца назад
I have been trying this and some URL exit without the URL in single quotes and some need single quote URL to exit. Gone to encapsulating all URL in single quotes.
@Riffomonas
@Riffomonas Месяц назад
yeah bash can sometimes do different things with single vs double quotes. a backslash can be useful for escaping quotes if you need to have quotes in quotes
@markrandall7631
@markrandall7631 Месяц назад
@@Riffomonas this was without the URL in any quotes, like your script.
@guani2155
@guani2155 2 месяца назад
Hi Pat, thanks for the nice vedio! when use nmds <- metaMDS(shared, autotransform = FALSE), then score(nmds), the output has both $sites (which is the Group here) and $species (OTUs). I cannot directly pipe it to ggplot. I wonder how you deal with it? Thanks!
@Riffomonas
@Riffomonas 2 месяца назад
Hmmm, I'm not sure - why are you giving metaMDS shared instead of a distance matrix? Could that be the difference between what you and I are doing? github.com/riffomonas/distances/blob/main/code/nmds.R
@guani2155
@guani2155 2 месяца назад
@@Riffomonas But at 12:27, you were using nmds<-metaMDS(shared, autotransform = FALSE), using shared as input?
@Riffomonas
@Riffomonas 2 месяца назад
The rest of the video goes on to say that the defaults were not ideal and that rarefaction of the data was necessary
@guani2155
@guani2155 Месяц назад
@@Riffomonas I see, thank you Pat!
@danielkwawuvi_tutorials
@danielkwawuvi_tutorials 2 месяца назад
Thank you for the guidance. Do you have a video on performing Principal Component Analysis on microbial data? It will be helpful to see you do this. I am learning a lot from you, Prof.
@Riffomonas
@Riffomonas 2 месяца назад
Thanks for watching check out these two videos: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-G5Qckqq5Erw.html and ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-h7OrVmT7Ja8.html