Download the file ⬇ - goodly.co.in/combine-data-from-multiple-pdf-files-excel Tackle even the most challenging data-cleaning problems. Check out the M Language course and push beyond the user interface ↗ - rb.gy/a2zsnn
Hello, Chandeep! I have a question: doesn't the use of the Table.Combine function (at 6:07) have the same effect as using the Table.ColumnNames + Table.ExpandTableColumn (that you showed next)? It seems like the same result and it would be simpler, but I don't know if I am missing something here. Thanks! Your videos are great!
Thank you so much for such a great video! I was looking for google to solve this problem but didn't find any good solution. I am happy to watch your video which solved my problem.
Great Video... Question though...What if Power Query does not read the PDF in a workable format? I have an Invoice PDf that when I import into PQ, the columns get jumbled up & I am not able to clean the data for reconciliation. I have not used your method per this video yet, but I will. Any thoughts other than using 3rd Party Apps? Thank you!
I've used PQ to import PDF data a lot. One thing I found is that the "Print to PDF" printer built into Chrome based browsers are the easiest to work with. I have a full license for Acrobat, but the Adobe PDF printer produces some of the most difficult PDFs to work with. The Microsoft Print to PDF isn't much better. If anyone knows of settings to adjust in these printers to make the PDFs easier to work with, please reply!
This was super helpful, but I have recently also started to use extraktAI, it saves me a ton of time tbh 🔥 but either way a great video guys, keep up the good work!
@goodly i have a PDF file wherein the data resides right below the columns there are 300 pages in that file how am goona get the data Please make video suggest otherwise
Thank you so much! I get PO copies in PDF format ... There are different sections on the PO like Supplier address/buyer address, PO # section / PO issued date / Section for LE name of the Business unit from where the PO was issued, and then the Milestone description with the amount that needs to be billed once milestone is completed. How can I put them on a table from different sections?
I have a pdf file having total of 50 pages (page001 to page050). Each page contains same structure of columns with different record but every page contains header and footer. I have to remove all those header and footer rows plus remove some unwanted columns before able to combine all 50 pages as 1 table. Created a function in powerquery to repeat those cleaning process to apply for all 50 pages but some pages detected lesser or more column numbers then the rest even though they are all the same structure if looked from pdf reader. How to deal with that issue?
Ahh....been facing a similar problem here, different number of columns detected even though they looked the same in the PDF reader app. Any solution would be appreciated
Great job! next step could be to add a step cleaning up the column names, if those are written al little different or having sometimes artificial white spaces.
you can do it in last step as well without adding new Rename steps, as you can see in had coded version there is a two list one that need to match with data source and second list goes for new names that you want to rename with
I had nested tables after grouping data. All the tables had the same number of rows (4). I needed one of the columns to be replaced with a fixed list of names (departments for example). I just could not get it to work. I was able to add an Index column (1,2,3,4) into the tables, expand them and then did a Merge from the external list of department names in another query. So the process was: 1. Create departments in excel sheet and make it a table. 2. Import this into a Query. 3. Import the other data from a table in Excel, grouped the data into nested tables, added a Index into these tables and expand. 4. Then I created a third query to then merge 2 and 3. 5. I am sure I should be able to get that list into the nested query tables the same way as I inserted the Index column into there. Please help.
Sandeep, Your videos are realy very deep, simple and practical sdetailing all steps from 0 to last. Can u show how to combine pdf files with password ptotected, which is known to user. One way is to open those files , print them as pdf and then store in the folder, which is cumbersome. whether there can be any short cut.
Thanks as usual. But can you provide us an example if we need to cancel some data from that PDF at the rows? Also, every page has the name and id for each employee and we need to add both of them into column
I have invoice pdfs, 30 of them each month, with multiple tables and scattered data i tried a lot of manipulation but wasn't able to get the desired output
HeLLo Goodly/All, Would be wonderful if Someone confirms: I was practicing along with Goodly. In the situation in the video, it seemed the following 2 are producing the same result. ✓ Table.ExpandTableColumn & ✓ Table.Combine Is My understanding correct? Thank You!
bro bro brooo, in previous videos instead of adding new columns and deleting after you used table.transformcolumns which I liked a lot and now I am using this practice thanks to you. Is there a reason why you did by adding new columns this time? (is it more faster, effective etc?)