Тёмный

How to extract PDF Data from PDF File using PDF.js 

Recoding
Подписаться 2,5 тыс.
Просмотров 47 тыс.
50% 1

In this video we gonna learn about how to extract PDF data using PDF.JS.
Timeline
Intro (00:00)
Support us on
☕ @Buy me a coffee - www.buymeacoff...
🎗@Patreon - / recoding_io
Follow us on
📝 @Medium - / recoding
🐦 @Twitter - / recoding_io
🦄 @Dev.to - dev.to/recoding
📌 @Pinterest.com - / recoding_io
🔗 @LinkedIn.com - / recoding-io

Опубликовано:

 

5 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 40   
@vxyvxolkckklfjds8245
@vxyvxolkckklfjds8245 3 месяца назад
Works fine. Just add "type": "module" to the JSON at the start and in the js import * as pdfjs from 'pdfjs-dist/legacy/build/pdf.mjs'
@JadeclonOfficial
@JadeclonOfficial 2 года назад
Great Video, works fine! Note that you have to define the const pdfjs = require("pdfjs-dist/legacy/build/pdf"); like this now
@rohit1kumar
@rohit1kumar Год назад
If you are working with node 18 This will also works fine pdfjs = require("pdfjs-dist/build/pdf")
@jimmylexfernandezr6365
@jimmylexfernandezr6365 3 года назад
Good video, but how do I extract images from a pdf file?
@andrehasibuan3214
@andrehasibuan3214 3 года назад
how to solved this Uncaught (in promise) TypeError: Cannot read property 'getDocument' of undefined ??
@peakonepainters5728
@peakonepainters5728 Год назад
cant get this to work. :( its throwing weird errors asking me to use a 'legacy'-build because it lacks native functionality
@samehkhalil1769
@samehkhalil1769 2 года назад
Thank you how can I extract data from fields in the PDF. I know the fields names
@subhanahmedkhan17
@subhanahmedkhan17 Год назад
Not getting font name as expected, instead getting some no. like g_d0_f2.
@vitor-is5od
@vitor-is5od 8 месяцев назад
there is a way to extract the pdf-data in client-side? in user browser
@Jeff-zc6rr
@Jeff-zc6rr 2 года назад
is the src from const file= event.target.files[0];? and then to const src= URL.createObjectURL(file); ? What is the src ?
@gauravgupta9594
@gauravgupta9594 2 года назад
bro did u get what is src ?
@ags5
@ags5 Год назад
This code is working fine 👍
@Jeff-zc6rr
@Jeff-zc6rr 2 года назад
getPage iis not a function?
@jarboesolutions3519
@jarboesolutions3519 3 года назад
Error: Cannot find module 'pdfjs-dist/es5/build/pdf'
@bondifrench3382
@bondifrench3382 3 года назад
See my answer to Manoj Tewari
@alisonprieto9435
@alisonprieto9435 3 года назад
Excelente trabajo
@Recoding
@Recoding 3 года назад
Gracias
@irfanbhuiyan620
@irfanbhuiyan620 3 года назад
Is it possible to run this without node.js?
@ahmadhasan185
@ahmadhasan185 3 года назад
👍😍
@mateusalmeida358
@mateusalmeida358 3 года назад
Can i modify the pdf document with this library? Like change the content of the hyperlinks, or modifying some words in the title?
@Recoding
@Recoding 2 года назад
Yes surely you can after extracting the data you could change any data you want.
@Jeff-zc6rr
@Jeff-zc6rr 2 года назад
const file = event.target.files[0]; const uri = URL.createObjectURL(file); var pdf = await pdfjs.getDocument({ url: uri }); var page = await pdf.getPage(1);
@roberttroop1861
@roberttroop1861 3 года назад
what is in the .vscode folder? If you would not mind?
@Recoding
@Recoding 2 года назад
It is a temporary file for storing project related settings.
@manojtewari3450
@manojtewari3450 3 года назад
this code is not woking ...please tell your version
@Recoding
@Recoding 3 года назад
What version?
@bondifrench3382
@bondifrench3382 3 года назад
At the time of the video (May 2021) version must have been 2.7.570, code in video works with that version. The next version was 2.8.335 and there was no longer an es5 folder so you need to require("pdfjs-dist/legacy/build/pdf") instead for it to work. With the latest version (2.9.359), code won't work because it seems now mandatory that we need to install a fake worker. So probably best to use previous versions at this stage.
@rohit1kumar
@rohit1kumar Год назад
​@@bondifrench3382 If you are working with node 18 This will also works fine pdfjs = require("pdfjs-dist/build/pdf")
@cmwebcreations
@cmwebcreations 3 года назад
what if PDF data is an embedded JPEG ?
@jeremyico7581
@jeremyico7581 Год назад
you have to try bro
@tryhuckd9331
@tryhuckd9331 2 года назад
Excelent video, How do I extract images?
@Recoding
@Recoding 2 года назад
You need to filter out img tag inside the PDF by using Filter Method of the Array.
@billmiller8201
@billmiller8201 3 года назад
The example does not work. Thumbs down.
@Recoding
@Recoding 3 года назад
What the problem you are facing please describe it.
@billmiller8201
@billmiller8201 3 года назад
@@Recoding As Jarboe stated in his comment, it can't find the requirement "pdfjs-dist/es5/build/pdf'
@bondifrench3382
@bondifrench3382 3 года назад
See my answer to Manoj Tewari
@rentonx2006
@rentonx2006 2 года назад
@@billmiller8201 en las ultimas versiones el "es5" se cambia por "legacy"
@shekhar7266
@shekhar7266 2 года назад
getting this error: Cannot find module 'pdfjs-dist/es5/build/pdf'
@rohit1kumar
@rohit1kumar Год назад
change the path to 'pdfjs-dist/legacy/build/pdf' if u are using node 16 and if u are using node 18 use 'pdfjs-dist/build/pdf' or 'pdfjs-dist/legacy/build/pdf' both works fine
@tmoby420
@tmoby420 Год назад
​@@rohit1kumar I appreciate you!
Далее
Read/Write JSON Files with Node.js
10:27
Просмотров 88 тыс.
Extract PDF Content with Python
13:15
Просмотров 217 тыс.
How to render PDF files on HTML 5 Canvas using PDF JS
14:48
Generate a PDF in React
23:38
Просмотров 14 тыс.
Why aren't you using Fastify? Or Koa? Or NestJS?
9:58
5 JavaScript Concepts You HAVE TO KNOW
9:38
Просмотров 1,4 млн
I tried 8 different Postgres ORMs
9:46
Просмотров 415 тыс.
Using PDF-lib and Node.js to populate PDF form
14:59
Просмотров 13 тыс.