Тёмный

Web Scraping With Selenium Python: Delayed JavaScript Rendering 

Smartproxy
Подписаться 3,8 тыс.
Просмотров 2,6 тыс.
50% 1

Wanna learn to web scrape with Selenium? In this Web Scraping With Selenium Python tutorial, you'll learn how to handle dynamic content with delayed JavaScript rendering. Moreover, it will teach you how to scrape in headless and headful modes.
🚀 Try Smartproxy proxies today: bit.ly/3CXBREx
⚙️ You can find Selenium documentation here: www.selenium.d...
⚙️ Beautiful Soup documentation: readthedocs.or...
⚙️ Find the full code archive on our GitHub: github.com/Sma...
The requirements for the code:
webdriver-manager
selenium
bs4
Copy the code:
from webdriver_manager.chrome import ChromeDriverManager
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from extension import proxies
from bs4 import BeautifulSoup
import json
username = 'spkjz8uhm3'
password = 'dwnacUgGr28wQh41yU'
endpoint = 'gate.smartproxy.com'
port = '7000'
# Set up Chrome WebDriver
chrome_options = webdriver.ChromeOptions()
proxies_extension = proxies(username, password, endpoint, port)
chrome_options.add_extension(proxies_extension)
# chrome_options.add_argument("--headless=new")
chrome = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=chrome_options)
# Open the desired webpage
url = "quotes.toscrap..."
chrome.get(url)
# Wait for the "quotes" divs to load
wait = WebDriverWait(chrome, 30)
quote_elements = wait.until(EC.presence_of_all_elements_located((By.CLASS_NAME, "quote")))
# Extract the HTML of all "quote" elements, parse them with BS4 and save to JSON
quote_data = []
for quote_element in quote_elements:
print(quote_element.get_attribute("outerHTML"))
soup = BeautifulSoup(quote_element.get_attribute("outerHTML"), 'html.parser')
quote_text = soup.find('span', class_='text').text
author = soup.find('small', class_='author').text
tags = [tag.text for tag in soup.find_all('a', class_='tag')]
quote_info = {
"Quote": quote_text,
"Author": author,
"Tags": tags
}
quote_data.append(quote_info)
with open('quote_info.json', 'w') as json_file:
json.dump(quote_data, json_file, indent=4)
# Close the WebDriver
chrome.quit()
💡 For more web scraping with Python tutorials, check out our playlist: • Python Web Scraping Tu...
❓ Why use Python for web scraping?
Python is considered one of the most efficient programming languages for web scraping. It is general-purpose and has a variety of web scraping frameworks and libraries, such as Selenium, Beautiful Soup, and Scrapy. What's more, web scraping with Python is easy to learn, even for beginners, thanks to its shallow learning curve.

Опубликовано:

 

29 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 9   
@Smartproxy
@Smartproxy 11 месяцев назад
Hey, now you can find the requirements file, proxy extension, and the code itself on our GitHub account: github.com/Smartproxy/selenium-delayed-js
@UmeshaKaushan-ve3bl
@UmeshaKaushan-ve3bl 3 месяца назад
Mm
@UmeshaKaushan-ve3bl
@UmeshaKaushan-ve3bl 3 месяца назад
Mm
@icrmsoftware59
@icrmsoftware59 5 месяцев назад
What if we want to use only city specific proxies ?
@DesarrolladorVerificante
@DesarrolladorVerificante 10 месяцев назад
How would it work for Selenium Firefox !!
@gxbytes
@gxbytes Год назад
very less ip pool can you help me in this
@Smartproxy
@Smartproxy 5 месяцев назад
If you would need any help, feel free to contact our support team: direct.lc.chat/12092754/1
@darksideishere
@darksideishere Год назад
Thanks a lot for this!
@Smartproxy
@Smartproxy 11 месяцев назад
We're glad to help!
Далее
100 Identical Twins Fight For $250,000
35:40
Просмотров 52 млн
Шоколадная девочка
00:23
Просмотров 189 тыс.
How I Practice Coding With ChatGPT
53:00
Просмотров 14
Web Scraping with Python Selenium & BeautifulSoup
28:45
All about running javascript using selenium python
30:13