{"id":64448,"date":"2025-01-24T13:18:38","date_gmt":"2025-01-24T13:18:38","guid":{"rendered":"https:\/\/proxidize.com\/?post_type=blog&#038;p=64448"},"modified":"2026-01-21T19:26:50","modified_gmt":"2026-01-21T19:26:50","slug":"python-libraries-for-web-scraping","status":"publish","type":"blog","link":"https:\/\/proxidize.com\/blog\/python-libraries-for-web-scraping\/","title":{"rendered":"4 Best Python Libraries for Web Scraping in 2026"},"content":{"rendered":"\n<p>Developers are always looking for the best tools for the job. Nowadays the internet is full of resources and libraries that you can use to optimise your work. Unfortunately, in our enthusiasm to try the latest tools and technologies, devs can jump the gun. Many tools will be still in early stages or still not mature enough to handle complex use cases, which leaves us disappointed.<\/p>\n\n\n\n<p>In this article we\u2019re going to talk about the best Python libraries for web scraping. Python is one of the most used programming languages in the world for <a href=\"https:\/\/proxidize.com\/blog\/web-scraping\/\" target=\"_blank\" rel=\"noreferrer noopener\">web scraping<\/a>. It has rich libraries and a very large community. If you are using AI to code, <a href=\"https:\/\/proxidize.com\/blog\/what-is-python\/\" target=\"_blank\" rel=\"noreferrer noopener\">Python<\/a> is generally the go-to language for AI.<\/p>\n\n\n\n<p>Let\u2019s take a real example from a fellow developer. Let&#8217;s say you work at a company that does an intelligent pricing for books and sells the data as a service. Say you were tasked to create a Python script that scrapes <a href=\"https:\/\/books.toscrape.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">this website<\/a> and gets the following data: Book name, price, and availability. Your script has to save that data to be sold for the users later.&nbsp;<\/p>\n\n\n\n<p><\/p>\n\n\n\t\t<div data-elementor-type=\"container\" data-elementor-id=\"85693\" class=\"elementor elementor-85693\" data-elementor-post-type=\"elementor_library\">\n\t\t\t\t<div class=\"elementor-element elementor-element-53838f9 e-con-full no-scale elementor-hidden-mobile_extra elementor-hidden-mobile e-flex e-con e-child\" data-id=\"53838f9\" data-element_type=\"container\" data-e-type=\"container\" data-settings=\"{&quot;background_background&quot;:&quot;gradient&quot;}\">\n\t\t<div class=\"elementor-element elementor-element-264a6ec e-grid e-con-full e-con e-child\" data-id=\"264a6ec\" data-element_type=\"container\" data-e-type=\"container\">\n\t\t<div class=\"elementor-element elementor-element-4986847 e-con-full e-flex e-con e-child\" data-id=\"4986847\" data-element_type=\"container\" data-e-type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-f8b9092 elementor-widget elementor-widget-heading\" data-id=\"f8b9092\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<p class=\"elementor-heading-title elementor-size-default\">High-quality scraping and automation  \nstarts with high-quality mobile proxies<\/p>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-fd5a829 e-con-full e-flex e-con e-child\" data-id=\"fd5a829\" data-element_type=\"container\" data-e-type=\"container\">\n\t\t<div class=\"elementor-element elementor-element-0087840 e-con-full e-flex e-con e-child\" data-id=\"0087840\" data-element_type=\"container\" data-e-type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-1e530dc elementor-widget__width-initial elementor-widget elementor-widget-image\" data-id=\"1e530dc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"125\" height=\"80\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2025\/10\/20-2.svg\" class=\"attachment-full size-full wp-image-86191\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f634f7d inline-CTA elementor-widget elementor-widget-button\" data-id=\"f634f7d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/proxidize.com\/mobile-proxy-pricing\/?coupon_code=20OFFMPB\" target=\"_blank\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Buy Proxies Now<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\n\n\n\n<p><\/p>\n\n\n\n<p>After a meeting with stakeholders you eventually go look at the website and see how its HTML is structured. You start to put together a plan for how to collect the data you need. You begin looking for the best tool for the job, which leads you down the rabbit hole of Python libraries.<\/p>\n\n\n\n<p>Let\u2019s talk about the four best Python libraries for web scraping. We\u2019ll walk you through code examples, when to use which libraries, and each one\u2019s pros and cons so that by the end you will have a better understanding of how to use each one and how to make your code more efficient\/effective as well.<\/p>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large centered\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"536\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/best_practices_before_web_scraping-img-1024x536.jpg\" alt=\"a drawing of the python logo, a browser and a magnifying glass under the title &quot;best practices before web scraping&quot;\" class=\"wp-image-95421\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/best_practices_before_web_scraping-img-1024x536.jpg 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/best_practices_before_web_scraping-img-300x157.jpg 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/best_practices_before_web_scraping-img-768x402.jpg 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/best_practices_before_web_scraping-img-600x314.jpg 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/best_practices_before_web_scraping-img.jpg 1200w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Best Practices Before Web Scraping<\/h2>\n\n\n\n<p>Before you start picking a Python web scraping library, there are some steps you should take before you start coding.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Check The Website Structure&nbsp;<\/h3>\n\n\n\n<p>As a developer you should always check the website you are scraping. By that I mean looking at the <a href=\"https:\/\/developer.chrome.com\/docs\/devtools\/inspect-mode\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">inspect window<\/a> and reading the target site\u2019s code. Pay attention to how it structures its data so you can build your code in function of the website.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Always Use a Virtual Environment in Python<\/h3>\n\n\n\n<p>To get started with these libraries, you need to install them. This is where some devs go wrong by falling into the trap of downloading each package every time you start a new project. This is a great way to run out of space.<\/p>\n\n\n\n<p>To solve this you need to use something called <a href=\"https:\/\/docs.python.org\/3\/library\/venv.html\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">venv<\/a> (virtual environment variable). It creates a virtual environment for just the current project and won\u2019t affect any other projects on your device. This both saves you space and keeps projects isolated from others.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Use a Package Manager<\/h3>\n\n\n\n<p>Using a package manager can make your life easier as a developer. You can control all your packages in one place. For this article, we used <a href=\"https:\/\/docs.astral.sh\/uv\/concepts\/projects\/sync\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">uv<\/a>. It&#8217;s easy to use, fast, and user friendly.<\/p>\n\n\n\n<p>Now that we\u2019ve covered the pre-project checks, let\u2019s dive into the four best Python libraries for web scraping.<\/p>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large centered\"><img decoding=\"async\" width=\"1024\" height=\"536\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/beautifulsoup4-img-1024x536.jpg\" alt=\"a drawing of the beautifulsoup logo under the title &quot;BeautifulSoup4&quot;\" class=\"wp-image-95420\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/beautifulsoup4-img-1024x536.jpg 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/beautifulsoup4-img-300x157.jpg 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/beautifulsoup4-img-768x402.jpg 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/beautifulsoup4-img-600x314.jpg 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/beautifulsoup4-img.jpg 1200w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">BeautifulSoup4<\/h2>\n\n\n\n<p><a href=\"https:\/\/proxidize.com\/blog\/what-is-beautifulsoup\/\" target=\"_blank\" rel=\"noreferrer noopener\">BeautifulSoup<\/a> is a library used to scrape information from web pages. Essentially it sits on top of an HTML or XML parser providing idioms for iteration, searching and modifying the parse tree. This library has a great deal of credibility because it has been around for the past 20 years. It&#8217;s very mature and well maintained.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Installing BeautifulSoup4<\/h3>\n\n\n\n<p>Installing the library is as simple as installing any other library; simply copy and paste the command into your terminal and keep in mind the recommended download techniques we discussed.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(1 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#e0def4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>pip install beautifulsoup4<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #EA9A97\">pip<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">install<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">beautifulsoup4<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Web Scraping with BeautifulSoup4<\/h3>\n\n\n\n<p>Once installed \u2014 to continue our example web scraping task \u2014 we will try to scrape the pricing, book name, and availability of the books of our target website. <a href=\"https:\/\/pypi.org\/project\/requests\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Requests<\/a> is a required library to <a href=\"https:\/\/proxidize.com\/blog\/web-scraping-with-beautiful-soup\/\" target=\"_blank\" rel=\"noreferrer noopener\">web scrape with BeautifulSoup4<\/a> to process HTTP requests.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#e0def4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>import requests\nfrom bs4 import BeautifulSoup\n\nurl = \"https:\/\/books.toscrape.com\/\"\nresponse = requests.get(url)\nresponse.raise_for_status()\n\nsoup = BeautifulSoup(response.text, \"html.parser\")\n\nbooks = soup.select(\"article.product_pod\")\n\nscraped_data = []\n\nfor book in books:\n    title = book.select_one(\"h3 a\")&#091;\"title\"&#093;\n    price = book.select_one(\"p.price_color\").text\n    availability = book.select_one(\"p.instock.availability\").text.strip()\n\n    scraped_data.append({\n        \"title\": title,\n        \"price\": price,\n        \"availability\": availability\n    })\n\nfor idx, book in enumerate(scraped_data, start=1):\n    print(f\"{idx}. {book&#091;'title'&#093;} \u2014 {book&#091;'price'&#093;} \u2014 {book&#091;'availability'&#093;}\")<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> requests<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> bs4 <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> BeautifulSoup<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">url <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;https:\/\/books.toscrape.com\/&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">response <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> requests<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">get<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">url<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">raise_for_status<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">soup <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> BeautifulSoup<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">text<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;html.parser&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">books <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> soup<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">select<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;article.product_pod&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">scraped_data <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #908CAA\">[]<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">for<\/span><span style=\"color: #E0DEF4\"> book <\/span><span style=\"color: #3E8FB0\">in<\/span><span style=\"color: #E0DEF4\"> books<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    title <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> book<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">select_one<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;h3 a&quot;<\/span><span style=\"color: #908CAA\">)&#091;<\/span><span style=\"color: #F6C177\">&quot;title&quot;<\/span><span style=\"color: #908CAA\">&#093;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    price <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> book<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">select_one<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;p.price_color&quot;<\/span><span style=\"color: #908CAA\">).<\/span><span style=\"color: #E0DEF4\">text<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    availability <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> book<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">select_one<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;p.instock.availability&quot;<\/span><span style=\"color: #908CAA\">).<\/span><span style=\"color: #E0DEF4\">text<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">strip<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    scraped_data<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">append<\/span><span style=\"color: #908CAA\">({<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #F6C177\">&quot;title&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> title<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #F6C177\">&quot;price&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> price<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #F6C177\">&quot;availability&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> availability<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">})<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">for<\/span><span style=\"color: #E0DEF4\"> idx<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> book <\/span><span style=\"color: #3E8FB0\">in<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EB6F92; font-style: italic\">enumerate<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">scraped_data<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">start<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #EA9A97\">1<\/span><span style=\"color: #908CAA\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">idx<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">. <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">book<\/span><span style=\"color: #908CAA\">&#091;<\/span><span style=\"color: #F6C177\">&#39;title&#39;<\/span><span style=\"color: #908CAA\">&#093;<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\"> \u2014 <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">book<\/span><span style=\"color: #908CAA\">&#091;<\/span><span style=\"color: #F6C177\">&#39;price&#39;<\/span><span style=\"color: #908CAA\">&#093;<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\"> \u2014 <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">book<\/span><span style=\"color: #908CAA\">&#091;<\/span><span style=\"color: #F6C177\">&#39;availability&#39;<\/span><span style=\"color: #908CAA\">&#093;<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Pros of Web Scraping with BeautifulSoup4<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Beginner friendly:<\/strong> If you are just starting as a developer, you will be in safe hands with this library. It\u2019s one of the most beginner friendly web scraping libraries out there with clear documentation and API.<\/li>\n\n\n\n<li><strong>Strong community support:<\/strong> Because it&#8217;s been around for the last 20 years, many people have used it and built projects on top of it. That means there\u2019s a lot of people online to ask for help if you need it.<\/li>\n\n\n\n<li><strong>Lightweight and fast: <\/strong>For static pages, BeautifulSoup is faster than any library, because its main strength is to scrape HTML websites. Most of the time when dealing with this kind of website you don&#8217;t need JavaScript execution.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Cons of Web Scraping with BeautifulSoup4<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Requires external libraries:<\/strong> BeautifulSoup is a parsing library. It lacks its own way to handle requests or do any automation scripts, so it needs other libraries as we saw earlier in the code.<\/li>\n\n\n\n<li><strong>Relatively slower than other libraries:<\/strong> It uses Python\u2019s built-in html.parser which is slower compared to other alternatives such as lxml. It supports synchronous&nbsp; operations by default, which makes it slow, since it waits for the other request to finish.<\/li>\n<\/ol>\n\n\n\n<p><strong>You always should maintain the code: <\/strong>Because BeautifulSoup relies on HTML structure, you will have to update your code whenever the website\u2019s structure changes.<\/p>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large centered\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"536\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/scrapy-img-1024x536.jpg\" alt=\"a drawing of the scrapy logo under the title &quot;scrapy&quot;\" class=\"wp-image-95419\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/scrapy-img-1024x536.jpg 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/scrapy-img-300x157.jpg 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/scrapy-img-768x402.jpg 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/scrapy-img-600x314.jpg 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/scrapy-img.jpg 1200w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Scrapy<\/h2>\n\n\n\n<p><a href=\"https:\/\/www.scrapy.org\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Scrapy<\/a> is one of the newest Python libraries for web scraping available out there. It has a very good reputation and developers love it because it\u2019s both free and open source. It can take care of everything for you, including managing requests, storing data in an organized way to make it easier for you to start scraping.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Installing Scrapy<\/h3>\n\n\n\n<p>Scrapy\u2019s website is cool and offers very clear documentation and an extremely simple installation process. Basically, you copy and paste the command into your terminal and you are ready to start scraping.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(1 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#e0def4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>pip install scrapy<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #EA9A97\">pip<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">install<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">scrapy<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Web Scraping with Scrapy<\/h3>\n\n\n\n<p>Now you should be able to start scraping the book website and extract the data needed from it.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#e0def4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>import scrapy\n\nclass BooksSpider(scrapy.Spider):\n    name = \"books\"\n    allowed_domains = &#091;\"books.toscrape.com\"&#093;\n    start_urls = &#091;\"https:\/\/books.toscrape.com\/\"&#093;\n\n    def parse(self, response):\n        for book in response.css(\"article.product_pod\"):\n            availability = \"\".join(book.css(\"p.instock.availability::text\").getall()).strip()\n            \n            yield {\n                \"title\": book.css(\"h3 a::attr(title)\").get(),\n                \"price\": book.css(\"p.price_color::text\").get(),\n                \"availability\": availability\n            }<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> scrapy<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">class<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #9CCFD8\">BooksSpider<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #C4A7E7; font-style: italic\">scrapy<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #C4A7E7; font-style: italic\">Spider<\/span><span style=\"color: #908CAA\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    name <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;books&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    allowed_domains <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #908CAA\">&#091;<\/span><span style=\"color: #F6C177\">&quot;books.toscrape.com&quot;<\/span><span style=\"color: #908CAA\">&#093;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    start_urls <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #908CAA\">&#091;<\/span><span style=\"color: #F6C177\">&quot;https:\/\/books.toscrape.com\/&quot;<\/span><span style=\"color: #908CAA\">&#093;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #3E8FB0\">def<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EA9A97\">parse<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #C4A7E7; font-style: italic\">self<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">response<\/span><span style=\"color: #908CAA\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #3E8FB0\">for<\/span><span style=\"color: #E0DEF4\"> book <\/span><span style=\"color: #3E8FB0\">in<\/span><span style=\"color: #E0DEF4\"> response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">css<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;article.product_pod&quot;<\/span><span style=\"color: #908CAA\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            availability <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;&quot;<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">join<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">book<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">css<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;p.instock.availability::text&quot;<\/span><span style=\"color: #908CAA\">).<\/span><span style=\"color: #E0DEF4\">getall<\/span><span style=\"color: #908CAA\">()).<\/span><span style=\"color: #E0DEF4\">strip<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            <\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            <\/span><span style=\"color: #3E8FB0\">yield<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #908CAA\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;title&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> book<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">css<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;h3 a::attr(title)&quot;<\/span><span style=\"color: #908CAA\">).<\/span><span style=\"color: #E0DEF4\">get<\/span><span style=\"color: #908CAA\">(),<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;price&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> book<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">css<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;p.price_color::text&quot;<\/span><span style=\"color: #908CAA\">).<\/span><span style=\"color: #E0DEF4\">get<\/span><span style=\"color: #908CAA\">(),<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;availability&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> availability<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            <\/span><span style=\"color: #908CAA\">}<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Pros of Web Scraping with Scrapy<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>High performance and efficiency: <\/strong>Scrapy is built on asynchronous networking, which allows it to handle thousands of requests concurrently without the need to wait for other requests. This makes it&nbsp; highly efficient and desirable to developers.<\/li>\n\n\n\n<li><strong>Built-in features:<\/strong> It contains many features that make the life of a developer easier, like support for CSS selectors and XPath expressions, and it has an autothrottle feature that prevents overloading websites.<\/li>\n\n\n\n<li><strong>Multiple output formats: <\/strong>It\u2019s a nice feature to have, especially when you have to work with data or AI frequently. Being able to have your collected data in a variety formats can save you a lot of time and effort.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Cons of Web Scraping with Scrapy<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Overkill for small projects: <\/strong>Scrapy is great, but to really take full advantage of the Python library you should use it for larger projects. That\u2019s where you\u2019ll also find you have a need to make use of its many features.<\/li>\n\n\n\n<li><strong>Steep learning curve:<\/strong> Scrapy can be a lot for a first-time user or junior dev. You\u2019ll get used to it but it\u2019ll take some time.<\/li>\n<\/ol>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large centered\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"536\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/playwright-img-1024x536.jpg\" alt=\"a drawing of the playwright logo, a laptop, the chrome and firefox logos under the title &quot;playwright&quot;\" class=\"wp-image-95418\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/playwright-img-1024x536.jpg 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/playwright-img-300x157.jpg 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/playwright-img-768x402.jpg 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/playwright-img-600x314.jpg 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/playwright-img.jpg 1200w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Playwright<\/h2>\n\n\n\n<p><a href=\"https:\/\/github.com\/microsoft\/playwright\" target=\"_blank\" rel=\"noreferrer noopener\">Playwright<\/a> is an open-source framework that is backed and developed by <a href=\"https:\/\/www.microsoft.com\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Microsoft<\/a>. It enables you to automate multiple browsers, like Chromium and Firefox, in a single API. The Playwright library is used to do automated tasks in browsers.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Installing Playwright<\/h3>\n\n\n\n<p>To install the framework, it\u2019s a straightforward process. Like the other libraries we used, we will use pip as our installation manager.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(1 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#e0def4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>pip install pytest-playwright<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #EA9A97\">pip<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">install<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">pytest-playwright<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Web Scraping with Playwright<\/h3>\n\n\n\n<p>After installation, you will be able to start automating. You\u2019ll also need a few other libraries like chromium and asyncio.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#e0def4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>import asyncio\nimport json\nfrom playwright.async_api import async_playwright\n\nasync def scrape_books():\n    async with async_playwright() as p:\n        browser = await p.chromium.launch(headless=True)\n        page = await browser.new_page()\n\n        await page.goto(\"https:\/\/books.toscrape.com\/\")\n        await page.wait_for_selector(\"article.product_pod\")\n\n        books = []\n\n        book_elements = await page.query_selector_all(\"article.product_pod\")\n\n        for book in book_elements:\n            title = await book.query_selector_eval(\n                \"h3 a\", \"el => el.getAttribute('title')\"\n            )\n            price = await book.query_selector_eval(\n                \"p.price_color\", \"el => el.textContent\"\n            )\n            availability = await book.query_selector_eval(\n                \"p.instock.availability\",\n                \"el => el.textContent.replace(\/\\\\s+\/g, ' ').trim()\"\n            )\n\n            books.append({\n                \"title\": title,\n                \"price\": price,\n                \"availability\": availability\n            })\n\n        await browser.close()\n        return books\n\n\nasync def main():\n    books = await scrape_books()\n\n    print(f\"Scraped {len(books)} books\\n\")\n    print(json.dumps(books, indent=2, ensure_ascii=False))\n\n    with open(\"books_playwright.json\", \"w\", encoding=\"utf-8\") as f:\n        json.dump(books, f, indent=2, ensure_ascii=False)\n\n    print(\"Data saved to books_playwright.json\")\n\n\nif __name__ == \"__main__\":\n    asyncio.run(main())<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> asyncio<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> json<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> playwright<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">async_api <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> async_playwright<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">async<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">def<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EA9A97\">scrape_books<\/span><span style=\"color: #908CAA\">():<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #3E8FB0\">async<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">with<\/span><span style=\"color: #E0DEF4\"> async_playwright<\/span><span style=\"color: #908CAA\">()<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">as<\/span><span style=\"color: #E0DEF4\"> p<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        browser <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">await<\/span><span style=\"color: #E0DEF4\"> p<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">chromium<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">launch<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #C4A7E7; font-style: italic\">headless<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #EA9A97\">True<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        page <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">await<\/span><span style=\"color: #E0DEF4\"> browser<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">new_page<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #3E8FB0\">await<\/span><span style=\"color: #E0DEF4\"> page<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">goto<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;https:\/\/books.toscrape.com\/&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #3E8FB0\">await<\/span><span style=\"color: #E0DEF4\"> page<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">wait_for_selector<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;article.product_pod&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        books <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #908CAA\">[]<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        book_elements <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">await<\/span><span style=\"color: #E0DEF4\"> page<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">query_selector_all<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;article.product_pod&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #3E8FB0\">for<\/span><span style=\"color: #E0DEF4\"> book <\/span><span style=\"color: #3E8FB0\">in<\/span><span style=\"color: #E0DEF4\"> book_elements<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            title <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">await<\/span><span style=\"color: #E0DEF4\"> book<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">query_selector_eval<\/span><span style=\"color: #908CAA\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;h3 a&quot;<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;el =&gt; el.getAttribute(&#39;title&#39;)&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            <\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            price <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">await<\/span><span style=\"color: #E0DEF4\"> book<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">query_selector_eval<\/span><span style=\"color: #908CAA\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;p.price_color&quot;<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;el =&gt; el.textContent&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            <\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            availability <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">await<\/span><span style=\"color: #E0DEF4\"> book<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">query_selector_eval<\/span><span style=\"color: #908CAA\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;p.instock.availability&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;el =&gt; el.textContent.replace(\/<\/span><span style=\"color: #3E8FB0\">\\\\<\/span><span style=\"color: #F6C177\">s+\/g, &#39; &#39;).trim()&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            <\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            books<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">append<\/span><span style=\"color: #908CAA\">({<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;title&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> title<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;price&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> price<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;availability&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> availability<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            <\/span><span style=\"color: #908CAA\">})<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #3E8FB0\">await<\/span><span style=\"color: #E0DEF4\"> browser<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">close<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #3E8FB0\">return<\/span><span style=\"color: #E0DEF4\"> books<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">async<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">def<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EA9A97\">main<\/span><span style=\"color: #908CAA\">():<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    books <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">await<\/span><span style=\"color: #E0DEF4\"> scrape_books<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Scraped <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #EB6F92; font-style: italic\">len<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">books<\/span><span style=\"color: #908CAA\">)<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\"> books<\/span><span style=\"color: #3E8FB0\">\\n<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">json<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">dumps<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">books<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">indent<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #EA9A97\">2<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">ensure_ascii<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #EA9A97\">False<\/span><span style=\"color: #908CAA\">))<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #3E8FB0\">with<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EB6F92; font-style: italic\">open<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;books_playwright.json&quot;<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;w&quot;<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">encoding<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #F6C177\">&quot;utf-8&quot;<\/span><span style=\"color: #908CAA\">)<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">as<\/span><span style=\"color: #E0DEF4\"> f<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        json<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">dump<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">books<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> f<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">indent<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #EA9A97\">2<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">ensure_ascii<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #EA9A97\">False<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;Data saved to books_playwright.json&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">if<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #9CCFD8\">__name__<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">==<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;__main__&quot;<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    asyncio<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">run<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">main<\/span><span style=\"color: #908CAA\">())<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Pros of Web Scraping with Playwright<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>It&#8217;s open source: <\/strong>Developers love open-source projects. We love to be in complete control of our code and Playwright gives us the freedom to modify our code as we see fit.<\/li>\n\n\n\n<li><strong>Automatic waits: <\/strong>If you\u2019ve ever dabbed in web scraping before, you know how annoying it is to create multiple tests to make sure that waits are visible. I struggled with this when I was building a <a href=\"https:\/\/proxidize.com\/blog\/twitter-scraper\/\" target=\"_blank\" rel=\"noreferrer noopener\">Twitter\/X scraper<\/a> until I switched to Playwright. It creates its own automatic testing, which can save you a significant amount of time.<\/li>\n\n\n\n<li><strong>Headless or GUI mode:<\/strong> Speed is another thing devs care about. Achieving speed isn\u2019t possible without trial, errors, and debugging. Luckily, Playwright solves this problem by giving you the option of using GUI mode for debugging and testing your code. Then, when you need speed for functional code, you switch to headless mode.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Cons of Web Scraping with Playwright<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Learning curve:<\/strong> Playwright isn\u2019t <em>difficult<\/em> to learn, but it will take some time for a developer unfamiliar with it to get used to it. The same is true of using any new tool for the first time.<\/li>\n\n\n\n<li><strong>Large installation size<\/strong>: Playwright has a few prerequisites that need to be installed before you get started, and these files can be resource intensive sometimes.<\/li>\n\n\n\n<li><strong>Small community: <\/strong>Although it&#8217;s growing and more people are switching to Playwright, it still has a very small community compared to something like Selenium. That means there will be fewer people around to help you if you really hit a wall.<\/li>\n<\/ol>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large centered\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"536\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/selenium-img-1024x536.jpg\" alt=\"a drawing of the selenium logo and a browser under the title &quot;selenium&quot;\" class=\"wp-image-95417\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/selenium-img-1024x536.jpg 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/selenium-img-300x157.jpg 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/selenium-img-768x402.jpg 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/selenium-img-600x314.jpg 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/selenium-img.jpg 1200w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Selenium<\/h2>\n\n\n\n<p>Selenium is an open-source framework and it&#8217;s one of the oldest open-source projects there is. It has a very large community and lots of support. It allows you to automate browser tasks.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Installing Selenium<\/h3>\n\n\n\n<p>To install the last version of the framework, you should add the following command to your terminal:<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(1 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#e0def4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>pip install selenium<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #EA9A97\">pip<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">install<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">selenium<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Web Scraping with Selenium<\/h3>\n\n\n\n<p>After the installation you can start <a href=\"https:\/\/proxidize.com\/blog\/web-scraping-with-selenium\/\" target=\"_blank\" rel=\"noreferrer noopener\">web scraping with Selenium<\/a> via your code:<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#e0def4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>import json\nfrom selenium import webdriver\nfrom selenium.webdriver.common.by import By\nfrom selenium.webdriver.chrome.service import Service\nfrom selenium.webdriver.chrome.options import Options\nfrom selenium.webdriver.support.ui import WebDriverWait\nfrom selenium.webdriver.support import expected_conditions as EC\nfrom webdriver_manager.chrome import ChromeDriverManager\n\n\ndef scrape_books():\n    chrome_options = Options()\n    chrome_options.add_argument(\"--headless\")\n    chrome_options.add_argument(\"--no-sandbox\")\n    chrome_options.add_argument(\"--disable-dev-shm-usage\")\n\n    print(\"Initializing Chrome WebDriver...\")\n    service = Service(ChromeDriverManager().install())\n    driver = webdriver.Chrome(service=service, options=chrome_options)\n\n    try:\n        print(\"Navigating to https:\/\/books.toscrape.com\/\")\n        driver.get(\"https:\/\/books.toscrape.com\/\")\n\n        wait = WebDriverWait(driver, 10)\n\n        wait.until(EC.title_contains(\"Books\"))\n        wait.until(EC.presence_of_all_elements_located(\n            (By.CSS_SELECTOR, \"article.product_pod\")\n        ))\n\n        book_elements = driver.find_elements(\n            By.CSS_SELECTOR, \"article.product_pod\"\n        )\n\n        books_data = []\n\n        for book in book_elements:\n            title = book.find_element(\n                By.CSS_SELECTOR, \"h3 a\"\n            ).get_attribute(\"title\")\n\n            price = book.find_element(\n                By.CSS_SELECTOR, \"p.price_color\"\n            ).text\n\n            availability = \" \".join(\n                book.find_element(\n                    By.CSS_SELECTOR, \"p.instock.availability\"\n                ).text.split()\n            )\n\n            books_data.append({\n                \"title\": title,\n                \"price\": price,\n                \"availability\": availability\n            })\n\n        return books_data\n\n    finally:\n        driver.quit()\n\n\ndef main():\n    print(\"Starting Selenium scraper...\\n\")\n\n    books = scrape_books()\n\n    print(f\"Scraped {len(books)} books:\\n\")\n    print(json.dumps(books, indent=2, ensure_ascii=False))\n\n    with open(\"books_selenium.json\", \"w\", encoding=\"utf-8\") as f:\n        json.dump(books, f, indent=2, ensure_ascii=False)\n\n    print(\"\\n\u2713 Data saved to books_selenium.json\")\n\n\nif __name__ == \"__main__\":\n    main()<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> json<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> selenium <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> webdriver<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> selenium<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">webdriver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">common<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">by <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> By<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> selenium<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">webdriver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">chrome<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">service <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> Service<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> selenium<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">webdriver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">chrome<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">options <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> Options<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> selenium<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">webdriver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">support<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">ui <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> WebDriverWait<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> selenium<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">webdriver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">support <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> expected_conditions <\/span><span style=\"color: #3E8FB0\">as<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">EC<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> webdriver_manager<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">chrome <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> ChromeDriverManager<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">def<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EA9A97\">scrape_books<\/span><span style=\"color: #908CAA\">():<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    chrome_options <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> Options<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    chrome_options<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">add_argument<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;--headless&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    chrome_options<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">add_argument<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;--no-sandbox&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    chrome_options<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">add_argument<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;--disable-dev-shm-usage&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;Initializing Chrome WebDriver...&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    service <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> Service<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">ChromeDriverManager<\/span><span style=\"color: #908CAA\">().<\/span><span style=\"color: #E0DEF4\">install<\/span><span style=\"color: #908CAA\">())<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    driver <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> webdriver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">Chrome<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #C4A7E7; font-style: italic\">service<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\">service<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">options<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\">chrome_options<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #3E8FB0\">try<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;Navigating to https:\/\/books.toscrape.com\/&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        driver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">get<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;https:\/\/books.toscrape.com\/&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        wait <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> WebDriverWait<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">driver<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EA9A97\">10<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        wait<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">until<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">EC<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">title_contains<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;Books&quot;<\/span><span style=\"color: #908CAA\">))<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        wait<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">until<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">EC<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">presence_of_all_elements_located<\/span><span style=\"color: #908CAA\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            <\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">By<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #3E8FB0\">CSS_SELECTOR<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;article.product_pod&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #908CAA\">))<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        book_elements <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> driver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">find_elements<\/span><span style=\"color: #908CAA\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            By<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #3E8FB0\">CSS_SELECTOR<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;article.product_pod&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        books_data <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #908CAA\">[]<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #3E8FB0\">for<\/span><span style=\"color: #E0DEF4\"> book <\/span><span style=\"color: #3E8FB0\">in<\/span><span style=\"color: #E0DEF4\"> book_elements<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            title <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> book<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">find_element<\/span><span style=\"color: #908CAA\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                By<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #3E8FB0\">CSS_SELECTOR<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;h3 a&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            <\/span><span style=\"color: #908CAA\">).<\/span><span style=\"color: #E0DEF4\">get_attribute<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;title&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            price <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> book<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">find_element<\/span><span style=\"color: #908CAA\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                By<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #3E8FB0\">CSS_SELECTOR<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;p.price_color&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            <\/span><span style=\"color: #908CAA\">).<\/span><span style=\"color: #E0DEF4\">text<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            availability <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot; &quot;<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">join<\/span><span style=\"color: #908CAA\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                book<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">find_element<\/span><span style=\"color: #908CAA\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                    By<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #3E8FB0\">CSS_SELECTOR<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;p.instock.availability&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #908CAA\">).<\/span><span style=\"color: #E0DEF4\">text<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">split<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            <\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            books_data<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">append<\/span><span style=\"color: #908CAA\">({<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;title&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> title<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;price&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> price<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">                <\/span><span style=\"color: #F6C177\">&quot;availability&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> availability<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">            <\/span><span style=\"color: #908CAA\">})<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        <\/span><span style=\"color: #3E8FB0\">return<\/span><span style=\"color: #E0DEF4\"> books_data<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #3E8FB0\">finally<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        driver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">quit<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">def<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EA9A97\">main<\/span><span style=\"color: #908CAA\">():<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;Starting Selenium scraper...<\/span><span style=\"color: #3E8FB0\">\\n<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    books <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> scrape_books<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Scraped <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #EB6F92; font-style: italic\">len<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">books<\/span><span style=\"color: #908CAA\">)<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\"> books:<\/span><span style=\"color: #3E8FB0\">\\n<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">json<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">dumps<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">books<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">indent<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #EA9A97\">2<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">ensure_ascii<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #EA9A97\">False<\/span><span style=\"color: #908CAA\">))<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #3E8FB0\">with<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EB6F92; font-style: italic\">open<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;books_selenium.json&quot;<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;w&quot;<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">encoding<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #F6C177\">&quot;utf-8&quot;<\/span><span style=\"color: #908CAA\">)<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">as<\/span><span style=\"color: #E0DEF4\"> f<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">        json<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">dump<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">books<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> f<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">indent<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #EA9A97\">2<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">ensure_ascii<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #EA9A97\">False<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #3E8FB0\">\\n<\/span><span style=\"color: #F6C177\">\u2713 Data saved to books_selenium.json&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">if<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #9CCFD8\">__name__<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #3E8FB0\">==<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;__main__&quot;<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    main<\/span><span style=\"color: #908CAA\">()<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Pros of Web Scraping with Selenium<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>It&#8217;s open source:<\/strong> As stated before, developers love open-source projects because they can tinker with the codebase until it exactly fulfills their requirements.<\/li>\n\n\n\n<li><strong>Large community: <\/strong>Since it\u2019s a framework that has been available for just over two decades, many people have used it and contributed to it. That gives it an advantage to many comparable frameworks: a lot of people have solved many different problems with Selenium. The chances that you\u2019re trying to accomplish something truly unique with it are small, and you can fall back on an experienced community for help.<\/li>\n\n\n\n<li><strong>Broadly supported: <\/strong>Selenium works perfectly on many platforms, from Mac, Windows, and Linux to most browsers, like Google Chrome, Firefox, and Opera.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Cons of Web Scraping with Selenium<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Learning curve:<\/strong> A long-lived library comes with many updates and a learning curve. Teams that try to avoid maintenance try to avoid Selenium for this reason.<\/li>\n\n\n\n<li><strong>Lacks built-in capabilities: <\/strong>Selenium is great, but libraries that come with built-in tools are better and unfortunately Selenium doesn&#8217;t offer that. There are many cases in which you\u2019ll need a third-party tool to do some tasks, which costs money and time.<\/li>\n<\/ol>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large centered\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"536\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/python_libraries_for_web_scraping_compared-img-1024x536.jpg\" alt=\"a drawing of a laptop under the title &quot;python libraries for web scraping compared: practical test&quot;\" class=\"wp-image-95416\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/python_libraries_for_web_scraping_compared-img-1024x536.jpg 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/python_libraries_for_web_scraping_compared-img-300x157.jpg 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/python_libraries_for_web_scraping_compared-img-768x402.jpg 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/python_libraries_for_web_scraping_compared-img-600x314.jpg 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2026\/01\/python_libraries_for_web_scraping_compared-img.jpg 1200w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Python Libraries for Web Scraping Compared: Practical Test<\/h2>\n\n\n\n<p>The easiest and fastest way to compare these four Python libraries for web scraping to each other is to have them each scrape the same <a href=\"https:\/\/books.toscrape.com\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">target website<\/a> and compare the results. For each, I write some simple code. Each of them had to do the following:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Open the website<\/li>\n\n\n\n<li>Scrape the website and retrieve the title, price, and availability of the book<\/li>\n\n\n\n<li>Return the output in JSON format<\/li>\n<\/ol>\n\n\n\n<p>This is a very simple test, one that admittedly doesn\u2019t truly contrast the complexities of each library\u2019s features, but it can provide us with an indication of which one is easier and faster to use.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Output<\/h3>\n\n\n\n<p>Each library outputs the same data. We scraped the website with the intent of getting the title, price and availability of each book.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#e0def4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>&#091;\n  {\n    \"title\": \"A Light in the Attic\",\n    \"price\": \"\\u00a351.77\",\n    \"availability\": \"In stock\"\n  },\n  {\n    \"title\": \"Soumission\",\n    \"price\": \"\\u00a350.10\",\n    \"availability\": \"In stock\"\n  },\n  {\n    \"title\": \"Sharp Objects\",\n    \"price\": \"\\u00a347.82\",\n    \"availability\": \"In stock\"\n  },\n  {\n    \"title\": \"Sapiens: A Brief History of Humankind\",\n    \"price\": \"\\u00a354.23\",\n    \"availability\": \"In stock\"\n  },\n  {\n    \"title\": \"The Requiem Red\",\n    \"price\": \"\\u00a322.65\",\n    \"availability\": \"In stock\"\n  },\n  {\n    \"title\": \"The Dirty Little Secrets of Getting Your Dream Job\",\n    \"price\": \"\\u00a333.34\",\n    \"availability\": \"In stock\"\n  },\n&#093;<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #908CAA\">&#091;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">title<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;A Light in the Attic&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">price<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #3E8FB0\">\\u00a3<\/span><span style=\"color: #F6C177\">51.77&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">availability<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;In stock&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA\">},<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">title<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;Soumission&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">price<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #3E8FB0\">\\u00a3<\/span><span style=\"color: #F6C177\">50.10&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">availability<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;In stock&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA\">},<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">title<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;Sharp Objects&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">price<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #3E8FB0\">\\u00a3<\/span><span style=\"color: #F6C177\">47.82&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">availability<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;In stock&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA\">},<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">title<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;Sapiens: A Brief History of Humankind&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">price<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #3E8FB0\">\\u00a3<\/span><span style=\"color: #F6C177\">54.23&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">availability<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;In stock&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA\">},<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">title<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;The Requiem Red&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">price<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #3E8FB0\">\\u00a3<\/span><span style=\"color: #F6C177\">22.65&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">availability<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;In stock&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA\">},<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">title<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;The Dirty Little Secrets of Getting Your Dream Job&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">price<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #3E8FB0\">\\u00a3<\/span><span style=\"color: #F6C177\">33.34&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #9CCFD8\">availability<\/span><span style=\"color: #908CAA\">&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;In stock&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA\">},<\/span><\/span>\n<span class=\"line\"><span style=\"color: #908CAA\">&#093;<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Speed Comparison<\/h3>\n\n\n\n<p>As you can see from the results, BeautifulSoup4 is the winner here. It was the easiest and the fastest to set up between all four, and is specifically designed for static, easy-to-scrape sites.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Library&nbsp;<\/th><th>Setup Difficulty&nbsp;<\/th><th>Average Time<\/th><th>Difference&nbsp;<\/th><\/tr><\/thead><tbody><tr><td>BeautifulSoup4<\/td><td>Very easy<\/td><td>0.896s<\/td><td>Fastest<\/td><\/tr><tr><td>Scrapy<\/td><td>Medium&nbsp;<\/td><td>0.910s<\/td><td>+0.014s<\/td><\/tr><tr><td>Playwright<\/td><td>Easy&nbsp;<\/td><td>2.330s<\/td><td>+1.434s<\/td><\/tr><tr><td>Selenium<\/td><td>Medium<\/td><td>1.671s<\/td><td>+0.775s<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Scrapy was only a fraction of a second behind BeautifulSoup, with the others following shortly after.<\/p>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>When looking for a Python library for web scraping, you have a lot of options to choose from as a developer. We hope this article has made that choice easier. Remember to communicate with your stakeholders to understand their requirements and needs. Let those inform what choice of web scraping library.<\/p>\n\n\n\n<p><strong>Key takeaways:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Always have a clear set of requirements in mind before you start any web scraping project. Choose your tech stack in function of the project\u2019s goals.<\/li>\n\n\n\n<li>If you have a small project and limited budget, choose BeautifulSoup, since it&#8217;s fast and great fit for small projects.<\/li>\n\n\n\n<li>If you care about speed and efficiency, choose Scrapy, a very fast library that has a lot of built-in features that will save you time and money.<\/li>\n\n\n\n<li>Always use venv for installing Python packages; you don\u2019t want to run out of space and affect other projects on your device.<\/li>\n<\/ul>\n\n\n\n<p>Although there are other Python libraries for web scraping than are mentioned in this article, these are the four that stand out from the perspective of a developer. Regardless of the project\u2019s scale or scope, choosing the right web scraping library can make all the difference.<\/p>\n\n\n\n<p>A library with a large community will likely be able to offer broader support; that might make your life easier. Treading new ground is always more demanding than being able to borrow solutions from people who have already solved problems you\u2019re facing.<\/p>\n","protected":false},"author":8854,"featured_media":95422,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","format":"standard","categories":[110],"tags":[],"class_list":["post-64448","blog","type-blog","status-publish","format-standard","has-post-thumbnail","hentry","category-web-scraping-and-automation"],"acf":[],"_links":{"self":[{"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/blog\/64448","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/blog"}],"about":[{"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/types\/blog"}],"author":[{"embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/users\/8854"}],"replies":[{"embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/comments?post=64448"}],"version-history":[{"count":4,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/blog\/64448\/revisions"}],"predecessor-version":[{"id":95428,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/blog\/64448\/revisions\/95428"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/media\/95422"}],"wp:attachment":[{"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/media?parent=64448"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/categories?post=64448"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/tags?post=64448"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}