{"id":54543,"date":"2024-06-14T16:52:03","date_gmt":"2024-06-14T15:52:03","guid":{"rendered":"https:\/\/proxidize.com\/?post_type=use-cases&#038;p=54543"},"modified":"2025-10-02T12:17:14","modified_gmt":"2025-10-02T11:17:14","slug":"web-scraping-with-beautiful-soup","status":"publish","type":"blog","link":"https:\/\/proxidize.com\/blog\/web-scraping-with-beautiful-soup\/","title":{"rendered":"Basics of Web Scraping with Beautiful Soup"},"content":{"rendered":"\n<p>Web scraping is an important technique for extracting information from websites, allowing users to gather data efficiently and systematically. By automating the process of collecting data, web scraping can save time and provide access to large amounts of information that would be impossible to gather manually.<\/p>\n\n\n\n<p>In this article, we will put our knowledge into practice with the basics of web scraping with Beautiful Soup, a powerful Python library designed for parsing HTML and XML documents. For a comprehensive understanding of web scraping, you can read our previous articles on <a href=\"http:\/\/proxidize.com\/use-cases\/web-scraping\/\">web scraping<\/a> and <a href=\"http:\/\/proxidize.com\/use-cases\/web-scraping-tools\/\">web scraping tools<\/a>. They provide a solid foundation for anyone new to the field and looking to hit the ground running.<\/p>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized centered\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/what-is-beautifulsoup--1024x576.png\" alt=\"what is beautifulsoup\" class=\"wp-image-54556\" style=\"object-fit:cover\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/what-is-beautifulsoup--1024x576.png 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/what-is-beautifulsoup--300x169.png 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/what-is-beautifulsoup--768x432.png 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/what-is-beautifulsoup--1536x864.png 1536w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/what-is-beautifulsoup--600x338.png 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/what-is-beautifulsoup-.png 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">What is Beautiful Soup?<\/h2>\n\n\n\n<p>Beautiful Soup is a versatile library in Python that can be used for web scraping. It pulls data out of HTML and XML files and creates a parse tree from page source codes that can be used to extract data easily. This lends itself well to applications in research, business intelligence, and competitive analysis, among others.<\/p>\n\n\n\n<p>HTML and XML are standard languages used to create and structure data on the web. HTML&nbsp; is mainly used to create web pages and web applications. It defines elements like headings, paragraphs, and links. XML, on the other hand, is designed to store and transport data in a way that is both readable by humans and machines.<\/p>\n\n\n\n<p>A parse tree, in the context of web scraping, is a hierarchical structure that represents the syntax of a document. It breaks the document down into its constituent parts, such as tags, attributes, and text, in a tree-like format. This structure allows for easy navigation of the document, making data extraction more efficient.<\/p>\n\n\n\n<p>When you make an HTTP request to a webpage, the server responds with the page\u2019s HTML content \u2014 which you can see by looking at the page source of any web page. Beautiful Soup takes this raw HTML and transforms it into a parse tree, which lets you find specific elements as needed.<\/p>\n\n\n\n<p>Compared to other web scraping libraries, Beautiful Soup has its advantages. Unlike Selenium, which is designed for interacting with dynamic and JavaScript-heavy web pages, Beautiful Soup excels at parsing and extracting data from static HTML content. It is also simpler and more lightweight than comprehensive frameworks like Scrapy.<\/p>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized centered\"><img decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/understanding-html-structure-1024x576.png\" alt=\"HTML Structure\" class=\"wp-image-54557\" style=\"object-fit:cover\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/understanding-html-structure-1024x576.png 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/understanding-html-structure-300x169.png 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/understanding-html-structure-768x432.png 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/understanding-html-structure-1536x864.png 1536w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/understanding-html-structure-600x338.png 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/understanding-html-structure.png 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Understanding HTML Structure<\/h2>\n\n\n\n<p>HTML documents are structured with tags, which we\u2019ve learned to identify elements like headings, paragraphs, links, and images. Each tag is enclosed in angle brackets (e.g. &lt;tag&gt;), and many elements have opening and closing tags (e.g. &lt;p&gt;&lt;\/p&gt; for paragraphs). Tags can also have attributes that provide additional information about the element, such as id, class, and src for images.<\/p>\n\n\n\n<p>Understanding HTML is important for web scraping because it lets you navigate to and extract the specific data you need from the page. This, in turn, will enable you to write more precise scripts.<\/p>\n\n\n\n<p>Here is a simple example of HTML.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers cbp-highlight-hover\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);--cbp-line-highlight-color:rgba(215, 211, 255, 0.2);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>&lt;!DOCTYPE html>\n\n&lt;html>\n\n&lt;head>\n\n\u00a0\u00a0\u00a0\u00a0&lt;title>Sample HTML Document&lt;\/title>\n\n&lt;\/head>\n\n&lt;body>\n\n\u00a0\u00a0\u00a0\u00a0&lt;h1>Welcome to My Web Page&lt;\/h1>\n\n\u00a0\u00a0\u00a0\u00a0&lt;p>This is a paragraph of text on my web page.&lt;\/p>\n\n\u00a0\u00a0\u00a0\u00a0&lt;a href=\"https:\/\/example.com\">Visit Example.com&lt;\/a>\n\n\u00a0\u00a0\u00a0\u00a0&lt;img src=\"image.jpg\" alt=\"Sample Image\">\n\n&lt;\/body>\n\n&lt;\/html><\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #3E8FB0\">&lt;<\/span><span style=\"color: #EB6F92\">!<\/span><span style=\"color: #3E8FB0\">DOCTYPE<\/span><span style=\"color: #E0DEF4\"> html<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">&lt;<\/span><span style=\"color: #E0DEF4\">html<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">&lt;<\/span><span style=\"color: #E0DEF4\">head<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">\u00a0\u00a0\u00a0\u00a0<\/span><span style=\"color: #3E8FB0\">&lt;<\/span><span style=\"color: #E0DEF4\">title<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><span style=\"color: #E0DEF4\">Sample <\/span><span style=\"color: #3E8FB0\">HTML<\/span><span style=\"color: #E0DEF4\"> Document<\/span><span style=\"color: #3E8FB0\">&lt;\/<\/span><span style=\"color: #E0DEF4\">title<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">&lt;\/<\/span><span style=\"color: #E0DEF4\">head<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">&lt;<\/span><span style=\"color: #E0DEF4\">body<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">\u00a0\u00a0\u00a0\u00a0<\/span><span style=\"color: #3E8FB0\">&lt;<\/span><span style=\"color: #E0DEF4\">h1<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><span style=\"color: #E0DEF4\">Welcome to My Web Page<\/span><span style=\"color: #3E8FB0\">&lt;\/<\/span><span style=\"color: #E0DEF4\">h1<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">\u00a0\u00a0\u00a0\u00a0<\/span><span style=\"color: #3E8FB0\">&lt;<\/span><span style=\"color: #E0DEF4\">p<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><span style=\"color: #E0DEF4\">This <\/span><span style=\"color: #3E8FB0\">is<\/span><span style=\"color: #E0DEF4\"> a paragraph of text on my web page<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #3E8FB0\">&lt;\/<\/span><span style=\"color: #E0DEF4\">p<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">\u00a0\u00a0\u00a0\u00a0<\/span><span style=\"color: #3E8FB0\">&lt;<\/span><span style=\"color: #E0DEF4\">a href<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #F6C177\">&quot;https:\/\/example.com&quot;<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><span style=\"color: #E0DEF4\">Visit Example<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">com<\/span><span style=\"color: #3E8FB0\">&lt;\/<\/span><span style=\"color: #E0DEF4\">a<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">\u00a0\u00a0\u00a0\u00a0<\/span><span style=\"color: #3E8FB0\">&lt;<\/span><span style=\"color: #E0DEF4\">img src<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #F6C177\">&quot;image.jpg&quot;<\/span><span style=\"color: #E0DEF4\"> alt<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #F6C177\">&quot;Sample Image&quot;<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">&lt;\/<\/span><span style=\"color: #E0DEF4\">body<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">&lt;\/<\/span><span style=\"color: #E0DEF4\">html<\/span><span style=\"color: #3E8FB0\">&gt;<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<p>And here you can see how the HTML translates to a web page.<\/p>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-us.googleusercontent.com\/docsz\/AD_4nXe6IQFou5sRtPdwsY4Mn1dTuuD9jdJ149T_sc0MaYpyKlgRlRAflt1RkHRzOh17rh4Kn3IM0rARt4zpBFs1hRme8gF9rKc62wi_mj9Mi3d44YEEZJ2mTfyvzuAG3XMxxnyB8ZARjOA7nIUO985XCEHEjA0?key=c4WT7w-DsTsNiODVZkCwAA\" alt=\"\"\/><\/figure>\n\n\n\n<div style=\"height:34px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>Now that we have a basic understanding of HTML structure, let\u2019s apply it in a hypothetical situation.<\/p>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized centered\"><img decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/step-by-step-guide-to-web-scraping-with-beautiful-soup-1024x576.png\" alt=\"Guide to web scraping with beautifulsoup\" class=\"wp-image-54558\" style=\"object-fit:cover\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/step-by-step-guide-to-web-scraping-with-beautiful-soup-1024x576.png 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/step-by-step-guide-to-web-scraping-with-beautiful-soup-300x169.png 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/step-by-step-guide-to-web-scraping-with-beautiful-soup-768x432.png 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/step-by-step-guide-to-web-scraping-with-beautiful-soup-1536x864.png 1536w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/step-by-step-guide-to-web-scraping-with-beautiful-soup-600x338.png 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/step-by-step-guide-to-web-scraping-with-beautiful-soup.png 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Step-By-Step Guide to Web Scraping With Beautiful Soup<\/h2>\n\n\n\n<p>Web scraping with Beautiful Soup and mobile proxies is a powerful combination for extracting data from websites while maintaining anonymity and avoiding IP bans. Here&#8217;s a step-by-step guide on how to use Beautiful Soup with mobile proxies:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 1: Install Python<\/h3>\n\n\n\n<p>Make sure you have Python installed on your system. If not, download and install it from the<a href=\"https:\/\/www.python.org\/\" target=\"_blank\" rel=\"noopener\"> official Python website<\/a>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 2: Install Required Libraries<\/h3>\n\n\n\n<p>Make sure you have Python installed on your system. You will need to install the following libraries, if you haven&#8217;t already, using pip (Python&#8217;s package manager):<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers cbp-highlight-hover\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(1 * 0.6 * .75rem);--cbp-line-highlight-color:rgba(215, 211, 255, 0.2);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>pip install requests\n\npip install beautifulsoup4\n\npip install pandas<\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #E0DEF4\">pip install requests<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">pip install beautifulsoup4<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">pip install pandas<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 3: Import Libraries<\/h3>\n\n\n\n<p>In your Python script, import the necessary libraries:<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers cbp-highlight-hover\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(1 * 0.6 * .75rem);--cbp-line-highlight-color:rgba(215, 211, 255, 0.2);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>import requests\n\nfrom bs4 import BeautifulSoup\n\nimport pandas as pd\n\n#Added pandas library for Excel export<\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> requests<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> bs4 <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> BeautifulSoup<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> pandas <\/span><span style=\"color: #3E8FB0\">as<\/span><span style=\"color: #E0DEF4\"> pd<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\">Added pandas library for Excel export<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 4: Make an HTTP Request<\/h3>\n\n\n\n<p>Use the requests library to make an HTTP request to your target URL. In this case we\u2019ll be using the hypothetical link https:\/\/example.com\/product-page.<\/p>\n\n\n\n<p>We\u2019ll go one step further and include a way to check whether the request was successful, too.<\/p>\n\n\n\n<p>In the event the request is successful, we\u2019ll create an object called <strong>soup <\/strong>from the HTML document. If it is unsuccessful, our code will give us an error message.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers cbp-highlight-hover\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);--cbp-line-highlight-color:rgba(215, 211, 255, 0.2);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>url = \"https:\/\/example.com\/product-page\"\n\nresponse = requests.get(url)\n\nresponse.raise_for_status()\n\n# Check if the request was successful\n\nif response.status_code == 200:\n\nsoup = BeautifulSoup(response.text, 'html.parser')\n\n# You can now use Beautiful Soup to parse the HTML content.\n\nelse:\n\u00a0 \u00a0 \u00a0 print(f\"Request failed with status code: {response.status_code}\")<\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #E0DEF4\">url <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;https:\/\/example.com\/product-page&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">response <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> requests<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">get<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">url<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">raise_for_status<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Check if the request was successful<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">if<\/span><span style=\"color: #E0DEF4\"> response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">status_code <\/span><span style=\"color: #3E8FB0\">==<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EA9A97\">200<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">soup <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> BeautifulSoup<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">text<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&#39;html.parser&#39;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> You can now use Beautiful Soup to parse the HTML content.<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">else<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">\u00a0 \u00a0 \u00a0 <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Request failed with status code: <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">status_code<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 5: Scrape Product Name and Price with Beautiful Soup<\/h3>\n\n\n\n<p>Now we can use Beautiful Soup to parse and extract the names and prices of all the products on the page.<\/p>\n\n\n\n<p>We\u2019ll start by creating lists to store the names and prices of each product.<\/p>\n\n\n\n<p>For the sake of this example, we\u2019ll assume that each product on the page is identified by the class<strong> product<\/strong>, its name is noted as <strong>product-name<\/strong> and its price as <strong>product-price<\/strong>.<\/p>\n\n\n\n<p>In reality, these will be different on the website you want to scrape from. You\u2019ll have to amend your script to reflect this.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers cbp-highlight-hover\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);--cbp-line-highlight-color:rgba(215, 211, 255, 0.2);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly># Lists to store product names and prices\n\nproduct_names = []\n\nproduct_prices = []\n\n# Find all the div elements with class 'product'\n\nproducts = soup.find_all('div', class_='product')\n\n# Iterate over each product found\n\nfor product in products:\n\n\u00a0\u00a0\u00a0\u00a0# Find the product name\n\n\u00a0\u00a0\u00a0\u00a0name = product.find(class_='product-name').text\n\n\u00a0\u00a0\u00a0\u00a0product_names.append(name)\n\n\u00a0\u00a0\u00a0\u00a0# Find the product price\n\n\u00a0\u00a0\u00a0\u00a0price = product.find(class_= 'product-price').text\n\n\u00a0\u00a0\u00a0\u00a0product_prices.append(price)<\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Lists to store product names and prices<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">product_names <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #908CAA\">[]<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">product_prices <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #908CAA\">[]<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Find all the div elements with class &#39;product&#39;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">products <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> soup<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">find_all<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&#39;div&#39;<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">class_<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #F6C177\">&#39;product&#39;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Iterate over each product found<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">for<\/span><span style=\"color: #E0DEF4\"> product <\/span><span style=\"color: #3E8FB0\">in<\/span><span style=\"color: #E0DEF4\"> products<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">\u00a0\u00a0\u00a0\u00a0<\/span><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Find the product name<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">\u00a0\u00a0\u00a0\u00a0name <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> product<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">find<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #C4A7E7; font-style: italic\">class_<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #F6C177\">&#39;product-name&#39;<\/span><span style=\"color: #908CAA\">).<\/span><span style=\"color: #E0DEF4\">text<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">\u00a0\u00a0\u00a0\u00a0product_names<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">append<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">name<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">\u00a0\u00a0\u00a0\u00a0<\/span><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Find the product price<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">\u00a0\u00a0\u00a0\u00a0price <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> product<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">find<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #C4A7E7; font-style: italic\">class_<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&#39;product-price&#39;<\/span><span style=\"color: #908CAA\">).<\/span><span style=\"color: #E0DEF4\">text<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">\u00a0\u00a0\u00a0\u00a0product_prices<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">append<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">price<\/span><span style=\"color: #908CAA\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<p>For every product on the page, the script will find the name and price and add it to the lists.<\/p>\n\n\n\n<p>The <strong>.text<\/strong> attribute in the code will remove the HTML tags and return only the text inside the element.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 6: Store Scraped Data in an Excel Sheet<\/h3>\n\n\n\n<p>Next, we\u2019ll store the product\u2019s name and price in a DataFrame and save it to an Excel sheet.<\/p>\n\n\n\n<p>Then we\u2019ll have our script tell us whether the data was successfully saved or not.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers cbp-highlight-hover\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);--cbp-line-highlight-color:rgba(215, 211, 255, 0.2);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly># Store the product name and price data in a DataFrame\n\ndata = { \"Product Name\": product_names, \"Price\": product_prices }\n\ndf = pd.DataFrame(data)\n\n# Define the Excel file name\n\nexcel_file = \"product_pricing.xlsx\"\n\n# Save the DataFrame to an Excel file\n\ndf.to_excel(excel_file, index=False)\n\n# Print a message to confirm that the data is saved\n\nprint(f\"Product pricing saved to {excel_file}\")\n\nelse: print(f\"Request failed with status code: {response.status_code}\")<\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Store the product name and price data in a DataFrame<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">data <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #908CAA\">{<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;Product Name&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> product_names<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;Price&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> product_prices <\/span><span style=\"color: #908CAA\">}<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">df <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> pd<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">DataFrame<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">data<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Define the Excel file name<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">excel_file <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;product_pricing.xlsx&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Save the DataFrame to an Excel file<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">df<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">to_excel<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">excel_file<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">index<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #EA9A97\">False<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Print a message to confirm that the data is saved<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Product pricing saved to <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">excel_file<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">else<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Request failed with status code: <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">status_code<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<div style=\"height:34px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized centered\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/common-web-scraping-challenges-1024x576.png\" alt=\"Web Scraping challenges\" class=\"wp-image-54559\" style=\"object-fit:cover\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/common-web-scraping-challenges-1024x576.png 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/common-web-scraping-challenges-300x169.png 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/common-web-scraping-challenges-768x432.png 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/common-web-scraping-challenges-1536x864.png 1536w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/common-web-scraping-challenges-600x338.png 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/06\/common-web-scraping-challenges.png 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Common Web Scraping Challenges<\/h2>\n\n\n\n<p>Web scraping is rarely straightforward. There are many challenges that will hamper your activities, and you\u2019ll have to account for each of them.<\/p>\n\n\n\n<p>One common challenge is missing or inconsistent data. Websites may have varying structures and not all the elements you&#8217;re looking for will be present on every page. This means your script will have to account for and handle these inconsistencies.<\/p>\n\n\n\n<p>Dynamic content is another obstacle, as some websites load data using JavaScript after the initial HTML page has been delivered. In these cases, libraries like Selenium would be more appropriate,&nbsp; as they can render JavaScript and access the dynamically loaded content.<\/p>\n\n\n\n<p>Anti-scraping mechanisms like CAPTCHAs are designed to prevent automated access to websites. While a nuisance, using captcha solvers can help bypass them. Websites may also implement rate limiting and IP blocking to combat bots. Using a pool of proxies can distribute your requests over several IPs and reduce the likelihood of being blocked.<\/p>\n\n\n\n<p>Lastly, websites frequently change their structure, which can break your scripts. This means you\u2019ll need to routinely update your scripts to handle any changes in the website\u2019s layout or structure.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>In this article, we covered the basics of web scraping with Beautiful Soup. We learned how to set up our environment by installing the necessary libraries and how to use Beautiful Soup to navigate and extract data from HTML content.<\/p>\n\n\n\n<p>We also explored the structure of HTML documents, understanding how tags, attributes, and text elements are used to build web pages.<\/p>\n\n\n\n<p>With our step-by-step guide, we learned how to make HTTP requests, parse HTML content with Beautiful Soup, and extract product names and prices from a sample e-commerce page. We then demonstrated how to store the extracted data in a DataFrame and save it to an Excel file, making it easy to manage and analyze the collected information.<\/p>\n\n\n\n<p>With this basic guide, you should be more comfortable dipping your toes into starting own web scraping project and effectively gather data from websites.<\/p>\n","protected":false},"author":2627,"featured_media":76632,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","format":"standard","categories":[110],"tags":[],"class_list":["post-54543","blog","type-blog","status-publish","format-standard","has-post-thumbnail","hentry","category-web-scraping-and-automation"],"acf":[],"_links":{"self":[{"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/blog\/54543","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/blog"}],"about":[{"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/types\/blog"}],"author":[{"embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/users\/2627"}],"replies":[{"embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/comments?post=54543"}],"version-history":[{"count":3,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/blog\/54543\/revisions"}],"predecessor-version":[{"id":84855,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/blog\/54543\/revisions\/84855"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/media\/76632"}],"wp:attachment":[{"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/media?parent=54543"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/categories?post=54543"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/tags?post=54543"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}