{"id":60703,"date":"2024-11-08T16:03:35","date_gmt":"2024-11-08T16:03:35","guid":{"rendered":"https:\/\/proxidize.com\/?post_type=use-cases&#038;p=60703"},"modified":"2025-10-23T11:53:50","modified_gmt":"2025-10-23T10:53:50","slug":"scraping-websites-with-login-pages-python","status":"publish","type":"blog","link":"https:\/\/proxidize.com\/blog\/scraping-websites-with-login-pages-python\/","title":{"rendered":"Scraping Websites with Login Pages Using Python"},"content":{"rendered":"\n<p>When scraping websites with login pages, a challenge arises with passing through the login page to gather the data you need. However, there is a way to bypass the login page and get straight to scraping. This guide will introduce the challenges associated with scraping websites with login pages, setting up an environment, analyzing the login mechanism, and how to create the Python script necessary to scrape through a login page. This guide can only assist with bypassing the login page to a website where a user already has the login information on it. Bypassing a login page without having any login credentials is unethical and could be illegal.<\/p>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized centered\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/understanding-the-challenges-1024x576.png\" alt=\"A drawing of a person tapping on a big screen under the title &quot;Understanding the Challenges&quot;.\" class=\"wp-image-60805\" style=\"object-fit:cover\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/understanding-the-challenges-1024x576.png 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/understanding-the-challenges-300x169.png 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/understanding-the-challenges-768x432.png 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/understanding-the-challenges-1536x864.png 1536w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/understanding-the-challenges-600x338.png 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/understanding-the-challenges.png 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Understanding the Challenges&nbsp;<\/h2>\n\n\n\n<p>Scraping websites that require a login is more complex than scraping public pages as it usually involves additional metrics that typical scraping does not include. This involves maintaining an active session, managing cookies, and dealing with CAPTCHAs.&nbsp;<\/p>\n\n\n\n<p>Scraping public pages involves sending a GET request to retrieve the HTML content of the pages. With authenticated pages, it is necessary to login to the website by submitting the credentials first, just like a normal user would when accessing their account. This process involves submitting a <a href=\"https:\/\/www.w3schools.com\/tags\/ref_httpmethods.asp\" target=\"_blank\" rel=\"noopener\">POST request<\/a> with form data such as a username, password, and security tokens. One must also ensure that a session is maintained after login so that authenticated requests can be made to access protected resources.&nbsp;<\/p>\n\n\n\n<p>Once logged in, every request must be authenticated through session cookies or tokens. This session management adds a layer of complexity as improper handling can result in denied access. Websites with login requirements use sessions and cookies to keep track of authenticated users. Scrapers need to maintain an active session throughout their process which is achieved by using requests.Session() object in Python. This is presented in visual details further down the article. This will store cookies and session data across multiple requests.<\/p>\n\n\n\n<p>Cookies need to be saved and sent along with each subsequent request so that sessions remain authenticated. Without cookies, the server may treat each request as unauthenticated and deny access. Some websites use <a href=\"https:\/\/owasp.org\/www-community\/attacks\/csrf#:~:text=Cross%2DSite%20Request%20Forgery%20(CSRF,which%20they&#039;re%20currently%20authenticated.\" target=\"_blank\" rel=\"noopener\">Cross-Site Request Forgery<\/a> (CSRF) protection. This includes hidden tokens in forms that must be submitted along with the login details. If the token is not sent or is incorrect, the server will reject the request.<\/p>\n\n\n\n<p>Finally, CAPTCHAs pose a significant roadblock when it comes to scraping in general. There are a few methods that could circumvent CAPTCHA including integrating <a href=\"https:\/\/proxidize.com\/antidetect-browser\/captcha-solvers\/\">CAPTCHA-solving services<\/a> or avoiding websites that use advanced CAPTCHA methods.<\/p>\n\n\n\n<p>Some websites implement more advanced detection techniques. One of the ways this could be bypassed is by using <a href=\"https:\/\/proxidize.com\/proxy-server\/mobile-proxy\/\">mobile proxies<\/a> to hide the IP address and make the traffic appear as if it is coming from somewhere else. This could be strengthened by using rotating proxies as they would intensify the anti-detection practices by allowing users to access ever-changing IPs.<\/p>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized centered\"><img decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/analyzing-the-login-mechanism--1024x576.png\" alt=\"A drawing of a person inspecting a big laptop under the title &quot;Analyzing the Login Mechanism&quot;.\" class=\"wp-image-60807\" style=\"object-fit:cover\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/analyzing-the-login-mechanism--1024x576.png 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/analyzing-the-login-mechanism--300x169.png 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/analyzing-the-login-mechanism--768x432.png 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/analyzing-the-login-mechanism--1536x864.png 1536w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/analyzing-the-login-mechanism--600x338.png 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/analyzing-the-login-mechanism-.png 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Analyzing the Login Mechanism&nbsp;<\/h2>\n\n\n\n<p>Scraping websites with a login page requires understanding how a website\u2019s login process works. To do this, a user must inspect the login form, identify key fields, and observe how form submission happens. This can be done by following these steps:<\/p>\n\n\n\n<p>Inspect the Login Form on the Website: Once on the login page of the desired website, open the developer tools by right-clicking on the page and selecting \u201cInspect\u201d or by pressing the hotkey Ctrl+Shift+I. Look at the HTML structure of the form to find the input elements for the login. This includes anything from Username Field (name=username\/id=username), Password Field (name=password), and CSRF Tokens which must be included in the form submission to successfully authenticate the user.<\/p>\n\n\n\n<p>Understand Form Submissions: Check if the form uses a POST method for submission. The form\u2019s \u201caction\u201d attribute specifies the URL that the data is sent to. The data can typically include the username, password, and other hidden fields such as the CSRF token and session identifiers.&nbsp;<\/p>\n\n\n\n<p>Use Browser Developer Tools to Monitor Network Requests During Login: While developer tools are open, navigate to the \u201cNetwork\u201d tab and submit the login form with test credentials. Look at the network requests that are made during the form submission and locate the request corresponding to the login attempt which is usually labeled as POST. Click on the request to see the details which include the headers, form data, and response. This information tells the user how the server handles authentication and what data needs to be sent for a successful login.<\/p>\n\n\n\t\t<div data-elementor-type=\"container\" data-elementor-id=\"85693\" class=\"elementor elementor-85693\" data-elementor-post-type=\"elementor_library\">\n\t\t\t\t<div class=\"elementor-element elementor-element-53838f9 e-con-full no-scale elementor-hidden-mobile_extra elementor-hidden-mobile e-flex e-con e-child\" data-id=\"53838f9\" data-element_type=\"container\" data-e-type=\"container\" data-settings=\"{&quot;background_background&quot;:&quot;gradient&quot;}\">\n\t\t<div class=\"elementor-element elementor-element-264a6ec e-grid e-con-full e-con e-child\" data-id=\"264a6ec\" data-element_type=\"container\" data-e-type=\"container\">\n\t\t<div class=\"elementor-element elementor-element-4986847 e-con-full e-flex e-con e-child\" data-id=\"4986847\" data-element_type=\"container\" data-e-type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-f8b9092 elementor-widget elementor-widget-heading\" data-id=\"f8b9092\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<p class=\"elementor-heading-title elementor-size-default\">High-quality scraping and automation  \nstarts with high-quality mobile proxies<\/p>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-fd5a829 e-con-full e-flex e-con e-child\" data-id=\"fd5a829\" data-element_type=\"container\" data-e-type=\"container\">\n\t\t<div class=\"elementor-element elementor-element-0087840 e-con-full e-flex e-con e-child\" data-id=\"0087840\" data-element_type=\"container\" data-e-type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-1e530dc elementor-widget__width-initial elementor-widget elementor-widget-image\" data-id=\"1e530dc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"125\" height=\"80\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2025\/10\/20-2.svg\" class=\"attachment-full size-full wp-image-86191\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f634f7d inline-CTA elementor-widget elementor-widget-button\" data-id=\"f634f7d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/proxidize.com\/mobile-proxy-pricing\/?coupon_code=20OFFMPB\" target=\"_blank\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Buy Proxies Now<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized centered\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/python-script-for-scraping-login-pages-1024x576.png\" alt=\"A drawing of a person standing at a big computer under the title &quot;Creating Python Script for Scraping Websites with Login Pages&quot;.\" class=\"wp-image-60806\" style=\"object-fit:cover\" srcset=\"https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/python-script-for-scraping-login-pages-1024x576.png 1024w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/python-script-for-scraping-login-pages-300x169.png 300w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/python-script-for-scraping-login-pages-768x432.png 768w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/python-script-for-scraping-login-pages-1536x864.png 1536w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/python-script-for-scraping-login-pages-600x338.png 600w, https:\/\/proxidize.com\/wp-content\/uploads\/2024\/11\/python-script-for-scraping-login-pages.png 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Creating Python Script for Scraping Websites with Login Pages<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Setting up the Environment&nbsp;<\/h3>\n\n\n\n<p>The first and most crucial step in any <a href=\"https:\/\/proxidize.com\/use-cases\/web-scraping\/\">web scraping<\/a> project is to set up the core environment. For many web scraping experts, this section can be skipped as they most probably know how to do this. For everyone else, let us walk through what this means and how to set up the environment.&nbsp;<\/p>\n\n\n\n<p>Setting up an environment involves explaining what <a href=\"https:\/\/proxidize.com\/use-cases\/web-scraping-tools\/\">tools your script<\/a> will utilize. Conceptually, it is like the tool box you fill when you want to build something. Without a hammer, you would not be able to do much. The environment differs depending on your specific task or project. For scraping websites with login pages using Python, the two main \u201ctools\u201d necessary for this task are the Requests library and Beautiful Soup.<\/p>\n\n\n\n<p>Requests is used in handling HTTP requests that allow you to send data to servers, maintain sessions, and manage cookies easily. It allows you to login to websites by sending POST requests and accessing authenticated pages within the same session. BeautifulSoup is used for parsing and extracting data from the content retrieved by requests. It navigates, searches, and modifies HTML documents which makes it perfect for getting data from web pages faster and more efficiently. Together, requests and BeautifulSoup provide a straightforward approach for logging in and extracting data from websites.<\/p>\n\n\n\n<p>In the terminal of your <a href=\"https:\/\/aws.amazon.com\/what-is\/ide\/#:~:text=An%20integrated%20development%20environment%20(IDE,programmers%20develop%20software%20code%20efficiently.\" target=\"_blank\" rel=\"noopener\">integrated development environment<\/a> (IDE), enter the following commands:<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(1 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>pip install requests\npip install beautifulsoup4<\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #EA9A97\">pip<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">install<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">requests<\/span><\/span>\n<span class=\"line\"><span style=\"color: #EA9A97\">pip<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">install<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">beautifulsoup4<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<p>Once that is done, you are ready to start scraping websites with login pages.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Handling Sessions<\/h3>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(1 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly># Create a session object to maintain cookies and headers across requests\nsession = requests.Session()\n\n# Send a POST request to log in using the session\nresponse = session.post(login_url, data=payload)\n\n# Access another page using the same session\nprotected_page_url = \"https:\/\/www.example.com\/protected-page\"\nprotected_response = session.get(protected_page_url)<\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Create a session object to maintain cookies and headers across requests<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">session <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> requests<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">Session<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Send a POST request to log in using the session<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">response <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> session<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">post<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">login_url<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">data<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\">payload<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Access another page using the same session<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">protected_page_url <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;https:\/\/www.example.com\/protected-page&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">protected_response <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> session<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">get<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">protected_page_url<\/span><span style=\"color: #908CAA\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cookie Management<\/h3>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(1 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly># check if the request was successful\nif response.status_code == 200:\n    print(\"Login successful!\")\n    # Display cookies received after login\n    print(\"Cookies after login:\")\n    print(session.cookies.get_dict())\nelse:\n    print(f\"Login failed with status code: {response.status_code}\")<\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> check if the request was successful<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">if<\/span><span style=\"color: #E0DEF4\"> response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">status_code <\/span><span style=\"color: #3E8FB0\">==<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EA9A97\">200<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;Login successful!&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Display cookies received after login<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;Cookies after login:&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">session<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">cookies<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">get_dict<\/span><span style=\"color: #908CAA\">())<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">else<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Login failed with status code: <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">status_code<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<p>If you already have cookies for a website but do not have the login information, this can easily be bypassed with the following script:<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>import requests\nfrom bs4 import BeautifulSoup\n\n# Create a requests session\nsession = requests.Session()\n\n# Example saved cookies (replace these with your actual cookies)\nsaved_cookies = {\n    \"session_id\": \"your_session_id_value\",\n    \"auth_token\": \"your_auth_token_value\"\n}\n\n# Load cookies into the session\nfor name, value in saved_cookies.items():\n    session.cookies.set(name, value)\n\n# Use the session to access a protected page\nprotected_page_url = \"https:\/\/www.scrapingcourse.com\/dashboard\"\nresponse = session.get(protected_page_url)\n\n# Parse the response using BeautifulSoup\nif response.status_code == 200:\n    soup = BeautifulSoup(response.text, 'html.parser')\n    page_title = soup.title.string if soup.title else \"No title found\"\n    print(f\"Page title: {page_title}\")\nelse:\n    print(f\"Failed to access the protected page: {response.status_code}\")\n    print(response.text)  # For debugging\n<\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> requests<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> bs4 <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> BeautifulSoup<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Create a requests session<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">session <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> requests<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">Session<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Example saved cookies (replace these with your actual cookies)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">saved_cookies <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #908CAA\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #F6C177\">&quot;session_id&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;your_session_id_value&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #F6C177\">&quot;auth_token&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;your_auth_token_value&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #908CAA\">}<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Load cookies into the session<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">for<\/span><span style=\"color: #E0DEF4\"> name<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> value <\/span><span style=\"color: #3E8FB0\">in<\/span><span style=\"color: #E0DEF4\"> saved_cookies<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">items<\/span><span style=\"color: #908CAA\">():<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    session<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">cookies<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">set<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">name<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> value<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Use the session to access a protected page<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">protected_page_url <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;https:\/\/www.scrapingcourse.com\/dashboard&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">response <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> session<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">get<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">protected_page_url<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Parse the response using BeautifulSoup<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">if<\/span><span style=\"color: #E0DEF4\"> response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">status_code <\/span><span style=\"color: #3E8FB0\">==<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EA9A97\">200<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    soup <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> BeautifulSoup<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">text<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&#39;html.parser&#39;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    page_title <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> soup<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">title<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">string <\/span><span style=\"color: #3E8FB0\">if<\/span><span style=\"color: #E0DEF4\"> soup<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">title <\/span><span style=\"color: #3E8FB0\">else<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;No title found&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Page title: <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">page_title<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">else<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Failed to access the protected page: <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">status_code<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">text<\/span><span style=\"color: #908CAA\">)<\/span><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> For debugging<\/span><\/span>\n<span class=\"line\"><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Full Script for Scraping Websites with Login Pages<\/h3>\n\n\n\n<p>For the script, we will be using the website https:\/\/www.scrapingcourse.com\/login as an example. With everything compiled, the script should look like this:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>import requests\nfrom bs4 import BeautifulSoup\n\n# Create a session to persist cookies and headers\nsession = requests.Session()\n\n# the URL of the login page\nlogin_url = \"https:\/\/www.scrapingcourse.com\/login\"\n\n# the payload with your login credentials\npayload = {\n    \"email\": \"admin@example.com\",\n    \"password\": \"password\",\n}\n\n# send the POST request to login using the session\nresponse = session.post(login_url, data=payload)\n\n# check if the request was successful\nif response.status_code == 200:\n    print(\"Login successful!\")\nelse:\n    print(f\"Login failed with status code: {response.status_code}\")\n\n# access another page after login, maintaining the session\nprotected_page_url = \"https:\/\/www.scrapingcourse.com\/protected-page\"\nprotected_response = session.get(protected_page_url)\n\n# parse the protected page content using BeautifulSoup\nsoup = BeautifulSoup(protected_response.text, \"html.parser\")\n\n# find the page title\npage_title = soup.title.string if soup.title else \"No title found\"\nprint(f\"Page title: {page_title}\")\n\n# Example of extracting data from the protected page\ndata = soup.find('div', class_='data-class')  # Adjust selector based on your needs\nif data:\n    print(f\"Extracted data: {data.text}\")\nelse:\n    print(\"No data found with the specified tag\/class.\")<\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> requests<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> bs4 <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> BeautifulSoup<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Create a session to persist cookies and headers<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">session <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> requests<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">Session<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> the URL of the login page<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">login_url <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;https:\/\/www.scrapingcourse.com\/login&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> the payload with your login credentials<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">payload <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #908CAA\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #F6C177\">&quot;email&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;admin@example.com&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #F6C177\">&quot;password&quot;<\/span><span style=\"color: #908CAA\">:<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;password&quot;<\/span><span style=\"color: #908CAA\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #908CAA\">}<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> send the POST request to login using the session<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">response <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> session<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">post<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">login_url<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">data<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\">payload<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> check if the request was successful<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">if<\/span><span style=\"color: #E0DEF4\"> response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">status_code <\/span><span style=\"color: #3E8FB0\">==<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EA9A97\">200<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;Login successful!&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">else<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Login failed with status code: <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">status_code<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> access another page after login, maintaining the session<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">protected_page_url <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;https:\/\/www.scrapingcourse.com\/protected-page&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">protected_response <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> session<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">get<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">protected_page_url<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> parse the protected page content using BeautifulSoup<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">soup <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> BeautifulSoup<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">protected_response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">text<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;html.parser&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> find the page title<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">page_title <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> soup<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">title<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">string <\/span><span style=\"color: #3E8FB0\">if<\/span><span style=\"color: #E0DEF4\"> soup<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">title <\/span><span style=\"color: #3E8FB0\">else<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;No title found&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Page title: <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">page_title<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Example of extracting data from the protected page<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">data <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> soup<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">find<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&#39;div&#39;<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #C4A7E7; font-style: italic\">class_<\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #F6C177\">&#39;data-class&#39;<\/span><span style=\"color: #908CAA\">)<\/span><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Adjust selector based on your needs<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">if<\/span><span style=\"color: #E0DEF4\"> data<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Extracted data: <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">data<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">text<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">else<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;No data found with the specified tag\/class.&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<p>Inputting the script above within your code will allow you to bypass the login page when scraping a website. This will save time when scraping a page such as a social media site or any other website that requires a login. It must be stated that this code alone will not start scraping a website, this will simply pass through the login page. To learn how to write code that will scrape websites, we have written an article detailing how to write a <a href=\"https:\/\/proxidize.com\/use-cases\/web-scraping-with-beautiful-soup\/\">Python script for web scraping<\/a>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Receiving Visual Confirmation<\/h3>\n\n\n\n<p>The code provided above works for getting past the login page successfully. However, you might want confirmation that the script is working. To do this, you would need to introduce Selenium into the mix. To do this, install the Selenium package into your terminal by using:<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(1 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>pip install selenium<\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #EA9A97\">pip<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">install<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">selenium<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<p>After that is done, you would need to add a few lines of code to inform your script that you wish to see the browser pop up. The fully updated script should look like this:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#e0def4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#232136;display:none;background-color:#e0def4\" aria-label=\"Copy\" data-copied-text=\"Copied!\" data-has-text-button=\"textSimple\" data-inside-header-type=\"none\" aria-live=\"polite\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>from selenium import webdriver\nfrom selenium.webdriver.common.by import By\nfrom selenium.webdriver.common.keys import Keys\nimport time\nimport requests\nfrom bs4 import BeautifulSoup\n\n# Initialize Selenium WebDriver\ndriver = webdriver.Chrome()  # Ensure you have the ChromeDriver installed\ndriver.get(\"https:\/\/www.scrapingcourse.com\/login\")\n\n# Perform login using Selenium (adjust selectors as needed)\nemail_input = driver.find_element(By.NAME, \"email\")\nemail_input.send_keys(\"admin@example.com\")\npassword_input = driver.find_element(By.NAME, \"password\")\npassword_input.send_keys(\"password\")\npassword_input.send_keys(Keys.RETURN)\n\n# Allow some time for login to process\ntime.sleep(5)\n\n# Extract cookies from Selenium and transfer to requests session\nsession = requests.Session()\nfor cookie in driver.get_cookies():\n    session.cookies.set(cookie&#091;'name'&#093;, cookie&#091;'value'&#093;)\n\n# Keep the browser open instead of quitting\ninput(\"Press Enter to continue after verifying the page is loaded...\")\n\n# Use the requests session to access a protected page\nprotected_page_url = \"https:\/\/www.scrapingcourse.com\/dashboard\"\nresponse = session.get(protected_page_url)\n\n# Parse the response using BeautifulSoup\nif response.status_code == 200:\n    soup = BeautifulSoup(response.text, 'html.parser')\n    page_title = soup.title.string if soup.title else \"No title found\"\n    print(f\"Page title: {page_title}\")\nelse:\n    print(f\"Failed to access the protected page: {response.status_code}\")\n    print(response.text)  # For debugging<\/textarea><\/pre><span class=\"cbp-btn-text\">Copy<\/span><\/span><pre class=\"shiki rose-pine-moon\" style=\"background-color: #232136\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> selenium <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> webdriver<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> selenium<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">webdriver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">common<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">by <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> By<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> selenium<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">webdriver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">common<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">keys <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> Keys<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> time<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> requests<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">from<\/span><span style=\"color: #E0DEF4\"> bs4 <\/span><span style=\"color: #3E8FB0\">import<\/span><span style=\"color: #E0DEF4\"> BeautifulSoup<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Initialize Selenium WebDriver<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">driver <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> webdriver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">Chrome<\/span><span style=\"color: #908CAA\">()<\/span><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Ensure you have the ChromeDriver installed<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">driver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">get<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;https:\/\/www.scrapingcourse.com\/login&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Perform login using Selenium (adjust selectors as needed)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">email_input <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> driver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">find_element<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">By<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #3E8FB0\">NAME<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;email&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">email_input<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">send_keys<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;admin@example.com&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">password_input <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> driver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">find_element<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">By<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #3E8FB0\">NAME<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;password&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">password_input<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">send_keys<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;password&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">password_input<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">send_keys<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">Keys<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #3E8FB0\">RETURN<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Allow some time for login to process<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">time<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">sleep<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #EA9A97\">5<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Extract cookies from Selenium and transfer to requests session<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">session <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> requests<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">Session<\/span><span style=\"color: #908CAA\">()<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">for<\/span><span style=\"color: #E0DEF4\"> cookie <\/span><span style=\"color: #3E8FB0\">in<\/span><span style=\"color: #E0DEF4\"> driver<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">get_cookies<\/span><span style=\"color: #908CAA\">():<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    session<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">cookies<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">set<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">cookie<\/span><span style=\"color: #908CAA\">&#091;<\/span><span style=\"color: #F6C177\">&#39;name&#39;<\/span><span style=\"color: #908CAA\">&#093;,<\/span><span style=\"color: #E0DEF4\"> cookie<\/span><span style=\"color: #908CAA\">&#091;<\/span><span style=\"color: #F6C177\">&#39;value&#39;<\/span><span style=\"color: #908CAA\">&#093;)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Keep the browser open instead of quitting<\/span><\/span>\n<span class=\"line\"><span style=\"color: #EB6F92; font-style: italic\">input<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #F6C177\">&quot;Press Enter to continue after verifying the page is loaded...&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Use the requests session to access a protected page<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">protected_page_url <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;https:\/\/www.scrapingcourse.com\/dashboard&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">response <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> session<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">get<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">protected_page_url<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> Parse the response using BeautifulSoup<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">if<\/span><span style=\"color: #E0DEF4\"> response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">status_code <\/span><span style=\"color: #3E8FB0\">==<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #EA9A97\">200<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    soup <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> BeautifulSoup<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">text<\/span><span style=\"color: #908CAA\">,<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&#39;html.parser&#39;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    page_title <\/span><span style=\"color: #3E8FB0\">=<\/span><span style=\"color: #E0DEF4\"> soup<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">title<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">string <\/span><span style=\"color: #3E8FB0\">if<\/span><span style=\"color: #E0DEF4\"> soup<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">title <\/span><span style=\"color: #3E8FB0\">else<\/span><span style=\"color: #E0DEF4\"> <\/span><span style=\"color: #F6C177\">&quot;No title found&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Page title: <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">page_title<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #3E8FB0\">else<\/span><span style=\"color: #908CAA\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #3E8FB0\">f<\/span><span style=\"color: #F6C177\">&quot;Failed to access the protected page: <\/span><span style=\"color: #3E8FB0\">{<\/span><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">status_code<\/span><span style=\"color: #3E8FB0\">}<\/span><span style=\"color: #F6C177\">&quot;<\/span><span style=\"color: #908CAA\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E0DEF4\">    <\/span><span style=\"color: #EB6F92; font-style: italic\">print<\/span><span style=\"color: #908CAA\">(<\/span><span style=\"color: #E0DEF4\">response<\/span><span style=\"color: #908CAA\">.<\/span><span style=\"color: #E0DEF4\">text<\/span><span style=\"color: #908CAA\">)<\/span><span style=\"color: #E0DEF4\">  <\/span><span style=\"color: #908CAA; font-style: italic\">#<\/span><span style=\"color: #6E6A86; font-style: italic\"> For debugging<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p><\/p>\n\n\n\n<p>This will be useful for you to confirm that the code is functioning correctly and logging into the correct page.<\/p>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>Scraping websites with login pages is fairly straightforward. Understanding the login page parameters would help explain how the website handles login requests. Most websites follow a similar mechanism with slightly different parameters, making the code you write for different websites quite easy to alter depending on the website of choice.<\/p>\n\n\n\n<p>Handling sessions and managing cookies is a vital part of the code as without them, the session could time out or be made null, resulting in a possible IP ban. We have written articles detailing how to<a href=\"https:\/\/proxidize.com\/antidetect-browser\/bypass-captcha\/\"> implement a CAPTCHA bypass tool<\/a> as well as code to enter in a proxy within your script which would have added procedures for your scraping practices. With all of these tools blended together and functioning correctly, there should be no trouble for scraping websites with login pages with Python.<\/p>\n","protected":false},"author":2627,"featured_media":75801,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","format":"standard","categories":[110],"tags":[],"class_list":["post-60703","blog","type-blog","status-publish","format-standard","has-post-thumbnail","hentry","category-web-scraping-and-automation"],"acf":[],"_links":{"self":[{"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/blog\/60703","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/blog"}],"about":[{"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/types\/blog"}],"author":[{"embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/users\/2627"}],"replies":[{"embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/comments?post=60703"}],"version-history":[{"count":4,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/blog\/60703\/revisions"}],"predecessor-version":[{"id":87237,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/blog\/60703\/revisions\/87237"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/media\/75801"}],"wp:attachment":[{"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/media?parent=60703"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/categories?post=60703"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/proxidize.com\/wp-json\/wp\/v2\/tags?post=60703"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}