Configure Scraping browser

How to configure Scraping browser

Before using the Scraping Browser, some configurations are needed. This article will guide you through setting up your credentials configuration, Scraping Browser, running sample scripts, and real-time browser sessions on the Page Operations Console. Follow our detailed instructions to ensure the efficient use of the Scraping Browser for web scraping.

Before you start using Scraping Browser, get your credentials - the username and password you'll use for the web automation tool. We assume that you have obtained a valid credential. If not, get it from ABCproxy.

Sample Code

We have provided some scraping examples to help you get started with our Scraping Browser more efficiently. You simply need to replace your credentials and target URL, then customize the scripts according to your business scenarios.

To run scripts in your local environment, you can refer to the following examples. Ensure you have installed the required dependencies locally, don’t forget to configure your credentials, and execute your scripts to obtain the desired data.

If the webpage you are accessing might encounter CAPTCHAs or verification challenges, don’t worry—we’ll handle them for you seamlessly.

import asyncio  
from playwright.async_api import async_playwright  
  
const AUTH = 'PROXY-FULL-ACCOUNT:PASSWORD';  
const SBR_WS_SERVER = `wss://${AUTH}@upg-scbr.abcproxy.com`;  
  
async def run(pw):  
    print('Connecting to Scraping Browser...')  
    browser = await pw.chromium.connect_over_cdp(SBR_WS_SERVER)  
    try:  
        print('Connected! Navigating to Target...')  

        page = await browser.new_page()  
        await page.goto('https://example.com', timeout= 2 * 60 * 1000) 

        # Screenshot
        print('To Screenshot from page')  
        await page.screenshot(path='./remote_screenshot_page.png')  
        # html content
        print('Scraping page content...')  
        html = await page.content()  
        print(html)  
 
    finally:  
        # In order to better use the Scraping browser, be sure to close the browser 
        await browser.close()  
   
async def main():  
    async with async_playwright() as playwright:  
        await run(playwright)  
  
if _name_ == '_main_':  
 asyncio.run(main())
 

Scraping Browser Initial Navigation and Workflow Management

The scraping browser session architecture allows each session to perform only one initial navigation. This initial navigation refers to the first instance of loading the target website that will be used for subsequent data extraction. After this initial phase, users can freely navigate the website within the same session through clicks, scrolls, and other interactive actions. However, to start a new scraping job from the initial navigation phase—whether targeting the same site or a different one—a new session must be created.

Time limit of Session

1.Regardless of your operational method, note that session timeout limits apply. If a browser session is not explicitly closed in your script, the system will automatically terminate it after a maximum of 60 minutes.

2.When using the Scraping Browser via the web console, the system enforces a strict one active session per account rule. To ensure optimal performance and experience, always explicitly close the browser session in your script.

Last updated