> ## Documentation Index
> Fetch the complete documentation index at: https://docs.smooth.sh/llms.txt
> Use this file to discover all available pages before exploring further.

# Scrape behind login walls

> Learn how to scrape data from protected websites

## Overview

This guide demonstrates how to scrape data from websites that require user authentication by using browser sessions. You'll learn to launch a session, authenticate manually via a live URL, and then run automated tasks within the same authenticated session.

<Steps>
  <Step title="Launch Session and Authenticate">
    Create a profile and session. The profile will persist your authentication cookies for future sessions.

    ```python Python theme={null}
    from smooth import SmoothClient

    client = SmoothClient()

    # Create the profile first (only needed once)
    client.create_profile(profile_id="gmail-session")

    with client.session(profile_id="gmail-session", url="https://mail.google.com") as session:
        # Get the live URL for manual authentication
        print(f"Please log in at: {session.live_url()}")

        # Wait for user to authenticate
        input("Press Enter after you've logged in...")

        # Now run tasks in the authenticated session
        result = session.run_task(
            task="Get the subject and sender of the most recent email"
        )

        print(f"Last email: {result.output}")
    ```

    Open the `live_url` in your browser and log in to Gmail. Once authenticated, press Enter to continue with the automated task.
  </Step>

  <Step title="Reuse the Profile">
    In future runs, use the same profile ID to skip manual authentication. Your login state is already saved.

    ```python Python theme={null}
    with client.session(profile_id="gmail-session", url="https://mail.google.com") as session:
        result = session.run_task(
            task="Get the subject and sender of my 5 most recent emails",
            response_model={
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "subject": {"type": "string", "description": "Email subject"},
                        "sender": {"type": "string", "description": "Sender name or email"}
                    }
                }
            }
        )

        for email in result.output:
            print(f"From: {email['sender']} - {email['subject']}")
    ```
  </Step>
</Steps>

## Key Benefits

* **Persistent Authentication**: Profiles maintain login state across multiple sessions
* **Manual Control**: You handle the authentication process manually for security
* **Automated Execution**: Once authenticated, run complex tasks automatically
* **Session Reuse**: The same profile can be used for multiple related tasks

## Best Practices

* Use descriptive profile IDs for better organization
* Keep profiles secure and don't share profile IDs
* Test authentication manually before running automated tasks
* Handle rate limits and be respectful to the target website

<Note>
  Browser profiles persist cookies and authentication state, making them perfect for accessing protected content while maintaining security through manual authentication.
</Note>

## Community

<Card title="Join Discord" icon="discord" href="https://discord.gg/VcdgMwUmMG">
  Join our community for support and showcases
</Card>
