Claude’s API can generate and analyze code, but it’s not just a fancy autocomplete.

Let’s say you’re building a Python script to scrape data from a website. You’ve got the basic structure, but you’re stuck on how to handle pagination or parse complex HTML.

import requests
from bs4 import BeautifulSoup

def scrape_website(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.content, 'html.parser')
    # ... stuck here ...
    return data

# Example usage (hypothetical, as we don't have a real URL for this example)
# data = scrape_website("https://example.com/page/1")
# print(data)

You could ask Claude to help:

Prompt:

"I’m writing a Python script to scrape data from a website. I’m using requests and BeautifulSoup. My current code is:

import requests
from bs4 import BeautifulSoup

def scrape_website(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.content, 'html.parser')
    # I need to extract product names and prices from this HTML.
    # The product names are in 'h2' tags with class 'product-title'.
    # The prices are in 'span' tags with class 'price'.
    # The website has pagination, with links like '<a href="/page/2">Next</a>'.
    # Please provide the Python code to extract this data and handle pagination,
    # assuming the base URL is 'https://example.com'.
    return data

# Example usage (hypothetical, as we don't have a real URL for this example)
# data = scrape_website("https://example.com/page/1")
# print(data)

Can you complete the scrape_website function to extract product names and prices, and also add logic to follow pagination links up to a maximum of 5 pages?"

Claude’s Potential Output (Illustrative):

import requests
from bs4 import BeautifulSoup

def scrape_website(url):
    all_products = []
    current_page = 1
    max_pages = 5

    while current_page <= max_pages:
        page_url = f"{url.split('/page/')[0]}/page/{current_page}"
        print(f"Scraping: {page_url}")

        try:
            response = requests.get(page_url)
            response.raise_for_status() # Raise an exception for bad status codes
            soup = BeautifulSoup(response.content, 'html.parser')

            products_on_page = soup.find_all('div', class_='product-item') # Assuming a common wrapper
            if not products_on_page:
                print(f"No products found on page {current_page}. Stopping.")
                break

            for product_div in products_on_page:
                name_tag = product_div.find('h2', class_='product-title')
                price_tag = product_div.find('span', class_='price')

                name = name_tag.text.strip() if name_tag else "N/A"
                price = price_tag.text.strip() if price_tag else "N/A"

                all_products.append({'name': name, 'price': price})

            # Check for next page link (optional, if the page count is reliable)
            # next_page_link = soup.find('a', text='Next')
            # if not next_page_link:
            #     print(f"No 'Next' link found on page {current_page}. Stopping.")
            #     break

            current_page += 1

        except requests.exceptions.RequestException as e:
            print(f"Error fetching page {current_page}: {e}")
            break
        except Exception as e:
            print(f"An unexpected error occurred on page {current_page}: {e}")
            break

    return all_products

# Example usage:
base_url = "https://example.com" # Replace with a real URL if testing
data = scrape_website(base_url)
print(data)

This isn’t just code generation; it’s reasoned code generation. Claude understood the context of your existing code, the specific HTML elements you needed to target, and the logic for pagination. It then constructed a solution that incorporates these requirements, including error handling for network requests and a loop to manage multiple pages.

The underlying mechanism involves Claude processing your prompt as a complex instruction. It breaks down the request into sub-problems:

  1. HTML Parsing: Identify product names (h2.product-title) and prices (span.price).
  2. Data Extraction: Extract text content from these tags and clean it (.strip()).
  3. Pagination Logic: Construct URLs for subsequent pages and iterate.
  4. Looping/Control Flow: Implement a while loop to manage page limits.
  5. Error Handling: Add try-except blocks for robust network operations.

Claude’s training data includes vast amounts of code and natural language descriptions of code. It uses this to build a probabilistic model of how code should be structured to fulfill a given request. It’s not "thinking" in the human sense, but it’s incredibly adept at pattern matching and sequence prediction on a massive scale.

When you provide code, Claude doesn’t just read it; it parses it into an internal representation that it can reason about alongside your natural language instructions. This allows it to understand what your code does and how your request modifies that behavior.

The one thing most people don’t realize is that Claude’s code generation is often influenced by the structure and style of the code you provide in the prompt. If your starter code is well-formatted, uses specific libraries, and follows certain conventions, Claude is more likely to generate code that aligns with those choices. It’s like giving it a blueprint and asking it to fill in the missing pieces in the same architectural style.

The next step is often integrating this generated code into a larger application or refining the scraping logic to handle more complex scenarios like JavaScript-rendered content.

Want structured learning?

Take the full Claude-api course →