Guide Scraper

Extract structured data from HTML or websites using customizable guides. Ideal for web scraping and data analysis.

Learn more about Guides
HTMLURL

Note: Might be slower when the html content is too big.

Extracted Data

No data available.

How to Use
  1. 1Create a parser key using the guide format (see 'Creating a Guide Key' section below).
  2. 2Choose between HTML input or URL input using the toggle switch in the form above.
  3. 3Enter your HTML content or the URL you want to extract.
  4. 4Provide your guide key in the designated field.
  5. 5Click the 'Extract' button to extract the content based on your guide.
  6. 6View the results in the display area below the form.
Creating a Guide Key

Before using the parser, you need to create a guide key. This key defines what content to extract and how to locate it in the HTML.

  1. 1Create a parser key using the guide format (see 'Creating a Guide Key' section below).
  2. 2Choose between HTML input or URL input using the toggle switch in the form above.
  3. 3Enter your HTML content or the URL you want to extract.
  4. 4Provide your guide key in the designated field.
  5. 5Click the 'Extract' button to extract the content based on your guide.
  6. 6View the results in the display area below the form.
Scraping with Guides

This tool uses a guide format to extract specific content from HTML. Here's an example of the guide format:

{
  "title": "Exact Title",
  "content": "Exact pharagraph/long description.",
  "author": "Exact Author Name",
  "date": "Exact Published date"
}

In this format, keys represent the name of the data you want to extract, and values are exact text that helps to locate the elements.

Key Format Options

  • key:list: Selects a list of items. You need to provide the exact text of the item you want to select.
  • key:ulist: Same as list, but each item of the list has a bullet.
  • key:olist: Same as list, but adds a number on each item (e.g., ["1. Item 1", "2. Item 2"]).
  • key:content: Similar to list, but returns a long string with items joined by "\n\n".
  • key:rm={text to remove}: Removes the specified text from the result. Note: In the mean time if you are using this option. Please put this as the last option. Example: key:urllist:prefix=prefix:rm={text to remove}.

Newly added Options

  • key:url: Gets the url of an element. You need to add the exact URL.
  • key:urllist: Gets the list of url. You need to add the one exact URL.
  • key:img: Gets the image url of an element. You need to add an exact image URL.
  • key:gallery: Gets the image list for example a carousel. You need to add an exact image URL from the list of images you want. This might not work if the images are lazy loaded.
  • key:pairs: Best to use when there is a data that are formatted like key-value pair. For example, a specification of a product. You just need to add at least one of those key-value pair
  • key:pairs:switch: Sometimes there will be times when the value appears first before the key. Use this option to switch the key as value and value as a key.
  • key:prefix=https://www.claudejera.me: Adds a prefix to a value. If the value is a list, the prefix will be added to each item in the list.
  • key:suffix=suffix text: Adds a suffix to a value. If the value is a list, the suffix will be added to each item in the list.

Value Format Options

  • value: The default format.
  • Guide=>value: Indicates that this will guide you to the desired value.
  • Key||value: Use this to separate the key and value for the "pairs" option.

Combined Formats

You can combine multiple formats. For example:

"key:olist:ulist:content:rm=text to remove": "Exact text to add"

In this case, the first type of list in the order (olist) will be used as the list type. And will remove the specified text for each item in the list.

Example Guide

{
  "title": "Philly Cheese Steak Dip",
  "short_desc": "This Philly cheesesteak dip would make a handsome addition to your snack table for the Big Game. Like all great party foods, it's wonderful hot, warm, room temp, and, I've heard from a reliable source, even delicious cold. Serve alongside sliced baguette. Keep it hot and fresh for guests by baking it in 2 batches, or feel free to do this all at once in a 9x13-inch baking dish.",
  "submitted_by": "By=>John Mitzewich",
  "info:pairs": "Prep Time:||25 mins",
  "ingredients:ulist:list:content:olist": "1 pound beef top sirloin steaks",
  "directions:rm=Repeat with the second batch of dip:olist": "Slice steak into thick pieces. Season generously on both sides with salt and pepper.",
  "facts:pairs:switch": "143||Calories"
}

Resulting Test Data

{
  "title": "Philly Cheese Steak Dip",
  "short_desc": "This Philly cheesesteak dip would make a handsome addition to your snack table for the Big Game. Like all great party foods, it's wonderful hot, warm, room temp, and, I've heard from a reliable source, even delicious cold. Serve alongside sliced baguette. Keep it hot and fresh for guests by baking it in 2 batches, or feel free to do this all at once in a 9x13-inch baking dish.",
  "Prep Time": "25 mins",
  "Cook Time": "1 hr",
  "Total Time": "1 hr 25 mins",
  "Servings": "24",
  "ingredients": "• 1 pound beef top sirloin steaks\n\n• salt and freshly ground black pepper to taste\n\n• 1 tablespoon olive oil\n\n• 1 yellow onion, diced\n\n• 1 tablespoon butter\n\n• 1 red bell pepper\n\n• 1 green bell pepper\n\n• ½ cup pepperoncini peppers\n\n• 7 pickled red peppers (such as Peppadew®)\n\n• 3 jalapeño peppers\n\n• 2 (8 ounce) packages cream cheese, softened\n\n• ½ pound provolone cheese\n\n• ½ teaspoon Worcestershire sauce\n\n• 1 pinch cayenne pepper",
  "directions": [
    "1. Slice steak into thick pieces. Season generously on both sides with salt and pepper.",
    "2. Heat olive oil in a pan over high heat until smoking. Add steak slices and sear until bottoms are browned, 3 to 4 minutes. Flip over and reduce heat to medium. Cook until juices appear on the tops, 3 to 4 minutes more. Transfer to a bowl to cool.",
    "3. Add diced onion, a big pinch of salt, and butter to the meat juices in the pan. Cook and stir over medium heat, scraping up the browned bits, until onions start to soften, 5 to 7 minutes.",
    "4. Dice up the red bell pepper, green bell pepper, pepperoncini, pickled red peppers, and jalapenos until you have 1 1/2 to 2 cups. Add to the onions; cook and stir until starting to soften, about 5 minutes.",
    "5. Preheat the oven to 400 degrees F (200 degrees C).",
    "6. Chop steak into small pieces. Place back into the bowl with the accumulated meat juices. Add the onion-pepper mixture and cream cheese. Grate in the provolone cheese, saving a little for the top. Drizzle in Worcestershire sauce and sprinkle in cayenne pepper. Mix dip thoroughly.",
    "7. Transfer 1/2 of the dip into a small baking dish. Smooth the top with a fork. Place baking dish on top of a sheet pan. Sprinkle the remaining provolone cheese on top. Save the other 1/2 to bake fresh for your guests.",
    "8. Bake in the preheated oven until dip is bubbling and heated through, 20 to 25 minutes. Broil until top is browned, about 1 minute more. Repeat with the second batch of dip. crys"
  ],
  "Calories": "143",
  "Fat": "11g",
  "Carbs": "3g",
  "Protein": "7g"
}

Note: You can check this website by visiting  this URL . To compare and fully understand how guides are made.

Need Help or Have Suggestions?

I'm always looking to improve my HTML/URL Scraper and make it even more useful for you. If you have any suggestions, questions, or run into issues, please don't hesitate to reach out to me—I'd love to hear from you!.

Also, make sure to visit my Facebook page for the latest updates!

If you're facing a scraping challenge or difficulty, feel free to reach out! I'd be more than happy to assist you in finding a solution. Let's tackle it together!