Start Agent From URL List
The WebSundew allows to scrape data starting from URL list. You may create Agent or Extractor. In this walkthrough we will use Agent. If you are not familiar with Agent, Extractor and other concepts you may read about them here
Step 1 - Create New Project
Click New Project in the application toolbar.
Step 2 - Create New Agent
Click New Agent in the application toolbar.
New agent dialog will appear:
Select URL List. Now load URLs from text files, paste from clipboard or add them manually.
https://demo.websundew.io/ecommerce/details?id=1
https://demo.websundew.io/ecommerce/details?id=2
https://demo.websundew.io/ecommerce/details?id=3
https://demo.websundew.io/ecommerce/details?id=4
https://demo.websundew.io/ecommerce/details?id=5
Copy urls and paste them.
Click Ok to complete creating of the Agent. The agent's editor will open. The content of the first HTML file will be available in the browser part of the agent editor.
Other Steps
You configured agent that starts from the URL list. Now you need to capture and export extracted data. These steps are depends on the HTML file structure and required export format. You can read more about capture and about export. Also you can read our tutorials.
Edit Agent
You can modify URL list after you created the agent:
- You need to open Agent for editing.
- Select Loop in the agent's graph:
- Open Properties View and modify folders.