To create your own crawler, you must name the crawler (typically, after the target site), and upload a text file named oovy, which controls the site capture process. This guide assumes the crawlers were installed during the Site Capture installation process and uses the Sample crawler primarily. To help you get started quickly, Site Capture comes with two sample crawlers, Sample and FirstSiteII. Starting any type of site capture process requires you to define a crawler in the Site Capture interface. For any capture mode, you have the option of configuring crawlers to email reports as soon as they are generated.
In archive capture, you can download them from the Site Capture interface. In static capture, you must obtain the logs from the file system. You can download the files, preview the archived sites, and set capture schedules.įor any capture mode, logs are generated after the crawl session to provide such information as crawled URLs, HTTP status, and network conditions. However, because the zip files are referenced by pointers in the Site Capture database, you can manage them from the Site Capture interface. Like static sessions, you can manually initiate archive crawl sessions from the Site Capture interface or after a publishing session. However, you can manage downloaded sites from the Site Capture file system only. You can initiate static crawl sessions manually from the application interface or after a publishing session. Pointers to the zip files are created in the Site Capture database. In archive mode, all crawled sites are kept and stored as zip files (archives) in time-stamped folders. Only the latest capture is kept (the previously stored files are overwritten). In static mode, a crawled site is stored as files ready to be served.
Static mode supports rapid deployment, high availability scenarios.Īrchive mode is used to maintain copies of websites on a regular basis for compliance purposes or similar reasons. Table 36-1 Static Capture Mode and Archive Mode Static Mode