APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

LWP (Library for WWW in Perl)


© May 2005 Tony Lawrence

If you want to automatically process web pages to extract data, you have a number of tools available. You can bring a web page down to your computer using "curl" or "wget"

curl http:.//aplawrence.com > mysite
 

If you don't really want the html, use "lynx --dump https://whatever.com > /yourstorage/whatever.txt" to get a text representation of the page. Check the man page for options you might want like "--nolist".

You can also easily be selective and pull only the data you want from a page with simple Perl scripts.

#!/usr/bin/perl
use LWP::Simple;  
$url = 'https://aplawrence.com";   
$content = get $url;     
print $content;
 

And then of course you'd process the $content as desired. It's only a little more complex if you are dealing with forms.

A book that covers LWP is reviewed at /Books/webc.html.


Got something to add? Send me email.





(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

->
-> LWP (Library for WWW in Perl)


Inexpensive and informative Apple related e-books:

Take Control of iCloud

Take Control of Pages

Take Control of Parallels Desktop 12

Take Control of OS X Server

Take Control of Numbers




More Articles by © Tony Lawrence




Printer Friendly Version

Have you tried Searching this site?

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us


Printer Friendly Version





Anyone who slaps a 'this page is best viewed with Browser X' label on a Web page appears to be yearning for the bad old days, before the Web, when you had very little chance of reading a document written on another computer, another word processor, or another network. (Tim Berners-Lee)




Linux posts

Troubleshooting posts


This post tagged:

Perl

Web/HTML



Unix/Linux Consultants

Skills Tests

Unix/Linux Book Reviews

My Unix/Linux Troubleshooting Book

This site runs on Linode





SCO Unix Sales, Support, & Service

Phone:  707-SCO-UNIX (707-726-8649Toll Free: 833-SCO-UNIX (833-726-8649)
www.SCOsales.com